summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2024-02-19Revert "usb: typec: tcpm: reset counter when enter into unattached state ↵Ondrej Jirman
after try role" The reverted commit makes the state machine only ever go from SRC_ATTACH_WAIT to SNK_TRY in endless loop when toggling. After revert it goes to SRC_ATTACHED after initially trying SNK_TRY earlier, as it should for toggling to ever detect the power source mode and the port is again able to provide power to attached power sinks. This reverts commit 2d6d80127006ae3da26b1f21a65eccf957f2d1e5. Cc: stable@vger.kernel.org Fixes: 2d6d80127006 ("usb: typec: tcpm: reset counter when enter into unattached state after try role") Signed-off-by: Ondrej Jirman <megi@xff.cz> Link: https://lore.kernel.org/r/20240217162023.1719738-1-megi@xff.cz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-19usb: gadget: omap_udc: fix USB gadget regression on Palm TEAaro Koskinen
When upgrading from 6.1 LTS to 6.6 LTS, I noticed the ethernet gadget stopped working on Palm TE. Commit 8825acd7cc8a ("ARM: omap1: remove dead code") deleted Palm TE from machine_without_vbus_sense(), although the board is still used. Fix that. Fixes: 8825acd7cc8a ("ARM: omap1: remove dead code") Cc: stable <stable@kernel.org> Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20240217192042.GA372205@darkstar.musicnaut.iki.fi Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-19usb: dwc3: gadget: Don't disconnect if not startedThinh Nguyen
Don't go through soft-disconnection sequence if the controller hasn't started. Otherwise, there will be timeout and warning reports from the soft-disconnection flow. Cc: stable@vger.kernel.org Fixes: 61a348857e86 ("usb: dwc3: gadget: Fix NULL pointer dereference in dwc3_gadget_suspend") Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Closes: https://lore.kernel.org/linux-usb/20240215233536.7yejlj3zzkl23vjd@synopsys.com/T/#mb0661cd5f9272602af390c18392b9a36da4f96e6 Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/e3be9b929934e0680a6f4b8f6eb11b18ae9c7e07.1708043922.git.Thinh.Nguyen@synopsys.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-19usb: cdns3: fix memory double free when handle zero packetFrank Li
829 if (request->complete) { 830 spin_unlock(&priv_dev->lock); 831 usb_gadget_giveback_request(&priv_ep->endpoint, 832 request); 833 spin_lock(&priv_dev->lock); 834 } 835 836 if (request->buf == priv_dev->zlp_buf) 837 cdns3_gadget_ep_free_request(&priv_ep->endpoint, request); Driver append an additional zero packet request when queue a packet, which length mod max packet size is 0. When transfer complete, run to line 831, usb_gadget_giveback_request() will free this requestion. 836 condition is true, so cdns3_gadget_ep_free_request() free this request again. Log: [ 1920.140696][ T150] BUG: KFENCE: use-after-free read in cdns3_gadget_giveback+0x134/0x2c0 [cdns3] [ 1920.140696][ T150] [ 1920.151837][ T150] Use-after-free read at 0x000000003d1cd10b (in kfence-#36): [ 1920.159082][ T150] cdns3_gadget_giveback+0x134/0x2c0 [cdns3] [ 1920.164988][ T150] cdns3_transfer_completed+0x438/0x5f8 [cdns3] Add check at line 829, skip call usb_gadget_giveback_request() if it is additional zero length packet request. Needn't call usb_gadget_giveback_request() because it is allocated in this driver. Cc: stable@vger.kernel.org Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver") Signed-off-by: Frank Li <Frank.Li@nxp.com> Reviewed-by: Roger Quadros <rogerq@kernel.org> Acked-by: Peter Chen <peter.chen@kernel.org> Link: https://lore.kernel.org/r/20240202154217.661867-2-Frank.Li@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-19usb: cdns3: fixed memory use after free at cdns3_gadget_ep_disable()Frank Li
... cdns3_gadget_ep_free_request(&priv_ep->endpoint, &priv_req->request); list_del_init(&priv_req->list); ... 'priv_req' actually free at cdns3_gadget_ep_free_request(). But list_del_init() use priv_req->list after it. [ 1542.642868][ T534] BUG: KFENCE: use-after-free read in __list_del_entry_valid+0x10/0xd4 [ 1542.642868][ T534] [ 1542.653162][ T534] Use-after-free read at 0x000000009ed0ba99 (in kfence-#3): [ 1542.660311][ T534] __list_del_entry_valid+0x10/0xd4 [ 1542.665375][ T534] cdns3_gadget_ep_disable+0x1f8/0x388 [cdns3] [ 1542.671571][ T534] usb_ep_disable+0x44/0xe4 [ 1542.675948][ T534] ffs_func_eps_disable+0x64/0xc8 [ 1542.680839][ T534] ffs_func_set_alt+0x74/0x368 [ 1542.685478][ T534] ffs_func_disable+0x18/0x28 Move list_del_init() before cdns3_gadget_ep_free_request() to resolve this problem. Cc: stable@vger.kernel.org Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver") Signed-off-by: Frank Li <Frank.Li@nxp.com> Reviewed-by: Roger Quadros <rogerq@kernel.org> Acked-by: Peter Chen <peter.chen@kernel.org> Link: https://lore.kernel.org/r/20240202154217.661867-1-Frank.Li@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-19usb: roles: don't get/set_role() when usb_role_switch is unregisteredXu Yang
There is a possibility that usb_role_switch device is unregistered before the user put usb_role_switch. In this case, the user may still want to get/set_role() since the user can't sense the changes of usb_role_switch. This will add a flag to show if usb_role_switch is already registered and avoid unwanted behaviors. Fixes: fde0aa6c175a ("usb: common: Small class for USB role switches") cc: stable@vger.kernel.org Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20240129093739.2371530-2-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-19usb: roles: fix NULL pointer issue when put module's referenceXu Yang
In current design, usb role class driver will get usb_role_switch parent's module reference after the user get usb_role_switch device and put the reference after the user put the usb_role_switch device. However, the parent device of usb_role_switch may be removed before the user put the usb_role_switch. If so, then, NULL pointer issue will be met when the user put the parent module's reference. This will save the module pointer in structure of usb_role_switch. Then, we don't need to find module by iterating long relations. Fixes: 5c54fcac9a9d ("usb: roles: Take care of driver module reference counting") cc: stable@vger.kernel.org Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20240129093739.2371530-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-19usb: cdnsp: fixed issue with incorrect detecting CDNSP family controllersPawel Laszczak
Cadence have several controllers from 0x000403xx family but current driver suuport detecting only one with DID equal 0x0004034E. It causes that if someone uses different CDNSP controller then driver will use incorrect version and register space. Patch fix this issue. cc: stable@vger.kernel.org Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver") Signed-off-by: Pawel Laszczak <pawell@cadence.com> Link: https://lore.kernel.org/r/20240215121609.259772-1-pawell@cadence.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-19usb: cdnsp: blocked some cdns3 specific codePawel Laszczak
host.c file has some parts of code that were introduced for CDNS3 driver and should not be used with CDNSP driver. This patch blocks using these parts of codes by CDNSP driver. These elements include: - xhci_plat_cdns3_xhci object - cdns3 specific XECP_PORT_CAP_REG register - cdns3 specific XECP_AUX_CTRL_REG1 register cc: stable@vger.kernel.org Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver") Signed-off-by: Pawel Laszczak <pawell@cadence.com> Link: https://lore.kernel.org/r/20240206104018.48272-1-pawell@cadence.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-19usb: uhci-grlib: Explicitly include linux/platform_device.hAndreas Larsson
This fixes relying upon linux/of_platform.h to include linux/platform_device.h, which it no longer does, thereby fixing compilation problems like: In file included from drivers/usb/host/uhci-hcd.c:850: drivers/usb/host/uhci-grlib.c: In function 'uhci_hcd_grlib_probe': drivers/usb/host/uhci-grlib.c:92:29: error: invalid use of undefined type 'struct platform_device' 92 | struct device_node *dn = op->dev.of_node; | ^~ Fixes: ef175b29a242 ("of: Stop circularly including of_device.h and of_platform.h") Signed-off-by: Andreas Larsson <andreas@gaisler.com> Link: https://lore.kernel.org/r/20240129075056.1511630-1-andreas@gaisler.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-18Merge tag 'irq_urgent_for_v6.8_rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Borislav Petkov: - Fix GICv4.1 affinity update - Restore a quirk for ACPI-based GICv4 systems - Handle non-coherent GICv4 redistributors properly - Prevent spurious interrupts on Broadcom devices using GIC v3 architecture - Other minor fixes * tag 'irq_urgent_for_v6.8_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/gic-v3-its: Fix GICv4.1 VPE affinity update irqchip/gic-v3-its: Restore quirk probing for ACPI-based systems irqchip/gic-v3-its: Handle non-coherent GICv4 redistributors irqchip/qcom-mpm: Fix IS_ERR() vs NULL check in qcom_mpm_init() irqchip/loongson-eiointc: Use correct struct type in eiointc_domain_alloc() irqchip/irq-brcmstb-l2: Add write memory barrier before exit
2024-02-18Merge tag 'i2c-for-6.8-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Two fixes for i801 and qcom-geni devices. Meanwhile, a fix from Arnd addresses a compilation error encountered during compile test on powerpc" * tag 'i2c-for-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: i801: Fix block process call transactions i2c: pasemi: split driver into two separate modules i2c: qcom-geni: Correct I2C TRE sequence
2024-02-18net: bcmasp: Sanity check is off by oneJustin Chen
A sanity check for OOB write is off by one leading to a false positive when the array is full. Fixes: 9b90aca97f6d ("net: ethernet: bcmasp: fix possible OOB write in bcmasp_netfilt_get_all_active()") Signed-off-by: Justin Chen <justin.chen@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-18net: bcmasp: Indicate MAC is in charge of PHY PMFlorian Fainelli
Avoid the PHY library call unnecessarily into the suspend/resume functions by setting phydev->mac_managed_pm to true. The ASP driver essentially does exactly what mdio_bus_phy_resume() does. Fixes: 490cb412007d ("net: bcmasp: Add support for ASP2.0 Ethernet controller") Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Justin Chen <justin.chen@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-17net: stmmac: Fix incorrect dereference in interrupt handlersPavel Sakharov
If 'dev' or 'data' is NULL, the 'priv' variable has an incorrect address when dereferencing calling netdev_err(). Since we get as 'dev_id' or 'data' what was passed as the 'dev' argument to request_irq() during interrupt initialization (that is, the net_device and rx/tx queue pointers initialized at the time of the call) and since there are usually no checks for the 'dev_id' argument in such handlers in other drivers, remove these checks from the handlers in stmmac driver. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 8532f613bc78 ("net: stmmac: introduce MSI Interrupt routines for mac, safety, RX & TX") Signed-off-by: Pavel Sakharov <p.sakharov@ispras.ru> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-17Merge tag 'driver-core-6.8-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are some driver core fixes, a kobject fix, and a documentation update for 6.8-rc5. In detail these changes are: - devlink fixes for reported issues with 6.8-rc1 - topology scheduling regression fix that has been reported by many - kobject loosening of checks change in -rc1 is now reverted as some codepaths seemed to need the checks - documentation update for the CVE process. Has been reviewed by many, the last minute change to the document was to bring the .rst format back into the the new style rules, the contents did not change. All of these, except for the documentation update, have been in linux-next for over a week. The documentation update has been reviewed for weeks by a group of developers, and in public for a week and the wording has stabilized for now. If future changes are needed, we can do so before 6.8-final is out (or anytime after that)" * tag 'driver-core-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: Documentation: Document the Linux Kernel CVE process Revert "kobject: Remove redundant checks for whether ktype is NULL" driver core: fw_devlink: Improve logs for cycle detection driver core: fw_devlink: Improve detection of overlapping cycles driver core: Fix device_link_flag_is_sync_state_only() topology: Set capacity_freq_ref in all cases
2024-02-17Merge tag 'char-misc-6.8-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char / miscdriver fixes from Greg KH: "Here is a small set of char/misc and IIO driver fixes for 6.8-rc5. Included in here are: - lots of iio driver fixes for reported issues - nvmem device naming fixup for reported problem - interconnect driver fixes for reported issues All of these have been in linux-next for a while with no reported the issues (the nvmem patch was included in a different branch in linux-next before sent to me for inclusion here)" * tag 'char-misc-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (21 commits) nvmem: include bit index in cell sysfs file name iio: adc: ad4130: only set GPIO_CTRL if pin is unused iio: adc: ad4130: zero-initialize clock init data interconnect: qcom: x1e80100: Add missing ACV enable_mask interconnect: qcom: sm8650: Use correct ACV enable_mask iio: accel: bma400: Fix a compilation problem iio: commom: st_sensors: ensure proper DMA alignment iio: hid-sensor-als: Return 0 for HID_USAGE_SENSOR_TIME_TIMESTAMP iio: move LIGHT_UVA and LIGHT_UVB to the end of iio_modifier staging: iio: ad5933: fix type mismatch regression iio: humidity: hdc3020: fix temperature offset iio: adc: ad7091r8: Fix error code in ad7091r8_gpio_setup() iio: adc: ad_sigma_delta: ensure proper DMA alignment iio: imu: adis: ensure proper DMA alignment iio: humidity: hdc3020: Add Makefile, Kconfig and MAINTAINERS entry iio: imu: bno055: serdev requires REGMAP iio: magnetometer: rm3100: add boundary check for the value read from RM3100_REG_TMRC iio: pressure: bmp280: Add missing bmp085 to SPI id table iio: core: fix memleak in iio_device_register_sysfs interconnect: qcom: sm8550: Enable sync_state ...
2024-02-17Merge tag 'tty-6.8-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty / serial fixes from Greg KH: "Here are three small tty and serial driver fixes for 6.8-rc5: - revert a 8250_pci1xxxx off-by-one change that was incorrect - two changes to fix the transmit path of the mxs-auart driver, fixing a regression in the 6.2 release All of these have been in linux-next for over a week with no reported issues" * tag 'tty-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: serial: mxs-auart: fix tx serial: core: introduce uart_port_tx_flags() serial: 8250_pci1xxxx: partially revert off by one patch
2024-02-17Merge tag 'usb-6.8-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB / Thunderbolt fixes from Greg KH: "Here are two small fixes for 6.8-rc5: - thunderbolt to fix a reported issue on many platforms - dwc3 driver revert of a commit that caused problems in -rc1 Both of these changes have been in linux-next for over a week with no reported issues" * tag 'usb-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: Revert "usb: dwc3: Support EBC feature of DWC_usb31" thunderbolt: Fix setting the CNS bit in ROUTER_CS_5
2024-02-17Merge tag 'media/v6.8-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: - regression fix for rkisp1 shared IRQ logic - fix atomisp breakage due to a kAPI change - permission fix for remote controller BPF support - memleak fix in ir_toy driver - Kconfig dependency fix for pwm-ir-rx * tag 'media/v6.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: pwm-ir-tx: Depend on CONFIG_HIGH_RES_TIMERS media: ir_toy: fix a memleak in irtoy_tx media: rc: bpf attach/detach requires write permission media: atomisp: Adjust for v4l2_subdev_state handling changes in 6.8 media: rkisp1: Fix IRQ handling due to shared interrupts media: Revert "media: rkisp1: Drop IRQF_SHARED"
2024-02-17Merge tag 'pci-v6.8-fixes-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fixes from Bjorn Helgaas: - Keep bridges in D0 if we need to poll downstream devices for PME to resolve a v6.6 regression where we failed to enumerate devices below bridges put in D3hot by runtime PM, e.g., NVMe drives connected via Thunderbolt or USB4 docks (Alex Williamson) - Add Siddharth Vadapalli as PCI TI DRA7XX/J721E reviewer * tag 'pci-v6.8-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: MAINTAINERS: Add Siddharth Vadapalli as PCI TI DRA7XX/J721E reviewer PCI: Fix active state requirement in PME polling
2024-02-17Merge tag 'i2c-host-fixes-6.8-rc5' of ↵Wolfram Sang
git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current Three fixes are included here. Two are strictly hardware-related for the i801 and qcom-geni devices. Meanwhile, a fix from Arnd addresses a compilation error encountered during compile test on powerpc.
2024-02-16cxl/pci: Fix disabling memory if DVSEC CXL Range does not match a CFMWS windowRobert Richter
The Linux CXL subsystem is built on the assumption that HPA == SPA. That is, the host physical address (HPA) the HDM decoder registers are programmed with are system physical addresses (SPA). During HDM decoder setup, the DVSEC CXL range registers (cxl-3.1, 8.1.3.8) are checked if the memory is enabled and the CXL range is in a HPA window that is described in a CFMWS structure of the CXL host bridge (cxl-3.1, 9.18.1.3). Now, if the HPA is not an SPA, the CXL range does not match a CFMWS window and the CXL memory range will be disabled then. The HDM decoder stops working which causes system memory being disabled and further a system hang during HDM decoder initialization, typically when a CXL enabled kernel boots. Prevent a system hang and do not disable the HDM decoder if the decoder's CXL range is not found in a CFMWS window. Note the change only fixes a hardware hang, but does not implement HPA/SPA translation. Support for this can be added in a follow on patch series. Signed-off-by: Robert Richter <rrichter@amd.com> Fixes: 34e37b4c432c ("cxl/port: Enable HDM Capability after validating DVSEC Ranges") Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20240216160113.407141-1-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-02-16cxl: Fix sysfs export of qos_class for memdevDave Jiang
Current implementation exports only to /sys/bus/cxl/devices/.../memN/qos_class. With both ram and pmem exposed, the second registered sysfs attribute is rejected as duplicate. It's not possible to create qos_class under the dev_groups via the driver due to the ram and pmem sysfs sub-directories already created by the device sysfs groups. Move the ram and pmem qos_class to the device sysfs groups and add a call to sysfs_update() after the perf data are validated so the qos_class can be visible. The end results should be /sys/bus/cxl/devices/.../memN/ram/qos_class and /sys/bus/cxl/devices/.../memN/pmem/qos_class. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240206190431.1810289-4-dave.jiang@intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-02-16cxl: Remove unnecessary type cast in cxl_qos_class_verify()Dave Jiang
The passed in host bridge parameter for device_for_each_child() has unnecessary void * type cast. Remove the type cast. Suggested-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240206190431.1810289-3-dave.jiang@intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-02-16cxl: Change 'struct cxl_memdev_state' *_perf_list to single 'struct ↵Dave Jiang
cxl_dpa_perf' In order to address the issue with being able to expose qos_class sysfs attributes under 'ram' and 'pmem' sub-directories, the attributes must be defined as static attributes rather than under driver->dev_groups. To avoid implementing locking for accessing the 'struct cxl_dpa_perf` lists, convert the list to a single 'struct cxl_dpa_perf' entry in preparation to move the attributes to statically defined. While theoretically a partition may have multiple qos_class via CDAT, this has not been encountered with testing on available hardware. The code is simplified for now to not support the complex case until a use case is needed to support that. Link: https://lore.kernel.org/linux-cxl/65b200ba228f_2d43c29468@dwillia2-mobl3.amr.corp.intel.com.notmuch/ Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240206190431.1810289-2-dave.jiang@intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-02-16cxl/region: Allow out of order assembly of autodiscovered regionsAlison Schofield
Autodiscovered regions can fail to assemble if they are not discovered in HPA decode order. The user will see failure messages like: [] cxl region0: endpoint5: HPA order violation region1 [] cxl region0: endpoint5: failed to allocate region reference The check that is causing the failure helps the CXL driver enforce a CXL spec mandate that decoders be committed in HPA order. The check is needless for autodiscovered regions since their decoders are already programmed. Trying to enforce order in the assembly of these regions is useless because they are assembled once all their member endpoints arrive, and there is no guarantee on the order in which endpoints are discovered during probe. Keep the existing check, but for autodiscovered regions, allow the out of order assembly after a sanity check that the lesser numbered decoder has the lesser HPA starting address. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Wonjae Lee <wj28.lee@samsung.com> Link: https://lore.kernel.org/r/3dec69ee97524ab229a20c6739272c3000b18408.1706736863.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-02-16cxl/region: Handle endpoint decoders in cxl_region_find_decoder()Alison Schofield
In preparation for adding a new caller of cxl_region_find_decoders() teach it to find a decoder from a cxl_endpoint_decoder structure. Combining switch and endpoint decoder lookup in one function prevents code duplication in call sites. Update the existing caller. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Wonjae Lee <wj28.lee@samsung.com> Link: https://lore.kernel.org/r/79ae6d72978ef9f3ceec9722e1cb793820553c8e.1706736863.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-02-16Merge tag 'md-6.8-20240216' of ↵Jens Axboe
https://git.kernel.org/pub/scm/linux/kernel/git/song/md into block-6.8 Pull MD fixes from Song: "1. Fix issues reported for dm-raid [1], by Yu Kuai. Please note that this PR only contains the first half of the set [2]. We still need more fixes in dm and md code (the rest of the set, or alternative fixes). 2. Fix active_io leak, by Yu Kuai. The fix was posted in the same set [2]. But it actually fixes a separate issue [3]. [1] https://lore.kernel.org/linux-raid/e5e8afe2-e9a8-49a2-5ab0-958d4065c55e@redhat.com/ [2] https://lore.kernel.org/linux-raid/20240201092559.910982-1-yukuai1@huaweicloud.com/ [3] https://lore.kernel.org/linux-raid/20240130172524.0000417b@linux.intel.com/ " * tag 'md-6.8-20240216' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md: md: Don't suspend the array for interrupted reshape md: Don't register sync_thread for reshape directly md: Make sure md_do_sync() will set MD_RECOVERY_DONE md: Don't ignore read-only array in md_check_recovery() md: Don't ignore suspended array in md_check_recovery() md: Fix missing release of 'active_io' for flush
2024-02-16Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Three fixes: the two fnic ones are a revert and a refix, which is why the diffstat is a bit big. The target one also extracts a function to add a check for configuration and so looks bigger than it is" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: fnic: Move fnic_fnic_flush_tx() to a work queue scsi: Revert "scsi: fcoe: Fix potential deadlock on &fip->ctlr_lock" scsi: target: Fix unmap setup during configuration
2024-02-16Merge tag 'block-6.8-2024-02-16' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: "Just an nvme pull request via Keith: - Fabrics connection error handling (Chaitanya) - Use relaxed effects to reduce unnecessary queue freezes (Keith)" * tag 'block-6.8-2024-02-16' of git://git.kernel.dk/linux: nvmet: remove superfluous initialization nvme: implement support for relaxed effects nvme-fabrics: fix I/O connect error handling
2024-02-16Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "It's a little busier than normal, but it's still not a lot of code and things seem fairly quiet in general: - Fix allocation failure during SVE coredumps - Fix handling of SVE context on signal delivery - Enable Neoverse N2 CPU errata workarounds for Microsoft's "Azure Cobalt 100" clone - Work around CMN PMU erratum in AmpereOneX implementation - Fix typo in CXL PMU event definition - Fix jump label asm constraints" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64/sve: Lower the maximum allocation for the SVE ptrace regset arm64: Subscribe Microsoft Azure Cobalt 100 to ARM Neoverse N2 errata perf/arm-cmn: Workaround AmpereOneX errata AC04_MESH_1 (incorrect child count) arm64: jump_label: use constraints "Si" instead of "i" arm64: fix typo in comments perf: CXL: fix mismatched cpmu event opcode arm64/signal: Don't assume that TIF_SVE means we saved SVE state
2024-02-16drm/nouveau/mmu/r535: uninitialized variable in r535_bar_new_()Dan Carpenter
If gf100_bar_new_() fails then "bar" is not initialized. Fixes: 5bf0257136a2 ("drm/nouveau/mmu/r535: initial support") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/dab21df7-4d90-4479-97d8-97e5d228c714@moroto.mountain
2024-02-16nouveau: fix function cast warningsArnd Bergmann
clang-16 warns about casting between incompatible function types: drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadow.c:161:10: error: cast from 'void (*)(const struct firmware *)' to 'void (*)(void *)' converts to incompatible function type [-Werror,-Wcast-function-type-strict] 161 | .fini = (void(*)(void *))release_firmware, This one was done to use the generic shadow_fw_release() function as a callback for struct nvbios_source. Change it to use the same prototype as the other five instances, with a trivial helper function that actually calls release_firmware. Fixes: 70c0f263cc2e ("drm/nouveau/bios: pull in basic vbios subdev, more to come later") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240213095753.455062-1-arnd@kernel.org
2024-02-16Merge tag 'drm-fixes-2024-02-16' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Regular weekly fixes, nothing too major, mostly amdgpu, then i915, xe, msm and nouveau with some scattered bits elsewhere. crtc: - fix uninit variable prime: - support > 4GB page arrays buddy: - fix error handling in allocations i915: - fix blankscreen on JSL chromebooks - stable fix to limit DP sst link rates xe: - Fix an out-of-bounds shift. - Fix the display code thinking xe uses shmem - Fix a warning about index out-of-bound - Fix a clang-16 compilation warning amdgpu: - PSR fixes - Suspend/resume fixes - Link training fix - Aspect ratio fix - DCN 3.5 fixes - VCN 4.x fix - GFX 11 fix - Misc display fixes - Misc small fixes amdkfd: - Cache size reporting fix - SIMD distribution fix msm: - GPU: - dmabuf vmap fix - a610 UBWC corruption fix (incorrect hbb) - revert a commit that was making GPU recovery unreliable - tlb invalidation fix ivpu: - suspend/resume fix nouveau: - fix scheduler cleanup path - fix pointless scheduler creation - fix kvalloc argument order rockchip: - vop2 locking fix" * tag 'drm-fixes-2024-02-16' of git://anongit.freedesktop.org/drm/drm: (38 commits) drm/amdgpu: Fix implicit assumtion in gfx11 debug flags drm/amdkfd: update SIMD distribution algo for GFXIP 9.4.2 onwards drm/amd/display: Increase ips2_eval delay for DCN35 drm/amdgpu/display: Initialize gamma correction mode variable in dcn30_get_gamcor_current() drm/amdgpu/soc21: update VCN 4 max HEVC encoding resolution drm/amd/display: fixed integer types and null check locations drm/amd/display: Fix array-index-out-of-bounds in dcn35_clkmgr drm/amd/display: Preserve original aspect ratio in create stream drm/amd/display: Fix possible NULL dereference on device remove/driver unload Revert "drm/amd/display: increased min_dcfclk_mhz and min_fclk_mhz" drm/amd/display: Add align done check Revert "drm/amd: flush any delayed gfxoff on suspend entry" drm/amd: Stop evicting resources on APUs in suspend drm/amd/display: Fix possible buffer overflow in 'find_dcfclk_for_voltage()' drm/amd/display: Fix possible use of uninitialized 'max_chunks_fbc_mode' in 'calculate_bandwidth()' drm/amd/display: Initialize 'wait_time_microsec' variable in link_dp_training_dpia.c drm/amd/display: Fix && vs || typos drm/amdkfd: Fix L2 cache size reporting in GFX9.4.3 drm/amdgpu: make damage clips support configurable drm/msm: Wire up tlb ops ...
2024-02-16drm/buddy: Modify duplicate list_splice_tail callArunpravin Paneer Selvam
Remove the duplicate list_splice_tail call when the total_allocated < size condition is true. Cc: <stable@vger.kernel.org> # 6.7+ Fixes: 8746c6c9dfa3 ("drm/buddy: Fix alloc_range() error handling code") Reported-by: Bert Karwatzki <spasswolf@web.de> Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240216100048.4101-1-Arunpravin.PaneerSelvam@amd.com Signed-off-by: Christian König <christian.koenig@amd.com>
2024-02-16net: ethernet: adi: requires PHYLIB supportRandy Dunlap
This driver uses functions that are supplied by the Kconfig symbol PHYLIB, so select it to ensure that they are built as needed. When CONFIG_ADIN1110=y and CONFIG_PHYLIB=m, there are multiple build (linker) errors that are resolved by this Kconfig change: ld: drivers/net/ethernet/adi/adin1110.o: in function `adin1110_net_open': drivers/net/ethernet/adi/adin1110.c:933: undefined reference to `phy_start' ld: drivers/net/ethernet/adi/adin1110.o: in function `adin1110_probe_netdevs': drivers/net/ethernet/adi/adin1110.c:1603: undefined reference to `get_phy_device' ld: drivers/net/ethernet/adi/adin1110.c:1609: undefined reference to `phy_connect' ld: drivers/net/ethernet/adi/adin1110.o: in function `adin1110_disconnect_phy': drivers/net/ethernet/adi/adin1110.c:1226: undefined reference to `phy_disconnect' ld: drivers/net/ethernet/adi/adin1110.o: in function `devm_mdiobus_alloc': include/linux/phy.h:455: undefined reference to `devm_mdiobus_alloc_size' ld: drivers/net/ethernet/adi/adin1110.o: in function `adin1110_register_mdiobus': drivers/net/ethernet/adi/adin1110.c:529: undefined reference to `__devm_mdiobus_register' ld: drivers/net/ethernet/adi/adin1110.o: in function `adin1110_net_stop': drivers/net/ethernet/adi/adin1110.c:958: undefined reference to `phy_stop' ld: drivers/net/ethernet/adi/adin1110.o: in function `adin1110_disconnect_phy': drivers/net/ethernet/adi/adin1110.c:1226: undefined reference to `phy_disconnect' ld: drivers/net/ethernet/adi/adin1110.o: in function `adin1110_adjust_link': drivers/net/ethernet/adi/adin1110.c:1077: undefined reference to `phy_print_status' ld: drivers/net/ethernet/adi/adin1110.o: in function `adin1110_ioctl': drivers/net/ethernet/adi/adin1110.c:790: undefined reference to `phy_do_ioctl' ld: drivers/net/ethernet/adi/adin1110.o:(.rodata+0xf60): undefined reference to `phy_ethtool_get_link_ksettings' ld: drivers/net/ethernet/adi/adin1110.o:(.rodata+0xf68): undefined reference to `phy_ethtool_set_link_ksettings' Fixes: bc93e19d088b ("net: ethernet: adi: Add ADIN1110 support") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202402070626.eZsfVHG5-lkp@intel.com/ Cc: Lennart Franzen <lennart@lfdomain.com> Cc: Alexandru Tachici <alexandru.tachici@analog.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: netdev@vger.kernel.org Reviewed-by: Nuno Sa <nuno.sa@analog.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-16Merge tag 'drm-msm-fixes-2024-02-15' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/msm into drm-fixes Fixes for v6.8-rc5 GPU: - dmabuf vmap fix - a610 UBWC corruption fix (incorrect hbb) - revert a commit that was making GPU recovery unreliable - tlb invalidation fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGszDSiw66+a=ttBr-hat+zrcBtfc_cZ4LQqXu89DJ0UeQ@mail.gmail.com
2024-02-16Merge tag 'amd-drm-fixes-6.8-2024-02-15-2' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.8-2024-02-15-2: amdgpu: - PSR fixes - Suspend/resume fixes - Link training fix - Aspect ratio fix - DCN 3.5 fixes - VCN 4.x fix - GFX 11 fix - Misc display fixes - Misc small fixes amdkfd: - Cache size reporting fix - SIMD distribution fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240215192452.11805-1-alexander.deucher@amd.com
2024-02-16Merge tag 'drm-xe-fixes-2024-02-15' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - Fix an out-of-bounds shift. - Fix the display code thinking xe uses shmem - Fix a warning about index out-of-bound - Fix a clang-16 compilation warning Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Zc4GpcrbFVqdK9Ws@fedora
2024-02-16Merge tag 'drm-intel-fixes-2024-02-15' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes Fix for #10172: Blank screen on JSL Chromebooks. Stable fix to limit DP SST link rate to <=8.1Gbps. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Zc37W27F5OvoeSkG@jlahtine-mobl.ger.corp.intel.com
2024-02-16Merge tag 'drm-misc-fixes-2024-02-15' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes A suspend/resume error fix for ivpu, a couple of scheduler fixes for nouveau, a patch to support large page arrays in prime, a uninitialized variable fix in crtc, a locking fix in rockchip/vop2 and a buddy allocator error reporting fix. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/b4ffqzigtfh6cgzdpwuk6jlrv3dnk4hu6etiizgvibysqgtl2p@42n2gdfdd5eu
2024-02-15md: Don't suspend the array for interrupted reshapeYu Kuai
md_start_sync() will suspend the array if there are spares that can be added or removed from conf, however, if reshape is still in progress, this won't happen at all or data will be corrupted(remove_and_add_spares won't be called from md_choose_sync_action for reshape), hence there is no need to suspend the array if reshape is not done yet. Meanwhile, there is a potential deadlock for raid456: 1) reshape is interrupted; 2) set one of the disk WantReplacement, and add a new disk to the array, however, recovery won't start until the reshape is finished; 3) then issue an IO across reshpae position, this IO will wait for reshape to make progress; 4) continue to reshape, then md_start_sync() found there is a spare disk that can be added to conf, mddev_suspend() is called; Step 4 and step 3 is waiting for each other, deadlock triggered. Noted this problem is found by code review, and it's not reporduced yet. Fix this porblem by don't suspend the array for interrupted reshape, this is safe because conf won't be changed until reshape is done. Fixes: bc08041b32ab ("md: suspend array in md_start_sync() if array need reconfiguration") Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240201092559.910982-6-yukuai1@huaweicloud.com
2024-02-15md: Don't register sync_thread for reshape directlyYu Kuai
Currently, if reshape is interrupted, then reassemble the array will register sync_thread directly from pers->run(), in this case 'MD_RECOVERY_RUNNING' is set directly, however, there is no guarantee that md_do_sync() will be executed, hence stop_sync_thread() will hang because 'MD_RECOVERY_RUNNING' can't be cleared. Last patch make sure that md_do_sync() will set MD_RECOVERY_DONE, however, following hang can still be triggered by dm-raid test shell/lvconvert-raid-reshape.sh occasionally: [root@fedora ~]# cat /proc/1982/stack [<0>] stop_sync_thread+0x1ab/0x270 [md_mod] [<0>] md_frozen_sync_thread+0x5c/0xa0 [md_mod] [<0>] raid_presuspend+0x1e/0x70 [dm_raid] [<0>] dm_table_presuspend_targets+0x40/0xb0 [dm_mod] [<0>] __dm_destroy+0x2a5/0x310 [dm_mod] [<0>] dm_destroy+0x16/0x30 [dm_mod] [<0>] dev_remove+0x165/0x290 [dm_mod] [<0>] ctl_ioctl+0x4bb/0x7b0 [dm_mod] [<0>] dm_ctl_ioctl+0x11/0x20 [dm_mod] [<0>] vfs_ioctl+0x21/0x60 [<0>] __x64_sys_ioctl+0xb9/0xe0 [<0>] do_syscall_64+0xc6/0x230 [<0>] entry_SYSCALL_64_after_hwframe+0x6c/0x74 Meanwhile mddev->recovery is: MD_RECOVERY_RUNNING | MD_RECOVERY_INTR | MD_RECOVERY_RESHAPE | MD_RECOVERY_FROZEN Fix this problem by remove the code to register sync_thread directly from raid10 and raid5. And let md_check_recovery() to register sync_thread. Fixes: f67055780caa ("[PATCH] md: Checkpoint and allow restart of raid5 reshape") Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240201092559.910982-5-yukuai1@huaweicloud.com
2024-02-15md: Make sure md_do_sync() will set MD_RECOVERY_DONEYu Kuai
stop_sync_thread() will interrupt md_do_sync(), and md_do_sync() must set MD_RECOVERY_DONE, so that follow up md_check_recovery() will unregister sync_thread, clear MD_RECOVERY_RUNNING and wake up stop_sync_thread(). If MD_RECOVERY_WAIT is set or the array is read-only, md_do_sync() will return without setting MD_RECOVERY_DONE, and after commit f52f5c71f3d4 ("md: fix stopping sync thread"), dm-raid switch from md_reap_sync_thread() to stop_sync_thread() to unregister sync_thread from md_stop() and md_stop_writes(), causing the test shell/lvconvert-raid-reshape.sh hang. We shouldn't switch back to md_reap_sync_thread() because it's problematic in the first place. Fix the problem by making sure md_do_sync() will set MD_RECOVERY_DONE. Reported-by: Mikulas Patocka <mpatocka@redhat.com> Closes: https://lore.kernel.org/all/ece2b06f-d647-6613-a534-ff4c9bec1142@redhat.com/ Fixes: d5d885fd514f ("md: introduce new personality funciton start()") Fixes: 5fd6c1dce06e ("[PATCH] md: allow checkpoint of recovery with version-1 superblock") Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240201092559.910982-4-yukuai1@huaweicloud.com
2024-02-15md: Don't ignore read-only array in md_check_recovery()Yu Kuai
Usually if the array is not read-write, md_check_recovery() won't register new sync_thread in the first place. And if the array is read-write and sync_thread is registered, md_set_readonly() will unregister sync_thread before setting the array read-only. md/raid follow this behavior hence there is no problem. After commit f52f5c71f3d4 ("md: fix stopping sync thread"), following hang can be triggered by test shell/integrity-caching.sh: 1) array is read-only. dm-raid update super block: rs_update_sbs ro = mddev->ro mddev->ro = 0 -> set array read-write md_update_sb 2) register new sync thread concurrently. 3) dm-raid set array back to read-only: rs_update_sbs mddev->ro = ro 4) stop the array: raid_dtr md_stop stop_sync_thread set_bit(MD_RECOVERY_INTR, &mddev->recovery); md_wakeup_thread_directly(mddev->sync_thread); wait_event(..., !test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) 5) sync thread done: md_do_sync set_bit(MD_RECOVERY_DONE, &mddev->recovery); md_wakeup_thread(mddev->thread); 6) daemon thread can't unregister sync thread: md_check_recovery if (!md_is_rdwr(mddev) && !test_bit(MD_RECOVERY_NEEDED, &mddev->recovery)) return; -> -> MD_RECOVERY_RUNNING can't be cleared, hence step 4 hang; The root cause is that dm-raid manipulate 'mddev->ro' by itself, however, dm-raid really should stop sync thread before setting the array read-only. Unfortunately, I need to read more code before I can refacter the handler of 'mddev->ro' in dm-raid, hence let's fix the problem the easy way for now to prevent dm-raid regression. Reported-by: Mikulas Patocka <mpatocka@redhat.com> Closes: https://lore.kernel.org/all/9801e40-8ac7-e225-6a71-309dcf9dc9aa@redhat.com/ Fixes: ecbfb9f118bc ("dm raid: add raid level takeover support") Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240201092559.910982-3-yukuai1@huaweicloud.com
2024-02-15md: Don't ignore suspended array in md_check_recovery()Yu Kuai
mddev_suspend() never stop sync_thread, hence it doesn't make sense to ignore suspended array in md_check_recovery(), which might cause sync_thread can't be unregistered. After commit f52f5c71f3d4 ("md: fix stopping sync thread"), following hang can be triggered by test shell/integrity-caching.sh: 1) suspend the array: raid_postsuspend mddev_suspend 2) stop the array: raid_dtr md_stop __md_stop_writes stop_sync_thread set_bit(MD_RECOVERY_INTR, &mddev->recovery); md_wakeup_thread_directly(mddev->sync_thread); wait_event(..., !test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) 3) sync thread done: md_do_sync set_bit(MD_RECOVERY_DONE, &mddev->recovery); md_wakeup_thread(mddev->thread); 4) daemon thread can't unregister sync thread: md_check_recovery if (mddev->suspended) return; -> return directly md_read_sync_thread clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery); -> MD_RECOVERY_RUNNING can't be cleared, hence step 2 hang; This problem is not just related to dm-raid, fix it by ignoring suspended array in md_check_recovery(). And follow up patches will improve dm-raid better to frozen sync thread during suspend. Reported-by: Mikulas Patocka <mpatocka@redhat.com> Closes: https://lore.kernel.org/all/8fb335e-6d2c-dbb5-d7-ded8db5145a@redhat.com/ Fixes: 68866e425be2 ("MD: no sync IO while suspended") Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240201092559.910982-2-yukuai1@huaweicloud.com
2024-02-15scsi: jazz_esp: Only build if SCSI core is builtinRandy Dunlap
JAZZ_ESP is a bool kconfig symbol that selects SCSI_SPI_ATTRS. When CONFIG_SCSI=m, this results in SCSI_SPI_ATTRS=m while JAZZ_ESP=y, which causes many undefined symbol linker errors. Fix this by only offering to build this driver when CONFIG_SCSI=y. [mkp: JAZZ_ESP is unique in that it does not support being compiled as a module unlike the remaining SPI SCSI HBA drivers] Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20240214055953.9612-1-rdunlap@infradead.org Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: linux-mips@vger.kernel.org Cc: Arnd Bergmann <arnd@arndb.de> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nicolas Schier <nicolas@fjasle.eu> Cc: James E.J. Bottomley <jejb@linux.ibm.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: linux-scsi@vger.kernel.org Cc: Geert Uytterhoeven <geert@linux-m68k.org> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202402112222.Gl0udKyU-lkp@intel.com/ Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-02-15scsi: smartpqi: Fix disable_managed_interruptsDon Brace
Correct blk-mq registration issue with module parameter disable_managed_interrupts enabled. When we turn off the default PCI_IRQ_AFFINITY flag, the driver needs to register with blk-mq using blk_mq_map_queues(). The driver is currently calling blk_mq_pci_map_queues() which results in a stack trace and possibly undefined behavior. Stack Trace: [ 7.860089] scsi host2: smartpqi [ 7.871934] WARNING: CPU: 0 PID: 238 at block/blk-mq-pci.c:52 blk_mq_pci_map_queues+0xca/0xd0 [ 7.889231] Modules linked in: sd_mod t10_pi sg uas smartpqi(+) crc32c_intel scsi_transport_sas usb_storage dm_mirror dm_region_hash dm_log dm_mod ipmi_devintf ipmi_msghandler fuse [ 7.924755] CPU: 0 PID: 238 Comm: kworker/0:3 Not tainted 4.18.0-372.88.1.el8_6_smartpqi_test.x86_64 #1 [ 7.944336] Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 03/08/2022 [ 7.963026] Workqueue: events work_for_cpu_fn [ 7.978275] RIP: 0010:blk_mq_pci_map_queues+0xca/0xd0 [ 7.978278] Code: 48 89 de 89 c7 e8 f6 0f 4f 00 3b 05 c4 b7 8e 01 72 e1 5b 31 c0 5d 41 5c 41 5d 41 5e 41 5f e9 7d df 73 00 31 c0 e9 76 df 73 00 <0f> 0b eb bc 90 90 0f 1f 44 00 00 41 57 49 89 ff 41 56 41 55 41 54 [ 7.978280] RSP: 0018:ffffa95fc3707d50 EFLAGS: 00010216 [ 7.978283] RAX: 00000000ffffffff RBX: 0000000000000000 RCX: 0000000000000010 [ 7.978284] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffff9190c32d4310 [ 7.978286] RBP: 0000000000000000 R08: ffffa95fc3707d38 R09: ffff91929b81ac00 [ 7.978287] R10: 0000000000000001 R11: ffffa95fc3707ac0 R12: 0000000000000000 [ 7.978288] R13: ffff9190c32d4000 R14: 00000000ffffffff R15: ffff9190c4c950a8 [ 7.978290] FS: 0000000000000000(0000) GS:ffff9193efc00000(0000) knlGS:0000000000000000 [ 7.978292] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8.172814] CR2: 000055d11166c000 CR3: 00000002dae10002 CR4: 00000000007706f0 [ 8.172816] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 8.172817] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 8.172818] PKRU: 55555554 [ 8.172819] Call Trace: [ 8.172823] blk_mq_alloc_tag_set+0x12e/0x310 [ 8.264339] scsi_add_host_with_dma.cold.9+0x30/0x245 [ 8.279302] pqi_ctrl_init+0xacf/0xc8e [smartpqi] [ 8.294085] ? pqi_pci_probe+0x480/0x4c8 [smartpqi] [ 8.309015] pqi_pci_probe+0x480/0x4c8 [smartpqi] [ 8.323286] local_pci_probe+0x42/0x80 [ 8.337855] work_for_cpu_fn+0x16/0x20 [ 8.351193] process_one_work+0x1a7/0x360 [ 8.364462] ? create_worker+0x1a0/0x1a0 [ 8.379252] worker_thread+0x1ce/0x390 [ 8.392623] ? create_worker+0x1a0/0x1a0 [ 8.406295] kthread+0x10a/0x120 [ 8.418428] ? set_kthread_struct+0x50/0x50 [ 8.431532] ret_from_fork+0x1f/0x40 [ 8.444137] ---[ end trace 1bf0173d39354506 ]--- Fixes: cf15c3e734e8 ("scsi: smartpqi: Add module param to disable managed ints") Tested-by: Yogesh Chandra Pandey <YogeshChandra.Pandey@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/20240213162200.1875970-2-don.brace@microchip.com Reviewed-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-02-15scsi: ufs: Uninitialized variable in ufshcd_devfreq_target()Dan Carpenter
There is one goto where "sched_clk_scaling_suspend_work" is true but "scale_up" is uninitialized. It leads to a Smatch uninitialized variable warning: drivers/ufs/core/ufshcd.c:1589 ufshcd_devfreq_target() error: uninitialized symbol 'scale_up'. Fixes: 1d969731b87f ("scsi: ufs: core: Only suspend clock scaling if scaling down") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/c787d37f-1107-4512-8991-bccf80e74a35@moroto.mountain Reviewed-by: Peter Wang <peter.wang@mediatek.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>