summaryrefslogtreecommitdiff
path: root/drivers/pci
AgeCommit message (Collapse)Author
2023-08-25PCI: Simplify pcie_capability_clear_and_set_word() control flowBjorn Helgaas
Return early for errors in pcie_capability_clear_and_set_word_unlocked() and pcie_capability_clear_and_set_dword() to simplify the control flow. No functional change intended. Link: https://lore.kernel.org/r/20230824193712.542167-13-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-08-25PCI: Tidy config space save/restore messagesBjorn Helgaas
Update config space save/restore debug messages so they line up better. Previously: nvme 0000:05:00.0: saving config space at offset 0x4 (reading 0x20100006) nvme 0000:05:00.0: saving config space at offset 0x8 (reading 0x1080200) nvme 0000:05:00.0: saving config space at offset 0xc (reading 0x0) nvme 0000:05:00.0: restoring config space at offset 0x4 (was 0x0, writing 0x20100006) Now: nvme 0000:05:00.0: save config 0x04: 0x20100006 nvme 0000:05:00.0: save config 0x08: 0x01080200 nvme 0000:05:00.0: save config 0x0c: 0x00000000 nvme 0000:05:00.0: restore config 0x04: 0x00000000 -> 0x20100006 No functional change intended. Enable these messages by setting CONFIG_DYNAMIC_DEBUG=y and adding 'dyndbg="file drivers/pci/* +p"' to kernel parameters. Link: https://lore.kernel.org/r/20230823191831.476579-1-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2023-08-25PCI: Fix code formatting inconsistenciesBjorn Helgaas
Remove unnecessary "return;" in void functions and format consistently. No functional change intended. Link: https://lore.kernel.org/r/20230824193712.542167-12-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-08-25PCI: Fix typos in docs and commentsBjorn Helgaas
Fix typos in docs and comments. Link: https://lore.kernel.org/r/20230824193712.542167-11-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-08-25PCI: Fix pci_bus_resetable(), pci_slot_resetable() name typosBjorn Helgaas
Fix typos in the pci_bus_resetable() and pci_slot_resetable() function names. No functional change intended. Link: https://lore.kernel.org/r/20230824193712.542167-10-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-08-25PCI: Simplify pci_dev_driver()Bjorn Helgaas
Simplify pci_dev_driver() by removing the "else". The "if" case always returns, so the "else" is superfluous. No functional change intended. Link: https://lore.kernel.org/r/20230824193712.542167-9-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-08-25PCI: Simplify pci_pio_to_address()Bjorn Helgaas
Simplify pci_pio_to_address() by removing an unnecessary local variable. No functional change intended. Link: https://lore.kernel.org/r/20230824193712.542167-8-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-08-25PCI/AER: Simplify AER_RECOVER_RING_SIZE definitionBjorn Helgaas
ACPI Platform Error Interfaces (APEI) convey error information to the OS. If the APEI GHES driver receives information about PCI errors, it queues it in aer_recover_ring for processing by the PCI AER code. AER_RECOVER_RING_SIZE is the size of the aer_recover_ring FIFO and is arbitrary, with no direct connection to hardware. AER_RECOVER_RING_ORDER was only used to compute AER_RECOVER_RING_SIZE. Remove it and define AER_RECOVER_RING_SIZE directly. No functional change intended. Link: https://lore.kernel.org/r/20230824193712.542167-7-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-08-25PCI: Use consistent put_user() pointer typesBjorn Helgaas
We used u8, u16, and u32 for get_user() pointer types, but "unsigned char", "unsigned short", and "unsigned int" for put_user(). Use u8, u16, and u32 for put_user() for consistency. No functional change intended. Link: https://lore.kernel.org/r/20230824193712.542167-6-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-08-25PCI: Fix printk field formattingBjorn Helgaas
Previously we used "%#08x" to print a 32-bit value. This fills an 8-character field with "0x...", but of course many 32-bit values require a 10-character field "0x12345678" for this format. Fix the formats to avoid confusion. Link: https://lore.kernel.org/r/20230824193712.542167-5-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-08-25PCI: Remove unnecessary initializationsBjorn Helgaas
We always assign "fields" immediately, so remove the unnecessary initializations. No functional change intended. Link: https://lore.kernel.org/r/20230824193712.542167-4-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-08-25PCI: Unexport pcie_port_bus_typeBjorn Helgaas
pcie_port_bus_type is used only in pci-driver.c and pcie/portdrv_core.c and pcie/portdrv_pci.c. None of these can be built as modules, so pcie_port_bus_type doesn't need to be exported. Unexport it. Link: https://lore.kernel.org/r/20230824193712.542167-3-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-08-25PCI: mvebu: Remove unused busn memberPali Rohár
The busn member of struct mvebu_pcie is unused, so drop it. Link: https://lore.kernel.org/r/20220905192310.22786-5-pali@kernel.org Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
2023-08-24PCI: Remove unused function declarationsYue Haibing
The following declarations have never been implemented since the beginning of git history, so remove them: u8 acpiphp_get_attention_status(struct acpiphp_slot *slot); u8 cpci_get_latch_status(struct slot *slot); u8 cpci_get_adapter_status(struct slot *slot); int ibmphp_get_total_hp_slots(void); void ibmphp_free_ibm_slot(struct slot *); void pdev_enable_device(struct pci_dev *); Link: https://lore.kernel.org/r/20230811095933.28652-1-yuehaibing@huawei.com Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-08-24PCI/VGA: Fix typosSui Jingfeng
Fix typos, rewrap to fill 78 columns, convert to conventional multi-line style. [bhelgaas: squash and add more fixes] Link: https://lore.kernel.org/r/20230808223412.1743176-7-sui.jingfeng@linux.dev Link: https://lore.kernel.org/r/20230808223412.1743176-9-sui.jingfeng@linux.dev Link: https://lore.kernel.org/r/20230808223412.1743176-10-sui.jingfeng@linux.dev Link: https://lore.kernel.org/r/20230808223412.1743176-11-sui.jingfeng@linux.dev Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-08-24PCI: brcmstb: Remove stale commentJim Quinlan
A comment says that Multi-MSI is not supported by the driver. A past commit [1] added this feature, so the comment is incorrect and is removed. [1] commit 198acab1772f22f2 ("PCI: brcmstb: Enable Multi-MSI") Link: https://lore.kernel.org/r/20230623144100.34196-6-james.quinlan@broadcom.com Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Acked-by: Florian Fainelli <f.fainelli@gmail.com>
2023-08-24PCI: brcmstb: Assert PERST# on BCM2711Jim Quinlan
The current PCIe driver assumes PERST# is asserted when probe() is invoked. Some older versions of the 2711/RPi bootloader left PERST# unasserted, as the Raspian OS does assert PERST# on probe(). For this reason, we assert PERST# for BCM2711 SOCs (i.e. RPi). Link: https://lore.kernel.org/r/20230623144100.34196-5-james.quinlan@broadcom.com Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
2023-08-24PCI: layerscape: Add power management support for ls1028aHou Zhiqiang
Add PME_Turn_off/PME_TO_Ack handshake sequence for ls1028a platform. Implemented on top of common dwc dw_pcie_suspend(resume)_noirq() functions to handle system enter/exit suspend states. Link: https://lore.kernel.org/r/20230821184815.2167131-4-Frank.Li@nxp.com Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Acked-by: Manivannan Sadhasivam <mani@kernel.org>
2023-08-24PCI: dwc: Implement generic suspend/resume functionalityFrank Li
Introduce an helper function (dw_pcie_get_ltssm()) to retrieve SMLH_LTSS_STATE. Add common dw_pcie_suspend(resume)_noirq() API to implement the DWC controller generic suspend/resume functionality. Add a controller specific callback to send the PME_Turn_Off message (ie .pme_turn_off) for controller platform specific PME handling. Link: https://lore.kernel.org/r/20230821184815.2167131-3-Frank.Li@nxp.com Signed-off-by: Frank Li <Frank.Li@nxp.com> [lpieralisi@kernel.org: commit log] Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Acked-by: Manivannan Sadhasivam <mani@kernel.org>
2023-08-24PCI: Add PCIE_PME_TO_L2_TIMEOUT_US L2 ready timeout valueFrank Li
Add the PCIE_PME_TO_L2_TIMEOUT_US macro to define the L2 ready timeout as described in the PCI specifications. Link: https://lore.kernel.org/r/20230821184815.2167131-2-Frank.Li@nxp.com Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Acked-by: Manivannan Sadhasivam <mani@kernel.org>
2023-08-24PCI: layerscape: Add workaround for lost link capabilities during resetXiaowei Bao
The endpoint controller loses the Maximum Link Width and Supported Link Speed value from the Link Capabilities Register - initially configured by the Reset Configuration Word (RCW) - during a link-down or hot reset event. Address this issue in the endpoint event handler. Link: https://lore.kernel.org/r/20230720135834.1977616-2-Frank.Li@nxp.com Fixes: a805770d8a22 ("PCI: layerscape: Add EP mode support") Signed-off-by: Xiaowei Bao <xiaowei.bao@nxp.com> Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Acked-by: Manivannan Sadhasivam <mani@kernel.org>
2023-08-24PCI: layerscape: Add support for link-down notificationFrank Li
Add support to pass link-down notification to the endpoint function driver so that it can process the LINK_DOWN event. Link: https://lore.kernel.org/r/20230720135834.1977616-1-Frank.Li@nxp.com Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Acked-by: Manivannan Sadhasivam <mani@kernel.org>
2023-08-23PCI/VGA: Simplify vga_client_register()Sui Jingfeng
Reorganize vga_client_register() to avoid the goto and the need to save the return value. Update the kernel-doc to reflect -ENODEV on failure. No functional change intended. [bhelgaas: drop "ret" variable, commit log] Link: https://lore.kernel.org/r/20230808223412.1743176-8-sui.jingfeng@linux.dev Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-08-23PCI/VGA: Simplify vga_arbiter_notify_clients()Sui Jingfeng
In vga_arbiter_notify_clients(), "new_state" was computed during every loop iteration even though it doesn't depend on anything that changes during the loop. Move the computation outside the loop. [bhelgaas: drop renames that obscure the purpose, commit log] Link: https://lore.kernel.org/r/20230808223412.1743176-6-sui.jingfeng@linux.dev Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-08-23PCI/VGA: Correct vga_update_device_decodes() parameter typeSui Jingfeng
Previously vga_update_device_decodes() took "int new_decodes", but the callers pass "unsigned int new_decodes". Correct the vga_update_device_decodes() parameter type to "unsigned int" to match. In vga_arbiter_notify_clients(), the return from vgadev->set_decode() is "unsigned int" but was stored as "uint32_t new_decodes". Correct the new_decodes type to "unsigned int". [bhelgaas: use correct type for ->set_decode() return, commit log] Link: https://lore.kernel.org/r/20230808223412.1743176-5-sui.jingfeng@linux.dev Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-08-23PCI/VGA: Correct vga_str_to_iostate() io_state parameter typeSui Jingfeng
Previously vga_str_to_iostate() took "int *io_state", but vga_arb_write() is the only caller and it passes "unsigned int *". Make the vga_str_to_iostate() parameter type "unsigned int *" to match. [bhelgaas: commit log] Link: https://lore.kernel.org/r/20230808223412.1743176-2-sui.jingfeng@linux.dev Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-08-23PCI: fu740: Set the number of MSI vectorsYong-Xuan Wang
The iMSI-RX module of the DW PCIe controller provides multiple sets of MSI_CTRL_INT_i_* registers, and each set is capable of handling 32 MSI interrupts. However, the fu740 PCIe controller driver only enabled one set of MSI_CTRL_INT_i_* registers, as the total number of supported interrupts was not specified. Set the supported number of MSI vectors to enable all the MSI_CTRL_INT_i_* registers on the fu740 PCIe core, allowing the system to fully utilize the available MSI interrupts. Link: https://lore.kernel.org/r/20230807055621.2431-1-yongxuan.wang@sifive.com Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
2023-08-22of: unittest: Add pci_dt_testdrv pci driverLizhi Hou
pci_dt_testdrv is bound to QEMU PCI Test Device. It reads overlay_pci_node fdt fragment and apply it to Test Device. Then it calls of_platform_default_populate() to populate the platform devices. Tested-by: Herve Codina <herve.codina@bootlin.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/1692120000-46900-6-git-send-email-lizhi.hou@amd.com Signed-off-by: Rob Herring <robh@kernel.org>
2023-08-22PCI: Add quirks to generate device tree node for Xilinx Alveo U50Lizhi Hou
The Xilinx Alveo U50 PCI card exposes multiple hardware peripherals on its PCI BAR. The card firmware provides a flattened device tree to describe the hardware peripherals on its BARs. This allows U50 driver to load the flattened device tree and generate the device tree node for hardware peripherals underneath. To generate device tree node for U50 card, add PCI quirks to call of_pci_make_dev_node() for U50. Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/1692120000-46900-4-git-send-email-lizhi.hou@amd.com Signed-off-by: Rob Herring <robh@kernel.org>
2023-08-22PCI: Create device tree node for bridgeLizhi Hou
The PCI endpoint device such as Xilinx Alveo PCI card maps the register spaces from multiple hardware peripherals to its PCI BAR. Normally, the PCI core discovers devices and BARs using the PCI enumeration process. There is no infrastructure to discover the hardware peripherals that are present in a PCI device, and which can be accessed through the PCI BARs. Apparently, the device tree framework requires a device tree node for the PCI device. Thus, it can generate the device tree nodes for hardware peripherals underneath. Because PCI is self discoverable bus, there might not be a device tree node created for PCI devices. Furthermore, if the PCI device is hot pluggable, when it is plugged in, the device tree nodes for its parent bridges are required. Add support to generate device tree node for PCI bridges. Add an of_pci_make_dev_node() interface that can be used to create device tree node for PCI devices. Add a PCI_DYNAMIC_OF_NODES config option. When the option is turned on, the kernel will generate device tree nodes for PCI bridges unconditionally. Initially, add the basic properties for the dynamically generated device tree nodes which include #address-cells, #size-cells, device_type, compatible, ranges, reg. Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/1692120000-46900-3-git-send-email-lizhi.hou@amd.com Signed-off-by: Rob Herring <robh@kernel.org>
2023-08-22PCI: hv: Fix a crash in hv_pci_restore_msi_msg() during hibernationDexuan Cui
When a Linux VM with an assigned PCI device runs on Hyper-V, if the PCI device driver is not loaded yet (i.e. MSI-X/MSI is not enabled on the device yet), doing a VM hibernation triggers a panic in hv_pci_restore_msi_msg() -> msi_lock_descs(&pdev->dev), because pdev->dev.msi.data is still NULL. Avoid the panic by checking if MSI-X/MSI is enabled. Link: https://lore.kernel.org/r/20230816175939.21566-1-decui@microsoft.com Fixes: dc2b453290c4 ("PCI: hv: Rework MSI handling") Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: sathyanarayanan.kuppuswamy@linux.intel.com Reviewed-by: Michael Kelley <mikelley@microsoft.com> Cc: stable@vger.kernel.org
2023-08-22PCI: vmd: Disable bridge window for domain resetNirmal Patel
During domain reset process vmd_domain_reset() clears PCI configuration space of VMD root ports. But certain platform has observed following errors and failed to boot. ... DMAR: VT-d detected Invalidation Queue Error: Reason f DMAR: VT-d detected Invalidation Time-out Error: SID ffff DMAR: VT-d detected Invalidation Completion Error: SID ffff DMAR: QI HEAD: UNKNOWN qw0 = 0x0, qw1 = 0x0 DMAR: QI PRIOR: UNKNOWN qw0 = 0x0, qw1 = 0x0 DMAR: Invalidation Time-out Error (ITE) cleared The root cause is that memset_io() clears prefetchable memory base/limit registers and prefetchable base/limit 32 bits registers sequentially. This seems to be enabling prefetchable memory if the device disabled prefetchable memory originally. Here is an example (before memset_io()): PCI configuration space for 10000:00:00.0: 86 80 30 20 06 00 10 00 04 00 04 06 00 00 01 00 00 00 00 00 00 00 00 00 00 01 01 00 00 00 00 20 00 00 00 00 01 00 01 00 ff ff ff ff 75 05 00 00 ... So, prefetchable memory is ffffffff00000000-575000fffff, which is disabled. When memset_io() clears prefetchable base 32 bits register, the prefetchable memory becomes 0000000000000000-575000fffff, which is enabled and incorrect. Here is the quote from section 7.5.1.3.9 of PCI Express Base 6.0 spec: The Prefetchable Memory Limit register must be programmed to a smaller value than the Prefetchable Memory Base register if there is no prefetchable memory on the secondary side of the bridge. This is believed to be the reason for the failure and in addition the sequence of operation in vmd_domain_reset() is not following the PCIe specs. Disable the bridge window by executing a sequence of operations borrowed from pci_disable_bridge_window() and pci_setup_bridge_io(), that comply with the PCI specifications. Link: https://lore.kernel.org/r/20230810215029.1177379-1-nirmal.patel@linux.intel.com Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
2023-08-18PCI: rpaphp: Error out on busy status from get-sensor-stateMahesh Salgaonkar
When certain PHB HW failure causes pHyp to recover PHB, it marks the PE state as temporarily unavailable until recovery is complete. This also triggers an EEH handler in Linux which needs to notify drivers, and perform recovery. But before notifying the driver about the PCI error it uses get_adapter_status()->rpaphp_get_sensor_state()->rtas_call(get-sensor-state) operation of the hotplug_slot to determine if the slot contains a device or not. If the slot is empty, the recovery is skipped entirely. eeh_event_handler() ->eeh_handle_normal_event() ->eeh_slot_presence_check() ->get_adapter_status() ->rpaphp_get_sensor_state() ->rtas_get_sensor() ->rtas_call(get-sensor-state) However on certain PHB failures, the RTAS call rtas_call(get-sensor-state) returns extended busy error (9902) until PHB is recovered by pHyp. Once PHB is recovered, the rtas_call(get-sensor-state) returns success with correct presence status. The RTAS call interface rtas_get_sensor() loops over the RTAS call on extended delay return code (9902) until the return value is either success (0) or error (-1). This causes the EEH handler to get stuck for ~6 seconds before it could notify that the PCI error has been detected and stop any active operations. Hence with running I/O traffic, during this 6 seconds, the network driver continues its operation and hits a timeout (netdev watchdog). ------------ [52732.244731] DEBUG: ibm_read_slot_reset_state2() [52732.244762] DEBUG: ret = 0, rets[0]=5, rets[1]=1, rets[2]=4000, rets[3]=> [52732.244798] DEBUG: in eeh_slot_presence_check [52732.244804] DEBUG: error state check [52732.244807] DEBUG: Is slot hotpluggable [52732.244810] DEBUG: hotpluggable ops ? [52732.244953] DEBUG: Calling ops->get_adapter_status [52732.244958] DEBUG: calling rpaphp_get_sensor_state [52736.564262] ------------[ cut here ]------------ [52736.564299] NETDEV WATCHDOG: enP64p1s0f3 (tg3): transmit queue 0 timed o> [52736.564324] WARNING: CPU: 1442 PID: 0 at net/sched/sch_generic.c:478 dev> [...] [52736.564505] NIP [c000000000c32368] dev_watchdog+0x438/0x440 [52736.564513] LR [c000000000c32364] dev_watchdog+0x434/0x440 ------------ On timeouts, network driver starts dumping debug information to console (e.g bnx2 driver calls bnx2x_panic_dump()), and go into recovery path while pHyp is still recovering the PHB. As part of recovery, the driver tries to reset the device and it keeps failing since every PCI read/write returns ff's. And when EEH recovery kicks-in, the driver is unable to recover the device. This impacts the ssh connection and leads to the system being inaccessible. To get the NIC working again it needs a reboot or re-assign the I/O adapter from HMC. [ 9531.168587] EEH: Beginning: 'slot_reset' [ 9531.168601] PCI 0013:01:00.0#10000: EEH: Invoking bnx2x->slot_reset() [...] [ 9614.110094] bnx2x: [bnx2x_func_stop:9129(enP19p1s0f0)]FUNC_STOP ramrod failed. Running a dry transaction [ 9614.110300] bnx2x: [bnx2x_igu_int_disable:902(enP19p1s0f0)]BUG! Proper val not read from IGU! [ 9629.178067] bnx2x: [bnx2x_fw_command:3055(enP19p1s0f0)]FW failed to respond! [ 9629.178085] bnx2x 0013:01:00.0 enP19p1s0f0: bc 7.10.4 [ 9629.178091] bnx2x: [bnx2x_fw_dump_lvl:789(enP19p1s0f0)]Cannot dump MCP info while in PCI error [ 9644.241813] bnx2x: [bnx2x_io_slot_reset:14245(enP19p1s0f0)]IO slot reset --> driver unload [...] [ 9644.241819] PCI 0013:01:00.0#10000: EEH: bnx2x driver reports: 'disconnect' [ 9644.241823] PCI 0013:01:00.1#10000: EEH: Invoking bnx2x->slot_reset() [ 9644.241827] bnx2x: [bnx2x_io_slot_reset:14229(enP19p1s0f1)]IO slot reset initializing... [ 9644.241916] bnx2x 0013:01:00.1: enabling device (0140 -> 0142) [ 9644.258604] bnx2x: [bnx2x_io_slot_reset:14245(enP19p1s0f1)]IO slot reset --> driver unload [ 9644.258612] PCI 0013:01:00.1#10000: EEH: bnx2x driver reports: 'disconnect' [ 9644.258615] EEH: Finished:'slot_reset' with aggregate recovery state:'disconnect' [ 9644.258620] EEH: Unable to recover from failure from PHB#13-PE#10000. [ 9644.261811] EEH: Beginning: 'error_detected(permanent failure)' [...] [ 9644.261823] EEH: Finished:'error_detected(permanent failure)' Hence, it becomes important to inform driver about the PCI error detection as early as possible, so that driver is aware of PCI error and waits for EEH handler's next action for successful recovery. Current implementation uses rtas_get_sensor() API which blocks the slot check state until RTAS call returns success. To avoid this, fix the PCI hotplug driver (rpaphp) to return an error (-EBUSY) if the slot presence state can not be detected immediately while PE is in EEH recovery state. Change rpaphp_get_sensor_state() to invoke rtas_call(get-sensor-state) directly only if the respective PE is in EEH recovery state, and take actions based on RTAS return status. This way EEH handler will not be blocked on rpaphp_get_sensor_state() and can immediately notify driver about the PCI error and stop any active operations. In normal cases (non-EEH case) rpaphp_get_sensor_state() will continue to invoke rtas_get_sensor() as it was earlier with no change in existing behavior. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/169235815601.193557.13989873835811325343.stgit@jupiter
2023-08-14PCI/P2PDMA: Use pci_dev_id() to simplify the codeZheng Zengkai
When we have a struct pci_dev *, use pci_dev_id() instead of manually composing the ID from dev->bus->number and dev->devfn. [bhelgaas: commit log] Link: https://lore.kernel.org/r/20230811111057.31900-1-zhengzengkai@huawei.com Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
2023-08-11PCI: Fix runtime PM race with PME pollingAlex Williamson
Testing that a device is not currently in a low power state provides no guarantees that the device is not imminently transitioning to such a state. Increment the PM usage counter before accessing the device. Since we don't wish to wake the device for PME polling, do so only if the device is already active by using pm_runtime_get_if_active(). Link: https://lore.kernel.org/r/20230803171233.3810944-3-alex.williamson@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-08-11PCI/VPD: Add runtime power management to sysfs interfaceAlex Williamson
Unlike default access to config space through sysfs, the VPD read and write functions don't actively manage the runtime power management state of the device during access. Since commit 7ab5e10eda02 ("vfio/pci: Move the unused device into low power state with runtime PM"), the vfio-pci driver will use runtime power management and release unused devices to make use of low power states. Attempting to access VPD information in D3cold can result in incorrect information or kernel crashes depending on the system behavior. Wrap the VPD read/write bin attribute handlers in runtime PM and take into account the potential quirk to select the correct device to wake. Link: https://lore.kernel.org/r/20230803171233.3810944-2-alex.williamson@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> [bhelgaas: tweak pci_dev_put() test to match the pci_get_func0_dev() test] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-08-11Merge tag 'pci-v6.5-fixes-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fixes from Bjorn Helgaas: - Add Manivannan Sadhasivam as DesignWare PCIe driver co-maintainer (Krzysztof Wilczyński) - Revert "PCI: dwc: Wait for link up only if link is started" to fix a regression on Qualcomm platforms that don't reach interconnect sync state if the slot is empty (Johan Hovold) - Revert "PCI: mvebu: Mark driver as BROKEN" so people can use pci-mvebu even though some others report problems (Bjorn Helgaas) - Avoid a NULL pointer dereference when using acpiphp for root bus hotplug to fix a regression added in v6.5-rc1 (Igor Mammedov) * tag 'pci-v6.5-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus Revert "PCI: mvebu: Mark driver as BROKEN" Revert "PCI: dwc: Wait for link up only if link is started" MAINTAINERS: Add Manivannan Sadhasivam as DesignWare PCIe driver maintainer
2023-08-10PCI/ASPM: Use RMW accessors for changing LNKCTLIlpo Järvinen
Don't assume that the device is fully under the control of ASPM and use RMW capability accessors which do proper locking to avoid losing concurrent updates to the register values. If configuration fails in pcie_aspm_configure_common_clock(), the function attempts to restore the old PCI_EXP_LNKCTL_CCC settings. Store only the old PCI_EXP_LNKCTL_CCC bit for the relevant devices rather than the content of the whole LNKCTL registers. It aligns better with how pcie_lnkctl_clear_and_set() expects its parameter and makes the code more obvious to understand. Suggested-by: Lukas Wunner <lukas@wunner.de> Fixes: 2a42d9dba784 ("PCIe: ASPM: Break out of endless loop waiting for PCI config bits to switch") Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support") Link: https://lore.kernel.org/r/20230717120503.15276-5-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: "Rafael J. Wysocki" <rafael@kernel.org>
2023-08-10PCI: pciehp: Use RMW accessors for changing LNKCTLIlpo Järvinen
As hotplug is not the only driver touching LNKCTL, use the RMW capability accessor which handles concurrent changes correctly. Suggested-by: Lukas Wunner <lukas@wunner.de> Fixes: 7f822999e12a ("PCI: pciehp: Add Disable/enable link functions") Link: https://lore.kernel.org/r/20230717120503.15276-4-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: "Rafael J. Wysocki" <rafael@kernel.org>
2023-08-10PCI: Make link retraining use RMW accessors for changing LNKCTLIlpo Järvinen
Don't assume that the device is fully under the control of PCI core. Use RMW capability accessors in link retraining which do proper locking to avoid losing concurrent updates to the register values. Suggested-by: Lukas Wunner <lukas@wunner.de> Fixes: 4ec73791a64b ("PCI: Work around Pericom PCIe-to-PCI bridge Retrain Link erratum") Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support") Link: https://lore.kernel.org/r/20230717120503.15276-3-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: "Rafael J. Wysocki" <rafael@kernel.org>
2023-08-10PCI: Add locking to RMW PCI Express Capability Register accessorsIlpo Järvinen
Many places in the kernel write the Link Control and Root Control PCI Express Capability Registers without proper concurrency control and this could result in losing the changes one of the writers intended to make. Add pcie_cap_lock spinlock into the struct pci_dev and use it to protect bit changes made in the RMW capability accessors. Protect only a selected set of registers by differentiating the RMW accessor internally to locked/unlocked variants using a wrapper which has the same signature as pcie_capability_clear_and_set_word(). As the Capability Register (pos) given to the wrapper is always a constant, the compiler should be able to simplify all the dead-code away. So far only the Link Control Register (ASPM, hotplug, link retraining, various drivers) and the Root Control Register (AER & PME) seem to require RMW locking. Suggested-by: Lukas Wunner <lukas@wunner.de> Fixes: c7f486567c1d ("PCI PM: PCIe PME root port service driver") Fixes: f12eb72a268b ("PCI/ASPM: Use PCI Express Capability accessors") Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support") Fixes: affa48de8417 ("staging/rdma/hfi1: Add support for enabling/disabling PCIe ASPM") Fixes: 849a9366cba9 ("misc: rtsx: Add support new chip rts5228 mmc: rtsx: Add support MMC_CAP2_NO_MMC") Fixes: 3d1e7aa80d1c ("misc: rtsx: Use pcie_capability_clear_and_set_word() for PCI_EXP_LNKCTL") Fixes: c0e5f4e73a71 ("misc: rtsx: Add support for RTS5261") Fixes: 3df4fce739e2 ("misc: rtsx: separate aspm mode into MODE_REG and MODE_CFG") Fixes: 121e9c6b5c4c ("misc: rtsx: modify and fix init_hw function") Fixes: 19f3bd548f27 ("mfd: rtsx: Remove LCTLR defination") Fixes: 773ccdfd9cc6 ("mfd: rtsx: Read vendor setting from config space") Fixes: 8275b77a1513 ("mfd: rts5249: Add support for RTS5250S power saving") Fixes: 5da4e04ae480 ("misc: rtsx: Add support for RTS5260") Fixes: 0f49bfbd0f2e ("tg3: Use PCI Express Capability accessors") Fixes: 5e7dfd0fb94a ("tg3: Prevent corruption at 10 / 100Mbps w CLKREQ") Fixes: b726e493e8dc ("r8169: sync existing 8168 device hardware start sequences with vendor driver") Fixes: e6de30d63eb1 ("r8169: more 8168dp support.") Fixes: 8a06127602de ("Bluetooth: hci_bcm4377: Add new driver for BCM4377 PCIe boards") Fixes: 6f461f6c7c96 ("e1000e: enable/disable ASPM L0s and L1 and ERT according to hardware errata") Fixes: 1eae4eb2a1c7 ("e1000e: Disable L1 ASPM power savings for 82573 mobile variants") Fixes: 8060e169e02f ("ath9k: Enable extended synch for AR9485 to fix L0s recovery issue") Fixes: 69ce674bfa69 ("ath9k: do btcoex ASPM disabling at initialization time") Fixes: f37f05503575 ("mt76: mt76x2e: disable pcie_aspm by default") Link: https://lore.kernel.org/r/20230717120503.15276-2-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: "Rafael J. Wysocki" <rafael@kernel.org>
2023-08-09PCI: Mark NVIDIA T4 GPUs to avoid bus resetWu Zongyong
NVIDIA T4 GPUs do not work with SBR. This problem is found when the T4 card is direct attached to a Root Port only. Avoid bus reset by marking T4 GPUs PCI_DEV_FLAGS_NO_BUS_RESET. Fixes: 4c207e7121fa ("PCI: Mark some NVIDIA GPUs to avoid bus reset") Link: https://lore.kernel.org/r/2dcebea53a6eb9bd212ec6d8974af2e5e0333ef6.1681129861.git.wuzongyong@linux.alibaba.com Signed-off-by: Wu Zongyong <wuzongyong@linux.alibaba.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-08-09PCI: switchtec: Add support for PCIe Gen5 devicesKelvin Cao
Advertise support of Gen5 devices in the driver's device ID table and add the same IDs for the switchtec quirks. Also update driver code to accommodate them. Link: https://lore.kernel.org/r/20230624000003.2315364-3-kelvin.cao@microchip.com Signed-off-by: Kelvin Cao <kelvin.cao@microchip.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
2023-08-09PCI: switchtec: Use normal comment styleKelvin Cao
Use normal comment style '/* */' for device ID description. Link: https://lore.kernel.org/r/20230624000003.2315364-2-kelvin.cao@microchip.com Signed-off-by: Kelvin Cao <kelvin.cao@microchip.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
2023-08-09PCI: move OF status = "disabled" detection to dev->match_driverVladimir Oltean
The blamed commit has broken probing on arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi when &enetc_port0 (PCI function 0) has status = "disabled". Background: pci_scan_slot() has logic to say that if the function 0 of a device is absent, the entire device is absent and we can skip the other functions entirely. Traditionally, this has meant that pci_bus_read_dev_vendor_id() returns an error code for that function. However, since the blamed commit, there is an extra confounding condition: function 0 of the device exists and has a valid vendor id, but it is disabled in the device tree. In that case, pci_scan_slot() would incorrectly skip the entire device instead of just that function. In the case of NXP LS1028A, status = "disabled" does not mean that the PCI function's config space is not available for reading. It is, but the Ethernet port is just not functionally useful with a particular SerDes protocol configuration (0x9999) due to pinmuxing constraints of the Soc. So, pci_scan_slot() skips all other functions on the ENETC ECAM (enetc_port1, enetc_port2, enetc_mdio_pf3 etc) when just enetc_port0 had to not be probed. There is an additional regression introduced by the change, caused by its fundamental premise. The enetc driver needs to run code for all PCI functions, regardless of whether they're enabled or not in the device tree. That is no longer possible if the driver's probe function is no longer called. But Rob recommends that we move the of_device_is_available() detection to dev->match_driver, and this makes the PCI fixups still run on all functions, while just probing drivers for those functions that are enabled. So, a separate change in the enetc driver will have to move the workarounds to a PCI fixup. Fixes: 6fffbc7ae137 ("PCI: Honor firmware's device disabled status") Link: https://lore.kernel.org/netdev/CAL_JsqLsVYiPLx2kcHkDQ4t=hQVCR7NHziDwi9cCFUFhx48Qow@mail.gmail.com/ Suggested-by: Rob Herring <robh@kernel.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-08PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root busIgor Mammedov
40613da52b13 ("PCI: acpiphp: Reassign resources on bridge if necessary") changed acpiphp hotplug to use pci_assign_unassigned_bridge_resources() which depends on bridge being available, however enable_slot() can be called without bridge associated: 1. Legitimate case of hotplug on root bus (widely used in virt world) 2. A (misbehaving) firmware, that sends ACPI Bus Check notifications to non existing root ports (Dell Inspiron 7352/0W6WV0), which end up at enable_slot(..., bridge = 0) where bus has no bridge assigned to it. acpihp doesn't know that it's a bridge, and bus specific 'PCI subsystem' can't augment ACPI context with bridge information since the PCI device to get this data from is/was not available. Issue is easy to reproduce with QEMU's 'pc' machine, which supports PCI hotplug on hostbridge slots. To reproduce, boot kernel at commit 40613da52b13 in VM started with following CLI (assuming guest root fs is installed on sda1 partition): # qemu-system-x86_64 -M pc -m 1G -enable-kvm -cpu host \ -monitor stdio -serial file:serial.log \ -kernel arch/x86/boot/bzImage \ -append "root=/dev/sda1 console=ttyS0" \ guest_disk.img Once guest OS is fully booted at qemu prompt: (qemu) device_add e1000 (check serial.log) it will cause NULL pointer dereference at: void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge) { struct pci_bus *parent = bridge->subordinate; BUG: kernel NULL pointer dereference, address: 0000000000000018 ? pci_assign_unassigned_bridge_resources+0x1f/0x260 enable_slot+0x21f/0x3e0 acpiphp_hotplug_notify+0x13d/0x260 acpi_device_hotplug+0xbc/0x540 acpi_hotplug_work_fn+0x15/0x20 process_one_work+0x1f7/0x370 worker_thread+0x45/0x3b0 The issue was discovered on Dell Inspiron 7352/0W6WV0 laptop with following sequence: 1. Suspend to RAM 2. Wake up with the same backtrace being observed: 3. 2nd suspend to RAM attempt makes laptop freeze Fix it by using __pci_bus_assign_resources() instead of pci_assign_unassigned_bridge_resources() as we used to do, but only in case when bus doesn't have a bridge associated (to cover for the case of ACPI event on hostbridge or non existing root port). That lets us keep hotplug on root bus working like it used to and at the same time keeps resource reassignment usable on root ports (and other 1st level bridges) that was fixed by 40613da52b13. Fixes: 40613da52b13 ("PCI: acpiphp: Reassign resources on bridge if necessary") Link: https://lore.kernel.org/r/20230726123518.2361181-2-imammedo@redhat.com Reported-by: Woody Suwalski <terraluna977@gmail.com> Tested-by: Woody Suwalski <terraluna977@gmail.com> Tested-by: Michal Koutný <mkoutny@suse.com> Link: https://lore.kernel.org/r/11fc981c-af49-ce64-6b43-3e282728bd1a@gmail.com Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Acked-by: Michael S. Tsirkin <mst@redhat.com>
2023-08-08PCI: dwc: Provide deinit callback for i.MXMark Brown
The i.MX integration for the DesignWare PCI controller has a _host_exit() operation which undoes everything that the _host_init() operation does but does not wire this up as the host_deinit callback for the core, or call it in any path other than suspend. This means that if we ever unwind the initial probe of the device, for example because it fails, the regulator core complains that the regulators for the device were left enabled: imx6q-pcie 33800000.pcie: iATU: unroll T, 4 ob, 4 ib, align 64K, limit 16G imx6q-pcie 33800000.pcie: Phy link never came up imx6q-pcie 33800000.pcie: Phy link never came up imx6q-pcie: probe of 33800000.pcie failed with error -110 ------------[ cut here ]------------ WARNING: CPU: 2 PID: 46 at drivers/regulator/core.c:2396 _regulator_put+0x110/0x128 Wire up the callback so that the core can clean up after itself. Link: https://lore.kernel.org/r/20230731-pci-imx-regulator-cleanup-v2-1-fc8fa5c9893d@kernel.org Tested-by: Fabio Estevam <festevam@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Richard Zhu <hongxing.zhu@nxp.com> Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2023-08-08PCI: microchip: Re-partition code between probe() and init()Daire McNamara
Continuing to use pci_host_common_probe() for the PCIe Root Complex on PolarFire SoC is leading to an extremely large _init() function and some unnatural code flow. Re-partition the code so that some tasks are done in a _probe() routine, which calls pci_host_common_probe() and then use a much smaller _init() function, mainly to enable interrupts after address translation tables are set up. Link: https://lore.kernel.org/r/20230728131401.1615724-8-daire.mcnamara@microchip.com Signed-off-by: Daire McNamara <daire.mcnamara@microchip.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
2023-08-08PCI: microchip: Gather MSI information from hardware config registersDaire McNamara
The PCIe Root Complex on PolarFire SoC is configured at bitstream creation time using Libero. Key MSI-related parameters include the number of MSIs (1/2/4/8/16/32) and the MSI address. In the device driver, extract this information from hardware registers at init time, and use it to configure MSI system, including configuring MSI capability structure correctly in configuration space. Link: https://lore.kernel.org/r/20230728131401.1615724-7-daire.mcnamara@microchip.com Signed-off-by: Daire McNamara <daire.mcnamara@microchip.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
2023-08-08PCI: microchip: Clean up initialisation of interruptsDaire McNamara
Refactor interrupt handling in _init() function into disable_interrupts(), init_interrupts(), clear_sec_errors() and clear ded_errors() because current code is unwieldy and prone to bugs. Disable interrupts as soon as possible and only enable interrupts after address translation is setup to prevent spurious axi2pcie and pcie2axi translation errors being reported. Link: https://lore.kernel.org/r/20230728131401.1615724-6-daire.mcnamara@microchip.com Signed-off-by: Daire McNamara <daire.mcnamara@microchip.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>