summaryrefslogtreecommitdiff
path: root/drivers/pci/pci.c
AgeCommit message (Collapse)Author
2024-02-17Merge tag 'pci-v6.8-fixes-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fixes from Bjorn Helgaas: - Keep bridges in D0 if we need to poll downstream devices for PME to resolve a v6.6 regression where we failed to enumerate devices below bridges put in D3hot by runtime PM, e.g., NVMe drives connected via Thunderbolt or USB4 docks (Alex Williamson) - Add Siddharth Vadapalli as PCI TI DRA7XX/J721E reviewer * tag 'pci-v6.8-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: MAINTAINERS: Add Siddharth Vadapalli as PCI TI DRA7XX/J721E reviewer PCI: Fix active state requirement in PME polling
2024-02-09PCI: Fix active state requirement in PME pollingAlex Williamson
The commit noted in fixes added a bogus requirement that runtime PM managed devices need to be in the RPM_ACTIVE state for PME polling. In fact, only devices in low power states should be polled. However there's still a requirement that the device config space must be accessible, which has implications for both the current state of the polled device and the parent bridge, when present. It's not sufficient to assume the bridge remains in D0 and cases have been observed where the bridge passes the D0 test, but the PM state indicates RPM_SUSPENDING and config space of the polled device becomes inaccessible during pci_pme_wakeup(). Therefore, since the bridge is already effectively required to be in the RPM_ACTIVE state, formalize this in the code and elevate the PM usage count to maintain the state while polling the subordinate device. This resolves a regression reported in the bugzilla below where a Thunderbolt/USB4 hierarchy fails to scan for an attached NVMe endpoint downstream of a bridge in a D3hot power state. Link: https://lore.kernel.org/r/20240123185548.1040096-1-alex.williamson@redhat.com Fixes: d3fcd7360338 ("PCI: Fix runtime PM race with PME polling") Reported-by: Sanath S <sanath.s@amd.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218360 Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Sanath S <sanath.s@amd.com> Reviewed-by: Rafael J. Wysocki <rafael@kernel.org> Cc: Lukas Wunner <lukas@wunner.de> Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
2024-01-31PCI/ASPM: Fix deadlock when enabling ASPMJohan Hovold
A last minute revert in 6.7-final introduced a potential deadlock when enabling ASPM during probe of Qualcomm PCIe controllers as reported by lockdep: ============================================ WARNING: possible recursive locking detected 6.7.0 #40 Not tainted -------------------------------------------- kworker/u16:5/90 is trying to acquire lock: ffffacfa78ced000 (pci_bus_sem){++++}-{3:3}, at: pcie_aspm_pm_state_change+0x58/0xdc but task is already holding lock: ffffacfa78ced000 (pci_bus_sem){++++}-{3:3}, at: pci_walk_bus+0x34/0xbc other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(pci_bus_sem); lock(pci_bus_sem); *** DEADLOCK *** Call trace: print_deadlock_bug+0x25c/0x348 __lock_acquire+0x10a4/0x2064 lock_acquire+0x1e8/0x318 down_read+0x60/0x184 pcie_aspm_pm_state_change+0x58/0xdc pci_set_full_power_state+0xa8/0x114 pci_set_power_state+0xc4/0x120 qcom_pcie_enable_aspm+0x1c/0x3c [pcie_qcom] pci_walk_bus+0x64/0xbc qcom_pcie_host_post_init_2_7_0+0x28/0x34 [pcie_qcom] The deadlock can easily be reproduced on machines like the Lenovo ThinkPad X13s by adding a delay to increase the race window during asynchronous probe where another thread can take a write lock. Add a new pci_set_power_state_locked() and associated helper functions that can be called with the PCI bus semaphore held to avoid taking the read lock twice. Link: https://lore.kernel.org/r/ZZu0qx2cmn7IwTyQ@hovoldconsulting.com Link: https://lore.kernel.org/r/20240130100243.11011-1-johan+linaro@kernel.org Fixes: f93e71aea6c6 ("Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()"") Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: <stable@vger.kernel.org> # 6.7
2024-01-18Merge tag 'cxl-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxlLinus Torvalds
Pull CXL (Compute Express Link) updates from Dan Williams: "The bulk of this update is support for enumerating the performance capabilities of CXL memory targets and connecting that to a platform CXL memory QoS class. Some follow-on work remains to hook up this data into core-mm policy, but that is saved for v6.9. The next significant update is unifying how CXL event records (things like background scrub errors) are processed between so called "firmware first" and native error record retrieval. The CXL driver handler that processes the record retrieved from the device mailbox is now the handler for that same record format coming from an EFI/ACPI notification source. This also contains miscellaneous feature updates, like Get Timestamp, and other fixups. Summary: - Add support for parsing the Coherent Device Attribute Table (CDAT) - Add support for calculating a platform CXL QoS class from CDAT data - Unify the tracing of EFI CXL Events with native CXL Events. - Add Get Timestamp support - Miscellaneous cleanups and fixups" * tag 'cxl-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (41 commits) cxl/core: use sysfs_emit() for attr's _show() cxl/pci: Register for and process CPER events PCI: Introduce cleanup helpers for device reference counts and locks acpi/ghes: Process CXL Component Events cxl/events: Create a CXL event union cxl/events: Separate UUID from event structures cxl/events: Remove passing a UUID to known event traces cxl/events: Create common event UUID defines cxl/events: Promote CXL event structures to a core header cxl: Refactor to use __free() for cxl_root allocation in cxl_endpoint_port_probe() cxl: Refactor to use __free() for cxl_root allocation in cxl_find_nvdimm_bridge() cxl: Fix device reference leak in cxl_port_perf_data_calculate() cxl: Convert find_cxl_root() to return a 'struct cxl_root *' cxl: Introduce put_cxl_root() helper cxl/port: Fix missing target list lock cxl/port: Fix decoder initialization when nr_targets > interleave_ways cxl/region: fix x9 interleave typo cxl/trace: Pass UUID explicitly to event traces cxl/region: use %pap format to print resource_size_t cxl/region: Add dev_dbg() detail on failure to allocate HPA space ...
2024-01-17Merge tag 'pci-v6.8-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Reserve ECAM so we don't assign it to PCI BARs; this works around bugs where BIOS included ECAM in a PNP0A03 host bridge window, didn't reserve it via a PNP0C02 motherboard device, and didn't allocate space for SR-IOV VF BARs (Bjorn Helgaas) - Add MMCONFIG/ECAM debug logging (Bjorn Helgaas) - Rename 'MMCONFIG' to 'ECAM' to match spec usage (Bjorn Helgaas) - Log device type (Root Port, Switch Port, etc) during enumeration (Bjorn Helgaas) - Log bridges before downstream devices so the dmesg order is more logical (Bjorn Helgaas) - Log resource names (BAR 0, VF BAR 0, bridge window, etc) consistently instead of a mix of names and "reg 0x10" (Puranjay Mohan, Bjorn Helgaas) - Fix 64GT/s effective data rate calculation to use 1b/1b encoding rather than the 8b/10b or 128b/130b used by lower rates (Ilpo Järvinen) - Use PCI_HEADER_TYPE_* instead of literals in x86, powerpc, SCSI lpfc (Ilpo Järvinen) - Clean up open-coded PCIBIOS return code mangling (Ilpo Järvinen) Resource management: - Restructure pci_dev_for_each_resource() to avoid computing the address of an out-of-bounds array element (the bounds check was performed later so the element was never actually *read*, but it's nicer to avoid even computing an out-of-bounds address) (Andy Shevchenko) Driver binding: - Convert pci-host-common.c platform .remove() callback to .remove_new() returning 'void' since it's not useful to return error codes here (Uwe Kleine-König) - Convert exynos, keystone, kirin from .remove() to .remove_new(), which returns void instead of int (Uwe Kleine-König) - Drop unused struct pci_driver.node member (Mathias Krause) Virtualization: - Add ACS quirk for more Zhaoxin Root Ports (LeoLiuoc) Error handling: - Log AER errors as "Correctable" (not "Corrected") or "Uncorrectable" to match spec terminology (Bjorn Helgaas) - Decode Requester ID when no error info found instead of printing the raw hex value (Bjorn Helgaas) Endpoint framework: - Use a unique test pattern for each BAR in the pci_endpoint_test to make it easier to debug address translation issues (Niklas Cassel) Broadcom STB PCIe controller driver: - Add DT property "brcm,clkreq-mode" and driver support for different CLKREQ# modes to make ASPM L1.x states possible (Jim Quinlan) Freescale Layerscape PCIe controller driver: - Add suspend/resume support for Layerscape LS1043a and LS1021a, including software-managed PME_Turn_Off and transitions between L0, L2/L3_Ready Link states (Frank Li) MediaTek PCIe controller driver: - Clear MSI interrupt status before handler to avoid missing MSIs that occur after the handler (qizhong cheng) MediaTek PCIe Gen3 controller driver: - Update mediatek-gen3 translation window setup to handle MMIO space that is not a power of two in size (Jianjun Wang) Qualcomm PCIe controller driver: - Increase qcom iommu-map maxItems to accommodate SDX55 (five entries) and SDM845 (sixteen entries) (Krzysztof Kozlowski) - Describe qcom,pcie-sc8180x clocks and resets accurately (Krzysztof Kozlowski) - Describe qcom,pcie-sm8150 clocks and resets accurately (Krzysztof Kozlowski) - Correct the qcom "reset-name" property, previously incorrectly called "reset-names" (Krzysztof Kozlowski) - Document qcom,pcie-sm8650, based on qcom,pcie-sm8550 (Neil Armstrong) Renesas R-Car PCIe controller driver: - Replace of_device.h with explicit of.h include to untangle header usage (Rob Herring) - Add DT and driver support for optional miniPCIe 1.5v and 3.3v regulators on KingFisher (Wolfram Sang) SiFive FU740 PCIe controller driver: - Convert fu740 CONFIG_PCIE_FU740 dependency from SOC_SIFIVE to ARCH_SIFIVE (Conor Dooley) Synopsys DesignWare PCIe controller driver: - Align iATU mapping for endpoint MSI-X (Niklas Cassel) - Drop "host_" prefix from struct dw_pcie_host_ops members (Yoshihiro Shimoda) - Drop "ep_" prefix from struct dw_pcie_ep_ops members (Yoshihiro Shimoda) - Rename struct dw_pcie_ep_ops.func_conf_select() to .get_dbi_offset() to be more descriptive (Yoshihiro Shimoda) - Add Endpoint DBI accessors to encapsulate offset lookups (Yoshihiro Shimoda) TI J721E PCIe driver: - Add j721e DT and driver support for 'num-lanes' for devices that support x1, x2, or x4 Links (Matt Ranostay) - Add j721e DT compatible strings and driver support for j784s4 (Matt Ranostay) - Make TI J721E Kconfig depend on ARCH_K3 since the hardware is specific to those TI SoC parts (Peter Robinson) TI Keystone PCIe controller driver: - Hold power management references to all PHYs while enabling them to avoid a race when one provides clocks to others (Siddharth Vadapalli) Xilinx XDMA PCIe controller driver: - Remove redundant dev_err(), since platform_get_irq() and platform_get_irq_byname() already log errors (Yang Li) - Fix uninitialized symbols in xilinx_pl_dma_pcie_setup_irq() (Krzysztof Wilczyński) - Fix xilinx_pl_dma_pcie_init_irq_domain() error return when irq_domain_add_linear() fails (Harshit Mogalapalli) MicroSemi Switchtec management driver: - Do dma_mrpc cleanup during switchtec_pci_remove() to match its devm ioremapping in switchtec_pci_probe(). Previously the cleanup was done in stdev_release(), which used stale pointers if stdev->cdev happened to be open when the PCI device was removed (Daniel Stodden) Miscellaneous: - Convert interrupt terminology from "legacy" to "INTx" to be more specific and match spec terminology (Damien Le Moal) - In dw-xdata-pcie, pci_endpoint_test, and vmd, replace usage of deprecated ida_simple_*() API with ida_alloc() and ida_free() (Christophe JAILLET)" * tag 'pci-v6.8-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (97 commits) PCI: Fix kernel-doc issues PCI: brcmstb: Configure HW CLKREQ# mode appropriate for downstream device dt-bindings: PCI: brcmstb: Add property "brcm,clkreq-mode" PCI: mediatek-gen3: Fix translation window size calculation PCI: mediatek: Clear interrupt status before dispatching handler PCI: keystone: Fix race condition when initializing PHYs PCI: xilinx-xdma: Fix error code in xilinx_pl_dma_pcie_init_irq_domain() PCI: xilinx-xdma: Fix uninitialized symbols in xilinx_pl_dma_pcie_setup_irq() PCI: rcar-gen4: Fix -Wvoid-pointer-to-enum-cast error PCI: iproc: Fix -Wvoid-pointer-to-enum-cast warning PCI: dwc: Add dw_pcie_ep_{read,write}_dbi[2] helpers PCI: dwc: Rename .func_conf_select to .get_dbi_offset in struct dw_pcie_ep_ops PCI: dwc: Rename .ep_init to .init in struct dw_pcie_ep_ops PCI: dwc: Drop host prefix from struct dw_pcie_host_ops members misc: pci_endpoint_test: Use a unique test pattern for each BAR PCI: j721e: Make TI J721E depend on ARCH_K3 PCI: j721e: Add TI J784S4 PCIe configuration PCI/AER: Use explicit register sizes for struct members PCI/AER: Decode Requester ID when no error info found PCI/AER: Use 'Correctable' and 'Uncorrectable' spec terms for errors ...
2024-01-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/broadcom/bnxt/bnxt.c e009b2efb7a8 ("bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters()") 0f2b21477988 ("bnxt_en: Fix compile error without CONFIG_RFS_ACCEL") https://lore.kernel.org/all/20240105115509.225aa8a2@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-02Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()"Bjorn Helgaas
This reverts commit 08d0cc5f34265d1a1e3031f319f594bd1970976c. Michael reported that when attempting to resume from suspend to RAM on ASUS mini PC PN51-BB757MDE1 (DMI model: MINIPC PN51-E1), 08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()") caused a 12-second delay with no output, followed by a reboot. Workarounds include: - Reverting 08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()") - Booting with "pcie_aspm=off" - Booting with "pcie_aspm.policy=performance" - "echo 0 | sudo tee /sys/bus/pci/devices/0000:03:00.0/link/l1_aspm" before suspending - Connecting a USB flash drive Link: https://lore.kernel.org/r/20240102232550.1751655-1-helgaas@kernel.org Fixes: 08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()") Reported-by: Michael Schaller <michael@5challer.de> Link: https://lore.kernel.org/r/76c61361-b8b4-435f-a9f1-32b716763d62@5challer.de Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: <stable@vger.kernel.org>
2023-12-22cxl: Calculate and store PCI link latency for the downstream portsDave Jiang
The latency is calculated by dividing the flit size over the bandwidth. Add support to retrieve the flit size for the CXL switch device and calculate the latency of the PCIe link. Cache the latency number with cxl_dport. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170319621931.2212653.6800240203604822886.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-12-15PCI: Use resource names in PCI log messagesPuranjay Mohan
Use the pci_resource_name() to get the name of the resource and use it while printing log messages. [bhelgaas: rename to match struct resource * names, also use names in other BAR messages] Link: https://lore.kernel.org/r/20211106112606.192563-3-puranjay12@gmail.com Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-12-15PCI: Update BAR # and window messagesPuranjay Mohan
The PCI log messages print the register offsets at some places and BAR numbers at other places. There is no uniformity in this logging mechanism. It would be better to print names than register offsets. Add a helper function that aids in printing more meaningful information about the BAR numbers like "VF BAR", "ROM", "bridge window", etc. This function can be called while printing PCI log messages. [bhelgaas: fold in Lukas' static array suggestion from https://lore.kernel.org/all/20211106115831.GA7452@wunner.de/] Link: https://lore.kernel.org/r/20211106112606.192563-2-puranjay12@gmail.com Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-11-18PCI: Add debug print for device ready delayIdo Schimmel
Currently, the time it took a PCI device to become ready after reset is only printed if it was longer than 1000ms ('PCI_RESET_WAIT'). However, for debugging purposes it is useful to know this time even if it was shorter. For example, with the device I am working on, hardware engineers asked to verify that it becomes ready on the first try (no delay). To that end, add a debug level print that can be enabled using dynamic debug. Example: # echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/reset # dmesg -c | grep ready # echo "file drivers/pci/pci.c +p" > /sys/kernel/debug/dynamic_debug/control # echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/reset # dmesg -c | grep ready [ 396.060335] mlxsw_spectrum4 0000:01:00.0: ready 0ms after bus reset # echo "file drivers/pci/pci.c -p" > /sys/kernel/debug/dynamic_debug/control # echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/reset # dmesg -c | grep ready Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-28Merge branch 'pci/field-get'Bjorn Helgaas
- Use FIELD_GET()/FIELD_PREP() when possible throughout drivers/pci/ (Ilpo Järvinen, Bjorn Helgaas) - Rework DPC control programming for clarity (Ilpo Järvinen) * pci/field-get: PCI/portdrv: Use FIELD_GET() PCI/VC: Use FIELD_GET() PCI/PTM: Use FIELD_GET() PCI/PME: Use FIELD_GET() PCI/ATS: Use FIELD_GET() PCI/ATS: Show PASID Capability register width in bitmasks PCI: Use FIELD_GET() in Sapphire RX 5600 XT Pulse quirk PCI: Use FIELD_GET() PCI/MSI: Use FIELD_GET/PREP() PCI/DPC: Use defines with DPC reason fields PCI/DPC: Use defined fields with DPC_CTL register PCI/DPC: Use FIELD_GET() PCI: hotplug: Use FIELD_GET/PREP() PCI: dwc: Use FIELD_GET/PREP() PCI: cadence: Use FIELD_GET() PCI: Use FIELD_GET() to extract Link Width PCI: mvebu: Use FIELD_PREP() with Link Width PCI: tegra194: Use FIELD_GET()/FIELD_PREP() with Link Width fields # Conflicts: # drivers/pci/controller/dwc/pcie-tegra194.c
2023-10-28Merge branch 'pci/config-errs'Bjorn Helgaas
- Simplify config accessor error checking (Ilpo Järvinen) * pci/config-errs: scsi: ipr: Do PCI error checks on own line PCI: xgene: Do PCI error check on own line & keep return value PCI: Do error check on own line to split long "if" conditions atm: iphase: Do PCI error checks on own line sh: pci: Do PCI error check on own line alpha: Streamline convoluted PCI error handling
2023-10-24PCI: Use FIELD_GET() in Sapphire RX 5600 XT Pulse quirkBjorn Helgaas
Use FIELD_GET() to remove dependences on the field position, i.e., the shift value. No functional change intended. Separate because this isn't as trivial as the other FIELD_GET() changes. See 907830b0fc9e ("PCI: Add a REBAR size quirk for Sapphire RX 5600 XT Pulse") Link: https://lore.kernel.org/r/20231010204436.1000644-3-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: Nirmoy Das <nirmoy.das@amd.com>
2023-10-24PCI: Use FIELD_GET()Bjorn Helgaas
Use FIELD_GET() and FIELD_PREP() to remove dependences on the field position, i.e., the shift value. No functional change intended. Link: https://lore.kernel.org/r/20231010204436.1000644-2-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2023-10-17PCI: Use FIELD_GET() to extract Link WidthIlpo Järvinen
Use FIELD_GET() to extract PCIe Negotiated and Maximum Link Width fields instead of custom masking and shifting. Link: https://lore.kernel.org/r/20230919125648.1920-7-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: drop duplicate include of <linux/bitfield.h>] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2023-10-10PCI: Do error check on own line to split long "if" conditionsIlpo Järvinen
Placing PCI error code check inside "if" condition usually results in need to split lines. Combined with additional conditions the "if" condition becomes messy. Convert to the usual error handling pattern with an additional variable to improve code readability. In addition, reverse the logic in pci_find_vsec_capability() to get rid of &&. No functional changes intended. Link: https://lore.kernel.org/r/20230911125354.25501-5-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: PCI_POSSIBLE_ERROR()] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-10-03PCI: Use PCI_HEADER_TYPE_* instead of literalsIlpo Järvinen
Replace literals under drivers/pci/ with PCI_HEADER_TYPE_MASK, PCI_HEADER_TYPE_NORMAL, and PCI_HEADER_TYPE_MFD. Also replace !! boolean conversions with FIELD_GET(). Link: https://lore.kernel.org/r/20231003125300.5541-4-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> # for Renesas R-Car
2023-08-29Merge branch 'pci/misc'Bjorn Helgaas
- Reorder struct pci_dev to avoid holes and reduce size (Christophe JAILLET) - Change pdev->rom_attr_enabled to single bit since it's only a boolean value (Christophe JAILLET) - Use struct_size() in pirq_convert_irt_table() instead of hand-writing it (Christophe JAILLET) - Explicitly include correct DT includes to untangle headers (Rob Herring) - Fix a DOE race between destroy_work_on_stack() and the stack-allocated task->work struct going out of scope in pci_doe() (Ira Weiny) - Use pci_dev_id() when possible instead of manually composing ID from dev->bus->number and dev->devfn (Xiongfeng Wang, Zheng Zengkai) - Move pci_create_resource_files() declarations to linux/pci.h for alpha build warnings (Arnd Bergmann) - Remove unused hotplug function declarations (Yue Haibing) - Remove unused mvebu struct mvebu_pcie.busn (Pali Rohár) - Unexport pcie_port_bus_type (Bjorn Helgaas) - Remove unnecessary sysfs ID local variable initialization (Bjorn Helgaas) - Fix BAR value printk formatting to accommodate 32-bit values (Bjorn Helgaas) - Use consistent pointer types for config access syscall get_user() and put_user() uses (Bjorn Helgaas) - Simplify AER_RECOVER_RING_SIZE definition (Bjorn Helgaas) - Simplify pci_pio_to_address() (Bjorn Helgaas) - Simplify pci_dev_driver() (Bjorn Helgaas) - Fix pci_bus_resetable(), pci_slot_resetable() name typos (Bjorn Helgaas) - Fix code and doc typos and code formatting (Bjorn Helgaas) - Tidy config space save/restore messages (Bjorn Helgaas) * pci/misc: PCI: Tidy config space save/restore messages PCI: Fix code formatting inconsistencies PCI: Fix typos in docs and comments PCI: Fix pci_bus_resetable(), pci_slot_resetable() name typos PCI: Simplify pci_dev_driver() PCI: Simplify pci_pio_to_address() PCI/AER: Simplify AER_RECOVER_RING_SIZE definition PCI: Use consistent put_user() pointer types PCI: Fix printk field formatting PCI: Remove unnecessary initializations PCI: Unexport pcie_port_bus_type PCI: mvebu: Remove unused busn member PCI: Remove unused function declarations PCI/sysfs: Move declarations to linux/pci.h PCI/P2PDMA: Use pci_dev_id() to simplify the code PCI/IOV: Use pci_dev_id() to simplify the code PCI/AER: Use pci_dev_id() to simplify the code PCI: apple: Use pci_dev_id() to simplify the code PCI/DOE: Fix destroy_work_on_stack() race PCI: Explicitly include correct DT includes x86/PCI: Use struct_size() in pirq_convert_irt_table() PCI: Change pdev->rom_attr_enabled to single bit PCI: Reorder pci_dev fields to reduce holes
2023-08-29Merge branch 'pci/vpd'Bjorn Helgaas
- Ensure device is accessible before VPD access via sysfs (Alex Williamson) - Ensure device doesn't go to a low-power state while we're polling for PME (Alex Williamson) * pci/vpd: PCI: Fix runtime PM race with PME polling PCI/VPD: Add runtime power management to sysfs interface
2023-08-29Merge branch 'pci/pm'Bjorn Helgaas
- Only read PCI_PM_CTRL register when available, to avoid reading the wrong register and corrupting dev->current_state (Feiyang Chen) * pci/pm: PCI/PM: Only read PCI_PM_CTRL register when available
2023-08-25PCI/PM: Only read PCI_PM_CTRL register when availableFeiyang Chen
For a device with no Power Management Capability, pci_power_up() previously returned 0 (success) if the platform was able to put the device in D0, which led to pci_set_full_power_state() trying to read PCI_PM_CTRL, even though it doesn't exist. Since dev->pm_cap == 0 in this case, pci_set_full_power_state() actually read the wrong register, interpreted it as PCI_PM_CTRL, and corrupted dev->current_state. This led to messages like this in some cases: pci 0000:01:00.0: Refused to change power state from D3hot to D0 To prevent this, make pci_power_up() always return a negative failure code if the device lacks a Power Management Capability, even if non-PCI platform power management has been able to put the device in D0. The failure will prevent pci_set_full_power_state() from trying to access PCI_PM_CTRL. Fixes: e200904b275c ("PCI/PM: Split pci_power_up()") Link: https://lore.kernel.org/r/20230824013738.1894965-1-chenfeiyang@loongson.cn Signed-off-by: Feiyang Chen <chenfeiyang@loongson.cn> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: "Rafael J. Wysocki" <rafael@kernel.org> Cc: stable@vger.kernel.org # v5.19+
2023-08-25PCI: Tidy config space save/restore messagesBjorn Helgaas
Update config space save/restore debug messages so they line up better. Previously: nvme 0000:05:00.0: saving config space at offset 0x4 (reading 0x20100006) nvme 0000:05:00.0: saving config space at offset 0x8 (reading 0x1080200) nvme 0000:05:00.0: saving config space at offset 0xc (reading 0x0) nvme 0000:05:00.0: restoring config space at offset 0x4 (was 0x0, writing 0x20100006) Now: nvme 0000:05:00.0: save config 0x04: 0x20100006 nvme 0000:05:00.0: save config 0x08: 0x01080200 nvme 0000:05:00.0: save config 0x0c: 0x00000000 nvme 0000:05:00.0: restore config 0x04: 0x00000000 -> 0x20100006 No functional change intended. Enable these messages by setting CONFIG_DYNAMIC_DEBUG=y and adding 'dyndbg="file drivers/pci/* +p"' to kernel parameters. Link: https://lore.kernel.org/r/20230823191831.476579-1-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2023-08-25PCI: Fix typos in docs and commentsBjorn Helgaas
Fix typos in docs and comments. Link: https://lore.kernel.org/r/20230824193712.542167-11-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-08-25PCI: Fix pci_bus_resetable(), pci_slot_resetable() name typosBjorn Helgaas
Fix typos in the pci_bus_resetable() and pci_slot_resetable() function names. No functional change intended. Link: https://lore.kernel.org/r/20230824193712.542167-10-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-08-25PCI: Simplify pci_pio_to_address()Bjorn Helgaas
Simplify pci_pio_to_address() by removing an unnecessary local variable. No functional change intended. Link: https://lore.kernel.org/r/20230824193712.542167-8-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-08-11PCI: Fix runtime PM race with PME pollingAlex Williamson
Testing that a device is not currently in a low power state provides no guarantees that the device is not imminently transitioning to such a state. Increment the PM usage counter before accessing the device. Since we don't wish to wake the device for PME polling, do so only if the device is already active by using pm_runtime_get_if_active(). Link: https://lore.kernel.org/r/20230803171233.3810944-3-alex.williamson@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-08-10PCI: Make link retraining use RMW accessors for changing LNKCTLIlpo Järvinen
Don't assume that the device is fully under the control of PCI core. Use RMW capability accessors in link retraining which do proper locking to avoid losing concurrent updates to the register values. Suggested-by: Lukas Wunner <lukas@wunner.de> Fixes: 4ec73791a64b ("PCI: Work around Pericom PCIe-to-PCI bridge Retrain Link erratum") Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support") Link: https://lore.kernel.org/r/20230717120503.15276-3-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: "Rafael J. Wysocki" <rafael@kernel.org>
2023-06-26Merge branch 'pci/pm'Bjorn Helgaas
- Reduce wait time for secondary bus to be ready to speed up resume (Mika Westerberg) - Avoid putting EloPOS E2/S2/H2 (as well as Elo i2) PCIe Ports in D3cold (Ondrej Zary) - Call _REG when transitioning D-states so AML that uses the PCI config space OpRegion works, which fixes some ASMedia GPIO controllers (Mario Limonciello) * pci/pm: PCI/ACPI: Call _REG when transitioning D-states PCI/ACPI: Validate acpi_pci_set_power_state() parameter PCI/PM: Avoid putting EloPOS E2/S2/H2 PCIe Ports in D3cold PCI/PM: Shorten pci_bridge_wait_for_secondary_bus() wait time for slow links
2023-06-26Merge branch 'pci/enumeration'Bjorn Helgaas
- Add PCI_EXT_CAP_ID_PL_32GT define (Ben Dooks) - Propagate firmware node by calling device_set_node() for better modularity (Andy Shevchenko) - Discover Data Link Layer Link Active Reporting earlier so quirks can take advantage of it (Maciej W. Rozycki) - Use cached Data Link Layer Link Active Reporting capability in pciehp, powerpc/eeh, and mlx5 (Maciej W. Rozycki) - Run quirk for devices that require OS to clear Retrain Link earlier, so later quirks can rely on it (Maciej W. Rozycki) - Export pcie_retrain_link() for use outside ASPM (Maciej W. Rozycki) - Add Data Link Layer Link Active Reporting as another way for pcie_retrain_link() to determine the link is up (Maciej W. Rozycki) - Work around link training failures (especially on the ASMedia ASM2824 switch) by training first at 2.5GT/s and then attempting higher rates (Maciej W. Rozycki) * pci/enumeration: PCI: Add failed link recovery for device reset events PCI: Work around PCIe link training failures PCI: Use pcie_wait_for_link_status() in pcie_wait_for_link_delay() PCI: Add support for polling DLLLA to pcie_retrain_link() PCI: Export pcie_retrain_link() for use outside ASPM PCI: Export PCIe link retrain timeout PCI: Execute quirk_enable_clear_retrain_link() earlier PCI/ASPM: Factor out waiting for link training to complete PCI/ASPM: Avoid unnecessary pcie_link_state use PCI/ASPM: Use distinct local vars in pcie_retrain_link() net/mlx5: Rely on dev->link_active_reporting powerpc/eeh: Rely on dev->link_active_reporting PCI: pciehp: Rely on dev->link_active_reporting PCI: Initialize dev->link_active_reporting earlier PCI: of: Propagate firmware node by calling device_set_node() PCI: Add PCI_EXT_CAP_ID_PL_32GT define # Conflicts: # drivers/pci/pcie/aspm.c
2023-06-23PCI/PM: Avoid putting EloPOS E2/S2/H2 PCIe Ports in D3coldOndrej Zary
The quirk for Elo i2 introduced in commit 92597f97a40b ("PCI/PM: Avoid putting Elo i2 PCIe Ports in D3cold") is also needed by EloPOS E2/S2/H2 which uses the same Continental Z2 board. Change the quirk to match the board instead of system. Link: https://bugzilla.kernel.org/show_bug.cgi?id=215715 Link: https://lore.kernel.org/r/20230614074253.22318-1-linux@zary.sk Signed-off-by: Ondrej Zary <linux@zary.sk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org
2023-06-20PCI: Add failed link recovery for device reset eventsMaciej W. Rozycki
Request failed link recovery with any upstream PCIe bridge where a device has not come back after reset within PCI_RESET_WAIT time. Reset the polling interval if recovery succeeded, otherwise continue as usual. [bhelgaas: inline pcie_parent_link_retrain()] Link: https://lore.kernel.org/r/alpine.DEB.2.21.2306111631050.64925@angie.orcam.me.uk Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-20PCI: Work around PCIe link training failuresMaciej W. Rozycki
Attempt to handle cases such as with a downstream port of the ASMedia ASM2824 PCIe switch where link training never completes and the link continues switching between speeds indefinitely with the data link layer never reaching the active state. It has been observed with a downstream port of the ASMedia ASM2824 Gen 3 switch wired to the upstream port of the Pericom PI7C9X2G304 Gen 2 switch, using a Delock Riser Card PCI Express x1 > 2 x PCIe x1 device, P/N 41433, wired to a SiFive HiFive Unmatched board. In this setup the switches should negotiate a link speed of 5.0GT/s, falling back to 2.5GT/s if necessary. Instead the link continues oscillating between the two speeds, at the rate of 34-35 times per second, with link training reported repeatedly active ~84% of the time. Limiting the target link speed to 2.5GT/s with the upstream ASM2824 device makes the two switches communicate correctly. Removing the speed restriction afterwards makes the two devices switch to 5.0GT/s then. Make use of these observations and detect the inability to train the link by checking for the Data Link Layer Link Active status bit being off while the Link Bandwidth Management Status indicating that hardware has changed the link speed or width in an attempt to correct unreliable link operation. Restrict the speed to 2.5GT/s then with the Target Link Speed field, request a retrain and wait 200ms for the data link to go up. If this is successful, lift the restriction, letting the devices negotiate a higher speed. Also check for a 2.5GT/s speed restriction the firmware may have already arranged and lift it too with ports of devices known to continue working afterwards (currently only ASM2824), that already report their data link being up. [bhelgaas: reorder and squash stubs from https://lore.kernel.org/r/alpine.DEB.2.21.2306111619570.64925@angie.orcam.me.uk to avoid adding stubs that do nothing] Link: https://lore.kernel.org/r/alpine.DEB.2.21.2203022037020.56670@angie.orcam.me.uk/ Link: https://source.denx.de/u-boot/u-boot/-/commit/a398a51ccc68 Link: https://lore.kernel.org/r/alpine.DEB.2.21.2305310038540.59226@angie.orcam.me.uk Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-20PCI: Use pcie_wait_for_link_status() in pcie_wait_for_link_delay()Maciej W. Rozycki
Remove a DLLLA status bit polling loop from pcie_wait_for_link_delay() and call almost identical code in pcie_wait_for_link_status() instead. This reduces the lower bound on the polling interval from 10ms to 1ms, possibly increasing the CPU load on the system in favour to reducing the wait time. Link: https://lore.kernel.org/r/alpine.DEB.2.21.2306111611170.64925@angie.orcam.me.uk Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-20PCI: Add support for polling DLLLA to pcie_retrain_link()Maciej W. Rozycki
Let the caller of pcie_retrain_link() specify whether they want to use the LT bit or the DLLLA bit of the Link Status Register to determine if link training has completed. It is up to the caller to verify whether the use of the DLLLA bit, the implementation of which is optional, is valid for the device requested. Link: https://lore.kernel.org/r/alpine.DEB.2.21.2306110310540.64925@angie.orcam.me.uk Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-20PCI: Export pcie_retrain_link() for use outside ASPMMaciej W. Rozycki
Export pcie_retrain_link() for link retrain needs outside ASPM. Struct pcie_link_state is local to ASPM and only used by pcie_retrain_link() to get at the associated PCI device, so change the operand and adjust the lone call site accordingly. Document the interface. No functional change at this point. Link: https://lore.kernel.org/r/alpine.DEB.2.21.2306110229010.64925@angie.orcam.me.uk Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-20PCI: Export PCIe link retrain timeoutMaciej W. Rozycki
Convert LINK_RETRAIN_TIMEOUT from jiffies to milliseconds, accordingly rename to PCIE_LINK_RETRAIN_TIMEOUT_MS, and make available via "pci.h" for the PCI core to use. Use in pcie_wait_for_link_delay(). Link: https://lore.kernel.org/r/alpine.DEB.2.21.2305310030280.59226@angie.orcam.me.uk Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-06PCI/PM: Shorten pci_bridge_wait_for_secondary_bus() wait time for slow linksMika Westerberg
With slow links (<= 5GT/s) active link reporting is not mandatory, so if a device is disconnected during system sleep we might end up waiting for it to respond for ~60s, which slows down resume time. PCIe r6.0, sec 6.6.1, mandates that software must wait for at least 1s before it can assume a device is broken, so use that minimum requirement for slow links and bail out if the device doesn't respond within 1s. However, if the port supports active link reporting we can wait longer as we do with the fast links. This should make system resume time faster for slow links as well while still following the PCIe spec. While there move the PCI_RESET_WAIT constant into pci.c because it is not used outside of that file anymore. Link: https://lore.kernel.org/r/20230425064751.24951-1-mika.westerberg@linux.intel.com Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2023-04-27Merge tag 'driver-core-6.4-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the large set of driver core changes for 6.4-rc1. Once again, a busy development cycle, with lots of changes happening in the driver core in the quest to be able to move "struct bus" and "struct class" into read-only memory, a task now complete with these changes. This will make the future rust interactions with the driver core more "provably correct" as well as providing more obvious lifetime rules for all busses and classes in the kernel. The changes required for this did touch many individual classes and busses as many callbacks were changed to take const * parameters instead. All of these changes have been submitted to the various subsystem maintainers, giving them plenty of time to review, and most of them actually did so. Other than those changes, included in here are a small set of other things: - kobject logging improvements - cacheinfo improvements and updates - obligatory fw_devlink updates and fixes - documentation updates - device property cleanups and const * changes - firwmare loader dependency fixes. All of these have been in linux-next for a while with no reported problems" * tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (120 commits) device property: make device_property functions take const device * driver core: update comments in device_rename() driver core: Don't require dynamic_debug for initcall_debug probe timing firmware_loader: rework crypto dependencies firmware_loader: Strip off \n from customized path zram: fix up permission for the hot_add sysfs file cacheinfo: Add use_arch[|_cache]_info field/function arch_topology: Remove early cacheinfo error message if -ENOENT cacheinfo: Check cache properties are present in DT cacheinfo: Check sib_leaf in cache_leaves_are_shared() cacheinfo: Allow early level detection when DT/ACPI info is missing/broken cacheinfo: Add arm64 early level initializer implementation cacheinfo: Add arch specific early level initializer tty: make tty_class a static const structure driver core: class: remove struct class_interface * from callbacks driver core: class: mark the struct class in struct class_interface constant driver core: class: make class_register() take a const * driver core: class: mark class_release() as taking a const * driver core: remove incorrect comment for device_create* MIPS: vpe-cmp: remove module owner pointer from struct class usage. ...
2023-04-20Merge branch 'pci/resource'Bjorn Helgaas
- Add pci_dev_for_each_resource() and pci_bus_for_each_resource() iterators to simplify loops (Andy Shevchenko) * pci/resource: EISA: Drop unused pci_bus_for_each_resource() index argument PCI: Make pci_bus_for_each_resource() index optional PCI: Document pci_bus_for_each_resource() PCI: Introduce pci_dev_for_each_resource() PCI: Introduce pci_resource_n()
2023-04-11PCI/PM: Drop pci_bridge_wait_for_secondary_bus() timeout parameterMika Westerberg
All callers of pci_bridge_wait_for_secondary_bus() supply a timeout of PCIE_RESET_READY_POLL_MS, so drop the parameter. Move the definition of PCIE_RESET_READY_POLL_MS into pci.c, the only user. [bhelgaas: extracted from https://lore.kernel.org/r/20230404052714.51315-3-mika.westerberg@linux.intel.com] Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-04-05PCI: Make pci_bus_for_each_resource() index optionalAndy Shevchenko
Refactor pci_bus_for_each_resource() in the same way as pci_dev_for_each_resource(). This allows the index to be hidden inside the implementation so the caller can omit it when it's not used otherwise. No functional changes intended. Link: https://lore.kernel.org/r/20230330162434.35055-6-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Krzysztof Wilczyński <kw@linux.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-03-23driver core: bus: mark the struct bus_type for sysfs callbacks as constantGreg Kroah-Hartman
struct bus_type should never be modified in a sysfs callback as there is nothing in the structure to modify, and frankly, the structure is almost never used in a sysfs callback, so mark it as constant to allow struct bus_type to be moved to read-only memory. Cc: "David S. Miller" <davem@davemloft.net> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Bounine <alex.bou9@gmail.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Ben Widawsky <bwidawsk@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Harald Freudenberger <freude@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Hu Haowen <src.res@email.cn> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Stuart Yoder <stuyoder@gmail.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Yanteng Si <siyanteng@loongson.cn> Acked-by: Ilya Dryomov <idryomov@gmail.com> # rbd Acked-by: Ira Weiny <ira.weiny@intel.com> # cxl Reviewed-by: Alex Shi <alexs@kernel.org> Acked-by: Iwona Winiarska <iwona.winiarska@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> # scsi Link: https://lore.kernel.org/r/20230313182918.1312597-23-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-24Merge tag 'pci-v6.3-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull PCI updates from Bjorn Helgaas: "Enumeration: - Rework portdrv shutdown so it disables interrupts but doesn't disable bus mastering, which leads to hangs on Loongson LS7A - Add mechanism to prevent Max_Read_Request_Size (MRRS) increases, again to avoid hardware issues on Loongson LS7A (and likely other devices based on DesignWare IP) - Ignore devices with a firmware (DT or ACPI) node that says the device is disabled Resource management: - Distribute spare resources to unconfigured hotplug bridges at boot-time (not just when hot-adding such a bridge), which makes hot-adding devices to docks work better. Tried this in v6.1 but had to revert for regressions, so try again - Fix root bus issue that dropped resources that happened to end at 0, e.g., [bus 00] PCI device hotplug: - Remove device locking when marking device as disconnected so this doesn't have to wait for concurrent driver bind/unbind to complete - Quirk more Qualcomm bridges that don't fully implement the PCIe Slot Status 'Command Completed' bit Power management: - Account for _S0W of the target bridge in acpi_pci_bridge_d3() so we don't miss hot-add notifications for USB4 docks, Thunderbolt, etc Reset: - Observe delay after reset, e.g., resuming from system sleep, regardless of whether a bridge can suspend to D3cold at runtime - Wait for secondary bus to become ready after a bridge reset Virtualization: - Avoid FLR on some AMD FCH AHCI adapters where it doesn't work - Allow independent IOMMU groups for some Wangxun NICs that prevent peer-to-peer transactions but don't advertise an ACS Capability Error handling: - Configure End-to-End-CRC (ECRC) only if Linux owns the AER Capability - Remove redundant Device Control Error Reporting Enable in the AER service driver since this is already done for all devices during enumeration ASPM: - Add pci_enable_link_state() interface to allow drivers to enable ASPM link state Endpoint framework: - Move dra7xx and tegra194 linkup processing from hard IRQ to threaded IRQ handler - Add a separate lock for endpoint controller list of endpoint function drivers to prevent deadlock in callbacks - Pass events from endpoint controller to endpoint function drivers via callbacks instead of notifiers Synopsys DesignWare eDMA controller driver (acked by Vinod): - Fix CPU vs PCI address issues - Fix source vs destination address issues - Fix issues with interleaved transfer semantics - Fix channel count initialization issue (issue still exists in several other drivers) - Clean up and improve debugfs usage so it will work on platforms with several eDMA devices Baikal T-1 PCIe controller driver: - Set a 64-bit DMA mask Freescale i.MX6 PCIe controller driver: - Add i.MX8MM, i.MX8MQ, i.MX8MP endpoint mode DT binding and driver support Intel VMD host bridge driver: - Add quirk to configure PCIe ASPM and LTR. This is normally done by BIOS, and will be for future products Marvell MVEBU PCIe controller driver: - Mark this driver as broken in Kconfig since bugs prevent its daily usage MediaTek MT7621 PCIe controller driver: - Delay PHY port initialization to improve boot reliability for ZBT WE1326, ZBT WF3526-P, and some Netgear models Qualcomm PCIe controller driver: - Add MSM8998 DT compatible string - Unify MSM8996 and MSM8998 clock orderings - Add SM8350 DT binding and driver support - Add IPQ8074 Gen3 DT binding and driver support - Correct qcom,perst-regs in DT binding - Add qcom_pcie_host_deinit() so the PHY is powered off and regulators and clocks are disabled on late host-init errors Socionext UniPhier Pro5 controller driver: - Clean up uniphier-ep reg, clocks, resets, and their names in DT binding Synopsys DesignWare PCIe controller driver: - Restrict coherent DMA mask to 32 bits for MSI, but allow controller drivers to set 64-bit streaming DMA mask - Add eDMA engine support in both Root Port and Endpoint controllers Miscellaneous: - Remove MODULE_LICENSE from boolean drivers so they don't look like modules so modprobe can complain about them" * tag 'pci-v6.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (86 commits) PCI: dwc: Add Root Port and Endpoint controller eDMA engine support PCI: bt1: Set 64-bit DMA mask PCI: dwc: Restrict only coherent DMA mask for MSI address allocation dmaengine: dw-edma: Prepare dw_edma_probe() for builtin callers dmaengine: dw-edma: Depend on DW_EDMA instead of selecting it dmaengine: dw-edma: Add mem-mapped LL-entries support PCI: Remove MODULE_LICENSE so boolean drivers don't look like modules PCI: hv: Drop duplicate PCI_MSI dependency PCI/P2PDMA: Annotate RCU dereference PCI/sysfs: Constify struct kobj_type pci_slot_ktype PCI: hotplug: Allow marking devices as disconnected during bind/unbind PCI: pciehp: Add Qualcomm quirk for Command Completed erratum PCI: qcom: Add IPQ8074 Gen3 port support dt-bindings: PCI: qcom: Add IPQ8074 Gen3 port dt-bindings: PCI: qcom: Sort compatibles alphabetically PCI: qcom: Fix host-init error handling PCI: qcom: Add SM8350 support dt-bindings: PCI: qcom: Add SM8350 dt-bindings: PCI: qcom-ep: Correct qcom,perst-regs dt-bindings: PCI: qcom: Unify MSM8996 and MSM8998 clock order ...
2023-02-22Merge branch 'pci/reset'Bjorn Helgaas
- Always observe reset delay when waking devices from D3cold, e.g., after system sleep, regardless of whether we're allowed to runtime-suspend to D3cold (Lukas Wunner) - Unify reset and resume delays to wait for downstream devices after a bridge reset (Lukas Wunner) - Wait for downstream devices after a DPC-induced bridge reset (Lukas Wunner) * pci/reset: PCI/DPC: Await readiness of secondary bus after reset PCI: Unify delay handling for reset and resume PCI/PM: Observe reset delay irrespective of bridge_d3
2023-02-10Revert "PCI/ASPM: Save L1 PM Substates Capability for suspend/resume"Bjorn Helgaas
This reverts commit 4ff116d0d5fd8a025604b0802d93a2d5f4e465d1. Tasev Nikola and Mark Enriquez reported that resume from suspend was broken in v6.1-rc1. Tasev bisected to a47126ec29f5 ("PCI/PTM: Cache PTM Capability offset"), but we can't figure out how that could be related. Mark saw the same symptoms and bisected to 4ff116d0d5fd ("PCI/ASPM: Save L1 PM Substates Capability for suspend/resume"), which does have a connection: it restores L1 Substates configuration while ASPM L1 may be enabled: pci_restore_state pci_restore_aspm_l1ss_state aspm_program_l1ss pci_write_config_dword(PCI_L1SS_CTL1, ctl1) # L1SS restore pci_restore_pcie_state pcie_capability_write_word(PCI_EXP_LNKCTL, cap[i++]) # L1 restore which is a problem because PCIe r6.0, sec 5.5.4, requires that: If setting either or both of the enable bits for ASPM L1 PM Substates, both ports must be configured as described in this section while ASPM L1 is disabled. Separately, Thomas Witt reported that 5e85eba6f50d ("PCI/ASPM: Refactor L1 PM Substates Control Register programming") broke suspend/resume, and it depends on 4ff116d0d5fd. Revert 4ff116d0d5fd ("PCI/ASPM: Save L1 PM Substates Capability for suspend/resume") to fix the resume issue and enable revert of 5e85eba6f50d to fix the issue Thomas reported. Note that reverting 4ff116d0d5fd means L1 Substates config may be lost on suspend/resume. As far as we know the system will use more power but will still *work* correctly. Fixes: 4ff116d0d5fd ("PCI/ASPM: Save L1 PM Substates Capability for suspend/resume") Link: https://bugzilla.kernel.org/show_bug.cgi?id=216782 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216877 Reported-by: Tasev Nikola <tasev.stefanoska@skynet.be> Reported-by: Mark Enriquez <enriquezmark36@gmail.com> Reported-by: Thomas Witt <kernel@witt.link> Tested-by: Mark Enriquez <enriquezmark36@gmail.com> Tested-by: Thomas Witt <kernel@witt.link> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v6.1+ Cc: Vidya Sagar <vidyas@nvidia.com>
2023-02-09PCI/DPC: Await readiness of secondary bus after resetLukas Wunner
pci_bridge_wait_for_secondary_bus() is called after a Secondary Bus Reset, but not after a DPC-induced Hot Reset. As a result, the delays prescribed by PCIe r6.0 sec 6.6.1 are not observed and devices on the secondary bus may be accessed before they're ready. One affected device is Intel's Ponte Vecchio HPC GPU. It comprises a PCIe switch whose upstream port is not immediately ready after reset. Because its config space is restored too early, it remains in D0uninitialized, its subordinate devices remain inaccessible and DPC recovery fails with messages such as: i915 0000:8c:00.0: can't change power state from D3cold to D0 (config space inaccessible) intel_vsec 0000:8e:00.1: can't change power state from D3cold to D0 (config space inaccessible) pcieport 0000:89:02.0: AER: device recovery failed Fix it. Link: https://lore.kernel.org/r/9f5ff00e1593d8d9a4b452398b98aa14d23fca11.1673769517.git.lukas@wunner.de Tested-by: Ravi Kishore Koppuravuri <ravi.kishore.koppuravuri@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: stable@vger.kernel.org
2023-02-07PCI: Unify delay handling for reset and resumeLukas Wunner
Sheng Bi reports that pci_bridge_secondary_bus_reset() may fail to wait for devices on the secondary bus to become accessible after reset: Although it does call pci_dev_wait(), it erroneously passes the bridge's pci_dev rather than that of a child. The bridge of course is always accessible while its secondary bus is reset, so pci_dev_wait() returns immediately. Sheng Bi proposes introducing a new pci_bridge_secondary_bus_wait() function which is called from pci_bridge_secondary_bus_reset(): https://lore.kernel.org/linux-pci/20220523171517.32407-1-windy.bi.enflame@gmail.com/ However we already have pci_bridge_wait_for_secondary_bus() which does almost exactly what we need. So far it's only called on resume from D3cold (which implies a Fundamental Reset per PCIe r6.0 sec 5.8). Re-using it for Secondary Bus Resets is a leaner and more rational approach than introducing a new function. That only requires a few minor tweaks: - Amend pci_bridge_wait_for_secondary_bus() to await accessibility of the first device on the secondary bus by calling pci_dev_wait() after performing the prescribed delays. pci_dev_wait() needs two parameters, a reset reason and a timeout, which callers must now pass to pci_bridge_wait_for_secondary_bus(). The timeout is 1 sec for resume (PCIe r6.0 sec 6.6.1) and 60 sec for reset (commit 821cdad5c46c ("PCI: Wait up to 60 seconds for device to become ready after FLR")). Introduce a PCI_RESET_WAIT macro for the 1 sec timeout. - Amend pci_bridge_wait_for_secondary_bus() to return 0 on success or -ENOTTY on error for consumption by pci_bridge_secondary_bus_reset(). - Drop an unnecessary 1 sec delay from pci_reset_secondary_bus() which is now performed by pci_bridge_wait_for_secondary_bus(). A static delay this long is only necessary for Conventional PCI, so modern PCIe systems benefit from shorter reset times as a side effect. Fixes: 6b2f1351af56 ("PCI: Wait for device to become ready after secondary bus reset") Link: https://lore.kernel.org/r/da77c92796b99ec568bd070cbe4725074a117038.1673769517.git.lukas@wunner.de Reported-by: Sheng Bi <windy.bi.enflame@gmail.com> Tested-by: Ravi Kishore Koppuravuri <ravi.kishore.koppuravuri@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: stable@vger.kernel.org # v4.17+
2023-02-07PCI/PM: Observe reset delay irrespective of bridge_d3Lukas Wunner
If a PCI bridge is suspended to D3cold upon entering system sleep, resuming it entails a Fundamental Reset per PCIe r6.0 sec 5.8. The delay prescribed after a Fundamental Reset in PCIe r6.0 sec 6.6.1 is sought to be observed by: pci_pm_resume_noirq() pci_pm_bridge_power_up_actions() pci_bridge_wait_for_secondary_bus() However, pci_bridge_wait_for_secondary_bus() bails out if the bridge_d3 flag is not set. That flag indicates whether a bridge is allowed to suspend to D3cold at *runtime*. Hence *no* delay is observed on resume from system sleep if runtime D3cold is forbidden. That doesn't make any sense, so drop the bridge_d3 check from pci_bridge_wait_for_secondary_bus(). The purpose of the bridge_d3 check was probably to avoid delays if a bridge remained in D0 during suspend. However the sole caller of pci_bridge_wait_for_secondary_bus(), pci_pm_bridge_power_up_actions(), is only invoked if the previous power state was D3cold. Hence the additional bridge_d3 check seems superfluous. Fixes: ad9001f2f411 ("PCI/PM: Add missing link delays required by the PCIe spec") Link: https://lore.kernel.org/r/eb37fa345285ec8bacabbf06b020b803f77bdd3d.1673769517.git.lukas@wunner.de Tested-by: Ravi Kishore Koppuravuri <ravi.kishore.koppuravuri@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: stable@vger.kernel.org # v5.5+
2023-02-01PCI: loongson: Prevent LS7A MRRS increasesHuacai Chen
Except for isochronous-configured devices, software may set Max_Read_Request_Size (MRRS) to any value up to 4096. If a device issues a read request with size greater than the completer's Max_Payload_Size (MPS), the completer is required to break the response into multiple completions. Instead of correctly responding with multiple completions to a large read request, some LS7A Root Ports respond with a Completer Abort. To prevent this, the MRRS must be limited to an implementation-specific value. The OS cannot detect that value, so rely on BIOS to configure MRRS before booting, and quirk the Root Ports so we never set an MRRS larger than that BIOS value for any downstream device. N.B. Hot-added devices are not configured by BIOS, and they power up with MRRS = 512 bytes, so these devices will be limited to 512 bytes. If the LS7A limit is smaller, those hot-added devices may not work correctly, but per [1], hotplug is not supported with this chipset revision. [1] https://lore.kernel.org/r/073638a7-ae68-2847-ac3d-29e5e760d6af@loongson.cn [bhelgaas: commit log] Link: https://bugzilla.kernel.org/show_bug.cgi?id=216884 Link: https://lore.kernel.org/r/20230201043018.778499-3-chenhuacai@loongson.cn Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>