linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-02-28	PCI: of: Use device_{add,remove}_of_node() to attach of_node to existing device	Herve Codina
	The commit 407d1a51921e ("PCI: Create device tree node for bridge") creates of_node for PCI devices. The newly created of_node is attached to an existing device. This is done setting directly pdev->dev.of_node in the code. Even if pdev->dev.of_node cannot be previously set, this doesn't handle the fwnode field of the struct device. Indeed, this field needs to be set if it hasn't already been set. device_{add,remove}_of_node() have been introduced to handle this case. Use them instead of the direct setting. Link: https://lore.kernel.org/r/20250224141356.36325-3-herve.codina@bootlin.com Signed-off-by: Herve Codina <herve.codina@bootlin.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
2025-02-28	PCI: brcmstb: Expand inbound window size up to 64GB	Stanimir Varbanov
	The BCM2712 memory map can support up to 64GB of system memory, thus expand the inbound window size in calculation helper function. The change is safe for the currently supported SoCs that have smaller inbound window sizes. Signed-off-by: Stanimir Varbanov <svarbanov@suse.de> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Jim Quinlan <james.quinlan@broadcom.com> Tested-by: Ivan T. Ivanov <iivanov@suse.de> Link: https://lore.kernel.org/r/20250224083559.47645-7-svarbanov@suse.de [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-28	PCI: brcmstb: Reuse pcie_cfg_data structure	Stanimir Varbanov
	Instead of copying fields from the pcie_cfg_data structure to brcm_pcie, reference it directly. Signed-off-by: Stanimir Varbanov <svarbanov@suse.de> Reviewed-by: Florian Fainelil <florian.fainelli@broadcom.com> Reviewed-by: Jim Quinlan <james.quinlan@broadcom.com> Tested-by: Ivan T. Ivanov <iivanov@suse.de> Link: https://lore.kernel.org/r/20250224083559.47645-6-svarbanov@suse.de [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-28	PCI: brcmstb: Add a softdep to MIP MSI-X driver	Stanimir Varbanov
	Then the brcmstb PCIe driver and MIP MSI-X interrupt controller drivers are built as modules there could be a race in probing. To avoid this, add a softdep to MIP driver to guarantee that MIP driver will be load first. Signed-off-by: Stanimir Varbanov <svarbanov@suse.de> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Ivan T. Ivanov <iivanov@suse.de> Link: https://lore.kernel.org/r/20250224083559.47645-5-svarbanov@suse.de [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-24	PCI: hv: Switch hv_compose_multi_msi_req_get_cpu() to using cpumask_next_wrap()	Yury Norov
	Calling cpumask_next_wrap_old() with starting CPU == nr_cpu_ids is effectively the same as request to find first CPU, starting from a given one and wrapping around if needed. cpumask_next_wrap() is a proper replacement for that. Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Yury Norov <yury.norov@gmail.com>
2025-02-24	cpumask: deprecate cpumask_next_wrap()	Yury Norov
	The next patch aligns implementation of cpumask_next_wrap() with the find_next_bit_wrap(), and it changes function signature. To make the transition smooth, this patch deprecates current implementation by adding an _old suffix. The following patches switch current users to the new implementation one by one. No functional changes were intended. Signed-off-by: Yury Norov <yury.norov@gmail.com>
2025-02-24	PCI: qcom-ep: Enable EP mode support for SAR2130P	Dmitry Baryshkov
	Enable PCIe Endpoint mode support for the Qualcomm SAR2130P platform. This is needed, as it is not possible to use a compatible fallback on any other platform since SAR2130P uses slightly different set of clocks. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20250221-sar2130p-pci-v3-6-61a0fdfb75b4@linaro.org Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-24	PCI: brcmstb: Fix missing of_node_put() in brcm_pcie_probe()	Stanimir Varbanov
	A call to of_parse_phandle() is incrementing the refcount, and as such, the of_node_put() must be called when the reference is no longer needed. Thus, refactor the existing code and add a missing of_node_put() call following the check to ensure that "msi_np" matches "pcie->np" and after MSI initialization, but only if the MSI support is enabled system-wide. Cc: stable@vger.kernel.org # v5.10+ Fixes: 40ca1bf580ef ("PCI: brcmstb: Add MSI support") Signed-off-by: Stanimir Varbanov <svarbanov@suse.de> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20250122222955.1752778-1-svarbanov@suse.de [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-24	PCI: qcom-ep: Mark BAR0/BAR2 as 64bit BARs and BAR1/BAR3 as RESERVED	Manivannan Sadhasivam
	On all Qcom endpoint SoCs, BAR0/BAR2 are 64bit BARs by default and software cannot change the type. So, mark the those BARs as 64bit BARs and also mark the successive BAR1/BAR3 as RESERVED BARs so that the EPF drivers cannot use them. Cc: stable+noautosel@kernel.org # depends on patch introducing only_64bit flag Fixes: f55fee56a631 ("PCI: qcom-ep: Add Qualcomm PCIe Endpoint controller driver") Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20241231130224.38206-3-manivannan.sadhasivam@linaro.org [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-22	x86/kaslr: Reduce KASLR entropy on most x86 systems	Balbir Singh
	When CONFIG_PCI_P2PDMA=y (which is basically enabled on all large x86 distros), it maps the PFN's via a ZONE_DEVICE mapping using devm_memremap_pages(). The mapped virtual address range corresponds to the pci_resource_start() of the BAR address and size corresponding to the BAR length. When KASLR is enabled, the direct map range of the kernel is reduced to the size of physical memory plus additional padding. If the BAR address is beyond this limit, PCI peer to peer DMA mappings fail. Fix this by not shrinking the size of the direct map when CONFIG_PCI_P2PDMA=y. This reduces the total available entropy, but it's better than the current work around of having to disable KASLR completely. [ mingo: Clarified the changelog to point out the broad impact ... ] Signed-off-by: Balbir Singh <balbirs@nvidia.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # drivers/pci/Kconfig Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/lkml/20250206023201.1481957-1-balbirs@nvidia.com/ Link: https://lore.kernel.org/r/20250206234234.1912585-1-balbirs@nvidia.com -- arch/x86/mm/kaslr.c \| 10 ++++++++-- drivers/pci/Kconfig \| 6 ++++++ 2 files changed, 14 insertions(+), 2 deletions(-)
2025-02-22	PCI: cpcihp: Remove unused .get_power() and .set_power()	Guilherme Giacomo Simoes
	The .get_power() and .set_power() function pointers in struct cpci_hp_controller_ops were declared but never implemented by any driver. Thus, to improve code readability and reduce resource usage, remove these pointers and the code that has never been used. Link: https://lore.kernel.org/r/20250217185638.398925-1-trintaeoitogc@gmail.com Signed-off-by: Guilherme Giacomo Simoes <trintaeoitogc@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-21	PCI/ERR: Handle TLP Log in Flit mode	Ilpo Järvinen
	Flit mode introduced in PCIe r6.0 alters how the TLP Header Log is presented through AER and DPC Capability registers. The TLP Prefix Log Register is not present with Flit mode, and the register becomes an extension of the TLP Header Log (PCIe r6.1 secs 7.8.4.12 & 7.9.14.13). Adapt pcie_read_tlp_log() and struct pcie_tlp_log to read and store the extended TLP Header Log when the Link is in Flit mode. As the Prefix Log and Extended TLP Header are not present at the same time, a C union can be used. Determining whether the error occurred while the Link was in Flit mode is a bit complicated. In case of AER, the Advanced Error Capabilities and Control Register directly tells whether the error was logged in Flit mode or not (PCIe r6.1 sec 7.8.4.7). The DPC Capability (PCIe r6.1 sec 7.9.14), unfortunately, does not contain the same information. Unlike AER, the DPC Capability does not provide a way to discern whether the error was logged in Flit mode (this is confirmed by PCI WG to be an oversight in the spec). DPC will bring the Link down immediately following an error, which makes it impossible to acquire the Flit Mode Status directly from the Link Status 2 register because Flit Mode Status is only set in certain Link states (PCIe r6.1 sec 7.5.3.20). As a workaround, use the flit_mode value stored into the struct pci_bus. Link: https://lore.kernel.org/r/20250207161836.2755-3-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-02-21	PCI: Track Flit Mode Status & print it with link status	Ilpo Järvinen
	PCIe r6.0 added Flit mode, which mainly alters HW behavior, but there are some OS visible changes. The OS visible changes include differences in the layout of some capabilities and interpretation of the TLP headers (in diagnostics situations). To be able to determine which mode the PCIe Link is using, store the Flit Mode Status (PCIe r6.1 sec 7.5.3.20) information in addition to the Link speed into struct pci_bus in pcie_update_link_speed(). Link: https://lore.kernel.org/r/20250207161836.2755-2-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: use unsigned int:1 instead of bool, update flit_mode setting] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-02-21	PCI/AER: Descope pci_printk() to aer_printk()	Ilpo Järvinen
	include/linux/pci.h provides low-level pci_printk() interface that is only used by AER because it needs to print the same message with different levels depending on the error severity. No other PCI code uses that functionality and calls pci_<level>() logging functions directly with the appropriate level. Descope pci_printk() into AER as aer_printk(). Link: https://lore.kernel.org/r/20241216161012.1774-5-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: retain pci_printk() for now since shpchp still uses it and I moved those patches to a different branch] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-02-21	PCI/ACS: Fix 'pci=config_acs=' parameter	Tushar Dave
	Commit 47c8846a49ba ("PCI: Extend ACS configurability") introduced bugs that fail to configure ACS ctrl to the value specified by the kernel parameter. Essentially there are two bugs: 1) When ACS is configured for multiple PCI devices using 'config_acs' kernel parameter, it results into error "PCI: Can't parse ACS command line parameter". This is due to a bug that doesn't preserve the ACS mask, but instead overwrites the mask with value 0. For example, using 'config_acs' to configure ACS ctrl for multiple BDFs fails: Kernel command line: pci=config_acs=1111011@0020:02:00.0;101xxxx@0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p" PCI: Can't parse ACS command line parameter pci 0020:02:00.0: ACS mask = 0x007f pci 0020:02:00.0: ACS flags = 0x007b pci 0020:02:00.0: Configured ACS to 0x007b After this fix: Kernel command line: pci=config_acs=1111011@0020:02:00.0;101xxxx@0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p" pci 0020:02:00.0: ACS mask = 0x007f pci 0020:02:00.0: ACS flags = 0x007b pci 0020:02:00.0: ACS control = 0x005f pci 0020:02:00.0: ACS fw_ctrl = 0x0053 pci 0020:02:00.0: Configured ACS to 0x007b pci 0039:00:00.0: ACS mask = 0x0070 pci 0039:00:00.0: ACS flags = 0x0050 pci 0039:00:00.0: ACS control = 0x001d pci 0039:00:00.0: ACS fw_ctrl = 0x0000 pci 0039:00:00.0: Configured ACS to 0x0050 2) In the bit manipulation logic, we copy the bit from the firmware settings when mask bit 0. For example, 'disable_acs_redir' fails to clear all three ACS P2P redir bits due to the wrong bit fiddling: Kernel command line: pci=disable_acs_redir=0020:02:00.0;0030:02:00.0;0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p" pci 0020:02:00.0: ACS mask = 0x002c pci 0020:02:00.0: ACS flags = 0xffd3 pci 0020:02:00.0: Configured ACS to 0xfffb pci 0030:02:00.0: ACS mask = 0x002c pci 0030:02:00.0: ACS flags = 0xffd3 pci 0030:02:00.0: Configured ACS to 0xffdf pci 0039:00:00.0: ACS mask = 0x002c pci 0039:00:00.0: ACS flags = 0xffd3 pci 0039:00:00.0: Configured ACS to 0xffd3 After this fix: Kernel command line: pci=disable_acs_redir=0020:02:00.0;0030:02:00.0;0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p" pci 0020:02:00.0: ACS mask = 0x002c pci 0020:02:00.0: ACS flags = 0xffd3 pci 0020:02:00.0: ACS control = 0x007f pci 0020:02:00.0: ACS fw_ctrl = 0x007b pci 0020:02:00.0: Configured ACS to 0x0053 pci 0030:02:00.0: ACS mask = 0x002c pci 0030:02:00.0: ACS flags = 0xffd3 pci 0030:02:00.0: ACS control = 0x005f pci 0030:02:00.0: ACS fw_ctrl = 0x005f pci 0030:02:00.0: Configured ACS to 0x0053 pci 0039:00:00.0: ACS mask = 0x002c pci 0039:00:00.0: ACS flags = 0xffd3 pci 0039:00:00.0: ACS control = 0x001d pci 0039:00:00.0: ACS fw_ctrl = 0x0000 pci 0039:00:00.0: Configured ACS to 0x0000 Link: https://lore.kernel.org/r/20250207030338.456887-1-tdave@nvidia.com Fixes: 47c8846a49ba ("PCI: Extend ACS configurability") Signed-off-by: Tushar Dave <tdave@nvidia.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2025-02-21	PCI: epf-mhi: Update device ID for SA8775P	Mrinmay Sarkar
	Update device ID for the Qcom SA8775P SoC. Signed-off-by: Mrinmay Sarkar <quic_msarkar@quicinc.com> Link: https://lore.kernel.org/r/20241205065422.2515086-3-quic_msarkar@quicinc.com [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-21	PCI: mediatek-gen3: Remove leftover mac_reset assert for Airoha EN7581 SoC	Lorenzo Bianconi
	Remove a leftover assert for mac_reset line in mtk_pcie_en7581_power_up(). This is not harmful since EN7581 does not requires mac_reset and mac_reset is not defined in EN7581 device tree. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Link: https://lore.kernel.org/r/20250201-pcie-en7581-remove-mac_reset-v2-1-a06786cdc683@kernel.org [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-21	PCI/pwrctrl: Add pwrctrl driver for PCI slots	Manivannan Sadhasivam
	This driver is used to control the power state of the devices attached to the PCI slots. Currently, it controls the voltage rails of the PCI slots defined in the devicetree node of the root port. The voltage rails for PCI slots are documented in the DT-schema: https://github.com/devicetree-org/dt-schema/blob/v2024.11/dtschema/schemas/pci/pci-bus-common.yaml#L153 Since this driver has to work with different kind of slots (PCIe x1/x4/x8/x16, Mini PCIe, PCI, etc.), the driver is thus using the of_regulator_bulk_get_all() API to obtain the voltage regulators defined in the DT node, instead of hardcoding them. As such, the DT node of the root port should define the relevant supply properties corresponding to the voltage rails of the PCI slot. Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20250116-pci-pwrctrl-slot-v3-5-827473c8fbf4@linaro.org [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-20	PCI: hv: Correct a comment	Easwar Hariharan
	The VF driver controls an endpoint attached to the PCI HyperV controller. An invalidation sent by the PF driver in the host would be delivered to the endpoint driver by the controller driver. Thus, correct the wording within the comment. Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20250207190716.89995-1-eahariha@linux.microsoft.com [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-20	PCI/pwrctrl: Skip scanning for the device further if pwrctrl device is created	Manivannan Sadhasivam
	The pwrctrl core will rescan the bus once the device is powered on. So there is no need to continue scanning for the device further. Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20250116-pci-pwrctrl-slot-v3-3-827473c8fbf4@linaro.org Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-20	PCI/pwrctrl: Move pci_pwrctrl_unregister() to pci_destroy_dev()	Manivannan Sadhasivam
	The PCI core will try to access the devices even after pci_stop_dev() for things like Data Object Exchange (DOE), ASPM, etc. So, move pci_pwrctrl_unregister() to the near end of pci_destroy_dev() to make sure that the devices are powered down only after the PCI core is done with them. Suggested-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Lukas Wunner <lukas@wunner.de> Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20250116-pci-pwrctrl-slot-v3-2-827473c8fbf4@linaro.org [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-20	PCI/pwrctrl: Move creation of pwrctrl devices to pci_scan_device()	Manivannan Sadhasivam
	Current way of creating pwrctrl devices requires iterating through the child devicetree nodes of the PCI bridge in pci_pwrctrl_create_devices(). Even though it works, it creates confusion as there is no symmetry between this and pci_pwrctrl_unregister() function that removes the pwrctrl devices. So to make these two functions symmetric, move the creation of pwrctrl devices to pci_scan_device(). During the scan of each device in a slot, the devicetree node (if exists) for the PCI device will be checked. If it has the supplies populated, then the pwrctrl device will be created. Since the PCI device scan happens so early, there would be no "struct pci_dev" available for the device. So the host bridge is used as the parent of all pwrctrl devices. One nice side effect of this move is that, it is now possible to have pwrctrl devices for PCI bridges as well (to control the supplies of PCI slots). Suggested-by: Lukas Wunner <lukas@wunner.de> Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20250116-pci-pwrctrl-slot-v3-1-827473c8fbf4@linaro.org [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-20	PCI/ASPM: Fix link state exit during switch upstream function removal	Daniel Stodden
	Before 456d8aa37d0f ("PCI/ASPM: Disable ASPM on MFD function removal to avoid use-after-free"), we would free the ASPM link only after the last function on the bus pertaining to the given link was removed. That was too late. If function 0 is removed before sibling function, link->downstream would point to free'd memory after. After above change, we freed the ASPM parent link state upon any function removal on the bus pertaining to a given link. That is too early. If the link is to a PCIe switch with MFD on the upstream port, then removing functions other than 0 first would free a link which still remains parent_link to the remaining downstream ports. The resulting GPFs are especially frequent during hot-unplug, because pciehp removes devices on the link bus in reverse order. On that switch, function 0 is the virtual P2P bridge to the internal bus. Free exactly when function 0 is removed -- before the parent link is obsolete, but after all subordinate links are gone. Link: https://lore.kernel.org/r/e12898835f25234561c9d7de4435590d957b85d9.1734924854.git.dns@arista.com Fixes: 456d8aa37d0f ("PCI/ASPM: Disable ASPM on MFD function removal to avoid use-after-free") Signed-off-by: Daniel Stodden <dns@arista.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-19	PCI: shpchp: Remove 'shpchp_debug' module parameter	Ilpo Järvinen
	The "shpchp_debug" module parameter is used to enable debug logging. The generic ability to turn on/off debug prints dynamically covers this use case already so there is no need for module specific debug handling. The ctrl_dbg() wrapper also uses a low-level pci_printk() despite always using KERN_DEBUG level. Remove "shpchp_debug" parameter and convert ctrl_dbg() to use pci_dbg(). From now on, shpchp can be debugged using the normal dynamic debugger by setting CONFIG_DYNAMIC_DEBUG=y and then either adding to kernel cmdline: dyndbg="file drivers/pci/hotplug/shpchp* +p" or using this command on a running kernel: echo 'file drivers/pci/hotplug/shpchp* +p' > /sys/kernel/debug/dynamic_debug/control Link: https://lore.kernel.org/r/20250217095550.2789-3-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-02-19	PCI: shpchp: Remove unused logging wrappers	Ilpo Järvinen
	The shpchp hotplug driver defines logging wrapper with generic names which are just duplicates of existing generic printk() wrappers. They are also unused so remove them. Link: https://lore.kernel.org/r/20250217095550.2789-2-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-02-19	PCI: shpchp: Change dbg() -> ctrl_dbg()	Ilpo Järvinen
	Convert the last user of dbg() to use ctrl_dbg(). Link: https://lore.kernel.org/r/20241216161012.1774-3-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-02-19	PCI: shpchp: Remove logging from module init/exit functions	Ilpo Järvinen
	The logging in shpchp module init/exit functions is not very useful. Remove it. Link: https://lore.kernel.org/r/20241216161012.1774-2-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-02-19	PM: sleep: Use DPM_FLAG_SMART_SUSPEND conditionally	Rafael J. Wysocki
	A recent discussion has revealed that using DPM_FLAG_SMART_SUSPEND unconditionally is generally problematic because it may lead to situations in which the device's runtime PM information is internally inconsistent or does not reflect its real state [1]. For this reason, change the handling of DPM_FLAG_SMART_SUSPEND so that it is only taken into account if it is consistently set by the drivers of all devices having any PM callbacks throughout dependency graphs in accordance with the following rules: - The "smart suspend" feature is only enabled for devices whose drivers ask for it (that is, set DPM_FLAG_SMART_SUSPEND) and for devices without PM callbacks unless they have never had runtime PM enabled. - The "smart suspend" feature is not enabled for a device if it has not been enabled for the device's parent unless the parent does not take children into account or it has never had runtime PM enabled. - The "smart suspend" feature is not enabled for a device if it has not been enabled for one of the device's suppliers taking runtime PM into account unless that supplier has never had runtime PM enabled. Namely, introduce a new device PM flag called smart_suspend that is only set if the above conditions are met and update all DPM_FLAG_SMART_SUSPEND users to check power.smart_suspend instead of directly checking the latter. At the same time, drop the power.set_active flage introduced recently in commit 3775fc538f53 ("PM: sleep: core: Synchronize runtime PM status of parents and children") because it is now sufficient to check power.smart_suspend along with the dev_pm_skip_resume() return value to decide whether or not pm_runtime_set_active() needs to be called for the device. Link: https://lore.kernel.org/linux-pm/CAPDyKFroyU3YDSfw_Y6k3giVfajg3NQGwNWeteJWqpW29BojhQ@mail.gmail.com/ [1] Fixes: 7585946243d6 ("PM: sleep: core: Restrict power.set_active propagation") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # drivers/pci Link: https://patch.msgid.link/1914558.tdWV9SEqCh@rjwysocki.net
2025-02-18	PCI: Rework optional resource handling	Ilpo Järvinen
	Remove and rescan cycle can result in failure to assign a bridge window if it becomes larger than before the remove. The bridge window size will include space for disabled Expansion ROM, which can causes the bridge window to not fit anymore into the same address space slot on rescan if the Expansion ROM resource was not assigned before the remove. In addition, the optional resource handling is not internally consistent. The resource fitting logic supports three main types of optional resources: - IOV BARs - Expansion ROMs - Bridge window size variation due to optional resources In addition to the above, resizable BARs beyond their current size will require handling optional variation in resource sizes within the resource fitting algorithm (not yet done by the resource fitting code). There are multiple inconsistencies related to optional resource handling: a) The allocation failure of disabled expansion ROM requires special case inside assign_requested_resources_sorted(). b) The optionality of disabled expansion ROM is not considered during bridge window sizing in pbus_size_mem(). c) Setting resource size to zero for optional resource in pbus_size_mem() is problematic because it makes also the alignment invalid, which is checked by pdev_sort_resources(). Optional IOV resources have their size set to zero by pbus_size_mem() but the information about size is stored externally in struct pci_sriov and complex call-chain trickery in pci_resource_alignment() ensures IOV resources return a valid alignment despite having zero resource size. A solution that is specific to IOV resources makes it hard to use the same solution for other types of resources such as expansion ROM. Simply changing pbus_size_mem() is not sufficient to fully address the main issue because it would introduce disparity between bridge window sizing and resource allocation. Due to size-based ordering of the resource list during assignment loop, an Expansion ROM resource could steal space from some other resource and make the other resource not fit if the Expansion ROM is larger than the other resource. Thus, the resource assignment functions need to be changed as well. Make optional resource handling more straightforward. Use pci_resource_is_optional() to determine if a resource is optional in both bridge window sizing and assignment failure classification to ensure they always align. Indicate with a parameter to assign_requested_resources_sorted() whether it should attempt to allocate optional resources or not. Always try first to assign all resources (also when realloc_head is not provided). This is required for calls from pci_assign_unassigned_root_bus_resources() that provide realloc_head only with some of its iterations. Non-bridge-window optional resources in realloc_head now have add_size 0. This condition has to be detected in reassign_resources_sorted() before reassigning them (which would fail as there is no size change). Removing add_size=0 optional resources entirely from realloc_head might eventually be doable but further rework in __assign_resources_sorted() is needed first to support such a change. Link: https://lore.kernel.org/r/20241216175632.4175-26-ilpo.jarvinen@linux.intel.com Reported-by: Jia Yao <jia.yao@intel.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219547 Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Jia Yao <jia.yao@intel.com> Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18	PCI: Perform reset_resource() and build fail list in sync	Ilpo Järvinen
	Resetting a resource is problematic as it prevents attempting to allocate the resource later, unless something in between restores the resource. Similarly, if fail_head does not contain all resources that were reset, those resources cannot be restored later. The entire reset/restore cycle adds complexity and leaving resources in the reset state causes issues to other code such as for checks done in pci_enable_resources(). Take a small step towards not resetting resources by delaying reset until the end of resource assignment and build failure list (fail_head) in sync with the reset to avoid leaving behind resources that cannot be restored (for the case where the caller provides fail_head in the first place to allow restore somewhere in the callchain, as is not all callers pass non-NULL fail_head). Leave the Expansion ROM check temporarily in place while building the failure list until an upcoming change that reworks optional resource handling. Ideally, whole resource reset could be removed but doing that in one step would be non-tractable due to complexity of all related code. Link: https://lore.kernel.org/r/20241216175632.4175-25-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18	PCI: Use res->parent to check if resource is assigned	Ilpo Järvinen
	reassign_resources_sorted() uses resource_size() to select between pci_assign_resource() and pci_reassign_resource(). Due to twisted way bridge window sizing in pbus_size_mem() sets resource sizes to 0, it works to match into IOV resources but that is going to be changed by an upcoming change. Replace resource_size() check with res->parent check that is the true dividing line in between whether assign or reassign function should be used for the resource. Link: https://lore.kernel.org/r/20241216175632.4175-24-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18	PCI: Add debug print when releasing resources before retry	Ilpo Järvinen
	PCI resource fitting is somewhat hard to track because it performs many actions without logging them. In the case inside __assign_resources_sorted(), the resources are released before resource assignment is going to be retried in a different order. That is just one level of retries the resource fitting performs overall so tracking it through repeated assignments or failures of a resource gets messy rather quickly. Simply announce the release explicitly using pci_dbg() so it is clear what is going on with each resource. Link: https://lore.kernel.org/r/20241216175632.4175-23-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18	PCI: Indicate optional resource assignment failures	Ilpo Järvinen
	Add pci_dbg() to note that an assignment failure was for an optional resource and reword existing message about resource resize to say the change was optional. Link: https://lore.kernel.org/r/20241216175632.4175-22-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18	PCI: Always have realloc_head in __assign_resources_sorted()	Ilpo Järvinen
	Add a dummy list to always have a non-NULL realloc head in __assign_resources_sorted() as it allows only checking list_empty(). In future, it would be good to ensure all callers provide a valid realloc_head but that is relatively complex to do in practice and not necessary for the subsequent optional resource handling fix. Link: https://lore.kernel.org/r/20241216175632.4175-21-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18	PCI: Extend enable to check for any optional resource	Ilpo Järvinen
	pci_enable_resources() checks if device's io and mem resources are all assigned and disallows enable if any resource failed to assign () but makes an exception for the case of disabled extension ROM. There are other optional resources, however. Add pci_resource_is_optional() and use it instead of pci_resource_is_disabled_rom() to cover also IOV resources that are also optional as per pbus_size_mem(). As there will be more users of pci_resource_is_optional() inside setup-bus.c in changes coming up after this one, the function is placed there. () In practice, resource fitting code calls reset_resource() for any resource it fails to assign which clears resource's ->flags causing pci_enable_resources() to never detect failed resource assignments. This seems undesirable internal logic inconsistency, effectively reset_resource() prevents pci_enable_resources() from functioning as intended. This is one step of many that will be needed towards removing reset_resource(). Link: https://lore.kernel.org/r/20241216175632.4175-20-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18	PCI: Add restore_dev_resource()	Ilpo Järvinen
	Resource fitting needs to restore the saved dev resources in a few places. Add a restore_dev_resource() helper for that. Link: https://lore.kernel.org/r/20241216175632.4175-19-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18	PCI: Remove incorrect comment from pci_reassign_resource()	Ilpo Järvinen
	Commit a4ac9fea016f ("PCI : Calculate right add_size") removed including min_align into new_size in pci_reassign_resource() which is the correct thing to do. However, it also added a snakeoil comment that the resource would already be aligned with min_align which is incorrect. A resource that is assigned earlier is aligned with the old alignment, NOT with the new requested alignment (min_align) until later deep within the reassignment callchain. Thus, remove the incorrect comment. Link: https://lore.kernel.org/r/20241216175632.4175-18-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Xiaochun Lee <lixc17@lenovo.com> Cc: Yinghai Lu <yinghai@kernel.org>
2025-02-18	PCI: Consolidate assignment loop next round preparation	Ilpo Järvinen
	pci_assign_unassigned_root_bus_resources() and pci_assign_unassigned_bridge_resources() have a loop that may perform several rounds to assign resources. The code to prepare for the next round is identical. Consolidate the code that prepares for the next assignment round into pci_prepare_next_assign_round(). Link: https://lore.kernel.org/r/20241216175632.4175-17-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18	PCI: Rename retval to ret	Ilpo Järvinen
	Rename 'retval' to 'ret' in pci_assign_unassigned_bridge_resources(). Link: https://lore.kernel.org/r/20241216175632.4175-16-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18	PCI: Use while loop and break instead of gotos	Ilpo Järvinen
	pci_assign_unassigned_root_bus_resources() and pci_assign_unassigned_bridge_resources() contain ad hoc loops using backwards goto and gotos out of the loop. Replace them with while loops and break statements. While reindenting the loop bodies, add braces & remove parenthesis. Link: https://lore.kernel.org/r/20241216175632.4175-15-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18	PCI: Refactor pdev_sort_resources() & __dev_sort_resources()	Ilpo Järvinen
	Reduce level of call nesting by calling pdev_sort_resources() directly and by moving the tests done inside __dev_sort_resources() into pdev_resources_assignable() helper. Link: https://lore.kernel.org/r/20241216175632.4175-14-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18	PCI: Converge return paths in __assign_resources_sorted()	Ilpo Järvinen
	All return paths want to free head list in __assign_resources_sorted(), so add a label and use goto. Link: https://lore.kernel.org/r/20241216175632.4175-13-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18	PCI: Add dev & res local variables to resource assignment funcs	Ilpo Järvinen
	Many PCI resource allocation related functions process struct pci_dev_resource items which hold the struct pci_dev and resource pointers. Reduce the number of lines that need indirection by adding 'dev' and 'res' local variable to hold the pointers. Link: https://lore.kernel.org/r/20241216175632.4175-12-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18	PCI: Add pci_resource_num() helper	Ilpo Järvinen
	A few places in PCI code, mainly in setup-bus.c, need to reverse lookup the index of a resource in pci_dev's resource array. Create pci_resource_num() helper to avoid repeating the pointer arithmetic trick used to calculate the index. Link: https://lore.kernel.org/r/20241216175632.4175-11-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18	PCI: Check resource_size() separately	Ilpo Järvinen
	Instead of chaining logic inside if () condition so that multiple lines are required, make !resource_size() a separate check and use continue. Link: https://lore.kernel.org/r/20241216175632.4175-10-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18	PCI: Add pci_resource_is_iov() to identify IOV resources	Michał Winiarski
	There are multiple places where special handling is required for IOV resources. Extract the identification of IOV resources to pci_resource_is_iov() and drop a few ifdefs. Link: https://lore.kernel.org/r/20241216175632.4175-9-ilpo.jarvinen@linux.intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18	PCI: Use resource_set_{range,size}() helpers	Ilpo Järvinen
	A few sites that could use resource_set_range/size() in setup-bus.c were not picked up earlier due to them no matching the usual pattern. Convert them now. These are more cases similar to 783602c920e9 ("PCI: Use resource_set_{range,size}() helpers"). Link: https://lore.kernel.org/r/20241216175632.4175-8-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: add 783602c920e9 history] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18	PCI: Use SZ_* instead of literals in setup-bus.c	Ilpo Järvinen
	Convert literals in setup-bus.c to SZ_* defines that make the size more human readable. As the code is now self-explanatory, eliminate comments about the size. Link: https://lore.kernel.org/r/20241216175632.4175-7-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18	PCI: Fix old_size lower bound in calculate_iosize() too	Ilpo Järvinen
	Commit 903534fa7d30 ("PCI: Fix resource double counting on remove & rescan") fixed double counting of mem resources because of old_size being applied too early. Fix a similar counting bug on the io resource side. Link: https://lore.kernel.org/r/20241216175632.4175-6-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18	PCI: Allow relaxed bridge window tail sizing for optional resources	Ilpo Järvinen
	Commit 566f1dd52816 ("PCI: Relax bridge window tail sizing rules") relaxed the bridge window requirements for non-optional size (size0) but pbus_size_mem() also handles optional sizes (IOV resources) using size1. This can manifest, e.g., as a failure to resize a BAR back to its original size after it was first shrunk when device has a VF BAR resource because the bridge window (size1) is enlarged beyond what is strictly required to fit the downstream resources. Allow using relaxed bridge window tail sizing rules also with the optional resources (size1) so that the remove/realloc cycle during BAR resize (smaller and back to the original size) does not fail unexpectedly due to increase in bridge window size demand. Also move add_align calculation to more logical place next to size1 assignment as they are strongly related to each other. Link: https://lore.kernel.org/r/20241216175632.4175-5-ilpo.jarvinen@linux.intel.com Fixes: 566f1dd52816 ("PCI: Relax bridge window tail sizing rules") Reported-by: Michał Winiarski <michal.winiarski@intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Xiaochun Lee <lixc17@lenovo.com>