Age | Commit message (Collapse) | Author |
|
The PCI hotplug core keeps a list of all registered slots. Its sole
purpose is to WARN() on slot removal if another slot is using the same
name.
But this can never happen because already on slot creation, an error is
returned and multiple messages are emitted if a slot's name is
duplicated:
pci_hp_register()
__pci_hp_register()
__pci_hp_initialize()
pci_create_slot()
kobject_init_and_add()
kobject_add_varg()
kobject_add_internal()
create_dir()
sysfs_create_dir_ns()
kernfs_create_dir_ns()
sysfs_warn_dup()
pr_warn("cannot create duplicate filename ...")
pr_err("%s failed for %s with -EEXIST, ...");
Drop the superfluous list.
Link: https://lore.kernel.org/r/603735bc50eb370bc7f1c358441ac671360bab25.1740501868.git.lukas@wunner.de
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
The test shell script "set_pcie_speed.sh" is not installed in INSTALL_PATH.
Attempting to execute set_pcie_cooling_state.sh shows warning:
./set_pcie_cooling_state.sh: line 119: ./set_pcie_speed.sh: No such file or directory
Add "set_pcie_speed.sh" to TEST_PROGS.
Link: https://lore.kernel.org/r/Z8FfK8rN30lKzvVV@ly-workstation
Fixes: 838f12c3d551 ("selftests/pcie_bwctrl: Create selftests")
Signed-off-by: Yi Lai <yi1.lai@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Log pci_dbg() messages about the reset methods we attempt and any errors
(-ENOTTY means "try the next method").
Set CONFIG_DYNAMIC_DEBUG=y and enable by booting with
dyndbg="file drivers/pci/* +p" or enable at runtime:
# echo "file drivers/pci/* +p" > /sys/kernel/debug/dynamic_debug/control
Link: https://lore.kernel.org/r/20250303204220.197172-1-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
|
|
Make the cast to the irq_hw_number_t type for the parameter passed to
irq_domain_set_info() function explicit.
Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250214173944.47506-9-james.quinlan@broadcom.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
The hardware has been updated with two changes to the MDIO packet
format.
The CMD field used to be 12 bits and now is only 1 bit. This change
is backwards compatible because the field's starting bit position is
unchanged, and the only commands we've used have values 0 and 1.
The PORT field's width has been changed from 4 bits to 5 bits. When
written, the new bit is not contiguous with the other four. However,
this change is backwards compatible because the driver never used
anything other than 0 for the port field's value.
Thus, update the existing code to handle new changes to the hardware
in a backwards-compatible manner.
Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250214173944.47506-8-james.quinlan@broadcom.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
The constants EXT_CFG_DATA and EXT_CFG_INDEX vary by SoC, where one of
the map_bus methods used these constants, and the other used a different
set of constants.
Thankfully, there was no problem because the SoCs that used the latter
map_bus method all had the same register constants.
Thus, remove redundant constants and adjust the code to use the correct
constants accordingly.
While at it, update the value of EXT_CFG_DATA to use the 4k-page based
configuration space access system, which is what the second map_bus
method was already using.
Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20250214173944.47506-7-james.quinlan@broadcom.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
The platform supports enabling and disabling regulators only on
ports below the Root Complex.
Thus, we need to verify this both when adding and removing the bus,
otherwise regulators may be disabled prematurely when a bus further
down the topology is removed.
Fixes: 9e6be018b263 ("PCI: brcmstb: Enable child bus device regulators from DT")
Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250214173944.47506-6-james.quinlan@broadcom.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
If the regulator_bulk_get() returns an error and no regulators
are created, we need to set their number to zero.
If we don't do this and the PCIe link up fails, a call to the
regulator_bulk_free() will result in a kernel panic.
While at it, print the error value, as we cannot return an error
upwards as the kernel will WARN() on an error from add_bus().
Fixes: 9e6be018b263 ("PCI: brcmstb: Enable child bus device regulators from DT")
Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20250214173944.47506-5-james.quinlan@broadcom.com
[kwilczynski: commit log, use comma in the message to match style with
other similar messages]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
When setting the LNKCAP and LNKCTL2 register fields, it was assumed
that the field started at the LSB of the register.
Although the masks do indeed start at the LSB, and this will probably
not change, it is prudent to use a method that makes no assumption
about the mask's placement in the register.
Thus, use the u{16,32}p_replace_bits() helpers since they are already
wildly used in this driver.
Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250214173944.47506-4-james.quinlan@broadcom.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
The driver has been mistakenly writing to a read-only (RO)
configuration space register (PCI_EXP_LNKCAP) to change the
PCIe link capability.
Although harmless in this case, the proper write destination
is an internal register that is reflected by PCI_EXP_LNKCAP.
Thus, fix the brcm_pcie_set_gen() function to correctly update
the link capability.
Fixes: c0452137034b ("PCI: brcmstb: Add Broadcom STB PCIe host controller driver")
Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250214173944.47506-3-james.quinlan@broadcom.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
When the user elects to limit the PCIe generation via the appropriate
devicetree property, apply the settings before the PCIe link up, not
after.
Fixes: c0452137034b ("PCI: brcmstb: Add Broadcom STB PCIe host controller driver")
Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250214173944.47506-2-james.quinlan@broadcom.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
Add a bare minimum amount of changes in order to support PCIe Root
Complex hardware IP found on RPi5. The PCIe controller on BCM2712
is based on BCM7712 and as such it inherits register offsets, PERST#
assertion, bridge_reset ops, and inbound windows count.
Although, the implementation for BCM2712 needs a workaround related to
the control of the bridge_reset where turning off of the Root Port must
not shutdown the bridge_reset and this must be avoided. To implement
this workaround a quirks field is introduced in pcie_cfg_data struct.
The controller also needs adjustment of PHY PLL setup to use a 54MHz
input refclk. The default input reference clock for the PHY PLL is
100Mhz, except for some devices where it is 54Mhz like BCM2712C1 and
BCM2712D0.
To implement those adjustments introduce a new .post_setup op in
pcie_cfg_data and call it at the end of brcm_pcie_setup function.
The BCM2712 .post_setup callback implements the required MDIO writes
that switch the PLL refclk and also change PHY PM clock period.
Without this RPi5 PCIex1 is unable to enumerate endpoint devices on
the expansion connector.
Signed-off-by: Stanimir Varbanov <svarbanov@suse.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Jim Quinlan <james.quinlan@broadcom.com>
Tested-by: Ivan T. Ivanov <iivanov@suse.de>
Link: https://lore.kernel.org/r/20250224083559.47645-8-svarbanov@suse.de
[commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
Per the Cadence's "PCIe Controller IP for AX14" user guide, Version
1.04, Section 9.1.7.1, "AXI Subordinate to PCIe Address Translation
Registers", Table 9.4, the bit 16 of the AXI Subordinate Address
(axi_s_awaddr) when set corresponds to MSG with data, and when not set,
to MSG without data.
However, the driver is currently doing the opposite and due to this,
the INTx is never received on the host.
So, fix the driver to reflect the documentation and also make INTx work.
Fixes: 37dddf14f1ae ("PCI: cadence: Add EndPoint Controller driver for Cadence PCIe controller")
Signed-off-by: Hans Zhang <18255117159@163.com>
Signed-off-by: Hans Zhang <hans.zhang@cixtech.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250214165724.184599-1-18255117159@163.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
Add AMD Versal2 MDB (Multimedia DMA Bridge) PCIe Root Port Bridge.
Signed-off-by: Thippeswamy Havalige <thippeswamy.havalige@amd.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250228093351.923615-3-thippeswamy.havalige@amd.com
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
Add support for MDB System Level Control and Status Register (SLCR)
aperture that is only supported for AMD Versal2 devices.
Signed-off-by: Thippeswamy Havalige <thippeswamy.havalige@amd.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250228093351.923615-2-thippeswamy.havalige@amd.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
PCI devices device tree nodes can be already created. This was introduced
by commit 407d1a51921e ("PCI: Create device tree node for bridge").
In order to have device tree nodes related to PCI devices attached on their
PCI root bus (the PCI bus handled by the PCI host bridge), a PCI root bus
device tree node is needed. This root bus node will be used as the parent
node of the first level devices scanned on the bus. On device tree based
systems, this PCI root bus device tree node is set to the node of the
related PCI host bridge. The PCI host bridge node is available in the
device tree used to describe the hardware passed at boot.
On non device tree based system (such as ACPI), a device tree node for the
PCI host bridge or for the root bus does not exist. Indeed, the PCI host
bridge is not described in a device tree used at boot simply because no
device tree is passed at boot.
The device tree PCI host bridge node creation needs to be done at runtime.
This is done in the same way as for the creation of the PCI device nodes.
I.e. node and properties are created based on computed information done by
the PCI core. Also, as is done on device tree based systems, this PCI host
bridge node is used for the PCI root bus.
With this done, hardware available in a PCI device that doesn't follow the
PCI model consisting in one PCI function handled by one driver can be
described by a device tree overlay loaded by the PCI device driver on non
device tree based systems. Those PCI devices provide a single PCI function
that includes several functionalities that require different drivers. The
device tree overlay describes the internal devices and their relationships.
It allows to load drivers needed by those different devices in order to
have functionalities handled.
Link: https://lore.kernel.org/r/20250224141356.36325-6-herve.codina@bootlin.com
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
|
|
The res parameter has no reason to be a pointer to an un-const struct
resource. Indeed, struct resource is not supposed to be modified by the
function.
Constify the res parameter.
Link: https://lore.kernel.org/r/20250224141356.36325-5-herve.codina@bootlin.com
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
|
|
The pdev (pointer to a struct pci_dev) parameter of of_pci_set_address()
cannot be NULL.
In order to use of_pci_set_address() when creating the PCI root bus node,
it needs to support a NULL pdev parameter. Indeed, in the case of the PCI
root bus node creation, no pdev is available and of_pci_set_address() will
be used with the bridge windows.
Allow to call of_pci_set_address() with a NULL pdev.
Link: https://lore.kernel.org/r/20250224141356.36325-4-herve.codina@bootlin.com
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
|
|
The commit 407d1a51921e ("PCI: Create device tree node for bridge")
creates of_node for PCI devices. The newly created of_node is attached
to an existing device. This is done setting directly pdev->dev.of_node
in the code.
Even if pdev->dev.of_node cannot be previously set, this doesn't handle
the fwnode field of the struct device. Indeed, this field needs to be
set if it hasn't already been set.
device_{add,remove}_of_node() have been introduced to handle this case.
Use them instead of the direct setting.
Link: https://lore.kernel.org/r/20250224141356.36325-3-herve.codina@bootlin.com
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
|
|
An of_node can be set to a device using device_set_node(), which does not
prevent any of_node and/or fwnode overwrites.
When adding an of_node on an already present device, the following
operations need to be done:
- Attach the of_node only if no of_node is already attached
- Attach the of_node as a fwnode if no fwnode were already attached
This is the purpose of device_add_of_node(). device_remove_of_node()
reverts the operations done by device_add_of_node().
Link: https://lore.kernel.org/r/20250224141356.36325-2-herve.codina@bootlin.com
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The BCM2712 memory map can support up to 64GB of system memory, thus
expand the inbound window size in calculation helper function.
The change is safe for the currently supported SoCs that have smaller
inbound window sizes.
Signed-off-by: Stanimir Varbanov <svarbanov@suse.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Jim Quinlan <james.quinlan@broadcom.com>
Tested-by: Ivan T. Ivanov <iivanov@suse.de>
Link: https://lore.kernel.org/r/20250224083559.47645-7-svarbanov@suse.de
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
Instead of copying fields from the pcie_cfg_data structure to
brcm_pcie, reference it directly.
Signed-off-by: Stanimir Varbanov <svarbanov@suse.de>
Reviewed-by: Florian Fainelil <florian.fainelli@broadcom.com>
Reviewed-by: Jim Quinlan <james.quinlan@broadcom.com>
Tested-by: Ivan T. Ivanov <iivanov@suse.de>
Link: https://lore.kernel.org/r/20250224083559.47645-6-svarbanov@suse.de
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
Then the brcmstb PCIe driver and MIP MSI-X interrupt controller
drivers are built as modules there could be a race in probing.
To avoid this, add a softdep to MIP driver to guarantee that
MIP driver will be load first.
Signed-off-by: Stanimir Varbanov <svarbanov@suse.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Tested-by: Ivan T. Ivanov <iivanov@suse.de>
Link: https://lore.kernel.org/r/20250224083559.47645-5-svarbanov@suse.de
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
Add an interrupt controller driver for MSI-X Interrupt Peripheral (MIP)
hardware block found in BCM2712. The interrupt controller is used to
handle MSI-X interrupts from peripherials behind PCIe endpoints like
RPi1 south bridge found in RPi5.
There are two MIPs on BCM2712, the first has 64 consecutive SPIs
assigned to 64 output vectors, and the second has 17 SPIs, but only
8 of them are consecutive starting at the 8th output vector.
Signed-off-by: Stanimir Varbanov <svarbanov@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Ivan T. Ivanov <iivanov@suse.de>
Link: https://lore.kernel.org/r/20250224083559.47645-4-svarbanov@suse.de
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
Update PCIe controller bindings with BCM2712 support.
Signed-off-by: Stanimir Varbanov <svarbanov@suse.de>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Tested-by: Ivan T. Ivanov <iivanov@suse.de>
Link: https://lore.kernel.org/r/20250224083559.47645-3-svarbanov@suse.de
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
Add bindings for BCM2712 MSI-X interrupt peripheral controller.
Signed-off-by: Stanimir Varbanov <svarbanov@suse.de>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Tested-by: Ivan T. Ivanov <iivanov@suse.de>
Link: https://lore.kernel.org/r/20250224083559.47645-2-svarbanov@suse.de
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
A call to of_parse_phandle() is incrementing the refcount, and as such,
the of_node_put() must be called when the reference is no longer needed.
Thus, refactor the existing code and add a missing of_node_put() call
following the check to ensure that "msi_np" matches "pcie->np" and after
MSI initialization, but only if the MSI support is enabled system-wide.
Cc: stable@vger.kernel.org # v5.10+
Fixes: 40ca1bf580ef ("PCI: brcmstb: Add MSI support")
Signed-off-by: Stanimir Varbanov <svarbanov@suse.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250122222955.1752778-1-svarbanov@suse.de
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
The .get_power() and .set_power() function pointers in struct
cpci_hp_controller_ops were declared but never implemented by any
driver.
Thus, to improve code readability and reduce resource usage,
remove these pointers and the code that has never been used.
Link: https://lore.kernel.org/r/20250217185638.398925-1-trintaeoitogc@gmail.com
Signed-off-by: Guilherme Giacomo Simoes <trintaeoitogc@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
Flit mode introduced in PCIe r6.0 alters how the TLP Header Log is
presented through AER and DPC Capability registers. The TLP Prefix Log
Register is not present with Flit mode, and the register becomes an
extension of the TLP Header Log (PCIe r6.1 secs 7.8.4.12 & 7.9.14.13).
Adapt pcie_read_tlp_log() and struct pcie_tlp_log to read and store the
extended TLP Header Log when the Link is in Flit mode. As the Prefix Log
and Extended TLP Header are not present at the same time, a C union can be
used.
Determining whether the error occurred while the Link was in Flit mode is a
bit complicated. In case of AER, the Advanced Error Capabilities and
Control Register directly tells whether the error was logged in Flit mode
or not (PCIe r6.1 sec 7.8.4.7). The DPC Capability (PCIe r6.1 sec 7.9.14),
unfortunately, does not contain the same information.
Unlike AER, the DPC Capability does not provide a way to discern whether
the error was logged in Flit mode (this is confirmed by PCI WG to be an
oversight in the spec). DPC will bring the Link down immediately following
an error, which makes it impossible to acquire the Flit Mode Status
directly from the Link Status 2 register because Flit Mode Status is only
set in certain Link states (PCIe r6.1 sec 7.5.3.20). As a workaround, use
the flit_mode value stored into the struct pci_bus.
Link: https://lore.kernel.org/r/20250207161836.2755-3-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
PCIe r6.0 added Flit mode, which mainly alters HW behavior, but there are
some OS visible changes. The OS visible changes include differences in the
layout of some capabilities and interpretation of the TLP headers (in
diagnostics situations).
To be able to determine which mode the PCIe Link is using, store the Flit
Mode Status (PCIe r6.1 sec 7.5.3.20) information in addition to the Link
speed into struct pci_bus in pcie_update_link_speed().
Link: https://lore.kernel.org/r/20250207161836.2755-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: use unsigned int:1 instead of bool, update flit_mode setting]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
include/linux/pci.h provides low-level pci_printk() interface that is
only used by AER because it needs to print the same message with
different levels depending on the error severity. No other PCI code
uses that functionality and calls pci_<level>() logging functions
directly with the appropriate level.
Descope pci_printk() into AER as aer_printk().
Link: https://lore.kernel.org/r/20241216161012.1774-5-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: retain pci_printk() for now since shpchp still uses it and I
moved those patches to a different branch]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Commit 47c8846a49ba ("PCI: Extend ACS configurability") introduced bugs
that fail to configure ACS ctrl to the value specified by the kernel
parameter. Essentially there are two bugs:
1) When ACS is configured for multiple PCI devices using 'config_acs'
kernel parameter, it results into error "PCI: Can't parse ACS command
line parameter". This is due to a bug that doesn't preserve the ACS
mask, but instead overwrites the mask with value 0.
For example, using 'config_acs' to configure ACS ctrl for multiple BDFs
fails:
Kernel command line: pci=config_acs=1111011@0020:02:00.0;101xxxx@0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p"
PCI: Can't parse ACS command line parameter
pci 0020:02:00.0: ACS mask = 0x007f
pci 0020:02:00.0: ACS flags = 0x007b
pci 0020:02:00.0: Configured ACS to 0x007b
After this fix:
Kernel command line: pci=config_acs=1111011@0020:02:00.0;101xxxx@0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p"
pci 0020:02:00.0: ACS mask = 0x007f
pci 0020:02:00.0: ACS flags = 0x007b
pci 0020:02:00.0: ACS control = 0x005f
pci 0020:02:00.0: ACS fw_ctrl = 0x0053
pci 0020:02:00.0: Configured ACS to 0x007b
pci 0039:00:00.0: ACS mask = 0x0070
pci 0039:00:00.0: ACS flags = 0x0050
pci 0039:00:00.0: ACS control = 0x001d
pci 0039:00:00.0: ACS fw_ctrl = 0x0000
pci 0039:00:00.0: Configured ACS to 0x0050
2) In the bit manipulation logic, we copy the bit from the firmware
settings when mask bit 0.
For example, 'disable_acs_redir' fails to clear all three ACS P2P redir
bits due to the wrong bit fiddling:
Kernel command line: pci=disable_acs_redir=0020:02:00.0;0030:02:00.0;0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p"
pci 0020:02:00.0: ACS mask = 0x002c
pci 0020:02:00.0: ACS flags = 0xffd3
pci 0020:02:00.0: Configured ACS to 0xfffb
pci 0030:02:00.0: ACS mask = 0x002c
pci 0030:02:00.0: ACS flags = 0xffd3
pci 0030:02:00.0: Configured ACS to 0xffdf
pci 0039:00:00.0: ACS mask = 0x002c
pci 0039:00:00.0: ACS flags = 0xffd3
pci 0039:00:00.0: Configured ACS to 0xffd3
After this fix:
Kernel command line: pci=disable_acs_redir=0020:02:00.0;0030:02:00.0;0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p"
pci 0020:02:00.0: ACS mask = 0x002c
pci 0020:02:00.0: ACS flags = 0xffd3
pci 0020:02:00.0: ACS control = 0x007f
pci 0020:02:00.0: ACS fw_ctrl = 0x007b
pci 0020:02:00.0: Configured ACS to 0x0053
pci 0030:02:00.0: ACS mask = 0x002c
pci 0030:02:00.0: ACS flags = 0xffd3
pci 0030:02:00.0: ACS control = 0x005f
pci 0030:02:00.0: ACS fw_ctrl = 0x005f
pci 0030:02:00.0: Configured ACS to 0x0053
pci 0039:00:00.0: ACS mask = 0x002c
pci 0039:00:00.0: ACS flags = 0xffd3
pci 0039:00:00.0: ACS control = 0x001d
pci 0039:00:00.0: ACS fw_ctrl = 0x0000
pci 0039:00:00.0: Configured ACS to 0x0000
Link: https://lore.kernel.org/r/20250207030338.456887-1-tdave@nvidia.com
Fixes: 47c8846a49ba ("PCI: Extend ACS configurability")
Signed-off-by: Tushar Dave <tdave@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
|
|
Update device ID for the Qcom SA8775P SoC.
Signed-off-by: Mrinmay Sarkar <quic_msarkar@quicinc.com>
Link: https://lore.kernel.org/r/20241205065422.2515086-3-quic_msarkar@quicinc.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
This driver is used to control the power state of the devices attached to
the PCI slots. Currently, it controls the voltage rails of the PCI slots
defined in the devicetree node of the root port.
The voltage rails for PCI slots are documented in the DT-schema:
https://github.com/devicetree-org/dt-schema/blob/v2024.11/dtschema/schemas/pci/pci-bus-common.yaml#L153
Since this driver has to work with different kind of slots (PCIe
x1/x4/x8/x16, Mini PCIe, PCI, etc.), the driver is thus using the
of_regulator_bulk_get_all() API to obtain the voltage regulators defined
in the DT node, instead of hardcoding them.
As such, the DT node of the root port should define the relevant supply
properties corresponding to the voltage rails of the PCI slot.
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250116-pci-pwrctrl-slot-v3-5-827473c8fbf4@linaro.org
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
The "pciclass" is an existing prefix used to identify the PCI bridge
devices, but it is not a vendor prefix. So document it in the non-vendor
prefix list.
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250116-pci-pwrctrl-slot-v3-4-827473c8fbf4@linaro.org
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
The pwrctrl core will rescan the bus once the device is powered on. So
there is no need to continue scanning for the device further.
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250116-pci-pwrctrl-slot-v3-3-827473c8fbf4@linaro.org
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
The PCI core will try to access the devices even after pci_stop_dev()
for things like Data Object Exchange (DOE), ASPM, etc.
So, move pci_pwrctrl_unregister() to the near end of pci_destroy_dev()
to make sure that the devices are powered down only after the PCI core
is done with them.
Suggested-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250116-pci-pwrctrl-slot-v3-2-827473c8fbf4@linaro.org
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
Current way of creating pwrctrl devices requires iterating through the
child devicetree nodes of the PCI bridge in pci_pwrctrl_create_devices().
Even though it works, it creates confusion as there is no symmetry between
this and pci_pwrctrl_unregister() function that removes the pwrctrl
devices.
So to make these two functions symmetric, move the creation of pwrctrl
devices to pci_scan_device(). During the scan of each device in a slot,
the devicetree node (if exists) for the PCI device will be checked. If it
has the supplies populated, then the pwrctrl device will be created.
Since the PCI device scan happens so early, there would be no "struct
pci_dev" available for the device. So the host bridge is used as the
parent of all pwrctrl devices.
One nice side effect of this move is that, it is now possible to have
pwrctrl devices for PCI bridges as well (to control the supplies of PCI
slots).
Suggested-by: Lukas Wunner <lukas@wunner.de>
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250116-pci-pwrctrl-slot-v3-1-827473c8fbf4@linaro.org
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
Before 456d8aa37d0f ("PCI/ASPM: Disable ASPM on MFD function removal to
avoid use-after-free"), we would free the ASPM link only after the last
function on the bus pertaining to the given link was removed.
That was too late. If function 0 is removed before sibling function,
link->downstream would point to free'd memory after.
After above change, we freed the ASPM parent link state upon any function
removal on the bus pertaining to a given link.
That is too early. If the link is to a PCIe switch with MFD on the upstream
port, then removing functions other than 0 first would free a link which
still remains parent_link to the remaining downstream ports.
The resulting GPFs are especially frequent during hot-unplug, because
pciehp removes devices on the link bus in reverse order.
On that switch, function 0 is the virtual P2P bridge to the internal bus.
Free exactly when function 0 is removed -- before the parent link is
obsolete, but after all subordinate links are gone.
Link: https://lore.kernel.org/r/e12898835f25234561c9d7de4435590d957b85d9.1734924854.git.dns@arista.com
Fixes: 456d8aa37d0f ("PCI/ASPM: Disable ASPM on MFD function removal to avoid use-after-free")
Signed-off-by: Daniel Stodden <dns@arista.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
The "shpchp_debug" module parameter is used to enable debug logging. The
generic ability to turn on/off debug prints dynamically covers this use
case already so there is no need for module specific debug handling. The
ctrl_dbg() wrapper also uses a low-level pci_printk() despite always using
KERN_DEBUG level.
Remove "shpchp_debug" parameter and convert ctrl_dbg() to use pci_dbg().
From now on, shpchp can be debugged using the normal dynamic debugger by
setting CONFIG_DYNAMIC_DEBUG=y and then either adding to kernel cmdline:
dyndbg="file drivers/pci/hotplug/shpchp* +p"
or using this command on a running kernel:
echo 'file drivers/pci/hotplug/shpchp* +p' > /sys/kernel/debug/dynamic_debug/control
Link: https://lore.kernel.org/r/20250217095550.2789-3-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
The shpchp hotplug driver defines logging wrapper with generic names which
are just duplicates of existing generic printk() wrappers. They are also
unused so remove them.
Link: https://lore.kernel.org/r/20250217095550.2789-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Convert the last user of dbg() to use ctrl_dbg().
Link: https://lore.kernel.org/r/20241216161012.1774-3-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
The logging in shpchp module init/exit functions is not very useful.
Remove it.
Link: https://lore.kernel.org/r/20241216161012.1774-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Remove and rescan cycle can result in failure to assign a bridge window if
it becomes larger than before the remove. The bridge window size will
include space for disabled Expansion ROM, which can causes the bridge
window to not fit anymore into the same address space slot on rescan if the
Expansion ROM resource was not assigned before the remove. In addition, the
optional resource handling is not internally consistent.
The resource fitting logic supports three main types of optional resources:
- IOV BARs
- Expansion ROMs
- Bridge window size variation due to optional resources
In addition to the above, resizable BARs beyond their current size will
require handling optional variation in resource sizes within the resource
fitting algorithm (not yet done by the resource fitting code).
There are multiple inconsistencies related to optional resource handling:
a) The allocation failure of disabled expansion ROM requires special case
inside assign_requested_resources_sorted().
b) The optionality of disabled expansion ROM is not considered during
bridge window sizing in pbus_size_mem().
c) Setting resource size to zero for optional resource in pbus_size_mem()
is problematic because it makes also the alignment invalid, which is
checked by pdev_sort_resources().
Optional IOV resources have their size set to zero by pbus_size_mem()
but the information about size is stored externally in struct pci_sriov
and complex call-chain trickery in pci_resource_alignment() ensures IOV
resources return a valid alignment despite having zero resource size. A
solution that is specific to IOV resources makes it hard to use the same
solution for other types of resources such as expansion ROM.
Simply changing pbus_size_mem() is not sufficient to fully address the main
issue because it would introduce disparity between bridge window sizing and
resource allocation. Due to size-based ordering of the resource list during
assignment loop, an Expansion ROM resource could steal space from some
other resource and make the other resource not fit if the Expansion ROM is
larger than the other resource. Thus, the resource assignment functions
need to be changed as well.
Make optional resource handling more straightforward. Use
pci_resource_is_optional() to determine if a resource is optional in both
bridge window sizing and assignment failure classification to ensure they
always align. Indicate with a parameter to
assign_requested_resources_sorted() whether it should attempt to allocate
optional resources or not.
Always try first to assign all resources (also when realloc_head is not
provided). This is required for calls from
pci_assign_unassigned_root_bus_resources() that provide realloc_head only
with some of its iterations.
Non-bridge-window optional resources in realloc_head now have add_size 0.
This condition has to be detected in reassign_resources_sorted() before
reassigning them (which would fail as there is no size change). Removing
add_size=0 optional resources entirely from realloc_head might eventually
be doable but further rework in __assign_resources_sorted() is needed first
to support such a change.
Link: https://lore.kernel.org/r/20241216175632.4175-26-ilpo.jarvinen@linux.intel.com
Reported-by: Jia Yao <jia.yao@intel.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219547
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jia Yao <jia.yao@intel.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
Resetting a resource is problematic as it prevents attempting to allocate
the resource later, unless something in between restores the resource.
Similarly, if fail_head does not contain all resources that were reset,
those resources cannot be restored later.
The entire reset/restore cycle adds complexity and leaving resources in the
reset state causes issues to other code such as for checks done in
pci_enable_resources(). Take a small step towards not resetting resources
by delaying reset until the end of resource assignment and build failure
list (fail_head) in sync with the reset to avoid leaving behind resources
that cannot be restored (for the case where the caller provides fail_head
in the first place to allow restore somewhere in the callchain, as is not
all callers pass non-NULL fail_head).
Leave the Expansion ROM check temporarily in place while building the
failure list until an upcoming change that reworks optional resource
handling.
Ideally, whole resource reset could be removed but doing that in one step
would be non-tractable due to complexity of all related code.
Link: https://lore.kernel.org/r/20241216175632.4175-25-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
reassign_resources_sorted() uses resource_size() to select between
pci_assign_resource() and pci_reassign_resource(). Due to twisted way
bridge window sizing in pbus_size_mem() sets resource sizes to 0, it works
to match into IOV resources but that is going to be changed by an upcoming
change.
Replace resource_size() check with res->parent check that is the true
dividing line in between whether assign or reassign function should be used
for the resource.
Link: https://lore.kernel.org/r/20241216175632.4175-24-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
PCI resource fitting is somewhat hard to track because it performs many
actions without logging them. In the case inside
__assign_resources_sorted(), the resources are released before resource
assignment is going to be retried in a different order. That is just one
level of retries the resource fitting performs overall so tracking it
through repeated assignments or failures of a resource gets messy rather
quickly.
Simply announce the release explicitly using pci_dbg() so it is clear what
is going on with each resource.
Link: https://lore.kernel.org/r/20241216175632.4175-23-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
Add pci_dbg() to note that an assignment failure was for an optional
resource and reword existing message about resource resize to say the
change was optional.
Link: https://lore.kernel.org/r/20241216175632.4175-22-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
Add a dummy list to always have a non-NULL realloc head in
__assign_resources_sorted() as it allows only checking list_empty().
In future, it would be good to ensure all callers provide a valid
realloc_head but that is relatively complex to do in practice and not
necessary for the subsequent optional resource handling fix.
Link: https://lore.kernel.org/r/20241216175632.4175-21-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
pci_enable_resources() checks if device's io and mem resources are all
assigned and disallows enable if any resource failed to assign (*) but
makes an exception for the case of disabled extension ROM. There are other
optional resources, however.
Add pci_resource_is_optional() and use it instead of
pci_resource_is_disabled_rom() to cover also IOV resources that are also
optional as per pbus_size_mem().
As there will be more users of pci_resource_is_optional() inside
setup-bus.c in changes coming up after this one, the function is placed
there.
(*) In practice, resource fitting code calls reset_resource() for any
resource it fails to assign which clears resource's ->flags causing
pci_enable_resources() to never detect failed resource assignments.
This seems undesirable internal logic inconsistency, effectively
reset_resource() prevents pci_enable_resources() from functioning as
intended. This is one step of many that will be needed towards removing
reset_resource().
Link: https://lore.kernel.org/r/20241216175632.4175-20-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|