Age | Commit message (Collapse) | Author |
|
- Remove unused tools 'pci' build target left over after moving tests to
tools/testing/selftests/pci_endpoint (Jianfeng Liu)
- Fix typos and whitespace errors (Bjorn Helgaas)
* pci/misc:
PCI: Fix typos
tools/Makefile: Remove pci target
# Conflicts:
# drivers/pci/endpoint/functions/pci-epf-test.c
|
|
pci_setup_bridge() is only used within setup-bus.c. Therefore, make it a
static function.
Link: https://lore.kernel.org/r/20250311174701.3586-3-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Fix typos and whitespace errors.
Link: https://lore.kernel.org/r/20250307231715.438518-1-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Per PCIe r6.0, sec 7.8.6.2, devices can advertise Resizable BAR sizes up to
128 TB in the Resizable BAR Capability register. Larger sizes can be
advertised via the Capability register, but that requires an API change.
Update pci_rebar_get_possible_sizes() and pbus_size_mem() to increase the
sizes we currently support from 512 GB to 128 TB.
Link: https://lore.kernel.org/r/20250307053535.44918-1-daizhiyuan@phytium.com.cn
Signed-off-by: Zhiyuan Dai <daizhiyuan@phytium.com.cn>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Remove and rescan cycle can result in failure to assign a bridge window if
it becomes larger than before the remove. The bridge window size will
include space for disabled Expansion ROM, which can causes the bridge
window to not fit anymore into the same address space slot on rescan if the
Expansion ROM resource was not assigned before the remove. In addition, the
optional resource handling is not internally consistent.
The resource fitting logic supports three main types of optional resources:
- IOV BARs
- Expansion ROMs
- Bridge window size variation due to optional resources
In addition to the above, resizable BARs beyond their current size will
require handling optional variation in resource sizes within the resource
fitting algorithm (not yet done by the resource fitting code).
There are multiple inconsistencies related to optional resource handling:
a) The allocation failure of disabled expansion ROM requires special case
inside assign_requested_resources_sorted().
b) The optionality of disabled expansion ROM is not considered during
bridge window sizing in pbus_size_mem().
c) Setting resource size to zero for optional resource in pbus_size_mem()
is problematic because it makes also the alignment invalid, which is
checked by pdev_sort_resources().
Optional IOV resources have their size set to zero by pbus_size_mem()
but the information about size is stored externally in struct pci_sriov
and complex call-chain trickery in pci_resource_alignment() ensures IOV
resources return a valid alignment despite having zero resource size. A
solution that is specific to IOV resources makes it hard to use the same
solution for other types of resources such as expansion ROM.
Simply changing pbus_size_mem() is not sufficient to fully address the main
issue because it would introduce disparity between bridge window sizing and
resource allocation. Due to size-based ordering of the resource list during
assignment loop, an Expansion ROM resource could steal space from some
other resource and make the other resource not fit if the Expansion ROM is
larger than the other resource. Thus, the resource assignment functions
need to be changed as well.
Make optional resource handling more straightforward. Use
pci_resource_is_optional() to determine if a resource is optional in both
bridge window sizing and assignment failure classification to ensure they
always align. Indicate with a parameter to
assign_requested_resources_sorted() whether it should attempt to allocate
optional resources or not.
Always try first to assign all resources (also when realloc_head is not
provided). This is required for calls from
pci_assign_unassigned_root_bus_resources() that provide realloc_head only
with some of its iterations.
Non-bridge-window optional resources in realloc_head now have add_size 0.
This condition has to be detected in reassign_resources_sorted() before
reassigning them (which would fail as there is no size change). Removing
add_size=0 optional resources entirely from realloc_head might eventually
be doable but further rework in __assign_resources_sorted() is needed first
to support such a change.
Link: https://lore.kernel.org/r/20241216175632.4175-26-ilpo.jarvinen@linux.intel.com
Reported-by: Jia Yao <jia.yao@intel.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219547
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jia Yao <jia.yao@intel.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
Resetting a resource is problematic as it prevents attempting to allocate
the resource later, unless something in between restores the resource.
Similarly, if fail_head does not contain all resources that were reset,
those resources cannot be restored later.
The entire reset/restore cycle adds complexity and leaving resources in the
reset state causes issues to other code such as for checks done in
pci_enable_resources(). Take a small step towards not resetting resources
by delaying reset until the end of resource assignment and build failure
list (fail_head) in sync with the reset to avoid leaving behind resources
that cannot be restored (for the case where the caller provides fail_head
in the first place to allow restore somewhere in the callchain, as is not
all callers pass non-NULL fail_head).
Leave the Expansion ROM check temporarily in place while building the
failure list until an upcoming change that reworks optional resource
handling.
Ideally, whole resource reset could be removed but doing that in one step
would be non-tractable due to complexity of all related code.
Link: https://lore.kernel.org/r/20241216175632.4175-25-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
reassign_resources_sorted() uses resource_size() to select between
pci_assign_resource() and pci_reassign_resource(). Due to twisted way
bridge window sizing in pbus_size_mem() sets resource sizes to 0, it works
to match into IOV resources but that is going to be changed by an upcoming
change.
Replace resource_size() check with res->parent check that is the true
dividing line in between whether assign or reassign function should be used
for the resource.
Link: https://lore.kernel.org/r/20241216175632.4175-24-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
PCI resource fitting is somewhat hard to track because it performs many
actions without logging them. In the case inside
__assign_resources_sorted(), the resources are released before resource
assignment is going to be retried in a different order. That is just one
level of retries the resource fitting performs overall so tracking it
through repeated assignments or failures of a resource gets messy rather
quickly.
Simply announce the release explicitly using pci_dbg() so it is clear what
is going on with each resource.
Link: https://lore.kernel.org/r/20241216175632.4175-23-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
Add pci_dbg() to note that an assignment failure was for an optional
resource and reword existing message about resource resize to say the
change was optional.
Link: https://lore.kernel.org/r/20241216175632.4175-22-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
Add a dummy list to always have a non-NULL realloc head in
__assign_resources_sorted() as it allows only checking list_empty().
In future, it would be good to ensure all callers provide a valid
realloc_head but that is relatively complex to do in practice and not
necessary for the subsequent optional resource handling fix.
Link: https://lore.kernel.org/r/20241216175632.4175-21-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
pci_enable_resources() checks if device's io and mem resources are all
assigned and disallows enable if any resource failed to assign (*) but
makes an exception for the case of disabled extension ROM. There are other
optional resources, however.
Add pci_resource_is_optional() and use it instead of
pci_resource_is_disabled_rom() to cover also IOV resources that are also
optional as per pbus_size_mem().
As there will be more users of pci_resource_is_optional() inside
setup-bus.c in changes coming up after this one, the function is placed
there.
(*) In practice, resource fitting code calls reset_resource() for any
resource it fails to assign which clears resource's ->flags causing
pci_enable_resources() to never detect failed resource assignments.
This seems undesirable internal logic inconsistency, effectively
reset_resource() prevents pci_enable_resources() from functioning as
intended. This is one step of many that will be needed towards removing
reset_resource().
Link: https://lore.kernel.org/r/20241216175632.4175-20-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
Resource fitting needs to restore the saved dev resources in a few places.
Add a restore_dev_resource() helper for that.
Link: https://lore.kernel.org/r/20241216175632.4175-19-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
pci_assign_unassigned_root_bus_resources() and
pci_assign_unassigned_bridge_resources() have a loop that may perform
several rounds to assign resources. The code to prepare for the next round
is identical.
Consolidate the code that prepares for the next assignment round into
pci_prepare_next_assign_round().
Link: https://lore.kernel.org/r/20241216175632.4175-17-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
Rename 'retval' to 'ret' in pci_assign_unassigned_bridge_resources().
Link: https://lore.kernel.org/r/20241216175632.4175-16-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
pci_assign_unassigned_root_bus_resources() and
pci_assign_unassigned_bridge_resources() contain ad hoc loops using
backwards goto and gotos out of the loop. Replace them with while loops
and break statements.
While reindenting the loop bodies, add braces & remove parenthesis.
Link: https://lore.kernel.org/r/20241216175632.4175-15-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
Reduce level of call nesting by calling pdev_sort_resources() directly
and by moving the tests done inside __dev_sort_resources() into
pdev_resources_assignable() helper.
Link: https://lore.kernel.org/r/20241216175632.4175-14-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
All return paths want to free head list in __assign_resources_sorted(), so
add a label and use goto.
Link: https://lore.kernel.org/r/20241216175632.4175-13-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
Many PCI resource allocation related functions process struct
pci_dev_resource items which hold the struct pci_dev and resource pointers.
Reduce the number of lines that need indirection by adding 'dev' and 'res'
local variable to hold the pointers.
Link: https://lore.kernel.org/r/20241216175632.4175-12-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
A few places in PCI code, mainly in setup-bus.c, need to reverse lookup the
index of a resource in pci_dev's resource array. Create pci_resource_num()
helper to avoid repeating the pointer arithmetic trick used to calculate
the index.
Link: https://lore.kernel.org/r/20241216175632.4175-11-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
Instead of chaining logic inside if () condition so that multiple lines are
required, make !resource_size() a separate check and use continue.
Link: https://lore.kernel.org/r/20241216175632.4175-10-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
There are multiple places where special handling is required for IOV
resources.
Extract the identification of IOV resources to pci_resource_is_iov() and
drop a few ifdefs.
Link: https://lore.kernel.org/r/20241216175632.4175-9-ilpo.jarvinen@linux.intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
A few sites that could use resource_set_range/size() in setup-bus.c were
not picked up earlier due to them no matching the usual pattern. Convert
them now.
These are more cases similar to 783602c920e9 ("PCI: Use
resource_set_{range,size}() helpers").
Link: https://lore.kernel.org/r/20241216175632.4175-8-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: add 783602c920e9 history]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
Convert literals in setup-bus.c to SZ_* defines that make the size more
human readable.
As the code is now self-explanatory, eliminate comments about the size.
Link: https://lore.kernel.org/r/20241216175632.4175-7-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
Commit 903534fa7d30 ("PCI: Fix resource double counting on remove &
rescan") fixed double counting of mem resources because of old_size being
applied too early.
Fix a similar counting bug on the io resource side.
Link: https://lore.kernel.org/r/20241216175632.4175-6-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
Commit 566f1dd52816 ("PCI: Relax bridge window tail sizing rules")
relaxed the bridge window requirements for non-optional size (size0)
but pbus_size_mem() also handles optional sizes (IOV resources) using
size1. This can manifest, e.g., as a failure to resize a BAR back to
its original size after it was first shrunk when device has a VF BAR
resource because the bridge window (size1) is enlarged beyond what is
strictly required to fit the downstream resources.
Allow using relaxed bridge window tail sizing rules also with the optional
resources (size1) so that the remove/realloc cycle during BAR resize
(smaller and back to the original size) does not fail unexpectedly due to
increase in bridge window size demand.
Also move add_align calculation to more logical place next to size1
assignment as they are strongly related to each other.
Link: https://lore.kernel.org/r/20241216175632.4175-5-ilpo.jarvinen@linux.intel.com
Fixes: 566f1dd52816 ("PCI: Relax bridge window tail sizing rules")
Reported-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
In pbus_size_io() and pbus_size_mem(), a complex ?: operation is performed
to set size1. Decompose this so it's easier to read.
In the case of pbus_size_mem(), simply initializing size1 to zero ensures
the size1 checks work as expected.
Link: https://lore.kernel.org/r/20241216175632.4175-4-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
Commit 566f1dd52816 ("PCI: Relax bridge window tail sizing rules")
relaxed bridge window tail alignment rule for the non-optional part
(size0, no add_size/add_align). The required alignment given for
pbus_upstream_space_available(), however, was add_align which relates
only to size1 alignment.
As pbus_upstream_space_available() only selects between normal and relaxed
tail alignment of the bridge window, the different alignment only makes
relaxed tail alignment to be used more often than what was intended, which
should be harmless because relaxed tail alignment itself should work in all
cases.
For consistency, change pbus_upstream_space_available() call to use
min_align which is the alignment that is going to be used for the bridge
window in the case where size0 sized allocation is attempted.
Link: https://lore.kernel.org/r/20241216175632.4175-3-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
Commit 566f1dd52816 ("PCI: Relax bridge window tail sizing rules")
relaxed bridge window tail alignment rule for the non-optional part
(size0, no add_size/add_align). The change, however, also overwrote
add_align, which is only related to case where optional size1 related
entry is added into realloc head.
Correct this by removing the add_align overwrite.
Link: https://lore.kernel.org/r/20241216175632.4175-2-ilpo.jarvinen@linux.intel.com
Fixes: 566f1dd52816 ("PCI: Relax bridge window tail sizing rules")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
|
|
7180c1d08639 ("PCI: Distribute available resources for root buses, too")
breaks BAR assignment on some devices:
pci 0006:03:00.0: BAR 0 [mem 0x6300c0000000-0x6300c1ffffff 64bit pref]: assigned
pci 0006:03:00.1: BAR 0 [mem 0x6300c2000000-0x6300c3ffffff 64bit pref]: assigned
pci 0006:03:00.2: BAR 0 [mem size 0x00800000 64bit pref]: can't assign; no space
pci 0006:03:00.0: VF BAR 0 [mem size 0x02000000 64bit pref]: can't assign; no space
pci 0006:03:00.1: VF BAR 0 [mem size 0x02000000 64bit pref]: can't assign; no space
The apertures of domain 0006 before 7180c1d08639:
6300c0000000-63ffffffffff : PCI Bus 0006:00
6300c0000000-6300c9ffffff : PCI Bus 0006:01
6300c0000000-6300c9ffffff : PCI Bus 0006:02 # 160MB
6300c0000000-6300c8ffffff : PCI Bus 0006:03 # 144MB
6300c0000000-6300c1ffffff : 0006:03:00.0 # 32MB
6300c2000000-6300c3ffffff : 0006:03:00.1 # 32MB
6300c4000000-6300c47fffff : 0006:03:00.2 # 8MB
6300c4800000-6300c67fffff : 0006:03:00.0 # 32MB
6300c6800000-6300c87fffff : 0006:03:00.1 # 32MB
6300c9000000-6300c9bfffff : PCI Bus 0006:04 # 12MB
6300c9000000-6300c9bfffff : PCI Bus 0006:05 # 12MB
6300c9000000-6300c91fffff : PCI Bus 0006:06 # 2MB
6300c9200000-6300c93fffff : PCI Bus 0006:07 # 2MB
6300c9400000-6300c95fffff : PCI Bus 0006:08 # 2MB
6300c9600000-6300c97fffff : PCI Bus 0006:09 # 2MB
After 7180c1d08639:
6300c0000000-63ffffffffff : PCI Bus 0006:00
6300c0000000-6300c9ffffff : PCI Bus 0006:01
6300c0000000-6300c9ffffff : PCI Bus 0006:02 # 160MB
6300c0000000-6300c43fffff : PCI Bus 0006:03 # 68MB
6300c0000000-6300c1ffffff : 0006:03:00.0 # 32MB
6300c2000000-6300c3ffffff : 0006:03:00.1 # 32MB
--- no space --- : 0006:03:00.2 # 8MB
--- no space --- : 0006:03:00.0 # 32MB
--- no space --- : 0006:03:00.1 # 32MB
6300c4400000-6300c4dfffff : PCI Bus 0006:04 # 10MB
6300c4400000-6300c4dfffff : PCI Bus 0006:05 # 10MB
6300c4400000-6300c45fffff : PCI Bus 0006:06 # 2MB
6300c4600000-6300c47fffff : PCI Bus 0006:07 # 2MB
6300c4800000-6300c49fffff : PCI Bus 0006:08 # 2MB
6300c4a00000-6300c4bfffff : PCI Bus 0006:09 # 2MB
We can see that the window to 0006:03 gets shrunken too much and 0006:04
eats away the window for 0006:03:00.2.
The offending commit distributes the upstream bridge's resources multiple
times to every downstream bridge, hence makes the aperture smaller than
desired because calculation of io_per_b, mmio_per_b and mmio_pref_per_b
becomes incorrect.
Instead, distribute downstream bridges' own resources to resolve the issue.
Link: https://lore.kernel.org/r/20241204022457.51322-1-kaihengf@nvidia.com
Fixes: 7180c1d08639 ("PCI: Distribute available resources for root buses, too")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219540
Signed-off-by: Kai-Heng Feng <kaihengf@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Carol Soto <csoto@nvidia.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Chris Chiu <chris.chiu@canonical.com>
|
|
Use pci_resource_name() helper in pdev_sort_resources() to print resources
in user-friendly format. Also replace the vague "bogus alignment" with a
more precise explanation of the problem.
Link: https://lore.kernel.org/r/20241017095545.1424-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Philipp Stanner <pstanner@redhat.com>
|
|
pci_bus_distribute_available_resources() performs alignment in case of
non-zero alignment requirement on three occasions.
Add ALIGN_DOWN_IF_NONZERO() helper to avoid coding the non-zero check three
times.
Suggested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240614100606.15830-5-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
Convert open-coded resource size calculations to use
resource_set_{range,size}() helpers.
While at it, use SZ_* for size parameter where appropriate which makes the
intent of code more obvious.
Also, cast sizes to resource_size_t, not u64.
Link: https://lore.kernel.org/r/20240614100606.15830-3-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
During remove & rescan cycle, PCI subsystem will recalculate and adjust
the bridge window sizing that was initially done by "BIOS". The size
calculation is based on the required alignment of the largest resource
among the downstream resources as per pbus_size_mem() (unimportant or
zero parameters marked with "..."):
min_align = calculate_mem_align(aligns, max_order);
size0 = calculate_memsize(size, ..., min_align);
inside calculate_memsize(), for the largest alignment:
min_align = align1 >> 1;
...
return min_align;
and then in calculate_memsize():
return ALIGN(max(size, ...), align);
If the original bridge window sizing tried to conserve space, this will
lead to massive increase of the required bridge window size when the
downstream has a large disparity in BAR sizes. E.g., with 16MiB and
16GiB BARs this results in 24GiB bridge window size even if 16MiB BAR
does not require gigabytes of space to fit.
When doing remove & rescan for a bus that contains such a PCI device, a
larger bridge window is suddenly required on rescan but when there is a
bridge window upstream that is already assigned based on the original
size, it cannot be enlarged to the new requirement. This causes the
allocation of the bridge window to fail (0x600000000 > 0x400ffffff):
pci 0000:02:01.0: PCI bridge to [bus 03]
pci 0000:02:01.0: bridge window [mem 0x40400000-0x405fffff]
pci 0000:02:01.0: bridge window [mem 0x6000000000-0x6400ffffff 64bit pref]
pci 0000:01:00.0: PCI bridge to [bus 02-04]
pci 0000:01:00.0: bridge window [mem 0x40400000-0x406fffff]
pci 0000:01:00.0: bridge window [mem 0x6000000000-0x6400ffffff 64bit pref]
pci 0000:03:00.0: device released
pci 0000:02:01.0: device released
pcieport 0000:01:00.0: scanning [bus 02-04] behind bridge, pass 0
pci 0000:02:01.0: PCI bridge to [bus 03]
pci 0000:02:01.0: bridge window [mem 0x40400000-0x405fffff]
pci 0000:02:01.0: bridge window [mem 0x6000000000-0x6400ffffff 64bit pref]
pci 0000:02:01.0: scanning [bus 03-03] behind bridge, pass 0
pci 0000:03:00.0: BAR 0 [mem 0x6400000000-0x6400ffffff 64bit pref]
pci 0000:03:00.0: BAR 2 [mem 0x6000000000-0x63ffffffff 64bit pref]
pci 0000:03:00.0: ROM [mem 0x40400000-0x405fffff pref]
pci 0000:02:01.0: PCI bridge to [bus 03]
pci 0000:02:01.0: scanning [bus 03-03] behind bridge, pass 1
pcieport 0000:01:00.0: scanning [bus 02-04] behind bridge, pass 1
pci 0000:02:01.0: bridge window [mem size 0x600000000 64bit pref]: can't assign; no space
pci 0000:02:01.0: bridge window [mem size 0x600000000 64bit pref]: failed to assign
pci 0000:02:01.0: bridge window [mem 0x40400000-0x405fffff]: assigned
pci 0000:03:00.0: BAR 2 [mem size 0x400000000 64bit pref]: can't assign; no space
pci 0000:03:00.0: BAR 2 [mem size 0x400000000 64bit pref]: failed to assign
pci 0000:03:00.0: BAR 0 [mem size 0x01000000 64bit pref]: can't assign; no space
pci 0000:03:00.0: BAR 0 [mem size 0x01000000 64bit pref]: failed to assign
pci 0000:03:00.0: ROM [mem 0x40400000-0x405fffff pref]: assigned
pci 0000:02:01.0: PCI bridge to [bus 03]
pci 0000:02:01.0: bridge window [mem 0x40400000-0x405fffff]
This is a major surprise for users who are suddenly left with a device that
was working fine with the original bridge window sizing.
Even if the already assigned bridge window could be enlarged by
reallocation in some cases (something the current code does not attempt
to do), it is not possible in general case and the large amount of
wasted space at the tail of the bridge window may lead to other
resource exhaustion problems on Root Complex level (think of multiple
PCIe cards with VFs and BAR size disparity in a single system).
PCI BARs only need natural alignment (PCIe r6.1, sec 7.5.1.2.1) and bridge
memory windows need 1MiB (sec 7.5.1.3). The current bridge window tail
alignment rule was introduced in the commit 5d0a8965aea9 ("[PATCH] 2.5.14:
New PCI allocation code (alpha, arm, parisc) [2/2]") that only states:
"pbus_size_mem: core stuff; tested with randomly generated sets of
resources". It does not explain the motivation for the extra tail space
allocated that is not truly needed by the downstream resources. As such, it
is far from clear if it ever has been required by any HW.
To prevent devices with BAR size disparity from becoming unusable after
remove & rescan cycle, attempt to do a truly minimal allocation for memory
resources if needed. First check if the normally calculated bridge window
will not fit into an already assigned upstream resource. In such case, try
with relaxed bridge window tail sizing rules instead where no extra tail
space is requested beyond what the downstream resources require. Only
enforce the alignment requirement of the bridge window itself (normally
1MiB).
With this patch, the resources are successfully allocated:
pci 0000:02:01.0: PCI bridge to [bus 03]
pci 0000:02:01.0: scanning [bus 03-03] behind bridge, pass 1
pcieport 0000:01:00.0: scanning [bus 02-04] behind bridge, pass 1
pcieport 0000:01:00.0: Assigned bridge window [mem 0x6000000000-0x6400ffffff 64bit pref] to [bus 02-04] cannot fit 0x600000000 required for 0000:02:01.0 bridging to [bus 03]
pci 0000:02:01.0: bridge window [mem 0x6000000000-0x6400ffffff 64bit pref] to [bus 03] requires relaxed alignment rules
pcieport 0000:01:00.0: Assigned bridge window [mem 0x40400000-0x406fffff] to [bus 02-04] free space at [mem 0x40400000-0x405fffff]
pci 0000:02:01.0: bridge window [mem 0x6000000000-0x6400ffffff 64bit pref]: assigned
pci 0000:02:01.0: bridge window [mem 0x40400000-0x405fffff]: assigned
pci 0000:03:00.0: BAR 2 [mem 0x6000000000-0x63ffffffff 64bit pref]: assigned
pci 0000:03:00.0: BAR 0 [mem 0x6400000000-0x6400ffffff 64bit pref]: assigned
pci 0000:03:00.0: ROM [mem 0x40400000-0x405fffff pref]: assigned
pci 0000:02:01.0: PCI bridge to [bus 03]
pci 0000:02:01.0: bridge window [mem 0x40400000-0x405fffff]
pci 0000:02:01.0: bridge window [mem 0x6000000000-0x6400ffffff 64bit pref]
This patch draws inspiration from the initial investigations and work by
Mika Westerberg.
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=216795
Link: https://lore.kernel.org/linux-pci/20190812144144.2646-1-mika.westerberg@linux.intel.com/
Fixes: 5d0a8965aea9 ("[PATCH] 2.5.14: New PCI allocation code (alpha, arm, parisc) [2/2]")
Link: https://lore.kernel.org/r/20240507102523.57320-9-ilpo.jarvinen@linux.intel.com
Tested-by: Lidong Wang <lidong.wang@intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
|
|
Calculations related to bridge window size contain literal 20 that is the
minimum alignment for a bridge window. Make the code more obvious by
converting the literal 20 to __ffs(SZ_1M).
Link: https://lore.kernel.org/r/20240507102523.57320-8-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: squash https://lore.kernel.org/r/20240612093250.17544-1-ilpo.jarvinen@linux.intel.com]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
|
|
pbus_size_mem() keeps the size of the optional resources in
children_add_size. When calculating the PCI bridge window size,
calculate_memsize() lower bounds size by old_size before adding
children_add_size and performing the window size alignment. This
results in double counting for the resources in children_add_size
because old_size may be based on the previous size of the bridge
window after it has already included children_add_size (that is,
size1 in pbus_size_mem() from an earlier invocation of that
function).
As a result, on repeated remove of the bus & rescan cycles the resource
size keeps increasing when children_add_size is non-zero as can be seen
from this extract:
iomem0: 23fffd00000-23fffdfffff : PCI Bus 0000:03 # 1MiB
iomem1: 20000000000-200001fffff : PCI Bus 0000:03 # 2MiB
iomem2: 20000000000-200002fffff : PCI Bus 0000:03 # 3MiB
iomem3: 20000000000-200003fffff : PCI Bus 0000:03 # 4MiB
iomem4: 20000000000-200004fffff : PCI Bus 0000:03 # 5MiB
Solve the double counting by moving old_size check later in
calculate_memsize() so that children_add_size is already accounted for.
After the patch, the bridge window retains its size as expected:
iomem0: 23fffd00000-23fffdfffff : PCI Bus 0000:03 # 1MiB
iomem1: 20000000000-200000fffff : PCI Bus 0000:03 # 1MiB
iomem2: 20000000000-200000fffff : PCI Bus 0000:03 # 1MiB
Fixes: a4ac9fea016f ("PCI : Calculate right add_size")
Link: https://lore.kernel.org/r/20240507102523.57320-2-ilpo.jarvinen@linux.intel.com
Tested-by: Lidong Wang <lidong.wang@intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
|
|
Use the pci_resource_name() to get the name of the resource and use it
while printing log messages.
[bhelgaas: rename to match struct resource * names, also use names in other
BAR messages]
Link: https://lore.kernel.org/r/20211106112606.192563-3-puranjay12@gmail.com
Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Fix a section mismatch warning on Sparc 32-bit:
WARNING: modpost: vmlinux: section mismatch in reference: leon_pci_init+0xf8 (section: .text) -> pci_assign_unassigned_resources (section: .init.text)
This is due to this comment from arch/sparc/kernel/leon_pci.c:
The LEON architecture does not rely on a BIOS or bootloader to setup PCI
for us. The Linux generic routines are used to setup resources, reset
values of configuration-space register settings are preserved.
Link: https://lore.kernel.org/r/20230925042316.15415-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
|
|
Fix typos in docs and comments.
Link: https://lore.kernel.org/r/20230824193712.542167-11-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
Refactor pci_bus_for_each_resource() in the same way as
pci_dev_for_each_resource(). This allows the index to be hidden inside the
implementation so the caller can omit it when it's not used otherwise.
No functional changes intended.
Link: https://lore.kernel.org/r/20230330162434.35055-6-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Krzysztof Wilczyński <kw@linux.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
|
|
Instead of open-coding it everywhere introduce a tiny helper that can be
used to iterate over each resource of a PCI device, and convert the most
obvious users into it.
While at it drop doubled empty line before pdev_sort_resources().
No functional changes intended.
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20230330162434.35055-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Krzysztof Wilczyński <kw@linux.com>
|
|
Previously we distributed spare resources only upon hot-add, so if the
initial root bus scan found devices that had not been fully configured by
the BIOS, we allocated only enough resources to cover what was then
present. If some of those devices were hotplug bridges, we did not leave
any additional resource space for future expansion.
Distribute the available resources for root buses, too, to make this work
the same way as the normal hotplug case.
A previous commit to do this was reverted due to a regression reported by
Jonathan Cameron:
e96e27fc6f79 ("PCI: Distribute available resources for root buses, too")
5632e2beaf9d ("Revert "PCI: Distribute available resources for root buses, too"")
This commit changes pci_bridge_resources_not_assigned() to work with
bridges that do not have all the resource windows programmed by the boot
firmware (previously we expected all I/O, memory and prefetchable memory
were programmed).
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216000
Link: https://lore.kernel.org/r/20220905080232.36087-5-mika.westerberg@linux.intel.com
Link: https://lore.kernel.org/r/20230131092405.29121-4-mika.westerberg@linux.intel.com
Reported-by: Chris Chiu <chris.chiu@canonical.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
A PCI bridge may reside on a bus with other devices as well. The resource
distribution code does not take this into account and therefore it expands
the bridge resource windows too much, not leaving space for the other
devices (or functions of a multifunction device). This leads to an issue
that Jonathan reported when running QEMU with the following topology (QEMU
parameters):
-device pcie-root-port,port=0,id=root_port13,chassis=0,slot=2 \
-device x3130-upstream,id=sw1,bus=root_port13,multifunction=on \
-device e1000,bus=root_port13,addr=0.1 \
-device xio3130-downstream,id=fun1,bus=sw1,chassis=0,slot=3 \
-device e1000,bus=fun1
The first e1000 NIC here is another function in the switch upstream port.
This leads to following errors:
pci 0000:00:04.0: bridge window [mem 0x10200000-0x103fffff] to [bus 02-04]
pci 0000:02:00.0: bridge window [mem 0x10200000-0x103fffff] to [bus 03-04]
pci 0000:02:00.1: BAR 0: failed to assign [mem size 0x00020000]
e1000 0000:02:00.1: can't ioremap BAR 0: [??? 0x00000000 flags 0x0]
Fix this by taking into account bridge windows, device BARs and SR-IOV PF
BARs on the bus (PF BARs include space for VF BARS so only account PF
BARs), including the ones belonging to bridges themselves if it has any.
Link: https://lore.kernel.org/linux-pci/20221014124553.0000696f@huawei.com/
Link: https://lore.kernel.org/linux-pci/6053736d-1923-41e7-def9-7585ce1772d9@ixsystems.com/
Link: https://lore.kernel.org/r/20230131092405.29121-3-mika.westerberg@linux.intel.com
Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reported-by: Alexander Motin <mav@ixsystems.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
After division the extra resource space per hotplug bridge may not be
aligned according to the window alignment, so align it before passing it
down for further distribution.
Link: https://lore.kernel.org/r/20230131092405.29121-2-mika.westerberg@linux.intel.com
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
This reverts commit e96e27fc6f7971380283768e9a734af16b1716ee.
Jonathan reported that this commit broke this topology, where all the space
available on bus 02 was assigned to the 02:00.0 bridge window, leaving none
for the e1000 device at 02:00.1:
pci 0000:00:04.0: bridge window [mem 0x10200000-0x103fffff] to [bus 02-04]
pci 0000:02:00.0: bridge window [mem 0x10200000-0x103fffff] to [bus 03-04]
pci 0000:02:00.1: BAR 0: failed to assign [mem size 0x00020000]
e1000 0000:02:00.1: can't ioremap BAR 0: [??? 0x00000000 flags 0x0]
Link: https://lore.kernel.org/r/20221014124553.0000696f@huawei.com
Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Drop two empty lines from pci_scan_child_bus_extend() and correct
indentation in pci_bridge_distribute_available_resources() to better
follow the kernel coding style.
No functional impact.
Link: https://lore.kernel.org/r/20220905080232.36087-6-mika.westerberg@linux.intel.com
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
|
|
Previously we distributed spare resources only upon hot-add, so if the
initial root bus scan found devices that had not been fully configured by
the BIOS, we allocated only enough resources to cover what was then
present. If some of those devices were hotplug bridges, we did not leave
any additional resource space for future expansion.
Distribute the available resources for root buses, too, to make this work
the same way as the normal hotplug case.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216000
Link: https://lore.kernel.org/r/20220905080232.36087-5-mika.westerberg@linux.intel.com
Reported-by: Chris Chiu <chris.chiu@canonical.com>
Tested-by: Chris Chiu <chris.chiu@canonical.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
|
|
We need to be able to call pci_bridge_distribute_available_resources()
from this function so move it accordingly to avoid need for forward
declaration.
No functional impact.
Link: https://lore.kernel.org/r/20220905080232.36087-4-mika.westerberg@linux.intel.com
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
|
|
- Update the aer-inject URL (Yicong Yang)
- Declare pci_filp_private only when HAVE_PCI_MMAP to avoid unused struct
definition (Krzysztof Wilczyński)
- Remove unused assignments (Bjorn Helgaas)
- Add #includes to asm/pci_x86.h to prevent build errors (Randy Dunlap)
* pci/misc:
x86/PCI: Add #includes to asm/pci_x86.h
PCI: ibmphp: Remove unused assignments
PCI: cpqphp: Remove unused assignments
PCI: fu740: Remove unused assignments
PCI: kirin: Remove unused assignments
PCI: Remove unused assignments
PCI: Declare pci_filp_private only when HAVE_PCI_MMAP
PCI/AER: Update aer-inject URL
|
|
Remove variables and assignments that are never used.
Found by Krzysztof using cppcheck, e.g.,
$ cppcheck --enable=all --force
uselessAssignmentPtrArg drivers/pci/proc.c:102 Assignment of function parameter has no effect outside the function. Did you forget dereferencing it?
unreadVariable drivers/pci/setup-bus.c:1528 Variable 'old_flags' is assigned a value that is never used.
Reported-by: Krzysztof Wilczyński <kw@linux.com>
Link: https://lore.kernel.org/r/20220313192933.434746-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Current kernel reports that BARs larger than 128GB, e.g., this 4TB BAR, are
disabled:
pci 0000:01:00.0: disabling BAR 4: [mem 0x00000000-0x3ffffffffff 64bit pref] (bad alignment 0x40000000000)
Increase the maximum BAR size from 128GB to 8TB for future expansion.
[bhelgaas: commit log]
Link: https://lore.kernel.org/r/20220118092117.10089-1-liudongdong3@huawei.com
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|