Age | Commit message (Collapse) | Author |
|
The bio structure consumes a substantial part of dm_buffer. The bio
structure is only needed when doing I/O on the buffer, thus we don't
have to embed it in the buffer.
Allocate the bio structure only when doing I/O.
We don't need to create a bio_set because, in case of allocation
failure, dm-bufio falls back to using dm-io (which keeps its own
bio_set).
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Support block sizes that are not a power-of-two (but they must be a
multiple of 512b). As always, a slab cache is used for allocations.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
kmalloc padded to the next power of two, using a slab cache avoids this.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Reorder fields in dm_buffer structure to improve packing and reduce
structure size. The compiler allocates 32-bit integer for field 'enum
data_mode', so change it to unsigned char.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
The I/O buffer doesn't have to be aligned on block size granularity,
relax alignment to ARCH_KMALLOC_MINALIGN (required to allow DMA from
slab cache memory on some architectures).
Also, set SLAB_RECLAIM_ACCOUNT so that the memory allocated from the
cache is accounted as reclaimable and doesn't inflate the 'used' entry
in the free command.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
All slab allocators can merge duplicate caches. So dm-bufio doesn't
need extra slab merging logic. Instead it can just allocate one slab
cache per client and let the allocator merge them.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
dm-bufio keeps the dm_bufio_cache_names array that holds names of the
slab caches.
Since the commit db265eca7700 ("mm/sl[aou]b: Move duping of slab name to
slab_common.c"), the kernel automatically duplicates the slab cache name
when creating the slab cache, so we no longer have to keep the name
allocated.
Remove the code that allocates the slab names and keeps them around.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Move dm-bufio.h to include/linux/ so that external GPL'd DM target
modules can use it.
It is better to allow the use of dm-bufio than force external modules
to implement the equivalent buffered IO mechanism in some new way. The
hope is this will encourage the use of dm-bufio; which will then make it
easier for a GPL'd external DM target module to be included upstream.
A couple dm-bufio EXPORT_SYMBOL exports have also been updated to use
EXPORT_SYMBOL_GPL.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
This comment was true when dm-bufio was written but, since 4.3, bios can
now have arbitrary size and the driver splits them.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Set QUEUE_FLAG_SECERASE in DM device's queue_flags if a DM table's
data devices support secure erase.
Also, add support for secure erase to both the linear and striped
targets.
Signed-off-by: Denis Semakin <d.semakin@omprussia.ru>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Otherwise, these abnormal IOs would be sent to the DM target
regardless of whether the target advertised support for them.
Factor out __process_abnormal_io() from __split_and_process_non_flush()
so that discards, write same, etc may be conditionally processed.
Fixes: 978e51ba3 ("dm: optimize bio-based NVMe IO submission")
Cc: stable@vger.kernel.org # 4.16
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Fix a race for "nosync" activations providing "aa.." device health
characters and "0/N" sync ratio rather than "AA..." and "N/N". Occurs
when status for the raid set is retrieved during resume before the MD
sync thread starts and clears the MD_RECOVERY_NEEDED flag.
Cc: stable@vger.kernel.org # 4.16+
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
process_queued_bios()
Signed-off-by: Wang Sheng-Hui <shhuiw@foxmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Ideally, we'd like to get rid of all VLAs in the kernel and add -Wvla to
the build args: https://lkml.org/lkml/2018/3/7/621
This one is a simple case, since we don't actually need the VLA at all: we
can just iterate over the stripes twice, once to emit their names, and the
second time to emit status (i.e. trade memory for time). Since the number
of stripes is probably low, this is hopefully not that expensive.
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
So developer could distinguish data and metadata bios easier.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Since crypto API commit 9fa68f62004 ("crypto: hash - prevent using keyed
hashes without setting key") dm-integrity cannot use keyed algorithms
without the key being set.
The dm-integrity recognizes this too late (during use of HMAC), so it
allows creation and formatting of superblock, but the device is in fact
unusable.
Fix it by detecting the key requirement in integrity table constructor.
Signed-off-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Signed-off-by: Wang Sheng-Hui <shhuiw@foxmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Reviewed-by: Scott Bauer <Scott.Bauer@intel.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
This target's kernel module being named dm-unstripe.ko doesn't allow
lvm2's DM module autoload capability to load the dm-unstripe.ko
because lvm2 looks for dm-unstriped.ko due to the target name being
"unstriped".
Add the "dm-unstriped" module alias to resolve this oversight.
NOTE: this isn't needed for the "striped" target, despite its source
file being named dm-stripe.c, because it is part of dm-mod.ko.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Address "FIXME: must support non power of 2 chunk_size, dm-stripe.c does".
Bump target version to indicate change.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Tested-by: Scott Bauer <Scott.Bauer@intel.com>
Reviewed-by: Scott Bauer <Scott.Bauer@intel.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
dm-crypt consumes an excessive amount memory when the user attempts to
zero a dm-crypt device with "blkdiscard -z". The command "blkdiscard -z"
calls the BLKZEROOUT ioctl, it goes to the function __blkdev_issue_zeroout,
__blkdev_issue_zeroout sends a large amount of write bios that contain
the zero page as their payload.
For each incoming page, dm-crypt allocates another page that holds the
encrypted data, so when processing "blkdiscard -z", dm-crypt tries to
allocate the amount of memory that is equal to the size of the device.
This can trigger OOM killer or cause system crash.
Fix this by limiting the amount of memory that dm-crypt allocates to 2%
of total system memory. This limit is system-wide and is divided by the
number of active dm-crypt devices and each device receives an equal
share.
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Could be useful for a target to return stats or other information.
If a target does DMEMIT() anything to @result from its .message method
then it must return 1 to the caller.
Signed-off-By: Mike Snitzer <snitzer@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator updates from Mark Brown:
"A very small set of updates for the regulator API this time around,
there's a few bug fixes and also:
- Conversion of the regulator API to use GPIO descriptors rather than
numbers from Linus Walleij.
- New drivers for Marvell 88PG86x and Qualcomm PM8998 and PMI8998"
* tag 'regulator-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: qcom: smd: Add pm8998 and pmi8998 regulators
regulator: core: Add missing blank line between functions
regulator: qcom_smd: Drop regulator/{machine,of_regulator} includes
regulator: giving regulator controlling gpios a non-empty label when used through the devicetree.
regulator: gpio: Fix some error handling paths in 'gpio_regulator_probe()'
regulator: 88pg86x: new i2c dual regulator chip
regulator: 88pg86x: add DT bindings document
regulator: da9211: Pass descriptors instead of GPIO numbers
regulator: da9055: Pass descriptor instead of GPIO number
regulator: core: Support passing an initialized GPIO enable descriptor
regulator: dt: regulator-name is required property
regulator: of: Add a missing 'of_node_put()' in an error handling path of 'of_regulator_match()'
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"This is a fairly large set of updates for regmap, mainly bugfixes.
The biggest bit of this is some fixes for the bulk operations code
which had issues in some use cases, Charles Keepax has sorted them
out. We also gained the ability to use debugfs with syscon regmaps and
to specify the clock to be used with MMIO regmaps"
* tag 'regmap-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: (21 commits)
regmap: debugfs: Improve warning message on debugfs_create_dir() failure
regmap: debugfs: Free map->debugfs_name when debugfs_create_dir() failed
regmap: debugfs: Don't leak dummy names
regmap: debugfs: Disambiguate dummy debugfs file name
regmap: mmio: Add function to attach a clock
regmap: Merge redundant handling in regmap_bulk_write
regmap: Tidy up regmap_raw_write chunking code
regmap: Move the handling for max_raw_write into regmap_raw_write
regmap: Remove unnecessary printk for failed allocation
regmap: Format data for raw write in regmap_bulk_write
regmap: use debugfs even when no device
regmap: Allow missing device in regmap_name_read_file()
regmap: Use _regmap_read in regmap_bulk_read
regmap: Tidy up regmap_raw_read chunking code
regmap: Move the handling for max_raw_read into regmap_raw_read
regmap: Use helper function for register offset
regmap: Don't use format_val in regmap_bulk_read
regmap: Correct comparison in regmap_cached
regmap: Correct offset handling in regmap_volatile_range
regmap-i2c: Off by one in regmap_i2c_smbus_i2c_read/write()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These update the cpuidle poll state definition to reduce excessive
energy usage related to it, add new CPU ID to the RAPL power capping
driver, update the ACPI system suspend code to handle some special
cases better, extend the PM core's device links code slightly, add new
sysfs attribute for better suspend-to-idle diagnostics and easier
hibernation handling, update power management tools and clean up
cpufreq quite a bit.
Specifics:
- Modify the cpuidle poll state implementation to prevent CPUs from
staying in the loop in there for excessive times (Rafael Wysocki).
- Add Intel Cannon Lake chips support to the RAPL power capping
driver (Joe Konno).
- Add reference counting to the device links handling code in the PM
core (Lukas Wunner).
- Avoid reconfiguring GPEs on suspend-to-idle in the ACPI system
suspend code (Rafael Wysocki).
- Allow devices to be put into deeper low-power states via ACPI if
both _SxD and _SxW are missing (Daniel Drake).
- Reorganize the core ACPI suspend-to-idle wakeup code to avoid a
keyboard wakeup issue on Asus UX331UA (Chris Chiu).
- Prevent the PCMCIA library code from aborting suspend-to-idle due
to noirq suspend failures resulting from incorrect assumptions
(Rafael Wysocki).
- Add coupled cpuidle supprt to the Exynos3250 platform (Marek
Szyprowski).
- Add new sysfs file to make it easier to specify the image storage
location during hibernation (Mario Limonciello).
- Add sysfs files for collecting suspend-to-idle usage and time
statistics for CPU idle states (Rafael Wysocki).
- Update the pm-graph utilities (Todd Brandt).
- Reduce the kernel log noise related to reporting Low-power Idle
constraings by the ACPI system suspend code (Rafael Wysocki).
- Make it easier to distinguish dedicated wakeup IRQs in the
/proc/interrupts output (Tony Lindgren).
- Add the frequency table validation in cpufreq to the core and drop
it from a number of cpufreq drivers (Viresh Kumar).
- Drop "cooling-{min|max}-level" for CPU nodes from a couple of DT
bindings (Viresh Kumar).
- Clean up the CPU online error code path in the cpufreq core (Viresh
Kumar).
- Fix assorted issues in the SCPI, CPPC, mediatek and tegra186
cpufreq drivers (Arnd Bergmann, Chunyu Hu, George Cherian, Viresh
Kumar).
- Drop memory allocation error messages from a few places in cpufreq
and cpuildle drivers (Markus Elfring)"
* tag 'pm-4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (56 commits)
ACPI / PM: Fix keyboard wakeup from suspend-to-idle on ASUS UX331UA
cpufreq: CPPC: Use transition_delay_us depending transition_latency
PM / hibernate: Change message when writing to /sys/power/resume
PM / hibernate: Make passing hibernate offsets more friendly
cpuidle: poll_state: Avoid invoking local_clock() too often
PM: cpuidle/suspend: Add s2idle usage and time state attributes
cpuidle: Enable coupled cpuidle support on Exynos3250 platform
cpuidle: poll_state: Add time limit to poll_idle()
cpufreq: tegra186: Don't validate the frequency table twice
cpufreq: speedstep: Don't validate the frequency table twice
cpufreq: sparc: Don't validate the frequency table twice
cpufreq: sh: Don't validate the frequency table twice
cpufreq: sfi: Don't validate the frequency table twice
cpufreq: scpi: Don't validate the frequency table twice
cpufreq: sc520: Don't validate the frequency table twice
cpufreq: s3c24xx: Don't validate the frequency table twice
cpufreq: qoirq: Don't validate the frequency table twice
cpufreq: pxa: Don't validate the frequency table twice
cpufreq: ppc_cbe: Don't validate the frequency table twice
cpufreq: powernow: Don't validate the frequency table twice
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"These update the ACPICA code in the kernel to follow upstream revision
20180313 which includes fixes related to the so-called module-level
AML (mostly "if" type of statements outside of any methods) that
should improve the handling of systems that load alternative SSDTs
depending on the current configuration, for example, and event
handling fixes related to disabling and enabling GPEs on system
startup and on suspend/resume.
Moreover, the ACPICA license boilerplate is replaced with SPDX license
IDs which alone reduces the number of lines of ACPICA code in the
kernel quite a bit.
Also added is a new driver for the generic ACPI Time and Alarm Device
(TAD). At the moment it only handles the most basic capabilities of
the TAD, however.
In addition to that the ACPI battery driver is improved to handle
battery thresholds on ThinkPads, among other things, some bugs are
fixed, a new backlight quirk is added and some documentation is
updated.
Specifics:
- Update the in-kernel ACPICA code to upstream revision 20180313
including:
* Module-level AML code handling fixes and simplifications (Bob
Moore, Erik Schmauss).
* Fixes and cleanups related to messaging (Bob Moore).
* Events handling fixes related to disabling and enabling GPEs
(Erik Schmauss).
* Introduction of SPDX license identifiers and removal of license
boilerplate in multiple files (Erik Schmauss).
* Assorted fixes and cleanups (Bob Moore, Erik Schmauss, Hans de
Goede, Seunghun Han).
- Add new basic driver for the ACPI Time and Alarm Device (Rafael
Wysocki).
- Modify the ACPI battery driver to support battery thresholds on
Lenovo ThinkPads (Ognjen Galic, Colin Ian King).
- Avoid reporting battery capacity over 100 in the ACPI battery
driver in some cases (Laszlo Toth).
- Make the kernel recognize an OEM _OSI string from Dell to avoid
power management issues with NVidia GPUs in Dell platforms (Alex
Hung).
- Make the PCI IRQ management code handle missing _PRS cleanly (Alex
Hung).
- Fix uevent notifications related to device hotplut (Lee, Chun-Yi).
- Prevent the ACPI PAD driver from leaking memory (Lenny Szubowicz).
- Update the ACPI CPPC library code to include subspace IDs in the
kernel messages logged by it (George Cherian).
- Add backlight quirk for Samsung 670Z5E (Hans de Goede).
- Add the NFIT and HMAT tables to the list of ACPI tables that can be
overridden via initrd (Dan Williams).
- Fix and clean up some ACPI documentation and Kconfig help language
(Aishwarya Pant, Randy Dunlap).
- Replace license boilerplate with an SPDX license ID in the ACPI
PMIC operation region handling code (Rajmohan Mani)"
* tag 'acpi-4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (39 commits)
ACPI: acpi_pad: Fix memory leak in power saving threads
ACPI / video: Add quirk to force acpi-video backlight on Samsung 670Z5E
ACPI: Add Time and Alarm Device (TAD) driver
ACPI / scan: Send change uevent with offine environmental data
ACPI / Kconfig: Update ACPI_PROCFS_POWER help text
ACPI / OSI: Add OEM _OSI strings to disable NVidia RTD3
ACPICA: Update version to 20180313
ACPICA: Cleanup/simplify module-level code support
ACPICA: Events: add a return on failure from acpi_hw_register_read
ACPICA: adding SPDX headers
ACPICA: Rename a global for clarity, no functional change
ACPICA: macros: fix ACPI_ERROR_NAMESPACE macro
ACPICA: Change a compile-time option to a runtime option
ACPICA: Remove calling of _STA from acpi_get_object_info()
ACPICA: AML Debug Object: Don't ignore output of zero-length strings
ACPICA: Fix memory leak on unusual memory leak
ACPICA: Events: Dispatch GPEs after enabling for the first time
ACPICA: Events: Add parallel GPE handling support to fix potential redundant _Exx evaluations
ACPICA: Events: Stop unconditionally clearing ACPI IRQs during suspend/resume
ACPICA: acpi: acpica: fix acpi operand cache leak in nseval.c
...
|
|
Commit 37dddf14f1ae ("PCI: cadence: Add EndPoint Controller driver for
Cadence PCIe controller") created the /drivers/pci/cadence directory to
keep in a single place Cadence host and endpoint controller drivers.
Since code in /drivers/pci/cadence falls within the PCI native host
bridge and endpoint controllers mainteinance remit, that maintainer
entry should have been updated too by adding the /drivers/pci/cadence
directory to it but it actually was not.
Update the MAINTAINERS entry accordingly, fixing the omission.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Alan Douglas <adouglas@cadence.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
|
|
The device tree code looks for CONFIG_CMDLINE_FORCE, but we were using
CONFIG_CMDLINE_OVERRIDE. It looks like this was just a hold over from
before our device tree conversion -- in fact, we'd already removed the
support for CONFIG_CMDLINE_OVERRIDE from our arch-specific code so it
didn't even work any more.
Thanks to Mortiz and Trung for finding the original bug, and for Michael
for suggeting a better fix.
CC: Trung Tran <trung.tran@ettus.com>
CC: Michael J Clark <mjc@sifive.com>
Reviewed-by: Moritz Fischer <mdf@kernel.org>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
|
|
Check on return values and goto label mbx_err are unnecessary.
Addresses-Coverity-ID: 1271151 ("Identical code for different branches")
Addresses-Coverity-ID: 1268788 ("Identical code for different branches")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
This structure is not needed since the introduction of commit
'c42687784b9a ("IB/ipoib: Scatter-Gather support in connected mode")'
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
This avoids a lot of -Wunused warnings such as:
====================
kernel/debug/debug_core.c: In function ‘kgdb_cpu_enter’:
./arch/sparc/include/asm/cmpxchg_64.h:55:22: warning: value computed is not used [-Wunused-value]
#define xchg(ptr,x) ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr))))
./arch/sparc/include/asm/atomic_64.h:86:30: note: in expansion of macro ‘xchg’
#define atomic_xchg(v, new) (xchg(&((v)->counter), new))
^~~~
kernel/debug/debug_core.c:508:4: note: in expansion of macro ‘atomic_xchg’
atomic_xchg(&kgdb_active, cpu);
^~~~~~~~~~~
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Previously the driver used pcie_get_minimum_link() to warn when the NIC
is in a slot that can't supply as much bandwidth as the NIC could use.
pcie_get_minimum_link() can be misleading because it finds the slowest link
and the narrowest link (which may be different links) without considering
the total bandwidth of each link. For a path with a 16 GT/s x1 link and a
2.5 GT/s x16 link, it returns 2.5 GT/s x1, which corresponds to 250 MB/s of
bandwidth, not the true available bandwidth of about 1969 MB/s for a
16 GT/s x1 link.
Use pcie_print_link_status() to report PCIe link speed and possible
limitations instead of implementing this in the driver itself. This finds
the slowest link in the path to the device by computing the total bandwidth
of each link and compares that with the capabilities of the device.
Note that the driver previously used dev_warn() to suggest using a
different slot, but pcie_print_link_status() uses dev_info() because if the
platform has no faster slot available, the user can't do anything about the
warning and may not want to be bothered with it.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
|
|
Use the new pci_bandwidth_available() function to calculate maximum
available bandwidth through the PCI chain instead of computing it ourselves
with mlx5e_get_pci_bw().
This is used to detect when the device is capable of more bandwidth than is
available in the current slot. The driver may adjust compression settings
accordingly.
Note that pci_bandwidth_available() accounts for PCIe encoding overhead, so
it is more accurate than mlx5e_get_pci_bw() was.
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
[bhelgaas: remove mlx5e_get_pci_bw() wrapper altogether]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
|
|
Use pcie_print_link_status() to report PCIe link speed and possible
limitations.
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
|
|
Use pcie_print_link_status() to report PCIe link speed and possible
limitations instead of implementing this in the driver itself.
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Add pcie_print_link_status(). This logs the current settings of the link
(speed, width, and total available bandwidth).
If the device is capable of more bandwidth but is limited by a slower
upstream link, we include information about the link that limits the
device's performance.
The user may be able to move the device to a different slot for better
performance.
This provides a unified method for all PCI devices to report status and
issues, instead of each device reporting in a different way, using
different code.
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
[bhelgaas: changelog, reword log messages, print device capabilities when
not limited, print bandwidth in Gb/s]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Add pcie_bandwidth_available() to compute the bandwidth available to a
device. This may be limited by the device itself or by a slower upstream
link leading to the device.
The available bandwidth at each link along the path is computed as:
link_width * link_speed * (1 - encoding_overhead)
2.5 and 5.0 GT/s links use 8b/10b encoding, which reduces the raw bandwidth
available by 20%; 8.0 GT/s and faster links use 128b/130b encoding, which
reduces it by about 1.5%.
The result is in Mb/s, i.e., megabits/second, of raw bandwidth.
Also return the device with the slowest link and the speed and width of
that link.
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
[bhelgaas: changelog, leave pcie_get_minimum_link() alone for now, return
bw directly, use pci_upstream_bridge(), check "next_bw <= bw" to find
uppermost limiting device, return speed/width of the limiting device]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
A 64-bit BAR consists of a BAR pair, where the second BAR has the
upper bits, so we cannot simply call pci_ioremap_bar() on every single
BAR index.
The second BAR in a BAR pair will not have the IORESOURCE_MEM resource
flag set. Only call ioremap on BARs that have the IORESOURCE_MEM
resource flag set.
pci 0000:01:00.0: BAR 4: assigned [mem 0xc0300000-0xc031ffff 64bit]
pci 0000:01:00.0: BAR 2: assigned [mem 0xc0320000-0xc03203ff 64bit]
pci 0000:01:00.0: BAR 0: assigned [mem 0xc0320400-0xc03204ff 64bit]
pci-endpoint-test 0000:01:00.0: can't ioremap BAR 1: [??? 0x00000000 flags 0x0]
pci-endpoint-test 0000:01:00.0: failed to read BAR1
pci-endpoint-test 0000:01:00.0: can't ioremap BAR 3: [??? 0x00000000 flags 0x0]
pci-endpoint-test 0000:01:00.0: failed to read BAR3
pci-endpoint-test 0000:01:00.0: can't ioremap BAR 5: [??? 0x00000000 flags 0x0]
pci-endpoint-test 0000:01:00.0: failed to read BAR5
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
|
|
Since a 64-bit BAR consists of a BAR pair, we need to write to both
BARs in the BAR pair to clear the BAR properly.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
|
|
Since a 64-bit BAR consists of a BAR pair, and since there is no
BAR after BAR_5, BAR_5 cannot be 64-bits wide.
This sanity check is done in pci_epc_clear_bar(), so that we don't need
to do this sanity check in all epc->ops->clear_bar() implementations.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
|
|
*epf_bar
Make epc->ops->clear_bar()/pci_epc_clear_bar() take struct *epf_bar.
This is needed so that epc->ops->clear_bar() can clear the BAR pair,
if the BAR is 64-bits wide.
This also makes it possible for pci_epc_clear_bar() to sanity check
the flags.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
|
|
If a 64-bit BAR was set-up, we need to skip a BAR,
since a 64-bit BAR consists of a BAR pair.
We need to check what BAR width the epc->ops->set_bar() specific
implementation actually did set-up, since some drivers, like the
Cadence EP controller, sometimes sets up a 64-bit BAR, even though
a 32-bit BAR was requested.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
|
|
cdns_pcie_ep_set_bar() does some round-up of the BAR size, which means
that a 64-bit BAR can be set-up, even when the flag
PCI_BASE_ADDRESS_MEM_TYPE_64 isn't set.
If a 64-bit BAR was set-up, set the flag PCI_BASE_ADDRESS_MEM_TYPE_64,
so that the calling function can know what BAR width that was actually
set-up.
I'm not sure why cdns_pcie_ep_set_bar() doesn't obey the flag
PCI_BASE_ADDRESS_MEM_TYPE_64, but I leave this for the MAINTAINER to
fix, since there might be a reason why this flag is ignored.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Alan Douglas <adouglas@cadence.com>
|
|
Since a 64-bit BAR consists of a BAR pair, we need to write to both
BARs in the BAR pair to setup the BAR properly.
Link: https://lkml.kernel.org/r/20180328115018.31921-7-niklas.cassel@axis.com
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
[lorenzo.pieralisi@arm.com: updated code according to review]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
|
|
Setting a BAR size > 4 GB is invalid if PCI_BASE_ADDRESS_MEM_TYPE_64
flag is not set.
This sanity check is done in pci_epc_set_bar(), so that we don't need
to do this sanity check in all epc->ops->set_bar() implementations.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
|
|
If flag PCI_BASE_ADDRESS_SPACE_IO is set, also having any
PCI_BASE_ADDRESS_MEM_* bit set is invalid.
This sanity check is done in pci_epc_set_bar(), so that we don't need
to do this sanity check in all epc->ops->set_bar() implementations.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
|
|
Since a 64-bit BAR consists of a BAR pair, and since there is no
BAR after BAR_5, BAR_5 cannot be 64-bits wide.
This sanity check is done in pci_epc_set_bar(), so that we don't need
to do this sanity check in all epc->ops->set_bar() implementations.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
|
|
Add barno and flags to struct epf_bar.
That way we can simplify epc->ops->set_bar()/pci_epc_set_bar()
by passing a struct *epf_bar instead of a whole lot of arguments.
This is needed so that epc->ops->set_bar() implementations can
modify BAR flags. Will be utilized in a succeeding patch.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
|
|
If a BAR supports 64-bit width or not depends on the hardware,
and should thus not depend on sizeof(dma_addr_t).
If a certain hardware doesn't support 64-bit BARs, its
epc->ops->set_bar() implementation should return -EINVAL
when PCI_BASE_ADDRESS_MEM_TYPE_64 is set.
We can't change pci_epc_set_bar() to only set
PCI_BASE_ADDRESS_MEM_TYPE_64 based on size, since if the user,
for some reason, wants to configure a BAR with a 64-bit width,
even though the BAR size is less than 4 GB, he should be able
to do that.
However, since pci-epf-test is simply a test and not an API,
we can set PCI_BASE_ADDRESS_MEM_TYPE_64 in pci-epf-test itself
only based on size.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
|