summaryrefslogtreecommitdiff
path: root/drivers/pci
AgeCommit message (Collapse)Author
2017-04-24PCI: rockchip: Update PCI config space remap functionLorenzo Pieralisi
PCI configuration space should be mapped with a memory region type that generates on the CPU host bus non-posted write transations. Update the driver to use the devm_pci_remap_cfg* interface to make sure the correct memory mappings for PCI configuration space are used. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Wenrui Li <wenrui.li@rock-chips.com> Cc: Shawn Lin <shawn.lin@rock-chips.com>
2017-04-24PCI: spear13xx: Update PCI config space remap functionLorenzo Pieralisi
PCI configuration space should be mapped with a memory region type that generate on the CPU host bus non-posted write transations. Update the driver to use the devm_pci_remap_cfg* interface to make sure the correct memory mappings for PCI configuration space are used. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Pratyush Anand <pratyush.anand@gmail.com>
2017-04-24PCI: xilinx-nwl: Update PCI config space remap functionLorenzo Pieralisi
PCI configuration space should be mapped with a memory region type that generates on the CPU host bus non-posted write transations. Update the driver to use the devm_pci_remap_cfg* interface to make sure the correct memory mappings for PCI configuration space are used. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com> Cc: Michal Simek <michal.simek@xilinx.com>
2017-04-24PCI: xilinx: Update PCI config space remap functionLorenzo Pieralisi
PCI configuration space should be mapped with a memory region type that generates on the CPU host bus non-posted write transations. Update the driver to use the devm_pci_remap_cfg* interface to make sure the correct memory mappings for PCI configuration space are used. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com> Cc: Michal Simek <michal.simek@xilinx.com>
2017-04-24PCI: ECAM: Map config region with pci_remap_cfgspace()Lorenzo Pieralisi
The current ECAM kernel implementation uses ioremap() to map the ECAM configuration space memory region; this is not safe in that on some architectures the ioremap interface provides mappings that allow posted write transactions. This, as highlighted in the PCIe specifications (4.0 - Rev0.3, "Ordering Considerations for the Enhanced Configuration Address Mechanism"), can create ordering issues for software because posted writes transactions on the CPU host bus are non posted in the PCI express fabric. Update the ioremap() interface to use pci_remap_cfgspace() whose mapping attributes guarantee that non-posted writes transactions are issued for memory writes within the ECAM memory mapped address region. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Jayachandran C <jnair@caviumnetworks.com>
2017-04-24PCI: Implement devm_pci_remap_cfgspace()Lorenzo Pieralisi
The introduction of the pci_remap_cfgspace() interface allows PCI host controller drivers to map PCI config space through a dedicated kernel interface. Current PCI host controller drivers use the devm_ioremap_*() devres interfaces to map PCI configuration space regions so in order to update them to the new pci_remap_cfgspace() mapping interface a new set of devres interfaces should be implemented so that PCI host controller drivers can make use of them. Introduce two new functions in the PCI kernel layer and Devres documentation: - devm_pci_remap_cfgspace() - devm_pci_remap_cfg_resource() so that PCI host controller drivers can make use of them to map PCI configuration space regions. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Jonathan Corbet <corbet@lwn.net>
2017-04-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Both conflict were simple overlapping changes. In the kaweth case, Eric Dumazet's skb_cow() bug fix overlapped the conversion of the driver in net-next to use in-netdev stats. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21PCI: rockchip: ModularizeBrian Norris
Now that we've exported pci_remap_iospace() and added proper remove() support, there's no reason this can't be a loadable module. Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Shawn Lin <shawn.lin@rock-chips.com>
2017-04-21PCI: Export pci_remap_iospace() and pci_unmap_iospace()Brian Norris
These are useful for PCIe host drivers, and those drivers can be modules. [bhelgaas: don't remove __weak; it's removed elsewhere] Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Shawn Lin <shawn.lin@rock-chips.com>
2017-04-21PCI: rockchip: Add remove() supportBrian Norris
Currently, if we try to unbind the platform device, the remove will succeed, but the removal won't undo most of the registration, leaving partially-configured PCI devices in the system. This allows, for example, a simple 'lspci' to crash the system, as it will try to touch the freed (via devm_*) driver structures, e.g., on RK3399: # echo f8000000.pcie > /sys/bus/platform/drivers/rockchip-pcie/unbind # lspci So let's implement device remove(). Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Shawn Lin <shawn.lin@rock-chips.com>
2017-04-20of/acpi: Configure dma operations at probe time for platform/amba/pci bus ↵Sricharan R
devices Configuring DMA ops at probe time will allow deferring device probe when the IOMMU isn't available yet. The dma_configure for the device is now called from the generic device_attach callback just before the bus/driver probe is called. This way, configuring the DMA ops for the device would be called at the same place for all bus_types, hence the deferred probing mechanism should work for all buses as well. pci_bus_add_devices (platform/amba)(_device_create/driver_register) | | pci_bus_add_device (device_add/driver_register) | | device_attach device_initial_probe | | __device_attach_driver __device_attach_driver | driver_probe_device | really_probe | dma_configure Similarly on the device/driver_unregister path __device_release_driver is called which inturn calls dma_deconfigure. This patch changes the dma ops configuration to probe time for both OF and ACPI based platform/amba/pci bus devices. Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Rob Herring <robh@kernel.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> (drivers/pci part) Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sricharan R <sricharan@codeaurora.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-04-20PCI: Call pcie_flr() from reset_chelsio_generic_dev()Christoph Hellwig
Instead of copy & pasting and old version of the code. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-20PCI: Call pcie_flr() from reset_intel_82599_sfp_virtfn()Christoph Hellwig
The 82599 quirk contained an outdated copy of the FLR code. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-20PCI: Export pcie_flr()Christoph Hellwig
Currently we opencode the FLR sequence in lots of place; export a core helper instead. We split out the probing for FLR support as all the non-core callers already know their hardware. Note that in the new pci_has_flr() function the quirk check has been moved before the capability check as there is no point in reading the capability in this case. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-20PCI: Add sysfs sriov_drivers_autoprobe to control VF driver bindingBodong Wang
Sometimes it is not desirable to bind SR-IOV VFs to drivers. This can save host side resource usage by VF instances that will be assigned to VMs. Add a new PCI sysfs interface "sriov_drivers_autoprobe" to control that from the PF. To modify it, echo 0/n/N (disable probe) or 1/y/Y (enable probe) to: /sys/bus/pci/devices/<DOMAIN:BUS:DEVICE.FUNCTION>/sriov_drivers_autoprobe Note that this must be done before enabling VFs. The change will not take effect if VFs are already enabled. Simply, one can disable VFs by setting sriov_numvfs to 0, choose whether to probe or not, and then re-enable the VFs by restoring sriov_numvfs. [bhelgaas: changelog, ABI doc] Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2017-04-20PCI: Add I/O BAR support to generic pci_mmap_resource_range()David Woodhouse
This will need to call into an arch-provided pci_iobar_pfn() function. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-20PCI: Add pci_mmap_resource_range() and use it for ARM64David Woodhouse
Starting to leave behind the legacy of the pci_mmap_page_range() interface which takes "user-visible" BAR addresses. This takes just the resource and offset. For now, both APIs coexist and depending on the platform, one is implemented as a wrapper around the other. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-20PCI: Add BAR index argument to pci_mmap_page_range()David Woodhouse
In all cases we know which BAR it is. Passing it in means that arch code (or generic code; watch this space) won't have to go looking for it again. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-20PCI: Use BAR index in sysfs attr->private instead of resource pointerDavid Woodhouse
We store the pointer, and then on *every* use of it we loop over the device's resources to find out the index. That's kind of silly. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-20Annotate hardware config module parameters in drivers/pci/hotplug/David Howells
When the kernel is running in secure boot mode, we lock down the kernel to prevent userspace from modifying the running kernel image. Whilst this includes prohibiting access to things like /dev/mem, it must also prevent access by means of configuring driver modules in such a way as to cause a device to access or modify the kernel image. To this end, annotate module_param* statements that refer to hardware configuration and indicate for future reference what type of parameter they specify. The parameter parser in the core sees this information and can skip such parameters with an error message if the kernel is locked down. The module initialisation then runs as normal, but just sees whatever the default values for those parameters is. Note that we do still need to do the module initialisation because some drivers have viable defaults set in case parameters aren't specified and some drivers support automatic configuration (e.g. PNP or PCI) in addition to manually coded parameters. This patch annotates drivers in drivers/pci/hotplug/. Suggested-by: Alan Cox <gnomes@lxorguk.ukuu.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> cc: Scott Murray <scott@spiteful.org> cc: linux-pci@vger.kernel.org
2017-04-19PCI: Remove __weak tag from pci_remap_iospace()Lorenzo Pieralisi
pci_remap_iospace() is marked as a weak symbol even though no architecture is currently overriding it; given that its implementation internals have already code paths that are arch specific (ie PCI_IOBASE and ioremap_page_range() attributes) there is no need to leave the weak symbol in the kernel since the same functionality can be achieved by customizing per-arch the corresponding functionality. Remove the __weak symbol from pci_remap_iospace(). Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Arnd Bergmann <arnd@arndb.de>
2017-04-19PCI: Don't resize resources when realigning all devices in systemYongji Xie
The "pci=resource_alignment" argument aligns BARs of designated devices by artificially increasing their size. Increasing the size increases the alignment and prevents other resources from being assigned in the same alignment region, e.g., in the same page, but it can break drivers that use the BAR size to locate things, e.g., ilo_map_device() does this: off = pci_resource_len(pdev, bar) - 0x2000; The new pcibios_default_alignment() interface allows an arch to request that *all* BARs in the system be aligned to a larger size. In this case, we don't need to artificially increase the resource size because we know every BAR of every device will be realigned, so nothing will share the same alignment region. Use IORESOURCE_STARTALIGN to request realignment of PCI BARs when we know we're realigning all BARs in the system. [bhelgaas: comment, changelog] Signed-off-by: Yongji Xie <elohimes@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-19PCI: Don't reassign resources that are already alignedBjorn Helgaas
The "pci=resource_alignment=" kernel argument designates devices for which we want alignment greater than is required by the PCI specs. Previously we set IORESOURCE_UNSET for every MEM resource of those devices, even if the resource was *already* sufficiently aligned. If a resource is already sufficiently aligned, leave it alone and don't try to reassign it. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-19PCI: Factor pci_reassigndev_resource_alignment()Bjorn Helgaas
Pull the BAR size adjustment out into a new function, pci_request_resource_alignment(), and add a comment about how and why we increase the resource size and alignment. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-19PCI: Add pcibios_default_alignment() for arch-specific alignment controlYongji Xie
When VFIO passes through a PCI device to a guest, it does not allow the guest to mmap BARs that are smaller than PAGE_SIZE unless it can reserve the rest of the page (see vfio_pci_probe_mmaps()). This is because a page might contain several small BARs for unrelated devices and a guest should not be able to access all of them. VFIO emulates guest accesses to non-mappable BARs, which is functional but slow. On systems with large page sizes, e.g., PowerNV with 64K pages, BARs are more likely to share a page and performance is more likely to be a problem. Add a weak function to set default alignment for all PCI devices. An arch can override it to force the PCI core to place memory BARs on their own pages. Signed-off-by: Yongji Xie <elohimes@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-19PCI: Include PCI-to-PCIe bridges as "Downstream Ports"Bjorn Helgaas
A PCI/PCI-X to PCI Express bridge, sometimes referred to as a "reverse bridge", is a bridge with conventional PCI or PCI-X on its primary side and a PCI Express Port on its secondary (downstream) side. That PCIe Port is a Downstream Port and could be connected to a slot, just like a Root Port or a Switch Downstream Port. Make pcie_downstream_port() return true for them, so we can access the Slot registers in the PCIe capability. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-18PCI: Freeze PME scan before suspending devicesLukas Wunner
Laurent Pinchart reported that the Renesas R-Car H2 Lager board (r8a7790) crashes during suspend tests. Geert Uytterhoeven managed to reproduce the issue on an M2-W Koelsch board (r8a7791): It occurs when the PME scan runs, once per second. During PME scan, the PCI host bridge (rcar-pci) registers are accessed while its module clock has already been disabled, leading to the crash. One reproducer is to configure s2ram to use "s2idle" instead of "deep" suspend: # echo 0 > /sys/module/printk/parameters/console_suspend # echo s2idle > /sys/power/mem_sleep # echo mem > /sys/power/state Another reproducer is to write either "platform" or "processors" to /sys/power/pm_test. It does not (or is less likely) to happen during full system suspend ("core" or "none") because system suspend also disables timers, and thus the workqueue handling PME scans no longer runs. Geert believes the issue may still happen in the small window between disabling module clocks and disabling timers: # echo 0 > /sys/module/printk/parameters/console_suspend # echo platform > /sys/power/pm_test # Or "processors" # echo mem > /sys/power/state (Make sure CONFIG_PCI_RCAR_GEN2 and CONFIG_USB_OHCI_HCD_PCI are enabled.) Rafael Wysocki agrees that PME scans should be suspended before the host bridge registers become inaccessible. To that end, queue the task on a workqueue that gets frozen before devices suspend. Rafael notes however that as a result, some wakeup events may be missed if they are delivered via PME from a device without working IRQ (which hence must be polled) and occur after the workqueue has been frozen. If that turns out to be an issue in practice, it may be possible to solve it by calling pci_pme_list_scan() once directly from one of the host bridge's pm_ops callbacks. Stacktrace for posterity: PM: Syncing filesystems ... [ 38.566237] done. PM: Preparing system for sleep (mem) Freezing user space processes ... [ 38.579813] (elapsed 0.001 seconds) done. Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. PM: Suspending system (mem) PM: suspend of devices complete after 152.456 msecs PM: late suspend of devices complete after 2.809 msecs PM: noirq suspend of devices complete after 29.863 msecs suspend debug: Waiting for 5 second(s). Unhandled fault: asynchronous external abort (0x1211) at 0x00000000 pgd = c0003000 [00000000] *pgd=80000040004003, *pmd=00000000 Internal error: : 1211 [#1] SMP ARM Modules linked in: CPU: 1 PID: 20 Comm: kworker/1:1 Not tainted 4.9.0-rc1-koelsch-00011-g68db9bc814362e7f #3383 Hardware name: Generic R8A7791 (Flattened Device Tree) Workqueue: events pci_pme_list_scan task: eb56e140 task.stack: eb58e000 PC is at pci_generic_config_read+0x64/0x6c LR is at rcar_pci_cfg_base+0x64/0x84 pc : [<c041d7b4>] lr : [<c04309a0>] psr: 600d0093 sp : eb58fe98 ip : c041d750 fp : 00000008 r10: c0e2283c r9 : 00000000 r8 : 600d0013 r7 : 00000008 r6 : eb58fed6 r5 : 00000002 r4 : eb58feb4 r3 : 00000000 r2 : 00000044 r1 : 00000008 r0 : 00000000 Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user Control: 30c5387d Table: 6a9f6c80 DAC: 55555555 Process kworker/1:1 (pid: 20, stack limit = 0xeb58e210) Stack: (0xeb58fe98 to 0xeb590000) fe80: 00000002 00000044 fea0: eb6f5800 c041d9b0 eb58feb4 00000008 00000044 00000000 eb78a000 eb78a000 fec0: 00000044 00000000 eb9aff00 c0424bf0 eb78a000 00000000 eb78a000 c0e22830 fee0: ea8a6fc0 c0424c5c eaae79c0 c0424ce0 eb55f380 c0e22838 eb9a9800 c0235fbc ff00: eb55f380 c0e22838 eb55f380 eb9a9800 eb9a9800 eb58e000 eb9a9824 c0e02100 ff20: eb55f398 c02366c4 eb56e140 eb5631c0 00000000 eb55f380 c023641c 00000000 ff40: 00000000 00000000 00000000 c023a928 cd105598 00000000 40506a34 eb55f380 ff60: 00000000 00000000 dead4ead ffffffff ffffffff eb58ff74 eb58ff74 00000000 ff80: 00000000 dead4ead ffffffff ffffffff eb58ff90 eb58ff90 eb58ffac eb5631c0 ffa0: c023a844 00000000 00000000 c0206d68 00000000 00000000 00000000 00000000 ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 3a81336c 10ccd1dd [<c041d7b4>] (pci_generic_config_read) from [<c041d9b0>] (pci_bus_read_config_word+0x58/0x80) [<c041d9b0>] (pci_bus_read_config_word) from [<c0424bf0>] (pci_check_pme_status+0x34/0x78) [<c0424bf0>] (pci_check_pme_status) from [<c0424c5c>] (pci_pme_wakeup+0x28/0x54) [<c0424c5c>] (pci_pme_wakeup) from [<c0424ce0>] (pci_pme_list_scan+0x58/0xb4) [<c0424ce0>] (pci_pme_list_scan) from [<c0235fbc>] (process_one_work+0x1bc/0x308) [<c0235fbc>] (process_one_work) from [<c02366c4>] (worker_thread+0x2a8/0x3e0) [<c02366c4>] (worker_thread) from [<c023a928>] (kthread+0xe4/0xfc) [<c023a928>] (kthread) from [<c0206d68>] (ret_from_fork+0x14/0x2c) Code: ea000000 e5903000 f57ff04f e3a00000 (e5843000) ---[ end trace 667d43ba3aa9e589 ]--- Fixes: df17e62e5bff ("PCI: Add support for polling PME state on suspended legacy PCI devices") Reported-and-tested-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reported-and-tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: stable@vger.kernel.org # 2.6.37+ Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Cc: Simon Horman <horms+renesas@verge.net.au> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Matthew Garrett <mjg59@srcf.ucam.org>
2017-04-18PCI: Fix calculation of bridge window's size and alignmentYongji Xie
In case that one device's alignment is greater than its size, we may get an incorrect size and alignment for its bus's memory window in pbus_size_mem(). Fix this case. Signed-off-by: Yongji Xie <elohimes@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-18PCI: Ignore requested alignment for IOV BARsYongji Xie
We would call pci_reassigndev_resource_alignment() before pci_init_capabilities(). So the requested alignment would never work for IOV BARs. Furthermore, it's meaningless to request additional alignment for IOV BARs, the IOV BAR alignment is only determined by the VF BAR size. Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
2017-04-18PCI: Make PCI_ROM_ADDRESS_MASK a 32-bit constantMatthias Kaehlcke
A 64-bit value is not needed since a PCI ROM address consists in 32 bits. This fixes a clang warning about "implicit conversion from 'unsigned long' to 'u32'". Also remove now unnecessary casts to u32 from __pci_read_base() and pci_std_update_resource(). Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-18PCI: Improve __pci_read_base() robustnessMarc Gonzalez
Local variables 'l' and 'sz' are uninitialized. Normally, they would be initialized by pci_read_config_dword() but when an error occurs, some drivers immediately return an error code, which leaves the argument uninitialized. Provide a safe initial value to make the code more robust. Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-18PCI/irq: Add pci_request_irq() and pci_free_irq() helpersChristoph Hellwig
These are small wrappers around request_threaded_irq() and free_irq(), which dynamically allocate space for the device name so that drivers don't need to keep static buffers for these around. Additionally it works with device-relative vector numbers to make the usage easier, and force the IRQF_SHARED flag on given that it has no runtime overhead and should be supported by all PCI devices. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-18PCI: Add arch_can_pci_mmap_io() on architectures which can mmap() I/O spaceDavid Woodhouse
This is relatively esoteric, and knowing that we don't have it makes life easier in some cases rather than just an eventual -EINVAL from pci_mmap_page_range(). Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-18PCI: Add arch_can_pci_mmap_wc() macroDavid Woodhouse
Most of the almost-identical versions of pci_mmap_page_range() silently ignore the 'write_combine' argument and give uncached mappings. Yet we allow the PCIIOC_WRITE_COMBINE ioctl in /proc/bus/pci, expose the 'resourceX_wc' file in sysfs, and allow an attempted mapping to apparently succeed. To fix this, introduce a macro arch_can_pci_mmap_wc() which indicates whether the platform can do a write-combining mapping. On x86 this ends up being pat_enabled(), while the few other platforms that support it can just set it to a literal '1'. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-18PCI: Only allow WC mmap on prefetchable resourcesDavid Woodhouse
The /proc/bus/pci mmap interface allows the user to specify whether they want WC or not. Don't let them do so on non-prefetchable BARs. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org
2017-04-18PCI: Fix another sanity check bug in /proc/pci mmapDavid Woodhouse
Don't match MMIO maps with I/O BARs and vice versa. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org
2017-04-18PCI: hv: Convert hv_pci_dev.refs from atomic_t to refcount_tElena Reshetova
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com>
2017-04-17PCI: Avoid generating invalid ThunderX2 DMA aliasesJayachandran C
On Cavium ThunderX2 arm64 SoCs (formerly known as Broadcom Vulcan), the PCI topology is slightly unusual. For a multi-node system, it looks like: 00:00.0 PCI bridge to [bus 01-1e] 01:0a.0 PCI-to-PCIe bridge to [bus 02-04] 02:00.0 PCIe Root Port bridge to [bus 03-04] (XLATE_ROOT) 03:00.0 PCIe Endpoint pci_for_each_dma_alias() assumes IOMMU translation is done at the root of the PCI hierarchy. It generates 03:00.0, 01:0a.0, and 00:00.0 as DMA aliases for 03:00.0 because buses 01 and 00 are non-PCIe buses that don't carry the Requester ID. Because the ThunderX2 IOMMU is at 02:00.0, the Requester IDs 01:0a.0 and 00:00.0 are never valid for the endpoint. This quirk stops alias generation at the XLATE_ROOT bridge so we won't generate 01:0a.0 or 00:00.0. The current IOMMU code only maps the last alias (this is a separate bug in itself). Prior to this quirk, we only created IOMMU mappings for the invalid Requester ID 00:00:0, which never matched any DMA transactions. With this quirk, we create IOMMU mappings for a valid Requester ID, which fixes devices with no aliases but leaves devices with aliases still broken. The last alias for the endpoint is also used by the ARM GICv3 MSI-X code. Without this quirk, the GIC Interrupt Translation Tables are setup with the invalid Requester ID, and the MSI-X generated by the device fails to be translated and routed. Link: https://bugzilla.kernel.org/show_bug.cgi?id=195447 Signed-off-by: Jayachandran C <jnair@caviumnetworks.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Acked-by: David Daney <david.daney@cavium.com>
2017-04-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts were simply overlapping changes. In the net/ipv4/route.c case the code had simply moved around a little bit and the same fix was made in both 'net' and 'net-next'. In the net/sched/sch_generic.c case a fix in 'net' happened at the same time that a new argument was added to qdisc_hash_add(). Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13PCI: Add device flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOTJayachandran C
Add a new quirk flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT to limit the DMA alias search to go no further than the bridge where the IOMMU unit is attached. The flag will be used to indicate a bridge device which forwards the address translation requests to the IOMMU, i.e., where the interrupt and DMA requests leave the PCIe hierarchy and go into the system blocks. Usually this happens at the PCI RC, so this flag is not needed. But on systems where there are bridges that introduce aliases above the IOMMU, this flag prevents pci_for_each_dma_alias() from generating aliases that the IOMMU will never see. The function pci_for_each_dma_alias() is updated to stop when it see a bridge with this flag set. Link: https://bugzilla.kernel.org/show_bug.cgi?id=195447 Signed-off-by: Jayachandran C <jnair@caviumnetworks.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Acked-by: David Daney <david.daney@cavium.com>
2017-04-12PCI: Fix pci_mmap_fits() for HAVE_PCI_RESOURCE_TO_USER platformsDavid Woodhouse
In the PCI_MMAP_PROCFS case when the address being passed by the user is a 'user visible' resource address based on the bus window, and not the actual contents of the resource, that's what we need to be checking it against. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org
2017-04-12PCI: Add bridge DMA alias quirk for ITE 8893 bridgeJarod Wilson
The ITE 8893 bridge has the same problems as the ITE 8892, which were resulting in crippling an older PCI 1Gbps NIC down to 45Mbps throughput with IOMMU and VT-d enabled. With the patch, this old e1000 goes back up to ~900Mbps. Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2017-04-12switchtec: Add IOCTLs to the Switchtec driverLogan Gunthorpe
Add a couple of special IOCTLs to: * Inform userspace of firmware partition locations * Pass event counts and allow userspace to wait on events * Translate PFF numbers used by the switch to port numbers [Dan Carpenter <dan.carpenter@oracle.com>: fix off-by-one in ioctl_event_ctl()] Tested-by: Krishna Dhulipala <krishnad@fb.com> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Stephen Bates <stephen.bates@microsemi.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Wei Zhang <wzhang@fb.com> Reviewed-by: Jens Axboe <axboe@fb.com>
2017-04-12switchtec: Add sysfs attributes to the Switchtec driverLogan Gunthorpe
Add a few read-only sysfs attributes which provide some device information that is exposed from the devices, primarily component and device names and versions. These are documented in Documentation/ABI/testing/sysfs-class-switchtec. Tested-by: Krishna Dhulipala <krishnad@fb.com> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Stephen Bates <stephen.bates@microsemi.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Wei Zhang <wzhang@fb.com> Reviewed-by: Jens Axboe <axboe@fb.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12PCI: hisi: Fix DT binding (hisi-pcie-almost-ecam)Dongdong Liu
The "hisilicon,pcie-almost-ecam" binding goes against the usual DT conventions, and is non-sensical in that it describes the IP based on what it isn't. Fix the DT binding with "hisilicon,hip06-pcie-ecam" and "hisilicon,hip07-pcie-ecam". Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-11PCI: rockchip: Set PCI_EXP_LNKSTA_SLC in the Root PortShawn Lin
All platforms using Rockchip use a common clock for the Root Port and the slot connected to it. Indicate this by setting the Slot Clock Configuration (PCI_EXP_LNKSTA_SLC) bit in the Root Port's Link Status. Per the Implementation Note in the spec (PCIe r3.1, sec 7.8.7), if the downstream component also sets PCI_EXP_LNKSTA_SLC, software may set the Common Clock Configuration (PCI_EXP_LNKCTL_CCC) bits on both ends of the Link. This is done by pcie_aspm_configure_common_clock(). Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Cc: Brian Norris <briannorris@chromium.org> Cc: jeffy.chen <jeffy.chen@rock-chips.com>
2017-04-11PCI: endpoint: functions: Add an EP function to test PCIKishon Vijay Abraham I
Adds a new endpoint function driver (to program the virtual test device) making use of the EP-core library. [bhelgaas: fold in pci_epf_test_probe() -ENOMEM test from Wei Yongjun <weiyongjun1@huawei.com>] Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-11PCI: endpoint: Create configfs entry for EPC device and EPF driverKishon Vijay Abraham I
Invoke APIs provided by pci-ep-cfs to create configfs entry for every EPC device and EPF driver to help users in creating EPF device and binding the EPF device to the EPC device. Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-11PCI: endpoint: Introduce configfs entry for configuring EP functionsKishon Vijay Abraham I
Introduce a new configfs entry to configure the EP function (like configuring the standard configuration header entries) and to bind the EP function with EP controller. Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-11PCI: endpoint: Add EP core layer to enable EP controller and EP functionsKishon Vijay Abraham I
Introduce a new EP core layer in order to support endpoint functions in linux kernel. This comprises the EPC library (Endpoint Controller Library) and EPF library (Endpoint Function Library). EPC library implements functions specific to an endpoint controller and EPF library implements functions specific to an endpoint function. Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Acked-by: Joao Pinto <jpinto@synopsys.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>