summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2020-01-23powernv/pci: Move pnv_pci_dma_bus_setup() to pci-ioda.cOliver O'Halloran
This is only used in pci-ioda.c so move it there and rename it to match. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200110070207.439-6-oohall@gmail.com
2020-01-23powernv/pci: Fold pnv_pci_dma_dev_setup() into the pci-ioda.c versionOliver O'Halloran
pnv_pci_dma_dev_setup() does nothing but call the phb->dma_dev_setup() callback, if one exists. That callback is only set for normal PCIe PHBs so we can remove the layer of indirection and use the ioda version in the pci_controller_ops. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200110070207.439-5-oohall@gmail.com
2020-01-23powerpc/iov: Move VF pdev fixup into pcibios_fixup_iov()Oliver O'Halloran
An ioda_pe for each VF is allocated in pnv_pci_sriov_enable() before the pci_dev for the VF is created. We need to set the pe->pdev pointer at some point after the pci_dev is created. Currently we do that in: pcibios_bus_add_device() pnv_pci_dma_dev_setup() (via phb->ops.dma_dev_setup) /* fixup is done here */ pnv_pci_ioda_dma_dev_setup() (via pnv_phb->dma_dev_setup) The fixup needs to be done before setting up DMA for for the VF's PE, but there's no real reason to delay it until this point. Move the fixup into pnv_pci_ioda_fixup_iov() so the ordering is: pcibios_add_device() pnv_pci_ioda_fixup_iov() (via ppc_md.pcibios_fixup_sriov) pcibios_bus_add_device() ... This isn't strictly required, but it's a slightly more logical place to do the fixup and it simplifies pnv_pci_dma_dev_setup(). Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200110070207.439-4-oohall@gmail.com
2020-01-23powernv/pci: Remove dma_dev_setup() for NPU PHBsOliver O'Halloran
The pnv_pci_dma_dev_setup() only does something when: 1) There PHB contains VFs, or 2) The PHB defines a dma_dev_setup() callback in the pnv_phb structure. Neither is true for NPU PHBs so there's no reason to set the callback. Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200110070207.439-3-oohall@gmail.com
2020-01-23powerpc/pci: Fold pcibios_setup_device() into pcibios_bus_add_device()Oliver O'Halloran
pcibios_bus_add_device() is the only caller of pcibios_setup_device(). Fold them together since there's no real reason to keep them separate. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200110070207.439-2-oohall@gmail.com
2020-01-23powerpc/powernv: Allow manually invoking special rebootsOliver O'Halloran
OPAL provides several different kinds of reboot for the kernel to use, namely forcing a full reboot, platform error reboot and MPIPL. Right now triggering the alternative resets requires some ad-hoc method such as triggering a kernel crash and hoping the stars align. It's sometimes handy to be able to trigger one of these resets directly, so add a way to do that. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191101085522.3055-2-oohall@gmail.com
2020-01-23powerpc/xmon: Allow passing an argument to ppc_md.restart()Oliver O'Halloran
On PowerNV a few different kinds of reboot are supported. We'd like to be able to exercise these from xmon so allow 'zr' to take an argument, and pass that to the ppc_md.restart() function. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191101085522.3055-1-oohall@gmail.com
2020-01-23powerpc/powernv: Use common code for the symbol_map exportOliver O'Halloran
Long before we had a generic way for firmware to export memory ranges of interest we added a special case for the skiboot symbol map. The code is pretty much identical to the generic export so re-use the code. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191101062611.32610-2-oohall@gmail.com
2020-01-23powerpc/powernv: Rework exports to support subnodesOliver O'Halloran
Originally we only had a handful of exported memory ranges, but we'd to export the per-core trace buffers. This results in a lot of files in the exports directory which is a but unfortunate. We can clean things up a bit by turning subnodes into subdirectories of the exports directory. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191101062611.32610-1-oohall@gmail.com
2020-01-23powerpc/eeh: Only dump stack once if an MMIO loop is detectedOliver O'Halloran
Many drivers don't check for errors when they get a 0xFFs response from an MMIO load. As a result after an EEH event occurs a driver can get stuck in a polling loop unless it some kind of internal timeout logic. Currently EEH tries to detect and report stuck drivers by dumping a stack trace after eeh_dev_check_failure() is called EEH_MAX_FAILS times on an already frozen PE. The value of EEH_MAX_FAILS was chosen so that a dump would occur every few seconds if the driver was spinning in a loop. This results in a lot of spurious stack traces in the kernel log. Fix this by limiting it to printing one stack trace for each PE freeze. If the driver is truely stuck the kernel's hung task detector is better suited to reporting the probelm anyway. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Tested-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191016012536.22588-1-oohall@gmail.com
2020-01-23powernv/pci: Add a debugfs entry to dump PHB's IODA PE stateOliver O'Halloran
Add a debugfs entry to dump the state of the active IODA PEs. The IODA PE state reflects how the PHB's internal concept of a PE is configured. This is separate to the EEH PE state and is managed power the PowerNV PCI backend rather than the EEH core. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> [mpe: Use DEFINE_DEBUGFS_ATTRIBUTE] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190912052945.12589-3-oohall@gmail.com
2020-01-23powernv/pci: Allow any write trigger the diag dumpOliver O'Halloran
Make the dump trigger off any input rather than just '1'. This allows you to write "echo 1> dump_diag_data" and it'll do what you want rather than erroring out pointlessly. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190912052945.12589-2-oohall@gmail.com
2020-01-23powernv/pci: Use pnv_phb as the private data for debugfs entriesOliver O'Halloran
Use the pnv_phb structure as the private data pointer for the debugfs files. This lets us delete some code and an open-coded use of hose->private_data. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190912052945.12589-1-oohall@gmail.com
2020-01-23powerpc/pcidn: Warn when sriov pci_dn management is used incorrectlyOliver O'Halloran
These functions can only be used on a SR-IOV capable physical function and they're only called in pcibios_sriov_enable / disable. Make them emit a warning in the future if they're used incorrectly and remove the dead code that checks if the device is a VF. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190821062655.19735-3-oohall@gmail.com
2020-01-23powerpc/pcidn: Make VF pci_dn management CONFIG_PCI_IOV specificOliver O'Halloran
The powerpc PCI code requires that a pci_dn structure exists for all devices in the system. This is fine for real devices since at boot a pci_dn is created for each PCI device in the DT and it's fine for hotplugged devices since the hotplug slot driver will manage the pci_dn's devices in hotplug slots. For SR-IOV, we need the platform / pcibios to manage the pci_dn for virtual functions since firmware is unaware of VFs, and they aren't "hot plugged" in the traditional sense. Management of the pci_dn is handled by the, poorly named, functions: add_pci_dev_data() and remove_pci_dev_data(). The entire body of these functions is #ifdef`ed around CONFIG_PCI_IOV and they cannot be used in any other context, so make them only available when CONFIG_PCI_IOV is selected, and rename them to reflect their actual usage rather than having them masquerade as generic code. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190821062655.19735-2-oohall@gmail.com
2020-01-23powerpc/sriov: Remove VF eeh_dev state when disabling SR-IOVOliver O'Halloran
When disabling virtual functions on an SR-IOV adapter we currently do not correctly remove the EEH state for the now-dead virtual functions. When removing the pci_dn that was created for the VF when SR-IOV was enabled we free the corresponding eeh_dev without removing it from the child device list of the eeh_pe that contained it. This can result in crashes due to the use-after-free. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Tested-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190821062655.19735-1-oohall@gmail.com
2020-01-23powerpc/eeh_sysfs: Make clearing EEH_DEV_SYSFS sanerOliver O'Halloran
The eeh_sysfs_remove_device() function is supposed to clear the EEH_DEV_SYSFS flag since it indicates the EEH sysfs entries have been added for a pci_dev. When the sysfs files are removed eeh_remove_device() the eeh_dev and the pci_dev have already been de-associated. This then causes the pci_dev_to_eeh_dev() call in eeh_sysfs_remove_device() to return NULL so the flag can't be cleared from the still-live eeh_dev. This problem is worked around in the caller by clearing the flag manually. However, this behaviour doesn't make a whole lot of sense, so this patch fixes it by: a) Re-ordering eeh_remove_device() so that eeh_sysfs_remove_device() is called before de-associating the pci_dev and eeh_dev. b) Making eeh_sysfs_remove_device() emit a warning if there's no corresponding eeh_dev for a pci_dev. The paths where the sysfs files are only reachable if EEH was setup for the device for the device in the first place so hitting this warning indicates a programming error. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Tested-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190715085612.8802-6-oohall@gmail.com
2020-01-23powerpc/eeh_sysfs: Remove double pci_dn lookup.Oliver O'Halloran
In eeh_notify_resume_show() the pci_dn for the device is looked up once in the declaration block and then once after checking for a NULL eeh_dev. Remove the second lookup since it's pointless. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Tested-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190715085612.8802-5-oohall@gmail.com
2020-01-23powerpc/eeh_sysfs: ifdef pseries sr-iov sysfs propertiesOliver O'Halloran
There are several EEH sysfs properties that only exists when the "ibm,is-open-sriov-pf" property appears in the device tree node of the PCI device. This used on pseries to indicate to the guest that the hypervisor allows the guest to configure the SR-IOV capability. Doing this requires some handshaking between the guest, hypervisor and userspace when a VF is EEH frozen which is why these properties exist. This is all dead code on non-pseries platforms so wrap it in an #ifdef CONFIG_PPC_PSERIES to make the dependency clearer. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Tested-by: Sam Bobroff <sbobroff@linux.ibm.com> Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190715085612.8802-4-oohall@gmail.com
2020-01-23powerpc/eeh_sysfs: Fix incorrect commentOliver O'Halloran
The EEH_ATTR_SHOW() helper is used to display fields from struct eeh_dev not struct pci_dn. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190715085612.8802-3-oohall@gmail.com
2020-01-23powerpc/eeh_cache: Don't use pci_dn when inserting new rangesOliver O'Halloran
At the point where we start inserting ranges into the EEH address cache the binding between pci_dev and eeh_dev has already been set up. Instead of consulting the pci_dn tree we can retrieve the eeh_dev directly using pci_dev_to_eeh_dev(). Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Tested-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190715085612.8802-2-oohall@gmail.com
2020-01-23powerpc/powernv/ioda: Find opencapi slot for a device nodeFrederic Barrat
Unlike real PCI slots, opencapi slots are directly associated to the (virtual) opencapi PHB, there's no intermediate bridge. So when looking for a slot ID, we must start the search from the device node itself and not its parent. Also, the slot ID is not attached to a specific bdfn, so let's build it from the PHB ID, like skiboot. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191121134918.7155-6-fbarrat@linux.ibm.com
2020-01-23powerpc/powernv/ioda: Release opencapi deviceFrederic Barrat
With hotplug, an opencapi device can now go away. It needs to be released, mostly to clean up its PE state. We were previously not defining any device callback. We can reuse the standard PCI release callback, it does a bit too much for an opencapi device, but it's harmless, and only needs minor tuning. Also separate the undo of the PELT-V code in a separate function, it is not needed for NPU devices and it improves a bit the readability of the code. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191121134918.7155-5-fbarrat@linux.ibm.com
2020-01-23powerpc/powernv/ioda: set up PE on opencapi device when enablingFrederic Barrat
The PE for an opencapi device was set as part of a late PHB fixup operation, when creating the PHB. To use the PCI hotplug framework, this is not going to work, as the PHB stays the same, it's only the devices underneath which are updated. For regular PCI devices, it is done as part of the reconfiguration of the bridge, but for opencapi PHBs, we don't have an intermediate bridge. So let's define the PE when the device is enabled. PEs are meaningless for opencapi, the NPU doesn't define them and opal is not doing anything with them. Reviewed-by: Alastair D'Silva <alastair@d-silva.org> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191121134918.7155-4-fbarrat@linux.ibm.com
2020-01-23powerpc/powernv/ioda: Protect PE listFrederic Barrat
Protect the PHB's list of PE. Probably not needed as long as it was populated during PHB creation, but it feels right and will become required once we can add/remove opencapi devices on hotplug. Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191121134918.7155-3-fbarrat@linux.ibm.com
2020-01-23powerpc/powernv/ioda: Fix ref count for devices with their own PEFrederic Barrat
The pci_dn structure used to store a pointer to the struct pci_dev, so taking a reference on the device was required. However, the pci_dev pointer was later removed from the pci_dn structure, but the reference was kept for the npu device. See commit 902bdc57451c ("powerpc/powernv/idoa: Remove unnecessary pcidev from pci_dn"). We don't need to take a reference on the device when assigning the PE as the struct pnv_ioda_pe is cleaned up at the same time as the (physical) device is released. Doing so prevents the device from being released, which is a problem for opencapi devices, since we want to be able to remove them through PCI hotplug. Now the ugly part: nvlink npu devices are not meant to be released. Because of the above, we've always leaked a reference and simply removing it now is dangerous and would likely require more work. There's currently no release device callback for nvlink devices for example. So to be safe, this patch leaks a reference on the npu device, but only for nvlink and not opencapi. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191121134918.7155-2-fbarrat@linux.ibm.com
2020-01-23powerpc/vdso32: miscellaneous optimisationsChristophe Leroy
Various optimisations by inverting branches and removing redundant instructions. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b4e79f963845545bcce1459cd6fcfe46bdde7863.1575273217.git.christophe.leroy@c-s.fr
2020-01-23powerpc/vdso32: implement clock_getres entirelyChristophe Leroy
clock_getres returns hrtimer_res for all clocks but coarse ones for which it returns KTIME_LOW_RES. return EINVAL for unknown clocks. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/37f94e47c91070b7606fb3ec3fe6fd2302a475a0.1575273217.git.christophe.leroy@c-s.fr
2020-01-23powerpc/vdso32: use LOAD_REG_IMMEDIATE()Christophe Leroy
Use LOAD_REG_IMMEDIATE() to load registers with immediate value. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/36f111437e66e601929308f5d5dce230e1ce472f.1575273217.git.christophe.leroy@c-s.fr
2020-01-23powerpc/vdso32: Don't read cache line size from the datapage on PPC32.Christophe Leroy
On PPC32, the cache lines have a fixed size known at build time. Don't read it from the datapage. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/dfa7b35e27e01964fcda84bf1ed8b2b31cf93826.1575273217.git.christophe.leroy@c-s.fr
2020-01-23powerpc/vdso32: inline __get_datapage()Christophe Leroy
__get_datapage() is only a few instructions to retrieve the address of the page where the kernel stores data to the VDSO. By inlining this function into its users, a bl/blr pair and a mflr/mtlr pair is avoided, plus a few reg moves. The improvement is noticeable (about 55 nsec/call on an 8xx) vdsotest before the patch: gettimeofday: vdso: 731 nsec/call clock-gettime-realtime-coarse: vdso: 668 nsec/call clock-gettime-monotonic-coarse: vdso: 745 nsec/call vdsotest after the patch: gettimeofday: vdso: 677 nsec/call clock-gettime-realtime-coarse: vdso: 613 nsec/call clock-gettime-monotonic-coarse: vdso: 690 nsec/call Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c39ef7f3dfa25356b01e211d539671f279086c09.1575273217.git.christophe.leroy@c-s.fr
2020-01-23powerpc/vdso32: Add support for CLOCK_{REALTIME/MONOTONIC}_COARSEChristophe Leroy
This is copied and adapted from commit 5c929885f1bb ("powerpc/vdso64: Add support for CLOCK_{REALTIME/MONOTONIC}_COARSE") from Santosh Sivaraj <santosh@fossix.org> Benchmark from vdsotest-all: clock-gettime-realtime: syscall: 3601 nsec/call clock-gettime-realtime: libc: 1072 nsec/call clock-gettime-realtime: vdso: 931 nsec/call clock-gettime-monotonic: syscall: 4034 nsec/call clock-gettime-monotonic: libc: 1213 nsec/call clock-gettime-monotonic: vdso: 1076 nsec/call clock-gettime-realtime-coarse: syscall: 2722 nsec/call clock-gettime-realtime-coarse: libc: 805 nsec/call clock-gettime-realtime-coarse: vdso: 668 nsec/call clock-gettime-monotonic-coarse: syscall: 2949 nsec/call clock-gettime-monotonic-coarse: libc: 882 nsec/call clock-gettime-monotonic-coarse: vdso: 745 nsec/call Additional test passed with: vdsotest -d 30 clock-gettime-monotonic-coarse verify Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://github.com/linuxppc/issues/issues/41 Link: https://lore.kernel.org/r/d1d24a376e396540194eeb85a2efe481e92ade24.1575273217.git.christophe.leroy@c-s.fr
2020-01-23powerpc/32: Add VDSO version of getcpu on non SMPChristophe Leroy
Commit 18ad51dd342a ("powerpc: Add VDSO version of getcpu") added getcpu() for PPC64 only, by making use of a user readable general purpose SPR. PPC32 doesn't have any such SPR. For non SMP, just return CPU id 0 from the VDSO directly. PPC32 doesn't support CONFIG_NUMA so NUMA node is always 0. Before the patch, vdsotest reported: getcpu: syscall: 1572 nsec/call getcpu: libc: 1787 nsec/call getcpu: vdso: not tested Now, vdsotest reports: getcpu: syscall: 1582 nsec/call getcpu: libc: 502 nsec/call getcpu: vdso: 187 nsec/call Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/eaac4b6494ecff1811220fccc895bf282aab884a.1575273217.git.christophe.leroy@c-s.fr
2020-01-23powerpc/devicetrees: Change 'gpios' to 'cs-gpios' on fsl, spi nodesChristophe Leroy
Since commit 0f0581b24bd0 ("spi: fsl: Convert to use CS GPIO descriptors"), the prefered way to define chipselect GPIOs is using 'cs-gpios' property instead of the legacy 'gpios' property. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7556683b57d8ce100855857f03d1cd3d2903d045.1574943062.git.christophe.leroy@c-s.fr
2020-01-23powerpc/hw_breakpoints: Rewrite 8xx breakpoints to allow any address range size.Christophe Leroy
Unlike standard powerpc, Powerpc 8xx doesn't have SPRN_DABR, but it has a breakpoint support based on a set of comparators which allow more flexibility. Commit 4ad8622dc548 ("powerpc/8xx: Implement hw_breakpoint") implemented breakpoints by emulating the DABR behaviour. It did this by setting one comparator the match 4 bytes at breakpoint address and the other comparator to match 4 bytes at breakpoint address + 4. Rewrite 8xx hw_breakpoint to make breakpoints match all addresses defined by the breakpoint address and length by making full use of comparators. Now, comparator E is set to match any address greater than breakpoint address minus one. Comparator F is set to match any address lower than breakpoint address plus breakpoint length. Addresses are aligned to 32 bits. When the breakpoint range starts at address 0, the breakpoint is set to match comparator F only. When the breakpoint range end at address 0xffffffff, the breakpoint is set to match comparator E only. Otherwise the breakpoint is set to match comparator E and F. At the same time, use registers bit names instead of hardcoded values. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/05105deeaf63bc02151aea2cdeaf525534e0e9d4.1574790198.git.christophe.leroy@c-s.fr
2020-01-23powerpc/8xx: Fix permanently mapped IMMR region.Christophe Leroy
When not using large TLBs, the IMMR region is still mapped as a whole block in the FIXMAP area. Properly report that the IMMR region is block-mapped even when not using large TLBs. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/45f4f414bcd7198b0755cf4287ff216fbfc24b9d.1574774187.git.christophe.leroy@c-s.fr
2020-01-23powerpc/ptdump: Only enable PPC_CHECK_WX with STRICT_KERNEL_RWXChristophe Leroy
ptdump_check_wx() is called from mark_rodata_ro() which only exists when CONFIG_STRICT_KERNEL_RWX is selected. Fixes: 453d87f6a8ae ("powerpc/mm: Warn if W+X pages found on boot") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/922d4939c735c6b52b4137838bcc066fffd4fc33.1578989545.git.christophe.leroy@c-s.fr
2020-01-23powerpc/ptdump: Fix W+X verificationChristophe Leroy
Verification cannot rely on simple bit checking because on some platforms PAGE_RW is 0, checking that a page is not W means checking that PAGE_RO is set instead of checking that PAGE_RW is not set. Use pte helpers instead of checking bits. Fixes: 453d87f6a8ae ("powerpc/mm: Warn if W+X pages found on boot") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0d894839fdbb19070f0e1e4140363be4f2bb62fc.1578989540.git.christophe.leroy@c-s.fr
2020-01-23powerpc/ptdump: Fix W+X verification call in mark_rodata_ro()Christophe Leroy
ptdump_check_wx() also have to be called when pages are mapped by blocks. Fixes: 453d87f6a8ae ("powerpc/mm: Warn if W+X pages found on boot") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/37517da8310f4457f28921a4edb88fb21d27b62a.1578989531.git.christophe.leroy@c-s.fr
2020-01-23powerpc/ptdump: don't entirely rebuild kernel when selecting CONFIG_PPC_DEBUG_WXChristophe Leroy
Selecting CONFIG_PPC_DEBUG_WX only impacts ptdump and pgtable_32/64 init calls. Declaring related functions in asm/pgtable.h implies rebuilding almost everything. Move ptdump_check_wx() declaration in mm/mmu_decl.h Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/bf34fd9dca61eadf9a134a9f89ebbc162cfd5f86.1578986011.git.christophe.leroy@c-s.fr
2020-01-23powerpc/mm/hash: Fix sharing context ids between kernel & userspaceAneesh Kumar K.V
Commit 0034d395f89d ("powerpc/mm/hash64: Map all the kernel regions in the same 0xc range") has a bug in the definition of MIN_USER_CONTEXT. The result is that the context id used for the vmemmap and the lowest context id handed out to userspace are the same. The context id is essentially the process identifier as far as the first stage of the MMU translation is concerned. This can result in multiple SLB entries with the same VSID (Virtual Segment ID), accessible to the kernel and some random userspace process that happens to get the overlapping id, which is not expected eg: 07 c00c000008000000 40066bdea7000500 1T ESID= c00c00 VSID= 66bdea7 LLP:100 12 0002000008000000 40066bdea7000d80 1T ESID= 200 VSID= 66bdea7 LLP:100 Even though the user process and the kernel use the same VSID, the permissions in the hash page table prevent the user process from reading or writing to any kernel mappings. It can also lead to SLB entries with different base page size encodings (LLP), eg: 05 c00c000008000000 00006bde0053b500 256M ESID=c00c00000 VSID= 6bde0053b LLP:100 09 0000000008000000 00006bde0053bc80 256M ESID= 0 VSID= 6bde0053b LLP: 0 Such SLB entries can result in machine checks, eg. as seen on a G5: Oops: Machine check, sig: 7 [#1] BE PAGE SIZE=64K MU-Hash SMP NR_CPUS=4 NUMA Power Mac NIP: c00000000026f248 LR: c000000000295e58 CTR: 0000000000000000 REGS: c0000000erfd3d70 TRAP: 0200 Tainted: G M (5.5.0-rcl-gcc-8.2.0-00010-g228b667d8ea1) MSR: 9000000000109032 <SF,HV,EE,ME,IR,DR,RI> CR: 24282048 XER: 00000000 DAR: c00c000000612c80 DSISR: 00000400 IRQMASK: 0 ... NIP [c00000000026f248] .kmem_cache_free+0x58/0x140 LR [c088000008295e58] .putname 8x88/0xa Call Trace: .putname+0xB8/0xa .filename_lookup.part.76+0xbe/0x160 .do_faccessat+0xe0/0x380 system_call+0x5c/ex68 This happens with 256MB segments and 64K pages, as the duplicate VSID is hit with the first vmemmap segment and the first user segment, and older 32-bit userspace maps things in the first user segment. On other CPUs a machine check is not seen. Instead the userspace process can get stuck continuously faulting, with the fault never properly serviced, due to the kernel not understanding that there is already a HPTE for the address but with inaccessible permissions. On machines with 1T segments we've not seen the bug hit other than by deliberately exercising it. That seems to be just a matter of luck though, due to the typical layout of the user virtual address space and the ranges of vmemmap that are typically populated. To fix it we add 2 to MIN_USER_CONTEXT. This ensures the lowest context given to userspace doesn't overlap with the VMEMMAP context, or with the context for INVALID_REGION_ID. Fixes: 0034d395f89d ("powerpc/mm/hash64: Map all the kernel regions in the same 0xc range") Cc: stable@vger.kernel.org # v5.2+ Reported-by: Christian Marillat <marillat@debian.org> Reported-by: Romain Dolbeau <romain@dolbeau.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> [mpe: Account for INVALID_REGION_ID, mostly rewrite change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200123102547.11623-1-mpe@ellerman.id.au
2020-01-23KVM: x86: list MSR_IA32_UCODE_REV as an emulated MSRPaolo Bonzini
Even if it's read-only, it can still be written to by userspace. Let them know by adding it to KVM_GET_MSR_INDEX_LIST. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-23KVM: x86: fix overlap between SPTE_MMIO_MASK and generationPaolo Bonzini
The SPTE_MMIO_MASK overlaps with the bits used to track MMIO generation number. A high enough generation number would overwrite the SPTE_SPECIAL_MASK region and cause the MMIO SPTE to be misinterpreted. Likewise, setting bits 52 and 53 would also cause an incorrect generation number to be read from the PTE, though this was partially mitigated by the (useless if it weren't for the bug) removal of SPTE_SPECIAL_MASK from the spte in get_mmio_spte_generation. Drop that removal, and replace it with a compile-time assertion. Fixes: 6eeb4ef049e7 ("KVM: x86: assign two bits to track SPTE kinds") Reported-by: Ben Gardon <bgardon@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Alexei Starovoitov says: ==================== pull-request: bpf-next 2020-01-22 The following pull-request contains BPF updates for your *net-next* tree. We've added 92 non-merge commits during the last 16 day(s) which contain a total of 320 files changed, 7532 insertions(+), 1448 deletions(-). The main changes are: 1) function by function verification and program extensions from Alexei. 2) massive cleanup of selftests/bpf from Toke and Andrii. 3) batched bpf map operations from Brian and Yonghong. 4) tcp congestion control in bpf from Martin. 5) bulking for non-map xdp_redirect form Toke. 6) bpf_send_signal_thread helper from Yonghong. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-22Merge tag 'samsung-soc-5.6-2' of ↵Olof Johansson
https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into arm/soc Samsung mach/soc changes for v5.6, part 2 1. Switch from legacy to atomic pwm API in rx1950 (s3c24xx), 2. Cleanups of unneeded selects in Kconfig. * tag 'samsung-soc-5.6-2' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux: ARM: s3c64xx: Drop unneeded select of TIMER_OF ARM: exynos: Drop unneeded select of MIGHT_HAVE_CACHE_L2X0 ARM: s3c24xx: Switch to atomic pwm API in rx1950 Link: https://lore.kernel.org/r/20200122172649.3143-1-krzk@kernel.org Signed-off-by: Olof Johansson <olof@lixom.net>
2020-01-22MIPS: Loongson64: Select mac2008 only featureJiaxun Yang
Some Loongson-64 processor didn't set MAC2008 bit in fcsr, but actually all Loongson64 processors are MAC2008 only. Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Paul Burton <paulburton@kernel.org> Cc: linux-mips@vger.kernel.org Cc: chenhc@lemote.com Cc: paul.burton@mips.com Cc: linux-kernel@vger.kernel.org
2020-01-22MIPS: Add MAC2008 SupportJiaxun Yang
MAC2008 means the processor implemented IEEE754 style Fused MADD instruction. It was introduced in Release3 but removed in Release5. The toolchain support of MAC2008 have never landed except for Loongson processors. This patch aimed to disabled the MAC2008 if it's optional. For MAC2008 only processors, we corrected math-emu behavior to align with actual hardware behavior. Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> [paulburton@kernel.org: Fixup MIPSr2-r5 check in cpu_set_fpu_2008.] Signed-off-by: Paul Burton <paulburton@kernel.org> Cc: linux-mips@vger.kernel.org Cc: chenhc@lemote.com Cc: paul.burton@mips.com Cc: linux-kernel@vger.kernel.org
2020-01-22riscv: keep 32-bit kernel to 32-bit phys_addr_tOlof Johansson
While rv32 technically has 34-bit physical addresses, no current platforms use it and it's likely to shake out driver bugs. Let's keep 64-bit phys_addr_t off on 32-bit builds until one shows up, since other work will be needed to make such a system useful anyway. PHYS_ADDR_T_64BIT is def_bool 64BIT, so just remove the select. Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-01-22riscv: Add KASAN supportNick Hu
This patch ports the feature Kernel Address SANitizer (KASAN). Note: The start address of shadow memory is at the beginning of kernel space, which is 2^64 - (2^39 / 2) in SV39. The size of the kernel space is 2^38 bytes so the size of shadow memory should be 2^38 / 8. Thus, the shadow memory would not overlap with the fixmap area. There are currently two limitations in this port, 1. RV64 only: KASAN need large address space for extra shadow memory region. 2. KASAN can't debug the modules since the modules are allocated in VMALLOC area. We mapped the shadow memory, which corresponding to VMALLOC area, to the kasan_early_shadow_page because we don't have enough physical space for all the shadow memory corresponding to VMALLOC area. Signed-off-by: Nick Hu <nickhu@andestech.com> Reported-by: Greentime Hu <green.hu@gmail.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-01-22ARM: 8955/1: virt: Relax arch timer version check during early bootVladimir Murzin
Updates to the Generic Timer architecture allow ID_PFR1.GenTimer to have values other than 0 or 1 while still preserving backward compatibility. At the moment, Linux is quite strict in the way it handles this field at early boot and will not configure arch timer if it doesn't find the value 1. Since here use ubfx for arch timer version extraction (hyb-stub build with -march=armv7-a, so it is safe) To help backports (even though the code was correct at the time of writing) Fixes: 8ec58be9f3ff ("ARM: virt: arch_timers: enable access to physical timers") Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>