summaryrefslogtreecommitdiff
path: root/arch/powerpc/sysdev
AgeCommit message (Collapse)Author
2021-08-10powerpc/xive: Add support for IRQ domain hierarchyCédric Le Goater
This adds handlers to allocate/free IRQs in a domain hierarchy. We could try to use xive_irq_domain_map() in xive_irq_domain_alloc() but we rely on xive_irq_alloc_data() to set the IRQ handler data and duplicating the code is simpler. xive_irq_free_data() needs to be called when IRQ are freed to clear the MMIO mappings and free the XIVE handler data, xive_irq_data structure. This is going to be a problem with MSI domains which we will address later. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-4-clg@kaod.org
2021-07-05powerpc/xive: Fix error handling when allocating an IPICédric Le Goater
This is a smatch warning: arch/powerpc/sysdev/xive/common.c:1161 xive_request_ipi() warn: unsigned 'xid->irq' is never less than zero. Fixes: fd6db2892eba ("powerpc/xive: Modernize XIVE-IPI domain with an 'alloc' handler") Cc: stable@vger.kernel.org # v5.13 Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701152412.1507612-1-clg@kaod.org
2021-07-02Merge tag 'powerpc-5.14-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - A big series refactoring parts of our KVM code, and converting some to C. - Support for ARCH_HAS_SET_MEMORY, and ARCH_HAS_STRICT_MODULE_RWX on some CPUs. - Support for the Microwatt soft-core. - Optimisations to our interrupt return path on 64-bit. - Support for userspace access to the NX GZIP accelerator on PowerVM on Power10. - Enable KUAP and KUEP by default on 32-bit Book3S CPUs. - Other smaller features, fixes & cleanups. Thanks to: Andy Shevchenko, Aneesh Kumar K.V, Arnd Bergmann, Athira Rajeev, Baokun Li, Benjamin Herrenschmidt, Bharata B Rao, Christophe Leroy, Daniel Axtens, Daniel Henrique Barboza, Finn Thain, Geoff Levand, Haren Myneni, Jason Wang, Jiapeng Chong, Joel Stanley, Jordan Niethe, Kajol Jain, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Nick Desaulniers, Paul Mackerras, Russell Currey, Sathvika Vasireddy, Shaokun Zhang, Stephen Rothwell, Sudeep Holla, Suraj Jitindar Singh, Tom Rix, Vaibhav Jain, YueHaibing, Zhang Jianhua, and Zhen Lei. * tag 'powerpc-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (218 commits) powerpc: Only build restart_table.c for 64s powerpc/64s: move ret_from_fork etc above __end_soft_masked powerpc/64s/interrupt: clean up interrupt return labels powerpc/64/interrupt: add missing kprobe annotations on interrupt exit symbols powerpc/64: enable MSR[EE] in irq replay pt_regs powerpc/64s/interrupt: preserve regs->softe for NMI interrupts powerpc/64s: add a table of implicit soft-masked addresses powerpc/64e: remove implicit soft-masking and interrupt exit restart logic powerpc/64e: fix CONFIG_RELOCATABLE build warnings powerpc/64s: fix hash page fault interrupt handler powerpc/4xx: Fix setup_kuep() on SMP powerpc/32s: Fix setup_{kuap/kuep}() on SMP powerpc/interrupt: Use names in check_return_regs_valid() powerpc/interrupt: Also use exit_must_hard_disable() on PPC32 powerpc/sysfs: Replace sizeof(arr)/sizeof(arr[0]) with ARRAY_SIZE powerpc/ptrace: Refactor regs_set_return_{msr/ip} powerpc/ptrace: Move set_return_regs_changed() before regs_set_return_{msr/ip} powerpc/stacktrace: Fix spurious "stale" traces in raise_backtrace_ipi() powerpc/pseries/vas: Include irqdomain.h powerpc: mark local variables around longjmp as volatile ...
2021-06-25powerpc/64s: avoid reloading (H)SRR registers if they are still validNicholas Piggin
When an interrupt is taken, the SRR registers are set to return to where it left off. Unless they are modified in the meantime, or the return address or MSR are modified, there is no need to reload these registers when returning from interrupt. Introduce per-CPU flags that track the validity of SRR and HSRR registers. These are cleared when returning from interrupt, when using the registers for something else (e.g., OPAL calls), when adjusting the return address or MSR of a context, and when context switching (which changes the return address and MSR). This improves the performance of interrupt returns. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fold in fixup patch from Nick] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210617155116.2167984-5-npiggin@gmail.com
2021-06-21powerpc/xics: Add a native ICS backend for microwattBenjamin Herrenschmidt
This is a simple native ICS backend that matches the layout of the Microwatt implementation of ICS. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> [mpe: Add empty ics_native_init() to unbreak non-microwatt builds] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> fixup-ics Link: https://lore.kernel.org/r/YMwW8cxrwB2W5EUN@thinks.paulus.ozlabs.org
2021-06-10powerpc: Move the use of irq_domain_add_nomap() behind a config optionMarc Zyngier
Only a handful of old PPC systems are still using the old 'nomap' variant of the irqdomain library. Move the associated definitions behind a configuration option, which will allow us to make some more radical changes. Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-06-10powerpc: Convert irq_domain_add_legacy_isa use to irq_domain_add_legacyMarc Zyngier
irq_domain_add_legacy_isa is a pain. It only exists for the benefit of two PPC-specific drivers, and creates an ugly dependency between asm/irq.h and linux/irqdomain.h Instead, let's convert these two drivers to irq_domain_add_legacy(), stop using NUM_ISA_INTERRUPTS by directly setting NR_IRQS_LEGACY. The dependency cannot be broken yet as there is a lot of PPC-related code that depends on it, but that's the first step towards it. A followup patch will remove irq_domain_add_legacy_isa. Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-06-10powerpc: Add missing linux/{of.h,irqdomain.h} include directivesMarc Zyngier
A bunch of PPC files are missing the inclusion of linux/of.h and linux/irqdomain.h, relying on transitive inclusion from another file. As we are about to break this dependency, make sure these dependencies are explicit. Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-04-30Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc updates from Andrew Morton: "A few misc subsystems and some of MM. 175 patches. Subsystems affected by this patch series: ia64, kbuild, scripts, sh, ocfs2, kfifo, vfs, kernel/watchdog, and mm (slab-generic, slub, kmemleak, debug, pagecache, msync, gup, memremap, memcg, pagemap, mremap, dma, sparsemem, vmalloc, documentation, kasan, initialization, pagealloc, and memory-failure)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (175 commits) mm/memory-failure: unnecessary amount of unmapping mm/mmzone.h: fix existing kernel-doc comments and link them to core-api mm: page_alloc: ignore init_on_free=1 for debug_pagealloc=1 net: page_pool: use alloc_pages_bulk in refill code path net: page_pool: refactor dma_map into own function page_pool_dma_map SUNRPC: refresh rq_pages using a bulk page allocator SUNRPC: set rq_page_end differently mm/page_alloc: inline __rmqueue_pcplist mm/page_alloc: optimize code layout for __alloc_pages_bulk mm/page_alloc: add an array-based interface to the bulk page allocator mm/page_alloc: add a bulk page allocator mm/page_alloc: rename alloced to allocated mm/page_alloc: duplicate include linux/vmalloc.h mm, page_alloc: avoid page_to_pfn() in move_freepages() mm/Kconfig: remove default DISCONTIGMEM_MANUAL mm: page_alloc: dump migrate-failed pages mm/mempolicy: fix mpol_misplaced kernel-doc mm/mempolicy: rewrite alloc_pages_vma documentation mm/mempolicy: rewrite alloc_pages documentation mm/mempolicy: rename alloc_pages_current to alloc_pages ...
2021-04-30Merge tag 'powerpc-5.13-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Enable KFENCE for 32-bit. - Implement EBPF for 32-bit. - Convert 32-bit to do interrupt entry/exit in C. - Convert 64-bit BookE to do interrupt entry/exit in C. - Changes to our signal handling code to use user_access_begin/end() more extensively. - Add support for time namespaces (CONFIG_TIME_NS) - A series of fixes that allow us to reenable STRICT_KERNEL_RWX. - Other smaller features, fixes & cleanups. Thanks to Alexey Kardashevskiy, Andreas Schwab, Andrew Donnellan, Aneesh Kumar K.V, Athira Rajeev, Bhaskar Chowdhury, Bixuan Cui, Cédric Le Goater, Chen Huang, Chris Packham, Christophe Leroy, Christopher M. Riedl, Colin Ian King, Dan Carpenter, Daniel Axtens, Daniel Henrique Barboza, David Gibson, Davidlohr Bueso, Denis Efremov, dingsenjie, Dmitry Safonov, Dominic DeMarco, Fabiano Rosas, Ganesh Goudar, Geert Uytterhoeven, Geetika Moolchandani, Greg Kurz, Guenter Roeck, Haren Myneni, He Ying, Jiapeng Chong, Jordan Niethe, Laurent Dufour, Lee Jones, Leonardo Bras, Li Huafei, Madhavan Srinivasan, Mahesh Salgaonkar, Masahiro Yamada, Nathan Chancellor, Nathan Lynch, Nicholas Piggin, Oliver O'Halloran, Paul Menzel, Pu Lehui, Randy Dunlap, Ravi Bangoria, Rosen Penev, Russell Currey, Santosh Sivaraj, Sebastian Andrzej Siewior, Segher Boessenkool, Shivaprasad G Bhat, Srikar Dronamraju, Stephen Rothwell, Thadeu Lima de Souza Cascardo, Thomas Gleixner, Tony Ambardar, Tyrel Datwyler, Vaibhav Jain, Vincenzo Frascino, Xiongwei Song, Yang Li, Yu Kuai, and Zhang Yunkai. * tag 'powerpc-5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (302 commits) powerpc/signal32: Fix erroneous SIGSEGV on RT signal return powerpc: Avoid clang uninitialized warning in __get_user_size_allowed powerpc/papr_scm: Mark nvdimm as unarmed if needed during probe powerpc/kvm: Fix build error when PPC_MEM_KEYS/PPC_PSERIES=n powerpc/kasan: Fix shadow start address with modules powerpc/kernel/iommu: Use largepool as a last resort when !largealloc powerpc/kernel/iommu: Align size for IOMMU_PAGE_SIZE() to save TCEs powerpc/44x: fix spelling mistake in Kconfig "varients" -> "variants" powerpc/iommu: Annotate nested lock for lockdep powerpc/iommu: Do not immediately panic when failed IOMMU table allocation powerpc/iommu: Allocate it_map by vmalloc selftests/powerpc: remove unneeded semicolon powerpc/64s: remove unneeded semicolon powerpc/eeh: remove unneeded semicolon powerpc/selftests: Add selftest to test concurrent perf/ptrace events powerpc/selftests/perf-hwbreak: Add testcases for 2nd DAWR powerpc/selftests/perf-hwbreak: Coalesce event creation code powerpc/selftests/ptrace-hwbreak: Add testcases for 2nd DAWR powerpc/configs: Add IBMVNIC to some 64-bit configs selftests/powerpc: Add uaccess flush test ...
2021-04-30powerpc/xive: remove unnecessary unmap_kernel_rangeNicholas Piggin
iounmap will remove ptes. Link: https://lkml.kernel.org/r/20210322021806.892164-4-npiggin@gmail.com Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Cédric Le Goater <clg@kaod.org> Cc: Uladzislau Rezki <urezki@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-23powerpc/iommu: Do not immediately panic when failed IOMMU table allocationAlexey Kardashevskiy
Most platforms allocate IOMMU table structures (specifically it_map) at the boot time and when this fails - it is a valid reason for panic(). However the powernv platform allocates it_map after a device is returned to the host OS after being passed through and this happens long after the host OS booted. It is quite possible to trigger the it_map allocation panic() and kill the host even though it is not necessary - the host OS can still use the DMA bypass mode (requires a tiny fraction of it_map's memory) and even if that fails, the host OS is runnnable as it was without the device for which allocating it_map causes the panic. Instead of immediately crashing in a powernv/ioda2 system, this prints an error and continues. All other platforms still call panic(). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Leonardo Bras <leobras.c@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210216033307.69863-3-aik@ozlabs.ru
2021-04-17powerpc/xive: Use the "ibm, chip-id" property only under PowerNVCédric Le Goater
The 'chip_id' field of the XIVE CPU structure is used to choose a target for a source located on the same chip. For that, the XIVE driver queries the chip identifier from the "ibm,chip-id" property and compares it to a 'src_chip' field identifying the chip of a source. This information is only available on the PowerNV platform, 'src_chip' being assigned to XIVE_INVALID_CHIP_ID under pSeries. The "ibm,chip-id" property is also not available on all platforms. It was first introduced on PowerNV and later, under QEMU for pSeries/KVM. However, the property is not part of PAPR and does not exist under pSeries/PowerVM. Assign 'chip_id' to XIVE_INVALID_CHIP_ID by default and let the PowerNV platform override the value with the "ibm,chip-id" property. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210413130352.1183267-1-clg@kaod.org
2021-04-14powerpc/xive: Modernize XIVE-IPI domain with an 'alloc' handlerCédric Le Goater
Instead of calling irq_create_mapping() to map the IPI for a node, introduce an 'alloc' handler. This is usually an extension to support hierarchy irq_domains which is not exactly the case for XIVE-IPI domain. However, we can now use the irq_domain_alloc_irqs() routine which allocates the IRQ descriptor on the specified node, even better for cache performance on multi node machines. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331144514.892250-10-clg@kaod.org
2021-04-14powerpc/xive: Map one IPI interrupt per nodeCédric Le Goater
ipistorm [*] can be used to benchmark the raw interrupt rate of an interrupt controller by measuring the number of IPIs a system can sustain. When applied to the XIVE interrupt controller of POWER9 and POWER10 systems, a significant drop of the interrupt rate can be observed when crossing the second node boundary. This is due to the fact that a single IPI interrupt is used for all CPUs of the system. The structure is shared and the cache line updates impact greatly the traffic between nodes and the overall IPI performance. As a workaround, the impact can be reduced by deactivating the IRQ lockup detector ("noirqdebug") which does a lot of accounting in the Linux IRQ descriptor structure and is responsible for most of the performance penalty. As a fix, this proposal allocates an IPI interrupt per node, to be shared by all CPUs of that node. It solves the scaling issue, the IRQ lockup detector still has an impact but the XIVE interrupt rate scales linearly. It also improves the "noirqdebug" case as showed in the tables below. * P9 DD2.2 - 2s * 64 threads "noirqdebug" Mint/s Mint/s chips cpus IPI/sys IPI/chip IPI/chip IPI/sys -------------------------------------------------------------- 1 0-15 4.984023 4.875405 4.996536 5.048892 0-31 10.879164 10.544040 10.757632 11.037859 0-47 15.345301 14.688764 14.926520 15.310053 0-63 17.064907 17.066812 17.613416 17.874511 2 0-79 11.768764 21.650749 22.689120 22.566508 0-95 10.616812 26.878789 28.434703 28.320324 0-111 10.151693 31.397803 31.771773 32.388122 0-127 9.948502 33.139336 34.875716 35.224548 * P10 DD1 - 4s (not homogeneous) 352 threads "noirqdebug" Mint/s Mint/s chips cpus IPI/sys IPI/chip IPI/chip IPI/sys -------------------------------------------------------------- 1 0-15 2.409402 2.364108 2.383303 2.395091 0-31 6.028325 6.046075 6.089999 6.073750 0-47 8.655178 8.644531 8.712830 8.724702 0-63 11.629652 11.735953 12.088203 12.055979 0-79 14.392321 14.729959 14.986701 14.973073 0-95 12.604158 13.004034 17.528748 17.568095 2 0-111 9.767753 13.719831 19.968606 20.024218 0-127 6.744566 16.418854 22.898066 22.995110 0-143 6.005699 19.174421 25.425622 25.417541 0-159 5.649719 21.938836 27.952662 28.059603 0-175 5.441410 24.109484 31.133915 31.127996 3 0-191 5.318341 24.405322 33.999221 33.775354 0-207 5.191382 26.449769 36.050161 35.867307 0-223 5.102790 29.356943 39.544135 39.508169 0-239 5.035295 31.933051 42.135075 42.071975 0-255 4.969209 34.477367 44.655395 44.757074 4 0-271 4.907652 35.887016 47.080545 47.318537 0-287 4.839581 38.076137 50.464307 50.636219 0-303 4.786031 40.881319 53.478684 53.310759 0-319 4.743750 43.448424 56.388102 55.973969 0-335 4.709936 45.623532 59.400930 58.926857 0-351 4.681413 45.646151 62.035804 61.830057 [*] https://github.com/antonblanchard/ipistorm Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331144514.892250-9-clg@kaod.org
2021-04-14powerpc/xive: Fix xmon command "dxi"Cédric Le Goater
When under xmon, the "dxi" command dumps the state of the XIVE interrupts. If an interrupt number is specified, only the state of the associated XIVE interrupt is dumped. This form of the command lacks an irq_data parameter which is nevertheless used by xmon_xive_get_irq_config(), leading to an xmon crash. Fix that by doing a lookup in the system IRQ mapping to query the IRQ descriptor data. Invalid interrupt numbers, or not belonging to the XIVE IRQ domain, OPAL event interrupt number for instance, should be caught by the previous query done at the firmware level. Fixes: 97ef27507793 ("powerpc/xive: Fix xmon support on the PowerNV platform") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Tested-by: Greg Kurz <groug@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331144514.892250-8-clg@kaod.org
2021-04-14powerpc/xive: Simplify the dump of XIVE interrupts under xmonCédric Le Goater
Move the xmon routine under XIVE subsystem and rework the loop on the interrupts taking into account the xive_irq_domain to filter out IPIs. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331144514.892250-7-clg@kaod.org
2021-04-14powerpc/xive: Drop check on irq_data in xive_core_debug_show()Cédric Le Goater
When looping on IRQ descriptor, irq_data is always valid. Fixes: 930914b7d528 ("powerpc/xive: Add a debugfs file to dump internal XIVE state") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331144514.892250-6-clg@kaod.org
2021-04-14powerpc/xive: Simplify xive_core_debug_show()Cédric Le Goater
Now that the IPI interrupt has its own domain, the checks on the HW interrupt number XIVE_IPI_HW_IRQ and on the chip can be replaced by a check on the domain. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331144514.892250-5-clg@kaod.org
2021-04-14powerpc/xive: Remove useless check on XIVE_IPI_HW_IRQCédric Le Goater
The IPI interrupt has its own domain now. Testing the HW interrupt number is not needed anymore. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331144514.892250-4-clg@kaod.org
2021-04-14powerpc/xive: Introduce an IPI interrupt domainCédric Le Goater
The IPI interrupt is a special case of the XIVE IRQ domain. When mapping and unmapping the interrupts in the Linux interrupt number space, the HW interrupt number 0 (XIVE_IPI_HW_IRQ) is checked to distinguish the IPI interrupt from other interrupts of the system. Simplify the XIVE interrupt domain by introducing a specific domain for the IPI. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331144514.892250-3-clg@kaod.org
2021-04-13of: net: pass the dst buffer to of_get_mac_address()Michael Walle
of_get_mac_address() returns a "const void*" pointer to a MAC address. Lately, support to fetch the MAC address by an NVMEM provider was added. But this will only work with platform devices. It will not work with PCI devices (e.g. of an integrated root complex) and esp. not with DSA ports. There is an of_* variant of the nvmem binding which works without devices. The returned data of a nvmem_cell_read() has to be freed after use. On the other hand the return of_get_mac_address() points to some static data without a lifetime. The trick for now, was to allocate a device resource managed buffer which is then returned. This will only work if we have an actual device. Change it, so that the caller of of_get_mac_address() has to supply a buffer where the MAC address is written to. Unfortunately, this will touch all drivers which use the of_get_mac_address(). Usually the code looks like: const char *addr; addr = of_get_mac_address(np); if (!IS_ERR(addr)) ether_addr_copy(ndev->dev_addr, addr); This can then be simply rewritten as: of_get_mac_address(np, ndev->dev_addr); Sometimes is_valid_ether_addr() is used to test the MAC address. of_get_mac_address() already makes sure, it just returns a valid MAC address. Thus we can just test its return code. But we have to be careful if there are still other sources for the MAC address before the of_get_mac_address(). In this case we have to keep the is_valid_ether_addr() call. The following coccinelle patch was used to convert common cases to the new style. Afterwards, I've manually gone over the drivers and fixed the return code variable: either used a new one or if one was already available use that. Mansour Moufid, thanks for that coccinelle patch! <spml> @a@ identifier x; expression y, z; @@ - x = of_get_mac_address(y); + x = of_get_mac_address(y, z); <... - ether_addr_copy(z, x); ...> @@ identifier a.x; @@ - if (<+... x ...+>) {} @@ identifier a.x; @@ if (<+... x ...+>) { ... } - else {} @@ identifier a.x; expression e; @@ - if (<+... x ...+>@e) - {} - else + if (!(e)) {...} @@ expression x, y, z; @@ - x = of_get_mac_address(y, z); + of_get_mac_address(y, z); ... when != x </spml> All drivers, except drivers/net/ethernet/aeroflex/greth.c, were compile-time tested. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-29powerpc/fsl-pci: Fix section mismatch warningMichael Ellerman
Section mismatch in reference from the function .fsl_add_bridge() to the function .init.text:.setup_pci_cmd() fsl_add_bridge() is not __init, and can't be, and is the only caller of setup_pci_cmd(). Fix it by making setup_pci_cmd() non-init. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210314093341.132986-1-mpe@ellerman.id.au
2021-03-29powerpc/xive: use true and false for bool variableYang Li
fixed the following coccicheck: ./arch/powerpc/sysdev/xive/spapr.c:552:8-9: WARNING: return of 0/1 in function 'xive_spapr_match' with return type bool Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1615793096-83758-1-git-send-email-yang.lee@linux.alibaba.com
2020-12-11powerpc/xive: Improve error reporting of OPAL callsCédric Le Goater
Introduce a vp_err() macro to standardize error reporting. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201210171450.1933725-13-clg@kaod.org
2020-12-11powerpc/xive: Simplify xive_do_source_eoi()Cédric Le Goater
Previous patches removed the need of the first argument which was a hack for Firwmware EOI. Remove it and flatten the routine which has became simpler. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201210171450.1933725-12-clg@kaod.org
2020-12-11powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_EOI_FWCédric Le Goater
This flag was used to support the P9 DD1 and we have stopped supporting this CPU when DD2 came out. See skiboot commit: https://github.com/open-power/skiboot/commit/0b0d15e3c170 Also, remove eoi handler which is now unused. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201210171450.1933725-11-clg@kaod.org
2020-12-11powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_MASK_FWCédric Le Goater
This flag was used to support the PHB4 LSIs on P9 DD1 and we have stopped supporting this CPU when DD2 came out. See skiboot commit: https://github.com/open-power/skiboot/commit/0b0d15e3c170 Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201210171450.1933725-10-clg@kaod.org
2020-12-11powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_SHIFT_BUGCédric Le Goater
This flag was used to support the PHB4 LSIs on P9 DD1 and we have stopped supporting this CPU when DD2 came out. See skiboot commit: https://github.com/open-power/skiboot/commit/0b0d15e3c170 Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201210171450.1933725-9-clg@kaod.org
2020-12-11powerpc/xive: Add a debug_show handler to the XIVE irq_domainCédric Le Goater
Full state of the Linux interrupt descriptors can be dumped under debugfs when compiled with CONFIG_GENERIC_IRQ_DEBUGFS. Add support for the XIVE interrupt controller. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201210171450.1933725-7-clg@kaod.org
2020-12-11powerpc/xive: Add a name to the IRQ domainCédric Le Goater
We hope one day to handle multiple irq_domain in the XIVE driver. Start simple by setting the name using the DT node. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201210171450.1933725-6-clg@kaod.org
2020-12-11powerpc/xive: Introduce XIVE_IPI_HW_IRQCédric Le Goater
The XIVE driver deals with CPU IPIs in a peculiar way. Each CPU has its own XIVE IPI interrupt allocated at the HW level, for PowerNV, or at the hypervisor level for pSeries. In practice, these interrupts are not always used. pSeries/PowerVM prefers local doorbells for local threads since they are faster. On PowerNV, global doorbells are also preferred for the same reason. The mapping in the Linux is reduced to a single interrupt using HW interrupt number 0 and a custom irq_chip to handle EOI. This can cause performance issues in some benchmark (ipistorm) on multichip systems. Clarify the use of the 0 value, it will help in improving multichip support. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201210171450.1933725-4-clg@kaod.org
2020-12-11powerpc/xive: Rename XIVE_IRQ_NO_EOI to show its a flagCédric Le Goater
This is a simple cleanup to identify easily all flags of the XIVE interrupt structure. The interrupts flagged with XIVE_IRQ_FLAG_NO_EOI are the escalations used to wake up vCPUs in KVM. They are handled very differently from the rest. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201210171450.1933725-3-clg@kaod.org
2020-11-19powerpc: sysdev: add missing iounmap() on error in mpic_msgr_probe()Qinglang Miao
I noticed that iounmap() of msgr_block_addr before return from mpic_msgr_probe() in the error handling case is missing. So use devm_ioremap() instead of just ioremap() when remapping the message register block, so the mapping will be automatically released on probe failure. Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201028091551.136400-1-miaoqinglang@huawei.com
2020-09-18powerpc/xive: Make debug routines staticCédric Le Goater
This fixes a compile error with W=1. CC arch/powerpc/sysdev/xive/common.o ../arch/powerpc/sysdev/xive/common.c:1568:6: error: no previous prototype for ‘xive_debug_show_cpu’ [-Werror=missing-prototypes] void xive_debug_show_cpu(struct seq_file *m, int cpu) ^~~~~~~~~~~~~~~~~~~ ../arch/powerpc/sysdev/xive/common.c:1602:6: error: no previous prototype for ‘xive_debug_show_irq’ [-Werror=missing-prototypes] void xive_debug_show_irq(struct seq_file *m, u32 hw_irq, struct irq_data *d) ^~~~~~~~~~~~~~~~~~~ Fixes: 930914b7d528 ("powerpc/xive: Add a debugfs file to dump internal XIVE state") Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200914211007.2285999-5-clg@kaod.org
2020-08-25powerpc/icp-hv: Fix missing of_node_put() in success pathNicholas Mc Guire
Both of_find_compatible_node() and of_find_node_by_type() will return a refcounted node on success - thus for the success path the node must be explicitly released with a of_node_put(). Fixes: 0b05ac6e2480 ("powerpc/xics: Rewrite XICS driver") Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1530691407-3991-1-git-send-email-hofrat@osadl.org
2020-07-30powerpc: fix function annotations to avoid section mismatch warnings with gcc-10Vladis Dronov
Certain warnings are emitted for powerpc code when building with a gcc-10 toolset: WARNING: modpost: vmlinux.o(.text.unlikely+0x377c): Section mismatch in reference from the function remove_pmd_table() to the function .meminit.text:split_kernel_mapping() The function remove_pmd_table() references the function __meminit split_kernel_mapping(). This is often because remove_pmd_table lacks a __meminit annotation or the annotation of split_kernel_mapping is wrong. Add the appropriate __init and __meminit annotations to make modpost not complain. In all the cases there are just a single callsite from another __init or __meminit function: __meminit remove_pagetable() -> remove_pud_table() -> remove_pmd_table() __init prom_init() -> setup_secure_guest() __init xive_spapr_init() -> xive_spapr_disabled() Signed-off-by: Vladis Dronov <vdronov@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200729133741.62789-1-vdronov@redhat.com
2020-06-22powerpc/xive: Ignore kmemleak false positivesAlexey Kardashevskiy
xive_native_provision_pages() allocates memory and passes the pointer to OPAL so kmemleak cannot find the pointer usage in the kernel memory and produces a false positive report (below) (even if the kernel did scan OPAL memory, it is unable to deal with __pa() addresses anyway). This silences the warning. unreferenced object 0xc000200350c40000 (size 65536): comm "qemu-system-ppc", pid 2725, jiffies 4294946414 (age 70776.530s) hex dump (first 32 bytes): 02 00 00 00 50 00 00 00 00 00 00 00 00 00 00 00 ....P........... 01 00 08 07 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<0000000081ff046c>] xive_native_alloc_vp_block+0x120/0x250 [<00000000d555d524>] kvmppc_xive_compute_vp_id+0x248/0x350 [kvm] [<00000000d69b9c9f>] kvmppc_xive_connect_vcpu+0xc0/0x520 [kvm] [<000000006acbc81c>] kvm_arch_vcpu_ioctl+0x308/0x580 [kvm] [<0000000089c69580>] kvm_vcpu_ioctl+0x19c/0xae0 [kvm] [<00000000902ae91e>] ksys_ioctl+0x184/0x1b0 [<00000000f3e68bd7>] sys_ioctl+0x48/0xb0 [<0000000001b2c127>] system_call_exception+0x124/0x1f0 [<00000000d2b2ee40>] system_call_common+0xe8/0x214 Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200612043303.84894-1-aik@ozlabs.ru
2020-06-18maccess: rename probe_kernel_address to get_kernel_nofaultChristoph Hellwig
Better describe what this helper does, and match the naming of copy_from_kernel_nofault. Also switch the argument order around, so that it acts and looks like get_user(). Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-17maccess: rename probe_user_{read,write} to copy_{from,to}_user_nofaultChristoph Hellwig
Better describe what these functions do. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09mm: reorder includes after introduction of linux/pgtable.hMike Rapoport
The replacement of <asm/pgrable.h> with <linux/pgtable.h> made the include of the latter in the middle of asm includes. Fix this up with the aid of the below script and manual adjustments here and there. import sys import re if len(sys.argv) is not 3: print "USAGE: %s <file> <header>" % (sys.argv[0]) sys.exit(1) hdr_to_move="#include <linux/%s>" % sys.argv[2] moved = False in_hdrs = False with open(sys.argv[1], "r") as f: lines = f.readlines() for _line in lines: line = _line.rstrip(' ') if line == hdr_to_move: continue if line.startswith("#include <linux/"): in_hdrs = True elif not moved and in_hdrs: moved = True print hdr_to_move print line Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-4-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09mm: introduce include/linux/pgtable.hMike Rapoport
The include/linux/pgtable.h is going to be the home of generic page table manipulation functions. Start with moving asm-generic/pgtable.h to include/linux/pgtable.h and make the latter include asm/pgtable.h. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-3-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09mm: don't include asm/pgtable.h if linux/mm.h is already includedMike Rapoport
Patch series "mm: consolidate definitions of page table accessors", v2. The low level page table accessors (pXY_index(), pXY_offset()) are duplicated across all architectures and sometimes more than once. For instance, we have 31 definition of pgd_offset() for 25 supported architectures. Most of these definitions are actually identical and typically it boils down to, e.g. static inline unsigned long pmd_index(unsigned long address) { return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1); } static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address) { return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address); } These definitions can be shared among 90% of the arches provided XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined. For architectures that really need a custom version there is always possibility to override the generic version with the usual ifdefs magic. These patches introduce include/linux/pgtable.h that replaces include/asm-generic/pgtable.h and add the definitions of the page table accessors to the new header. This patch (of 12): The linux/mm.h header includes <asm/pgtable.h> to allow inlining of the functions involving page table manipulations, e.g. pte_alloc() and pmd_alloc(). So, there is no point to explicitly include <asm/pgtable.h> in the files that include <linux/mm.h>. The include statements in such cases are remove with a simple loop: for f in $(git grep -l "include <linux/mm.h>") ; do sed -i -e '/include <asm\/pgtable.h>/ d' $f done Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-1-rppt@kernel.org Link: http://lkml.kernel.org/r/20200514170327.31389-2-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02powerpc/rtas: Implement reentrant rtas callLeonardo Bras
Implement rtas_call_reentrant() for reentrant rtas-calls: "ibm,int-on", "ibm,int-off",ibm,get-xive" and "ibm,set-xive". On LoPAPR Version 1.1 (March 24, 2016), from 7.3.10.1 to 7.3.10.4, items 2 and 3 say: 2 - For the PowerPC External Interrupt option: The * call must be reentrant to the number of processors on the platform. 3 - For the PowerPC External Interrupt option: The * argument call buffer for each simultaneous call must be physically unique. So, these rtas-calls can be called in a lockless way, if using a different buffer for each cpu doing such rtas call. For this, it was suggested to add the buffer (struct rtas_args) in the PACA struct, so each cpu can have it's own buffer. The PACA struct received a pointer to rtas buffer, which is allocated in the memory range available to rtas 32-bit. Reentrant rtas calls are useful to avoid deadlocks in crashing, where rtas-calls are needed, but some other thread crashed holding the rtas.lock. This is a backtrace of a deadlock from a kdump testing environment: #0 arch_spin_lock #1 lock_rtas () #2 rtas_call (token=8204, nargs=1, nret=1, outputs=0x0) #3 ics_rtas_mask_real_irq (hw_irq=4100) #4 machine_kexec_mask_interrupts #5 default_machine_crash_shutdown #6 machine_crash_shutdown #7 __crash_kexec #8 crash_kexec #9 oops_end Signed-off-by: Leonardo Bras <leobras.c@gmail.com> [mpe: Move under #ifdef PSERIES to avoid build breakage] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200518234245.200672-3-leobras.c@gmail.com
2020-05-28powerpc/xive: Share the event-queue page with the Hypervisor.Ram Pai
XIVE interrupt controller uses an Event Queue (EQ) to enqueue event notifications when an exception occurs. The EQ is a single memory page provided by the O/S defining a circular buffer, one per server and priority couple. On baremetal, the EQ page is configured with an OPAL call. On pseries, an extra hop is necessary and the guest OS uses the hcall H_INT_SET_QUEUE_CONFIG to configure the XIVE interrupt controller. The XIVE controller being Hypervisor privileged, it will not be allowed to enqueue event notifications for a Secure VM unless the EQ pages are shared by the Secure VM. Hypervisor/Ultravisor still requires support for the TIMA and ESB page fault handlers. Until this is complete, QEMU can use the emulated XIVE device for Secure VMs, option "kernel_irqchip=off" on the QEMU pseries machine. Signed-off-by: Ram Pai <linuxram@us.ibm.com> Reviewed-by: Cedric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200426020518.GC5853@oc0525413822.ibm.com
2020-05-28powerpc: Remove Xilinx PPC405/PPC440 supportMichal Simek
The latest Xilinx design tools called ISE and EDK has been released in October 2013. New tool doesn't support any PPC405/PPC440 new designs. These platforms are no longer supported and tested. PowerPC 405/440 port is orphan from 2013 by commit cdeb89943bfc ("MAINTAINERS: Fix incorrect status tag") and commit 19624236cce1 ("MAINTAINERS: Update Grant's email address and maintainership") that's why it is time to remove the support fot these platforms. Signed-off-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/8c593895e2cb57d232d85ce4d8c3a1aa7f0869cc.1590079968.git.christophe.leroy@csgroup.eu
2020-05-28powerpc/xive: Do not expose a debugfs file when XIVE is disabledCédric Le Goater
The XIVE interrupt mode can be disabled with the "xive=off" kernel parameter, in which case there is nothing to present to the user in the associated /sys/kernel/debug/powerpc/xive file. Fixes: 930914b7d528 ("powerpc/xive: Add a debugfs file to dump internal XIVE state") Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200429075122.1216388-4-clg@kaod.org
2020-05-26powerpc/xive: Clear the page tables for the ESB IO mappingCédric Le Goater
Commit 1ca3dec2b2df ("powerpc/xive: Prevent page fault issues in the machine crash handler") fixed an issue in the FW assisted dump of machines using hash MMU and the XIVE interrupt mode under the POWER hypervisor. It forced the mapping of the ESB page of interrupts being mapped in the Linux IRQ number space to make sure the 'crash kexec' sequence worked during such an event. But it didn't handle the un-mapping. This mapping is now blocking the removal of a passthrough IO adapter under the POWER hypervisor because it expects the guest OS to have cleared all page table entries related to the adapter. If some are still present, the RTAS call which isolates the PCI slot returns error 9001 "valid outstanding translations". Remove these mapping in the IRQ data cleanup routine. Under KVM, this cleanup is not required because the ESB pages for the adapter interrupts are un-mapped from the guest by the hypervisor in the KVM XIVE native device. This is now redundant but it's harmless. Fixes: 1ca3dec2b2df ("powerpc/xive: Prevent page fault issues in the machine crash handler") Cc: stable@vger.kernel.org # v5.5+ Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200429075122.1216388-2-clg@kaod.org
2020-05-26Merge "Use hugepages to map kernel mem on 8xx" into nextMichael Ellerman
Merge Christophe's large series to use huge pages for the linear mapping on 8xx. From his cover letter: The main purpose of this big series is to: - reorganise huge page handling to avoid using mm_slices. - use huge pages to map kernel memory on the 8xx. The 8xx supports 4 page sizes: 4k, 16k, 512k and 8M. It uses 2 Level page tables, PGD having 1024 entries, each entry covering 4M address space. Then each page table has 1024 entries. At the time being, page sizes are managed in PGD entries, implying the use of mm_slices as it can't mix several pages of the same size in one page table. The first purpose of this series is to reorganise things so that standard page tables can also handle 512k pages. This is done by adding a new _PAGE_HUGE flag which will be copied into the Level 1 entry in the TLB miss handler. That done, we have 2 types of pages: - PGD entries to regular page tables handling 4k/16k and 512k pages - PGD entries to hugepd tables handling 8M pages. There is no need to mix 8M pages with other sizes, because a 8M page will use more than what a single PGD covers. Then comes the second purpose of this series. At the time being, the 8xx has implemented special handling in the TLB miss handlers in order to transparently map kernel linear address space and the IMMR using huge pages by building the TLB entries in assembly at the time of the exception. As mm_slices is only for user space pages, and also because it would anyway not be convenient to slice kernel address space, it was not possible to use huge pages for kernel address space. But after step one of the series, it is now more flexible to use huge pages. This series drop all assembly 'just in time' handling of huge pages and use huge pages in page tables instead. Once the above is done, then comes icing on the cake: - Use huge pages for KASAN shadow mapping - Allow pinned TLBs with strict kernel rwx - Allow pinned TLBs with debug pagealloc Then, last but not least, those modifications for the 8xx allows the following improvement on book3s/32: - Mapping KASAN shadow with BATs - Allowing BATs with debug pagealloc All this allows to considerably simplify TLB miss handlers and associated initialisation. The overhead of reading page tables is negligible compared to the reduction of the miss handlers. While we were at touching pte_update(), some cleanup was done there too. Tested widely on 8xx and 832x. Boot tested on QEMU MAC99.
2020-05-26powerpc/8xx: Don't set IMMR map anymore at bootChristophe Leroy
Only early debug requires IMMR to be mapped early. No need to set it up and pin it in assembly. Map it through page tables at udbg init when necessary. If CONFIG_PIN_TLB_IMMR is selected, pin it once we don't need the 32 Mb pinned RAM anymore. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/13c1e8539fdf363d3146f4884e5c3c76c6c308b5.1589866984.git.christophe.leroy@csgroup.eu