summaryrefslogtreecommitdiff
path: root/arch/powerpc/include/asm/fadump-internal.h
AgeCommit message (Collapse)Author
2024-05-10powerpc/fadump: setup additional parameters for dump capture kernelHari Bathini
For fadump case, passing additional parameters to dump capture kernel helps in minimizing the memory footprint for it and also provides the flexibility to disable components/modules, like hugepages, that are hindering the boot process of the special dump capture environment. Set up a dedicated parameter area to be passed to the capture kernel. This area type is defined as RTAS_FADUMP_PARAM_AREA. Sysfs attribute '/sys/kernel/fadump/bootargs_append' is exported to the userspace to specify the additional parameters to be passed to the capture kernel Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240509115755.519982-3-hbathini@linux.ibm.com
2024-05-10powerpc/pseries/fadump: add support for multiple boot memory regionsHari Bathini
Currently, fadump on pseries assumes a single boot memory region even though f/w supports more than one boot memory region. Add support for more boot memory regions to make the implementation flexible for any enhancements that introduce other region types. For this, rtas memory structure for fadump is updated to have multiple boot memory regions instead of just one. Additionally, methods responsible for creating the fadump memory structure during both the first and second kernel boot have been modified to take these multiple boot memory regions into account. Also, a new callback has been added to the fadump_ops structure to get the maximum boot memory regions supported by the platform. Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240509115755.519982-2-hbathini@linux.ibm.com
2024-04-29powerpc: make fadump resilient with memory add/remove eventsSourabh Jain
Due to changes in memory resources caused by either memory hotplug or online/offline events, the elfcorehdr, which describes the CPUs and memory of the crashed kernel to the kernel that collects the dump (known as second/fadump kernel), becomes outdated. Consequently, attempting dump collection with an outdated elfcorehdr can lead to failed or inaccurate dump collection. Memory hotplug or online/offline events is referred as memory add/remove events in reset of the commit message. The current solution to address the aforementioned issue is as follows: Monitor memory add/remove events in userspace using udev rules, and re-register fadump whenever there are changes in memory resources. This leads to the creation of a new elfcorehdr with updated system memory information. There are several notable issues associated with re-registering fadump for every memory add/remove events. 1. Bulk memory add/remove events with udev-based fadump re-registration can lead to race conditions and, more importantly, it creates a wide window during which fadump is inactive until all memory add/remove events are settled. 2. Re-registering fadump for every memory add/remove event is inefficient. 3. The memory for elfcorehdr is allocated based on the memblock regions available during early boot and remains fixed thereafter. However, if elfcorehdr is later recreated with additional memblock regions, its size will increase, potentially leading to memory corruption. Address the aforementioned challenges by shifting the creation of elfcorehdr from the first kernel (also referred as the crashed kernel), where it was created and frequently recreated for every memory add/remove event, to the fadump kernel. As a result, the elfcorehdr only needs to be created once, thus eliminating the necessity to re-register fadump during memory add/remove events. At present, the first kernel prepares fadump header and stores it in the fadump reserved area. The fadump header includes the start address of the elfcorehdr, crashing CPU details, and other relevant information. In the event of a crash in the first kernel, the second/fadump boots and accesses the fadump header prepared by the first kernel. It then performs the following steps in a platform-specific function [rtas|opal]_fadump_process: 1. Sanity check for fadump header 2. Update CPU notes in elfcorehdr Along with the above, update the setup_fadump()/fadump.c to create elfcorehdr and set its address to the global variable elfcorehdr_addr for the vmcore module to process it in the second/fadump kernel. Section below outlines the information required to create the elfcorehdr and the changes made to make it available to the fadump kernel if it's not already. To create elfcorehdr, the following crashed kernel information is required: CPU notes, vmcoreinfo, and memory ranges. At present, the CPU notes are already prepared in the fadump kernel, so no changes are needed in that regard. The fadump kernel has access to all crashed kernel memory regions, including boot memory regions that are relocated by firmware to fadump reserved areas, so no changes for that either. However, it is necessary to add new members to the fadump header, i.e., the 'fadump_crash_info_header' structure, in order to pass the crashed kernel's vmcoreinfo address and its size to fadump kernel. In addition to the vmcoreinfo address and size, there are a few other attributes also added to the fadump_crash_info_header structure. 1. version: It stores the fadump header version, which is currently set to 1. This provides flexibility to update the fadump crash info header in the future without changing the magic number. For each change in the fadump header, the version will be increased. This will help the updated kernel determine how to handle kernel dumps from older kernels. The magic number remains relevant for checking fadump header corruption. 2. pt_regs_sz/cpu_mask_sz: Store size of pt_regs and cpu_mask structure of first kernel. These attributes are used to prevent dump processing if the sizes of pt_regs or cpu_mask structure differ between the first and fadump kernels. Note: if either first/crashed kernel or second/fadump kernel do not have the changes introduced here then kernel fail to collect the dump and prints relevant error message on the console. Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240422195932.1583833-2-sourabhjain@linux.ibm.com
2022-04-26powerpc/fadump: save CPU reg data in vmcore when PHYP terminates LPARHari Bathini
An LPAR can be terminated by the POWER Hypervisor (PHYP) for various reasons. If FADump was configured when PHYP terminates the LPAR, platform-assisted dump is initiated to save the kernel dump. But CPU register data would not be processed/saved in the vmcore in such case because CPU mask is set in crash_fadump() at the time of kernel crash and it remains unset in this case with LPAR being terminated by PHYP abruptly. To get around the problem, initialize cpu_mask to cpu_possible_mask so as to ensure all possible CPUs' register data is processed for the vmcore generated on PHYP terminated LPAR. Also, rename the crash info member variable from online_mask to cpu_mask as it doesn't necessarily have to be online CPU mask always. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220404182137.59231-1-hbathini@linux.ibm.com
2022-03-22cma: factor out minimum alignment requirementDavid Hildenbrand
Patch series "mm: enforce pageblock_order < MAX_ORDER". Having pageblock_order >= MAX_ORDER seems to be able to happen in corner cases and some parts of the kernel are not prepared for it. For example, Aneesh has shown [1] that such kernels can be compiled on ppc64 with 64k base pages by setting FORCE_MAX_ZONEORDER=8, which will run into a WARN_ON_ONCE(order >= MAX_ORDER) in comapction code right during boot. We can get pageblock_order >= MAX_ORDER when the default hugetlb size is bigger than the maximum allocation granularity of the buddy, in which case we are no longer talking about huge pages but instead gigantic pages. Having pageblock_order >= MAX_ORDER can only make alloc_contig_range() of such gigantic pages more likely to succeed. Reliable use of gigantic pages either requires boot time allcoation or CMA, no need to overcomplicate some places in the kernel to optimize for corner cases that are broken in other areas of the kernel. This patch (of 2): Let's enforce pageblock_order < MAX_ORDER and simplify. Especially patch #1 can be regarded a cleanup before: [PATCH v5 0/6] Use pageblock_order for cma and alloc_contig_range alignment. [2] [1] https://lkml.kernel.org/r/87r189a2ks.fsf@linux.ibm.com [2] https://lkml.kernel.org/r/20220211164135.1803616-1-zi.yan@sent.com Link: https://lkml.kernel.org/r/20220214174132.219303-2-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Acked-by: Rob Herring <robh@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Frank Rowand <frowand.list@gmail.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: John Garry via iommu <iommu@lists.linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-23powerpc/kernel: Add __init attribute to eligible functionsNick Child
Some functions defined in `arch/powerpc/kernel` (and one in `arch/powerpc/ kexec`) are deserving of an `__init` macro attribute. These functions are only called by other initialization functions and therefore should inherit the attribute. Also, change function declarations in header files to include `__init`. Signed-off-by: Nick Child <nick.child@ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211216220035.605465-2-nick.child@ibm.com
2020-05-04powerpc/fadump: use static allocation for reserved memory rangesHari Bathini
At times, memory ranges have to be looked up during early boot, when kernel couldn't be initialized for dynamic memory allocation. In fact, reserved-ranges look up is needed during FADump memory reservation. Without accounting for reserved-ranges in reserving memory for FADump, MPIPL boot fails with memory corruption issues. So, extend memory ranges handling to support static allocation and populate reserved memory ranges during early boot. Fixes: dda9dbfeeb7a ("powerpc/fadump: consider reserved ranges while releasing memory") Cc: stable@vger.kernel.org Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/158737294432.26700.4830263187856221314.stgit@hbathini.in.ibm.com
2019-09-14powerpc/fadump: support holes in kernel boot memory areaHari Bathini
With support to copy multiple kernel boot memory regions owing to copy size limitation, also handle holes in the memory area to be preserved. Support as many as 128 kernel boot memory regions. This allows having an adequate FADump capture kernel size for different scenarios. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821385448.5656.6124791213910877759.stgit@hbathini.in.ibm.com
2019-09-14powerpc/fadump: remove RMA_START and RMA_END macrosHari Bathini
RMA_START is defined as '0' and there is even a BUILD_BUG_ON() to make sure it is never anything else. Remove this macro and use '0' instead as code change is needed anyway when it has to be something else. Also, remove unused RMA_END macro. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821384096.5656.15026984053970204652.stgit@hbathini.in.ibm.com
2019-09-14powerpc/fadump: consider f/w load areaHari Bathini
OPAL loads kernel & initrd at 512MB offset (256MB size), also exported as ibm,opal/dump/fw-load-area. So, if boot memory size of FADump is less than 768MB, kernel memory to be exported as '/proc/vmcore' would be overwritten by f/w while loading kernel & initrd. To avoid such a scenario, enforce a minimum boot memory size of 768MB on OPAL platform and skip using FADump if a newer F/W version loads kernel & initrd above 768MB. Also, irrespective of RMA size, set the minimum boot memory size expected on pseries platform at 320MB. This is to avoid inflating the minimum memory requirements on systems with 512M/1024M RMA size. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821381414.5656.1592867278535469652.stgit@hbathini.in.ibm.com
2019-09-14powerpc/fadump: add support to preserve crash data on FADUMP disabled kernelHari Bathini
Add a new kernel config option, CONFIG_PRESERVE_FA_DUMP that ensures that crash data, from previously crash'ed kernel, is preserved. This helps in cases where FADump is not enabled but the subsequent memory preserving kernel boot is likely to process this crash data. One typical usecase for this config option is petitboot kernel. As OPAL allows registering address with it in the first kernel and retrieving it after MPIPL, use it to store the top of boot memory. A kernel that intends to preserve crash data retrieves it and avoids using memory beyond this address. Move arch_reserved_kernel_pages() function as it is needed for both FA_DUMP and PRESERVE_FA_DUMP configurations. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821375751.5656.11459483669542541602.stgit@hbathini.in.ibm.com
2019-09-14powerpc/fadump: make crash memory ranges array allocation genericHari Bathini
Make allocate_crash_memory_ranges() and free_crash_memory_ranges() functions generic to reuse them for memory management of all types of dynamic memory range arrays. This change helps in memory management of reserved ranges array to be added later. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821369863.5656.4375667005352155892.stgit@hbathini.in.ibm.com
2019-09-14powerpc/fadump: process architected register state data provided by firmwareHari Bathini
Firmware provides architected register state data at the time of crash. Process this data and build CPU notes to append to ELF core. In case this data is missing or in unsupported format, at least append crashing CPU's register data, to have something to work with in the vmcore file. Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821367702.5656.5546683836236508389.stgit@hbathini.in.ibm.com
2019-09-14powerpc/fadump: reset metadata address during clean upHari Bathini
During kexec boot, metadata address needs to be reset to avoid running into errors interpreting stale metadata address, in case the kexec'ed kernel crashes before metadata address could be setup again. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821346629.5656.10783321582005237813.stgit@hbathini.in.ibm.com
2019-09-14powerpc/fadump: register kernel metadata address with opalHari Bathini
OPAL allows registering address with it in the first kernel and retrieving it after MPIPL. Setup kernel metadata and register its address with OPAL to use it for processing the crash dump. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821345011.5656.13567765019032928471.stgit@hbathini.in.ibm.com
2019-09-14powerpc/fadump: add fadump support on powernvHari Bathini
Add basic callback functions for FADump on PowerNV platform. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821342072.5656.4346362203141486452.stgit@hbathini.in.ibm.com
2019-09-14pseries/fadump: define RTAS register/un-register callback functionsHari Bathini
Move platform specific register/un-register code, the RTAS calls, to register/un-register callback functions. This would also mean moving code that initializes and prints the platform specific FADump data. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821332856.5656.16380417702046411631.stgit@hbathini.in.ibm.com
2019-09-14powerpc/fadump: introduce callbacks for platform specific operationsHari Bathini
Introduce callback functions for platform specific operations like register, unregister, invalidate & such. Also, define place-holders for the same on pSeries platform. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821330286.5656.15538934400074110770.stgit@hbathini.in.ibm.com
2019-09-14powerpc/fadump: move rtas specific definitions to platform codeHari Bathini
Currently, FADump is only supported on pSeries but that is going to change soon with FADump support being added on PowerNV platform. So, move rtas specific definitions to platform code to allow FADump to have multiple platforms support. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821328494.5656.16219929140866195511.stgit@hbathini.in.ibm.com
2019-09-14powerpc/fadump: declare helper functions in internal header fileHari Bathini
Declare helper functions, that can be reused by multiple platforms, in the internal header file. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821320487.5656.2660730464212209984.stgit@hbathini.in.ibm.com
2019-09-14powerpc/fadump: add helper functionsHari Bathini
Add helper functions to setup & free CPU notes buffer and to find if a given memory area is contiguous. Also, use boolean as return type for the function that finds if boot memory area is contiguous. While at it, save the virtual address of CPU notes buffer instead of physical address as virtual address is used often. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821318971.5656.9281936950510635858.stgit@hbathini.in.ibm.com
2019-09-14powerpc/fadump: move internal macros/definitions to a new headerHari Bathini
Though asm/fadump.h is meant to be used by other components dealing with FADump, it also has macros/definitions internal to FADump code. Move them to a new header file used within FADump code. This also makes way for refactoring platform specific FADump code. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821313134.5656.6597770626574392140.stgit@hbathini.in.ibm.com