summaryrefslogtreecommitdiff
path: root/arch/x86/xen/setup.c
AgeCommit message (Collapse)Author
2019-02-18x86/xen: dont add memory above max allowed allocationJuergen Gross
Don't allow memory to be added above the allowed maximum allocation limit set by Xen. Trying to do so would result in cases like the following: [ 584.559652] ------------[ cut here ]------------ [ 584.564897] WARNING: CPU: 2 PID: 1 at ../arch/x86/xen/multicalls.c:129 xen_alloc_pte+0x1c7/0x390() [ 584.575151] Modules linked in: [ 584.578643] Supported: Yes [ 584.581750] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.4.120-92.70-default #1 [ 584.590000] Hardware name: Cisco Systems Inc UCSC-C460-M4/UCSC-C460-M4, BIOS C460M4.4.0.1b.0.0629181419 06/29/2018 [ 584.601862] 0000000000000000 ffffffff813175a0 0000000000000000 ffffffff8184777c [ 584.610200] ffffffff8107f4e1 ffff880487eb7000 ffff8801862b79c0 ffff88048608d290 [ 584.618537] 0000000000487eb7 ffffea0000000201 ffffffff81009de7 ffffffff81068561 [ 584.626876] Call Trace: [ 584.629699] [<ffffffff81019ad9>] dump_trace+0x59/0x340 [ 584.635645] [<ffffffff81019eaa>] show_stack_log_lvl+0xea/0x170 [ 584.642391] [<ffffffff8101ac51>] show_stack+0x21/0x40 [ 584.648238] [<ffffffff813175a0>] dump_stack+0x5c/0x7c [ 584.654085] [<ffffffff8107f4e1>] warn_slowpath_common+0x81/0xb0 [ 584.660932] [<ffffffff81009de7>] xen_alloc_pte+0x1c7/0x390 [ 584.667289] [<ffffffff810647f0>] pmd_populate_kernel.constprop.6+0x40/0x80 [ 584.675241] [<ffffffff815ecfe8>] phys_pmd_init+0x210/0x255 [ 584.681587] [<ffffffff815ed207>] phys_pud_init+0x1da/0x247 [ 584.687931] [<ffffffff815edb3b>] kernel_physical_mapping_init+0xf5/0x1d4 [ 584.695682] [<ffffffff815e9bdd>] init_memory_mapping+0x18d/0x380 [ 584.702631] [<ffffffff81064699>] arch_add_memory+0x59/0xf0 Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2018-12-03x86: Fix various typos in commentsIngo Molnar
Go over arch/x86/ and fix common typos in comments, and a typo in an actual function argument name. No change in functionality intended. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-11-29Revert "xen/balloon: Mark unallocated host memory as UNUSABLE"Igor Druzhinin
This reverts commit b3cf8528bb21febb650a7ecbf080d0647be40b9f. That commit unintentionally broke Xen balloon memory hotplug with "hotplug_unpopulated" set to 1. As long as "System RAM" resource got assigned under a new "Unusable memory" resource in IO/Mem tree any attempt to online this memory would fail due to general kernel restrictions on having "System RAM" resources as 1st level only. The original issue that commit has tried to workaround fa564ad96366 ("x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f, 60-7f)") also got amended by the following 03a551734 ("x86/PCI: Move and shrink AMD 64-bit window to avoid conflict") which made the original fix to Xen ballooning unnecessary. Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2018-08-20x86/xen: remove unused function xen_auto_xlated_memory_setup()Juergen Gross
xen_auto_xlated_memory_setup() is a leftover from PVH V1. Remove it. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-12-22Merge tag 'for-linus-4.15-rc5-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "This contains two fixes for running under Xen: - a fix avoiding resource conflicts between adding mmio areas and memory hotplug - a fix setting NX bits in page table entries copied from Xen when running a PV guest" * tag 'for-linus-4.15-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/balloon: Mark unallocated host memory as UNUSABLE x86-64/Xen: eliminate W+X mappings
2017-12-20xen/balloon: Mark unallocated host memory as UNUSABLEBoris Ostrovsky
Commit f5775e0b6116 ("x86/xen: discard RAM regions above the maximum reservation") left host memory not assigned to dom0 as available for memory hotplug. Unfortunately this also meant that those regions could be used by others. Specifically, commit fa564ad96366 ("x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f, 60-7f)") may try to map those addresses as MMIO. To prevent this mark unallocated host memory as E820_TYPE_UNUSABLE (thus effectively reverting f5775e0b6116) and keep track of that region as a hostmem resource that can be used for the hotplug. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-31xen: remove tests for pvh mode in pure pv pathsJuergen Gross
Remove the last tests for XENFEAT_auto_translated_physmap in pure PV-domain specific paths. PVH V1 is gone and the feature will always be "false" in PV guests. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-07-03x86: xen: remove unnecessary variable in xen_foreach_remap_area()Gustavo A. R. Silva
Remove unnecessary variable mfn in function xen_foreach_remap_area() and, refactor the code. Variable mfn at line 518:mfn = xen_remap_buf.mfns[i]; is only being used to store a value to be passed as an argument to the xen_update_mem_tables() function. This value can be passed directly, which makes variable mfn unnecessary. Also, value assigned to variable mfn at line 534:mfn = xen_remap_mfn; is never used. Addresses-Coverity-ID: 1260110 Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2017-01-28x86/boot/e820: Simplify the e820__update_table() interfaceIngo Molnar
The e820__update_table() parameters are pretty complex: arch/x86/include/asm/e820/api.h:extern int e820__update_table(struct e820_entry *biosmap, int max_nr_map, u32 *pnr_map); But 90% of the usage is trivial: arch/x86/kernel/e820.c: if (e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries)) arch/x86/kernel/e820.c: e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries); arch/x86/kernel/e820.c: e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries); arch/x86/kernel/e820.c: if (e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries) < 0) arch/x86/kernel/e820.c: e820__update_table(boot_params.e820_table, ARRAY_SIZE(boot_params.e820_table), &new_nr); arch/x86/kernel/early-quirks.c: e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries); arch/x86/kernel/setup.c: e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries); arch/x86/kernel/setup.c: e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries); arch/x86/platform/efi/efi.c: e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries); arch/x86/xen/setup.c: e820__update_table(xen_e820_table.entries, ARRAY_SIZE(xen_e820_table.entries), arch/x86/xen/setup.c: e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries); arch/x86/xen/setup.c: e820__update_table(xen_e820_table.entries, ARRAY_SIZE(xen_e820_table.entries), as it only uses an exiting struct e820_table's entries array, its size and its current number of entries as input and output arguments. Only one use is non-trivial: arch/x86/kernel/e820.c: e820__update_table(boot_params.e820_table, ARRAY_SIZE(boot_params.e820_table), &new_nr); ... which call updates the E820 table in the zeropage in-situ, and the layout there does not match that of 'struct e820_table' (in particular nr_entries is at a different offset, hardcoded by the boot protocol). Simplify all this by introducing a low level __e820__update_table() API that the zeropage update call can use, and simplifying the main e820__update_table() call signature down to: int e820__update_table(struct e820_table *table); This visibly simplifies all the call sites: arch/x86/include/asm/e820/api.h:extern int e820__update_table(struct e820_table *table); arch/x86/include/asm/e820/types.h: * call to e820__update_table() to remove duplicates. The allowance arch/x86/kernel/e820.c: * The return value from e820__update_table() is zero if it arch/x86/kernel/e820.c:int __init e820__update_table(struct e820_table *table) arch/x86/kernel/e820.c: if (e820__update_table(e820_table)) arch/x86/kernel/e820.c: e820__update_table(e820_table_firmware); arch/x86/kernel/e820.c: e820__update_table(e820_table); arch/x86/kernel/e820.c: e820__update_table(e820_table); arch/x86/kernel/e820.c: if (e820__update_table(e820_table) < 0) arch/x86/kernel/early-quirks.c: e820__update_table(e820_table); arch/x86/kernel/setup.c: e820__update_table(e820_table); arch/x86/kernel/setup.c: e820__update_table(e820_table); arch/x86/platform/efi/efi.c: e820__update_table(e820_table); arch/x86/xen/setup.c: e820__update_table(&xen_e820_table); arch/x86/xen/setup.c: e820__update_table(e820_table); arch/x86/xen/setup.c: e820__update_table(&xen_e820_table); No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28xen, x86/boot/e820: Simplify Xen's xen_e820_table constructIngo Molnar
The Xen guest memory setup code has: static struct e820_entry xen_e820_table[E820_MAX_ENTRIES] __initdata; static u32 xen_e820_table_entries __initdata; ... which is really a 'struct e820_table', open-coded. Convert the Xen code over to use a single struct e820_table, as this will allow the simplification of the e820__update_table() API. No intended change in functionality, but not runtime tested. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Clean up the E820 table size define namesIngo Molnar
We've got a number of defines related to the E820 table and its size: E820MAP E820NR E820_X_MAX E820MAX The first two denote byte offsets into the zeropage (struct boot_params), and can are not used in the kernel and can be removed. The E820_*_MAX values have an inconsistent structure and it's unclear in any case what they mean. 'X' presuably goes for extended - but it's not very expressive altogether. Change these over to: E820_MAX_ENTRIES_ZEROPAGE E820_MAX_ENTRIES ... which are self-explanatory names. No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Prefix the E820_* type names with "E820_TYPE_"Ingo Molnar
So there's a number of constants that start with "E820" but which are not types - these create a confusing mixture when seen together with 'enum e820_type' values: E820MAP E820NR E820_X_MAX E820MAX To better differentiate the 'enum e820_type' values prefix them with E820_TYPE_. No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Create coherent API function names for E820 range operationsIngo Molnar
We have these three related functions: extern void e820_add_region(u64 start, u64 size, int type); extern u64 e820_update_range(u64 start, u64 size, unsigned old_type, unsigned new_type); extern u64 e820_remove_range(u64 start, u64 size, unsigned old_type, int checktype); But it's not clear from the naming that they are 3 operations based around the same 'memory range' concept. Rename them to better signal this, and move the prototypes next to each other: extern void e820__range_add (u64 start, u64 size, int type); extern u64 e820__range_update(u64 start, u64 size, unsigned old_type, unsigned new_type); extern u64 e820__range_remove(u64 start, u64 size, unsigned old_type, int checktype); Note that this improved organization of the functions shows another problem that was easy to miss before: sometimes the E820 entry type is 'int', sometimes 'unsigned int' - but this will be fixed in a separate patch. No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Rename sanitize_e820_table() to e820__update_table()Ingo Molnar
sanitize_e820_table() is a minor misnomer in that it suggests that the E820 table requires sanitizing - which implies that it will only do anything if the E820 table is irregular (not sane). That is wrong, because sanitize_e820_table() also does a very regular sorting of the E820 table, which is a necessity in the basic append-only flow of E820 updates the kernel is allowed to perform to it. So rename it to e820__update_table() to include that purpose as well. This also lines up all the table-update functions into a coherent naming family: int e820__update_table(struct e820_entry *biosmap, int max_nr_map, u32 *pnr_map); void e820__update_table_print(void); void e820__update_table_firmware(void); No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Harmonize the 'struct e820_table' fieldsIngo Molnar
So the e820_table->map and e820_table->nr_map names are a bit confusing, because it's not clear what a 'map' really means (it could be a bitmap, or some other data structure), nor is it clear what nr_map means (is it a current index, or some other count). Rename the fields from: e820_table->map => e820_table->entries e820_table->nr_map => e820_table->nr_entries which makes it abundantly clear that these are entries of the table, and that the size of the table is ->nr_entries. Propagate the changes to all affected files. Where necessary, adjust local variable names to better reflect the new field names. No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Rename everything to e820_tableIngo Molnar
No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Rename 'e820_map' variables to 'e820_array'Ingo Molnar
In line with the rename to 'struct e820_array', harmonize the naming of common e820 table variable names as well: e820 => e820_array e820_saved => e820_array_saved e820_map => e820_array initial_e820 => e820_array_init This makes the variable names more consistent and easier to grep for. No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Rename the basic e820 data types to 'struct e820_entry' and ↵Ingo Molnar
'struct e820_array' The 'e820entry' and 'e820map' names have various annoyances: - the missing underscore departs from the usual kernel style and makes the code look weird, - in the past I kept confusing the 'map' with the 'entry', because a 'map' is ambiguous in that regard, - it's not really clear from the 'e820map' that this is a regular C array. Rename them to 'struct e820_entry' and 'struct e820_array' accordingly. ( Leave the legacy UAPI header alone but do the rename in the bootparam.h and e820/types.h file - outside tools relying on these defines should either adjust their code, or should use the legacy header, or should create their private copies for the definitions. ) No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Move asm/e820.h to asm/e820/api.hIngo Molnar
In line with asm/e820/types.h, move the e820 API declarations to asm/e820/api.h and update all usage sites. This is just a mechanical, obviously correct move & replace patch, there will be subsequent changes to clean up the code and to make better use of the new header organization. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-22xen/setup: Don't relocate p2m over existing oneRoss Lagerwall
When relocating the p2m, take special care not to relocate it so that is overlaps with the current location of the p2m/initrd. This is needed since the full extent of the current location is not marked as a reserved region in the e820. This was seen to happen to a dom0 with a large initial p2m and a small reserved region in the middle of the initial p2m. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2016-12-09xen/x86: Increase xen_e820_map to E820_X_MAX possible entriesAlex Thorlton
On systems with sufficiently large e820 tables, and several IOAPICs, it is possible for the XENMEM_machine_memory_map callback (and its counterpart, XENMEM_memory_map) to attempt to return an e820 table with more than 128 entries. This callback adds entries to the BIOS-provided e820 table to account for IOAPIC registers, which, on sufficiently large systems, can result in an e820 table that is too large to copy back into xen_e820_map. This change simply increases the size of xen_e820_map to E820_X_MAX to ensure that there is enough room to store the entire e820 map returned from this callback. Signed-off-by: Alex Thorlton <athorlton@sgi.com> Suggested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Juergen Gross <jgross@suse.com>
2016-09-21x86/e820: Prepare e280 code for switch to dynamic storageDenys Vlasenko
This patch turns e820 and e820_saved into pointers to e820 tables, of the same size as before. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/20160917213927.1787-2-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-14x86/xen: Audit and remove any unnecessary uses of module.hPaul Gortmaker
Historically a lot of these existed because we did not have a distinction between what was modular code and what was providing support to modules via EXPORT_SYMBOL and friends. That changed when we forked out support for the latter into the export.h file. This means we should be able to reduce the usage of module.h in code that is obj-y Makefile or bool Kconfig. The advantage in doing so is that module.h itself sources about 15 other headers; adding significantly to what we feed cpp, and it can obscure what headers we are effectively using. Since module.h was the source for init.h (for __init) and for export.h (for EXPORT_SYMBOL) we consider each obj-y/bool instance for the presence of either and replace as needed. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Juergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/20160714001901.31603-7-paul.gortmaker@windriver.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-24xen: use same main loop for counting and remapping pagesJuergen Gross
Instead of having two functions for cycling through the E820 map in order to count to be remapped pages and remap them later, just use one function with a caller supplied sub-function called for each region to be processed. This eliminates the possibility of a mismatch between both loops which showed up in certain configurations. Suggested-by: Ed Swierk <eswierk@skyportsystems.com> Signed-off-by: Juergen Gross <jgross@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-11-04Merge tag 'for-linus-4.4-rc0-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from David Vrabel: - Improve balloon driver memory hotplug placement. - Use unpopulated hotplugged memory for foreign pages (if supported/enabled). - Support 64 KiB guest pages on arm64. - CPU hotplug support on arm/arm64. * tag 'for-linus-4.4-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (44 commits) xen: fix the check of e_pfn in xen_find_pfn_range x86/xen: add reschedule point when mapping foreign GFNs xen/arm: don't try to re-register vcpu_info on cpu_hotplug. xen, cpu_hotplug: call device_offline instead of cpu_down xen/arm: Enable cpu_hotplug.c xenbus: Support multiple grants ring with 64KB xen/grant-table: Add an helper to iterate over a specific number of grants xen/xenbus: Rename *RING_PAGE* to *RING_GRANT* xen/arm: correct comment in enlighten.c xen/gntdev: use types from linux/types.h in userspace headers xen/gntalloc: use types from linux/types.h in userspace headers xen/balloon: Use the correct sizeof when declaring frame_list xen/swiotlb: Add support for 64KB page granularity xen/swiotlb: Pass addresses rather than frame numbers to xen_arch_need_swiotlb arm/xen: Add support for 64KB page granularity xen/privcmd: Add support for Linux 64KB page granularity net/xen-netback: Make it running on 64KB page granularity net/xen-netfront: Make it running on 64KB page granularity block/xen-blkback: Make it running on 64KB page granularity block/xen-blkfront: Make it running on 64KB page granularity ...
2015-11-02xen: fix the check of e_pfn in xen_find_pfn_rangeZhenzhong Duan
On some NUMA system, after dom0 up, we see below warning even if there are enough pfn ranges that could be used for remapping: "Unable to find available pfn range, not remapping identity pages" Fix it to avoid getting a memory region of zero size in xen_find_pfn_range. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23x86/xen: discard RAM regions above the maximum reservationDavid Vrabel
During setup, discard RAM regions that are above the maximum reservation (instead of marking them as E820_UNUSABLE). This allows hotplug memory to be placed at these addresses. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
2015-10-07x86/vdso: Remove runtime 32-bit vDSO selectionAndy Lutomirski
32-bit userspace will now always see the same vDSO, which is exactly what used to be the int80 vDSO. Subsequent patches will clean it up and make it support SYSENTER and SYSCALL using alternatives. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/e7e6b3526fa442502e6125fe69486aab50813c32.1444091584.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-28x86/xen: Do not clip xen_e820_map to xen_e820_map_entries when sanitizing mapMalcolm Crossley
Sanitizing the e820 map may produce extra E820 entries which would result in the topmost E820 entries being removed. The removed entries would typically include the top E820 usable RAM region and thus result in the domain having signicantly less RAM available to it. Fix by allowing sanitize_e820_map to use the full size of the allocated E820 array. Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-28xen: use correct type for HYPERVISOR_memory_op()Juergen Gross
HYPERVISOR_memory_op() is defined to return an "int" value. This is wrong, as the Xen hypervisor will return "long". The sub-function XENMEM_maximum_reservation returns the maximum number of pages for the current domain. An int will overflow for a domain configured with 8TB of memory or more. Correct this by using the correct type. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-08xen: switch extra memory accounting to use pfnsJuergen Gross
Instead of using physical addresses for accounting of extra memory areas available for ballooning switch to pfns as this is much less error prone regarding partial pages. Reported-by: Roger Pau Monné <roger.pau@citrix.com> Tested-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-08xen: limit memory to architectural maximumJuergen Gross
When a pv-domain (including dom0) is started it tries to size it's p2m list according to the maximum possible memory amount it ever can achieve. Limit the initial maximum memory size to the architectural limit of the hardware in order to avoid overflows during remapping of memory. This problem will occur when dom0 is started with an initial memory size being a multiple of 1GB, but without specifying it's maximum memory size. The kernel must be configured without CONFIG_XEN_BALLOON_MEMORY_HOTPLUG for the problem to happen. Reported-by: Roger Pau Monné <roger.pau@citrix.com> Tested-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-08xen: avoid another early crash of memory limited dom0Juergen Gross
Commit b1c9f169047b ("xen: split counting of extra memory pages...") introduced an error when dom0 was started with limited memory occurring only on some hardware. The problem arises in case dom0 is started with initial memory and maximum memory being the same. The kernel must be configured without CONFIG_XEN_BALLOON_MEMORY_HOTPLUG for the problem to happen. If all of this is true and the E820 map of the machine is sparse (some areas are not covered) then the machine might crash early in the boot process. An example E820 map triggering the problem looks like this: [ 0.000000] e820: BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009d7ff] usable [ 0.000000] BIOS-e820: [mem 0x000000000009d800-0x000000000009ffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000cf7fafff] usable [ 0.000000] BIOS-e820: [mem 0x00000000cf7fb000-0x00000000cf95ffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000cf960000-0x00000000cfb62fff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x00000000cfb63000-0x00000000cfd14fff] usable [ 0.000000] BIOS-e820: [mem 0x00000000cfd15000-0x00000000cfd61fff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x00000000cfd62000-0x00000000cfd6cfff] ACPI data [ 0.000000] BIOS-e820: [mem 0x00000000cfd6d000-0x00000000cfd6ffff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x00000000cfd70000-0x00000000cfd70fff] usable [ 0.000000] BIOS-e820: [mem 0x00000000cfd71000-0x00000000cfea8fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000cfea9000-0x00000000cfeb9fff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x00000000cfeba000-0x00000000cfecafff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000cfecb000-0x00000000cfecbfff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x00000000cfecc000-0x00000000cfedbfff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000cfedc000-0x00000000cfedcfff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x00000000cfedd000-0x00000000cfeddfff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000cfede000-0x00000000cfee3fff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x00000000cfee4000-0x00000000cfef6fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000cfef7000-0x00000000cfefffff] usable [ 0.000000] BIOS-e820: [mem 0x00000000e0000000-0x00000000efffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fec10000-0x00000000fec10fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fed00000-0x00000000fed00fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fed40000-0x00000000fed44fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fed61000-0x00000000fed70fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fed80000-0x00000000fed8ffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000100001000-0x000000020effffff] usable In this case the area a0000-dffff isn't present in the map. This will confuse the memory setup of the domain when remapping the memory from such holes to populated areas. To avoid the problem the accounting of to be remapped memory has to count such holes in the E820 map as well. Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-08xen: avoid early crash of memory limited dom0Juergen Gross
Commit b1c9f169047b ("xen: split counting of extra memory pages...") introduced an error when dom0 was started with limited memory. The problem arises in case dom0 is started with initial memory and maximum memory being the same and exactly a multiple of 1 GB. The kernel must be configured without CONFIG_XEN_BALLOON_MEMORY_HOTPLUG for the problem to happen. In this case it will crash very early during boot due to the virtual mapped p2m list not being large enough to be able to remap any memory: (XEN) Freed 304kB init memory. mapping kernel into physical memory about to get started... (XEN) traps.c:459:d0v0 Unhandled invalid opcode fault/trap [#6] on VCPU 0 [ec=0000] (XEN) domain_crash_sync called from entry.S: fault at ffff82d080229a93 create_bounce_frame+0x12b/0x13a (XEN) Domain 0 (vcpu#0) crashed on cpu#0: (XEN) ----[ Xen-4.5.2-pre x86_64 debug=n Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: e033:[<ffffffff81d120cb>] (XEN) RFLAGS: 0000000000000206 EM: 1 CONTEXT: pv guest (d0v0) (XEN) rax: ffffffff81db2000 rbx: 000000004d000000 rcx: 0000000000000000 (XEN) rdx: 000000004d000000 rsi: 0000000000063000 rdi: 000000004d063000 (XEN) rbp: ffffffff81c03d78 rsp: ffffffff81c03d28 r8: 0000000000023000 (XEN) r9: 00000001040ff000 r10: 0000000000007ff0 r11: 0000000000000000 (XEN) r12: 0000000000063000 r13: 000000000004d000 r14: 0000000000000063 (XEN) r15: 0000000000000063 cr0: 0000000080050033 cr4: 00000000000006f0 (XEN) cr3: 0000000105c0f000 cr2: ffffc90000268000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033 (XEN) Guest stack trace from rsp=ffffffff81c03d28: (XEN) 0000000000000000 0000000000000000 ffffffff81d120cb 000000010000e030 (XEN) 0000000000010006 ffffffff81c03d68 000000000000e02b ffffffffffffffff (XEN) 0000000000000063 000000000004d063 ffffffff81c03de8 ffffffff81d130a7 (XEN) ffffffff81c03de8 000000000004d000 00000001040ff000 0000000000105db1 (XEN) 00000001040ff001 000000000004d062 ffff8800092d6ff8 0000000002027000 (XEN) ffff8800094d8340 ffff8800092d6ff8 00003ffffffff000 ffff8800092d7ff8 (XEN) ffffffff81c03e48 ffffffff81d13c43 ffff8800094d8000 ffff8800094d9000 (XEN) 0000000000000000 ffff8800092d6000 00000000092d6000 000000004cfbf000 (XEN) 00000000092d6000 00000000052d5442 0000000000000000 0000000000000000 (XEN) ffffffff81c03ed8 ffffffff81d185c1 0000000000000000 0000000000000000 (XEN) ffffffff81c03e78 ffffffff810f8ca4 ffffffff81c03ed8 ffffffff8171a15d (XEN) 0000000000000010 ffffffff81c03ee8 0000000000000000 0000000000000000 (XEN) ffffffff81f0e402 ffffffffffffffff ffffffff81dae900 0000000000000000 (XEN) 0000000000000000 0000000000000000 ffffffff81c03f28 ffffffff81d0cf0f (XEN) 0000000000000000 0000000000000000 0000000000000000 ffffffff81db82e0 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) ffffffff81c03f38 ffffffff81d0c603 ffffffff81c03ff8 ffffffff81d11c86 (XEN) 0300000100000032 0000000000000005 0000000000000020 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) Domain 0 crashed: rebooting machine in 5 seconds. This can be avoided by allocating aneough space for the p2m to cover the maximum memory of dom0 plus the identity mapped holes required for PCI space, BIOS etc. Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20xen: remove no longer needed p2m.hJuergen Gross
Cleanup by removing arch/x86/xen/p2m.h as it isn't needed any more. Most definitions in this file are used in p2m.c only. Move those into p2m.c. set_phys_range_identity() is already declared in arch/x86/include/asm/xen/page.h, add __init annotation there. MAX_REMAP_RANGES isn't used at all, just delete it. The only define left is P2M_PER_PAGE which is moved to page.h as well. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20xen: allow more than 512 GB of RAM for 64 bit pv-domainsJuergen Gross
64 bit pv-domains under Xen are limited to 512 GB of RAM today. The main reason has been the 3 level p2m tree, which was replaced by the virtual mapped linear p2m list. Parallel to the p2m list which is being used by the kernel itself there is a 3 level mfn tree for usage by the Xen tools and eventually for crash dump analysis. For this tree the linear p2m list can serve as a replacement, too. As the kernel can't know whether the tools are capable of dealing with the p2m list instead of the mfn tree, the limit of 512 GB can't be dropped in all cases. This patch replaces the hard limit by a kernel parameter which tells the kernel to obey the 512 GB limit or not. The default is selected by a configuration parameter which specifies whether the 512 GB limit should be active per default for domUs (domain save/restore/migration and crash dump analysis are affected). Memory above the domain limit is returned to the hypervisor instead of being identity mapped, which was wrong anyway. The kernel configuration parameter to specify the maximum size of a domain can be deleted, as it is not relevant any more. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20xen: move p2m list if conflicting with e820 mapJuergen Gross
Check whether the hypervisor supplied p2m list is placed at a location which is conflicting with the target E820 map. If this is the case relocate it to a new area unused up to now and compliant to the E820 map. As the p2m list might by huge (up to several GB) and is required to be mapped virtually, set up a temporary mapping for the copied list. For pvh domains just delete the p2m related information from start info instead of reserving the p2m memory, as we don't need it at all. For 32 bit kernels adjust the memblock_reserve() parameters in order to cover the page tables only. This requires to memblock_reserve() the start_info page on it's own. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20xen: check for initrd conflicting with e820 mapJuergen Gross
Check whether the initrd is placed at a location which is conflicting with the target E820 map. If this is the case relocate it to a new area unused up to now and compliant to the E820 map. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20xen: check pre-allocated page tables for conflict with memory mapJuergen Gross
Check whether the page tables built by the domain builder are at memory addresses which are in conflict with the target memory map. If this is the case just panic instead of running into problems later. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20xen: check for kernel memory conflicting with memory layoutJuergen Gross
Checks whether the pre-allocated memory of the loaded kernel is in conflict with the target memory map. If this is the case, just panic instead of run into problems later, as there is nothing we can do to repair this situation. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20xen: find unused contiguous memory areaJuergen Gross
For being able to relocate pre-allocated data areas like initrd or p2m list it is mandatory to find a contiguous memory area which is not yet in use and doesn't conflict with the memory map we want to be in effect. In case such an area is found reserve it at once as this will be required to be done in any case. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20xen: check memory area against e820 mapJuergen Gross
Provide a service routine to check a physical memory area against the E820 map. The routine will return false if the complete area is RAM according to the E820 map and true otherwise. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20xen: split counting of extra memory pages from remappingJuergen Gross
Memory pages in the initial memory setup done by the Xen hypervisor conflicting with the target E820 map are remapped. In order to do this those pages are counted and remapped in xen_set_identity_and_remap(). Split the counting from the remapping operation to be able to setup the needed memory sizes in time but doing the remap operation at a later time. This enables us to simplify the interface to xen_set_identity_and_remap() as the number of remapped and released pages is no longer needed here. Finally move the remapping further down to prepare relocating conflicting memory contents before the memory might be clobbered by xen_set_identity_and_remap(). This requires to not destroy the Xen E820 map when the one for the system is being constructed. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20xen: move static e820 map to global scopeJuergen Gross
Instead of using a function local static e820 map in xen_memory_setup() and calling various functions in the same source with the map as a parameter use a map directly accessible by all functions in the source. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20xen: eliminate scalability issues from initial mapping setupJuergen Gross
Direct Xen to place the initial P->M table outside of the initial mapping, as otherwise the 1G (implementation) / 2G (theoretical) restriction on the size of the initial mapping limits the amount of memory a domain can be handed initially. As the initial P->M table is copied rather early during boot to domain private memory and it's initial virtual mapping is dropped, the easiest way to avoid virtual address conflicts with other addresses in the kernel is to use a user address area for the virtual address of the initial P->M table. This allows us to just throw away the page tables of the initial mapping after the copy without having to care about address invalidation. It should be noted that this patch won't enable a pv-domain to USE more than 512 GB of RAM. It just enables it to be started with a P->M table covering more memory. This is especially important for being able to boot a Dom0 on a system with more than 512 GB memory. Signed-off-by: Juergen Gross <jgross@suse.com> Based-on-patch-by: Jan Beulich <jbeulich@suse.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-01-28x86/xen: add some __init and static annotations in arch/x86/xen/setup.cJuergen Gross
Some more functions in arch/x86/xen/setup.c can be made "__init". xen_ignore_unusable() can be made "static". Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-01-28x86/xen: use correct types for addresses in arch/x86/xen/setup.cJuergen Gross
In many places in arch/x86/xen/setup.c wrong types are used for physical addresses (u64 or unsigned long long). Use phys_addr_t instead. Use macros already defined instead of open coding them. Correct some other type mismatches. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-01-28x86/xen: cleanup arch/x86/xen/setup.cJuergen Gross
Remove extern declarations in arch/x86/xen/setup.c which are either not used or redundant. Move needed other extern declarations to xen-ops.h Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-01-12xen: check for zero sized area when invalidating memoryJuergen Gross
With the introduction of the linear mapped p2m list setting memory areas to "invalid" had to be delayed. When doing the invalidation make sure no zero sized areas are processed. Signed-off-by: Juegren Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>