summaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)Author
2017-02-01x86: Convert obsolete cputime type to nsecsFrederic Weisbecker
Use the new nsec based cputime accessors as part of the whole cputime conversion from cputime_t to nsecs. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Link: http://lkml.kernel.org/r/1485832191-26889-10-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01sched/cputime: Convert task/group cputime to nsecsFrederic Weisbecker
Now that most cputime readers use the transition API which return the task cputime in old style cputime_t, we can safely store the cputime in nsecs. This will eventually make cputime statistics less opaque and more granular. Back and forth convertions between cputime_t and nsecs in order to deal with cputime_t random granularity won't be needed anymore. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Link: http://lkml.kernel.org/r/1485832191-26889-8-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01sched/cputime: Introduce special task_cputime_t() API to return old-typed ↵Frederic Weisbecker
cputime This API returns a task's cputime in cputime_t in order to ease the conversion of cputime internals to use nsecs units instead. Blindly converting all cputime readers to use this API now will later let us convert more smoothly and step by step all these places to use the new nsec based cputime. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Link: http://lkml.kernel.org/r/1485832191-26889-7-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01Merge branch 'linus' into sched/core, to pick up fixes and refresh the branchIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01efi/x86: Add debug code to print cooked memmapDave Young
It is not obvious if the reserved boot area are added correctly, add a efi_print_memmap() call to print the new memmap. Tested-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Nicolai Stange <nicstange@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1485868902-20401-10-git-send-email-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01efi/x86: Move the EFI BGRT init code to early init codeDave Young
Before invoking the arch specific handler, efi_mem_reserve() reserves the given memory region through memblock. efi_bgrt_init() will call efi_mem_reserve() after mm_init(), at which time memblock is dead and should not be used anymore. The EFI BGRT code depends on ACPI initialization to get the BGRT ACPI table, so move parsing of the BGRT table to ACPI early boot code to ensure that efi_mem_reserve() in EFI BGRT code still use memblock safely. Tested-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-acpi@vger.kernel.org Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1485868902-20401-9-git-send-email-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01x86/efi: Add support for EFI_MEMORY_ATTRIBUTES_TABLESai Praneeth
UEFI v2.6 introduces EFI_MEMORY_ATTRIBUTES_TABLE which describes memory protections that may be applied to the EFI Runtime code and data regions by the kernel. This enables the kernel to map these regions more strictly thereby increasing security. Presently, the only valid bits for the attribute field of a memory descriptor are EFI_MEMORY_RO and EFI_MEMORY_XP, hence use these bits to update the mappings in efi_pgd. The UEFI specification recommends to use this feature instead of EFI_PROPERTIES_TABLE and hence while updating EFI mappings we first check for EFI_MEMORY_ATTRIBUTES_TABLE and if it's present we update the mappings according to this table and hence disregarding EFI_PROPERTIES_TABLE even if it's published by the firmware. We consider EFI_PROPERTIES_TABLE only when EFI_MEMORY_ATTRIBUTES_TABLE is absent. Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Lee, Chun-Yi <jlee@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Cc: Ricardo Neri <ricardo.neri@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1485868902-20401-6-git-send-email-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01x86/efi: Deduplicate efi_char16_printk()Lukas Wunner
Eliminate the separate 32-bit and 64x- bit code paths by way of the shiny new efi_call_proto() macro. No functional change intended. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1485868902-20401-3-git-send-email-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01efi: Deduplicate efi_file_size() / _read() / _close()Lukas Wunner
There's one ARM, one x86_32 and one x86_64 version which can be folded into a single shared version by masking their differences with the shiny new efi_call_proto() macro. No functional change intended. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1485868902-20401-2-git-send-email-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01perf/x86/intel/uncore: Make package handling more robustThomas Gleixner
The package management code in uncore relies on package mapping being available before a CPU is started. This changed with: 9d85eb9119f4 ("x86/smpboot: Make logical package management more robust") because the ACPI/BIOS information turned out to be unreliable, but that left uncore in broken state. This was not noticed because on a regular boot all CPUs are online before uncore is initialized. Move the allocation to the CPU online callback and simplify the hotplug handling. At this point the package mapping is established and correct. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com> Fixes: 9d85eb9119f4 ("x86/smpboot: Make logical package management more robust") Link: http://lkml.kernel.org/r/20170131230141.377156255@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01perf/x86/intel/uncore: Clean up hotplug conversion falloutThomas Gleixner
The recent conversion to the hotplug state machine kept two mechanisms from the original code: 1) The first_init logic which adds the number of online CPUs in a package to the refcount. That's wrong because the callbacks are executed for all online CPUs. Remove it so the refcounting is correct. 2) The on_each_cpu() call to undo box->init() in the error handling path. That's bogus because when the prepare callback fails no box has been initialized yet. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com> Fixes: 1a246b9f58c6 ("perf/x86/intel/uncore: Convert to hotplug state machine") Link: http://lkml.kernel.org/r/20170131230141.298032324@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01perf/x86/intel/rapl: Make package handling more robustThomas Gleixner
The package management code in RAPL relies on package mapping being available before a CPU is started. This changed with: 9d85eb9119f4 ("x86/smpboot: Make logical package management more robust") because the ACPI/BIOS information turned out to be unreliable, but that left RAPL in broken state. This was not noticed because on a regular boot all CPUs are online before RAPL is initialized. A possible fix would be to reintroduce the mess which allocates a package data structure in CPU prepare and when it turns out to already exist in starting throw it away later in the CPU online callback. But that's a horrible hack and not required at all because RAPL becomes functional for perf only in the CPU online callback. That's correct because user space is not yet informed about the CPU being onlined, so nothing caan rely on RAPL being available on that particular CPU. Move the allocation to the CPU online callback and simplify the hotplug handling. At this point the package mapping is established and correct. This also adds a missing check for available package data in the event_init() function. Reported-by: Yasuaki Ishimatsu <yasu.isimatu@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: 9d85eb9119f4 ("x86/smpboot: Make logical package management more robust") Link: http://lkml.kernel.org/r/20170131230141.212593966@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-31x86/mce: Make timer handling more robustThomas Gleixner
Erik reported that on a preproduction hardware a CMCI storm triggers the BUG_ON in add_timer_on(). The reason is that the per CPU MCE timer is started by the CMCI logic before the MCE CPU hotplug callback starts the timer with add_timer_on(). So the timer is already queued which triggers the BUG. Using add_timer_on() is pretty pointless in this code because the timer is strictlty per CPU, initialized as pinned and all operations which arm the timer happen on the CPU to which the timer belongs. Simplify the whole machinery by using mod_timer() instead of add_timer_on() which avoids the problem because mod_timer() can handle already queued timers. Use __start_timer() everywhere so the earliest armed expiry time is preserved. Reported-by: Erik Veijola <erik.veijola@intel.com> Tested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@alien8.de> Cc: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1701310936080.3457@nanos Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-31x86/irq: Make irq activate operations symmetricThomas Gleixner
The recent commit which prevents double activation of interrupts unearthed interesting code in x86. The code (ab)uses irq_domain_activate_irq() to reconfigure an already activated interrupt. That trips over the prevention code now. Fix it by deactivating the interrupt before activating the new configuration. Fixes: 08d85f3ea99f1 "irqdomain: Avoid activating interrupts more than once" Reported-and-tested-by: Mike Galbraith <efault@gmx.de> Reported-and-tested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1701311901580.3457@nanos
2017-01-31Drivers: hv: restore TSC page cleanup before kexecVitaly Kuznetsov
We need to cleanup the TSC page before doing kexec/kdump or the new kernel may crash if it tries to use it. Fixes: 63ed4e0c67df ("Drivers: hv: vmbus: Consolidate all Hyper-V specific clocksource code") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-31Drivers: hv: restore hypervcall page cleanup before kexecVitaly Kuznetsov
We need to cleanup the hypercall page before doing kexec/kdump or the new kernel may crash if it tries to use it. Reuse the now-empty hv_cleanup function renaming it to hyperv_cleanup and moving to the arch specific code. Fixes: 8730046c1498 ("Drivers: hv vmbus: Move Hypercall page setup out of common code") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-31Merge branch 'x86/urgent' into x86/microcode, to resolve conflictsIngo Molnar
Conflicts: arch/x86/kernel/cpu/microcode/amd.c arch/x86/kernel/cpu/microcode/core.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-31x86/mm: Remove CONFIG_DEBUG_NX_TESTKees Cook
CONFIG_DEBUG_NX_TEST has been broken since CONFIG_DEBUG_SET_MODULE_RONX=y was added in v2.6.37 via: 84e1c6bb38eb ("x86: Add RO/NX protection for loadable kernel modules") since the exception table was then made read-only. Additionally, the manually constructed extables were never fixed when relative extables were introduced in v3.5 via: 706276543b69 ("x86, extable: Switch to relative exception table entries") However, relative extables won't work for test_nx.c, since test instruction memory areas may be more than INT_MAX away from an executable fixup (e.g. stack and heap too far away from executable memory with the fixup). Since clearly no one has been using this code for a while now, and similar tests exist in LKDTM, this should just be removed entirely. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jinbum Park <jinb.park7@gmail.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170131003711.GA74048@beast Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-30x86/mm/cpa: Avoid wbinvd() for PREEMPTJohn Ogness
Although wbinvd() is faster than flushing many individual pages, it blocks the memory bus for "long" periods of time (>100us), thus directly causing unusually large latencies on all CPUs, regardless of any CPU isolation features that may be active. This is an unpriviledged operatation as it is exposed to user space via the graphics subsystem. For 1024 pages, flushing those pages individually can take up to 2200us, but the task remains fully preemptible during that time. Signed-off-by: John Ogness <john.ogness@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: linux-rt-users <linux-rt-users@vger.kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-30perf/x86/events: Add an AMD-specific MakefileBorislav Petkov
Move the AMD pieces from the generic Makefile so that $ make arch/x86/events/amd/<file>.s can work too. Otherwise you get: $ make arch/x86/events/amd/ibs.s scripts/Makefile.build:44: arch/x86/events/amd/Makefile: No such file or directory make[1]: *** No rule to make target 'arch/x86/events/amd/Makefile'. Stop. Makefile:1636: recipe for target 'arch/x86/events/amd/ibs.s' failed make: *** [arch/x86/events/amd/ibs.s] Error 2 Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/20170126080819.417-1-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-30perf/x86/amd/uncore: Update sysfs attributes for Family17h processorsJanakarajan Natarajan
This patch updates the sysfs attributes for AMD Family17h processors. In Family17h, the event bit position is changed for both the NorthBridge and Last level cache counters. The sysfs attributes are assigned based on the family and the type of the counter. Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/617570ed3634e804991f95db62c3cf3856a9d2a7.1484598705.git.Janakarajan.Natarajan@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-30perf/x86/amd/uncore: Update the number of uncore countersJanakarajan Natarajan
This patch updates the AMD uncore driver to support AMD Family17h processors. In Family17h, there are two extra last level cache counters. The maximum available counters is increased and the number of counters for each uncore type is now based on the family. Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/799f9c5be8963cc209d9169a08f4a2643b748dc7.1484598705.git.Janakarajan.Natarajan@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-30perf/x86/amd/uncore: Rename 'L2' to 'LLC'Janakarajan Natarajan
This patch renames L2 counters to LLC counters. In AMD Family17h processors, L3 cache counter is supported. Since older families have at most L2 counters, last level cache (LLC) indicates L2/L3 based on the family. Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/5d8cd8736d8d578354597a548e64ff16210c319b.1484598705.git.Janakarajan.Natarajan@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-30x86/boot/e820: Simplify e820__update_table()Ingo Molnar
- Remove the now unnecessary __e820__update_table() wrappery - Move statics out from function scope, to make the logic clearer - Rename local variables to be more in line with the rest of 820.c - Remove unnecessary local variables: old_nr, *nr_entries No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-30x86/microcode: Do not access the initrd after it has been freedBorislav Petkov
When we look for microcode blobs, we first try builtin and if that doesn't succeed, we fallback to the initrd supplied to the kernel. However, at some point doing boot, that initrd gets jettisoned and we shouldn't access it anymore. But we do, as the below KASAN report shows. That's because find_microcode_in_initrd() doesn't check whether the initrd is still valid or not. So do that. ================================================================== BUG: KASAN: use-after-free in find_cpio_data Read of size 1 by task swapper/1/0 page:ffffea0000db9d40 count:0 mapcount:0 mapping: (null) index:0x1 flags: 0x100000000000000() raw: 0100000000000000 0000000000000000 0000000000000001 00000000ffffffff raw: dead000000000100 dead000000000200 0000000000000000 0000000000000000 page dumped because: kasan: bad access detected CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W 4.10.0-rc5-debug-00075-g2dbde22 #3 Hardware name: Dell Inc. XPS 13 9360/0839Y6, BIOS 1.2.3 12/01/2016 Call Trace: dump_stack ? _atomic_dec_and_lock ? __dump_page kasan_report_error ? pointer ? find_cpio_data __asan_report_load1_noabort ? find_cpio_data find_cpio_data ? vsprintf ? dump_stack ? get_ucode_user ? print_usage_bug find_microcode_in_initrd __load_ucode_intel ? collect_cpu_info_early ? debug_check_no_locks_freed load_ucode_intel_ap ? collect_cpu_info ? trace_hardirqs_on ? flat_send_IPI_mask_allbutself load_ucode_ap ? get_builtin_firmware ? flush_tlb_func ? do_raw_spin_trylock ? cpumask_weight cpu_init ? trace_hardirqs_off ? play_dead_common ? native_play_dead ? hlt_play_dead ? syscall_init ? arch_cpu_idle_dead ? do_idle start_secondary start_cpu Memory state around the buggy address: ffff880036e74f00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff880036e74f80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >ffff880036e75000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ffff880036e75080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff880036e75100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ================================================================== Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Tested-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170126165833.evjemhbqzaepirxo@pd.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-29x86/xen: Fix APIC id mismatch warning on IntelMohit Gambhir
This patch fixes the following warning message seen when booting the kernel as Dom0 with Xen on Intel machines. [0.003000] [Firmware Bug]: CPU1: APIC id mismatch. Firmware: 0 APIC: 1] The code generating the warning in validate_apic_and_package_id() matches cpu_data(cpu).apicid (initialized in init_intel()-> detect_extended_topology() using cpuid) against the apicid returned from xen_apic_read(). Now, xen_apic_read() makes a hypercall to retrieve apicid for the boot cpu but returns 0 otherwise. Hence the warning gets thrown for all but the boot cpu. The idea behind xen_apic_read() returning 0 for apicid is that the guests (even Dom0) should not need to know what physical processor their vcpus are running on. This is because we currently do not have topology information in Xen and also because xen allows more vcpus than physical processors. However, boot cpu's apicid is required for loading xen-acpi-processor driver on AMD machines. Look at following patch for details: commit 558daa289a40 ("xen/apic: Return the APIC ID (and version) for CPU 0.") So to get rid of the warning, this patch modifies xen_cpu_present_to_apicid() to return cpu_data(cpu).apicid instead of calling xen_apic_read(). The warning is not seen on AMD machines because init_amd() populates cpu_data(cpu).apicid by calling hard_smp_processor_id()->xen_apic_read() as opposed to using apicid from cpuid as is done on Intel machines. Signed-off-by: Mohit Gambhir <mohit.gambhir@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-01-29x86/boot/e820: Separate the E820 ABI structures from the in-kernel structuresIngo Molnar
Linus pointed out that relying on the compiler to pack structures with enums is fragile not just for the kernel, but for external tooling as well which might rely on our UAPI headers. So separate the two from each other: introduce 'struct boot_e820_entry', which is the boot protocol entry format. This actually simplifies the code, as e820__update_table() is now never called directly with boot protocol table entries - we can rely on append_e820_table() and do a e820__update_table() call afterwards. ( This will allow further simplifications of __e820__update_table(), but that will be done in a separate patch. ) This change also has the side effect of not modifying the bootparams structure anymore - which might be useful for debugging. In theory we could even constify the boot_params structure - at least from the E820 code's point of view. Remove the uapi/asm/e820/types.h file, as it's not used anymore - all kernel side E820 types are defined in asm/e820/types.h. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-29x86/boot/e820: Fix and clean up e820_type switch() statementsIngo Molnar
A test-build of e820.o with -Wswitch-enum shows the following warnings: arch/x86/kernel/e820.c: In function ‘e820_type_to_string’: arch/x86/kernel/e820.c:965:2: warning: enumeration value ‘E820_TYPE_RESERVED’ not handled in switch [-Wswitch-enum] switch (entry->type) { ^ arch/x86/kernel/e820.c: In function ‘e820_type_to_iomem_type’: arch/x86/kernel/e820.c:979:2: warning: enumeration value ‘E820_TYPE_RESERVED’ not handled in switch [-Wswitch-enum] switch (entry->type) { ^ arch/x86/kernel/e820.c: In function ‘e820_type_to_iores_desc’: arch/x86/kernel/e820.c:993:2: warning: enumeration value ‘E820_TYPE_RESERVED’ not handled in switch [-Wswitch-enum] switch (entry->type) { ^ arch/x86/kernel/e820.c: In function ‘do_mark_busy’: arch/x86/kernel/e820.c:1015:2: warning: enumeration value ‘E820_TYPE_RAM’ not handled in switch [-Wswitch-enum] switch (type) { ^ Here's the four warnings: - The one in e820_type_to_string() is a borderline bug, we should differentiate known-reserved E820 types from unknown types. Fix it by printing a separate message for unknown E820 types. - The ones in e820_type_to_iomem_type(), e820_type_to_iores_desc() and do_mark_busy() are worth documenting, at least to the extent of enumerating them explicitly. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Rename the remaining E820 APIs to the e820__*() prefixIngo Molnar
Three more renames left: e820_end_of_ram_pfn() => e820__end_of_ram_pfn() e820_end_of_low_ram_pfn() => e820__end_of_low_ram_pfn() e820_reallocate_tables() => e820__reallocate_tables() After this all E820 API calls are prefixed with "e820__", making it much easier to grep for E820 functionality in the kernel. No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Remove unnecessary #include'sIngo Molnar
A number of headers were included into e820.c unnecessarily - remove them. No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Rename e820_mark_nosave_regions() to ↵Ingo Molnar
e820__register_nosave_regions() This function is a minor misnomer: it is talking about 'marking' regions as nosave - while the hibernation API is called register_nosave_region() and the e820_mark_nosave_regions() is a wrapper around that functionality. So name it to be in line with the API it is derived from. ( Rename e820_mark_nvs_memory() to e820__register_nvs_regions(), for similar reasons. ) No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Rename e820_reserve_resources*() to e820__reserve_resources*()Ingo Molnar
Also do some minor cleanups. No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Use bool in query APIsIngo Molnar
Change e820__mapped_any() and e820__mapped_all()'s return type and e820__range_remove()'s check_type parameter to bool. Propagate it into arch/x86/pci/mmconfig-shared.c as this change affects a function signature there too. No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Document e820__reserve_setup_data()Ingo Molnar
Also clean it up a bit. No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Clean up __e820__update_table() et alIngo Molnar
The __e820__update_table() function has various weirdly named variables, such as 'pbios', 'biosmap' and 'pnr_map' which are pretty confusing and actively misleading at times. This weird naming found its way into other functions as well, such as __append_e820_table() and append_e820_table(). Standardize the naming to make it all much easier to read: biosmap -> entries pbios -> entry nr_map -> nr_entries pnr_map -> nr_entries ... Also clean up the types used: entry indices routinely mixed u32 and int, standardize on u32 thoughout. Update the comments as well, while at it. No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Simplify the e820__update_table() interfaceIngo Molnar
The e820__update_table() parameters are pretty complex: arch/x86/include/asm/e820/api.h:extern int e820__update_table(struct e820_entry *biosmap, int max_nr_map, u32 *pnr_map); But 90% of the usage is trivial: arch/x86/kernel/e820.c: if (e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries)) arch/x86/kernel/e820.c: e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries); arch/x86/kernel/e820.c: e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries); arch/x86/kernel/e820.c: if (e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries) < 0) arch/x86/kernel/e820.c: e820__update_table(boot_params.e820_table, ARRAY_SIZE(boot_params.e820_table), &new_nr); arch/x86/kernel/early-quirks.c: e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries); arch/x86/kernel/setup.c: e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries); arch/x86/kernel/setup.c: e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries); arch/x86/platform/efi/efi.c: e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries); arch/x86/xen/setup.c: e820__update_table(xen_e820_table.entries, ARRAY_SIZE(xen_e820_table.entries), arch/x86/xen/setup.c: e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries); arch/x86/xen/setup.c: e820__update_table(xen_e820_table.entries, ARRAY_SIZE(xen_e820_table.entries), as it only uses an exiting struct e820_table's entries array, its size and its current number of entries as input and output arguments. Only one use is non-trivial: arch/x86/kernel/e820.c: e820__update_table(boot_params.e820_table, ARRAY_SIZE(boot_params.e820_table), &new_nr); ... which call updates the E820 table in the zeropage in-situ, and the layout there does not match that of 'struct e820_table' (in particular nr_entries is at a different offset, hardcoded by the boot protocol). Simplify all this by introducing a low level __e820__update_table() API that the zeropage update call can use, and simplifying the main e820__update_table() call signature down to: int e820__update_table(struct e820_table *table); This visibly simplifies all the call sites: arch/x86/include/asm/e820/api.h:extern int e820__update_table(struct e820_table *table); arch/x86/include/asm/e820/types.h: * call to e820__update_table() to remove duplicates. The allowance arch/x86/kernel/e820.c: * The return value from e820__update_table() is zero if it arch/x86/kernel/e820.c:int __init e820__update_table(struct e820_table *table) arch/x86/kernel/e820.c: if (e820__update_table(e820_table)) arch/x86/kernel/e820.c: e820__update_table(e820_table_firmware); arch/x86/kernel/e820.c: e820__update_table(e820_table); arch/x86/kernel/e820.c: e820__update_table(e820_table); arch/x86/kernel/e820.c: if (e820__update_table(e820_table) < 0) arch/x86/kernel/early-quirks.c: e820__update_table(e820_table); arch/x86/kernel/setup.c: e820__update_table(e820_table); arch/x86/kernel/setup.c: e820__update_table(e820_table); arch/x86/platform/efi/efi.c: e820__update_table(e820_table); arch/x86/xen/setup.c: e820__update_table(&xen_e820_table); arch/x86/xen/setup.c: e820__update_table(e820_table); arch/x86/xen/setup.c: e820__update_table(&xen_e820_table); No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28xen, x86/boot/e820: Simplify Xen's xen_e820_table constructIngo Molnar
The Xen guest memory setup code has: static struct e820_entry xen_e820_table[E820_MAX_ENTRIES] __initdata; static u32 xen_e820_table_entries __initdata; ... which is really a 'struct e820_table', open-coded. Convert the Xen code over to use a single struct e820_table, as this will allow the simplification of the e820__update_table() API. No intended change in functionality, but not runtime tested. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Clean up and standardize sizeof() usesIngo Molnar
There's various sizeof() uses in e820.c - standardize on the shortest and least error prone one, along the pattern of: - memset(entry, 0, sizeof(struct e820_entry)); + memset(entry, 0, sizeof(*entry)); ... because with this pattern in most cases it's immediately clear that we have used the right type - and the pattern is robust against changing the type as well. No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Clean up the E820 table size define namesIngo Molnar
We've got a number of defines related to the E820 table and its size: E820MAP E820NR E820_X_MAX E820MAX The first two denote byte offsets into the zeropage (struct boot_params), and can are not used in the kernel and can be removed. The E820_*_MAX values have an inconsistent structure and it's unclear in any case what they mean. 'X' presuably goes for extended - but it's not very expressive altogether. Change these over to: E820_MAX_ENTRIES_ZEROPAGE E820_MAX_ENTRIES ... which are self-explanatory names. No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Prefix the E820_* type names with "E820_TYPE_"Ingo Molnar
So there's a number of constants that start with "E820" but which are not types - these create a confusing mixture when seen together with 'enum e820_type' values: E820MAP E820NR E820_X_MAX E820MAX To better differentiate the 'enum e820_type' values prefix them with E820_TYPE_. No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Use 'enum e820_type' when handling the e820 region typeIngo Molnar
The E820 region type is put into four different types (!) when used in function parameters or local variables: unsigned type; int type; unsigned long current_type; u32 type; Use 'enum e820_type' in all these cases instead. No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Use 'enum e820_type' in 'struct e820_entry'Ingo Molnar
Use a stricter type for struct e820_entry. Add a build-time check to make sure the compiler won't ever pack the enum into a field smaller than 'int'. No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Introduce 'enum e820_type'Ingo Molnar
Use an enum instead of CPP #define. Also fix various small annoyances in the descriptions of the various E820 types. No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Simplify e820_reserve_resources()Ingo Molnar
Remove unnecessary duplications of "e820_table->entries[i]." via a local variable, plus pass in 'entry' to the type_to_*() functions which further improves the readability of the code - and other small tweaks. No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Reorder the function prototypes in api.hIngo Molnar
Reorder the function prototypes in api.h a bit, to group related functions together. No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Rename e820_print_map() to e820__print_table()Ingo Molnar
All other table-level methods are already named 'table' in some way, to change this one over to the (now consistent) nomenclature. No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Create coherent API function names for E820 range operationsIngo Molnar
We have these three related functions: extern void e820_add_region(u64 start, u64 size, int type); extern u64 e820_update_range(u64 start, u64 size, unsigned old_type, unsigned new_type); extern u64 e820_remove_range(u64 start, u64 size, unsigned old_type, int checktype); But it's not clear from the naming that they are 3 operations based around the same 'memory range' concept. Rename them to better signal this, and move the prototypes next to each other: extern void e820__range_add (u64 start, u64 size, int type); extern u64 e820__range_update(u64 start, u64 size, unsigned old_type, unsigned new_type); extern u64 e820__range_remove(u64 start, u64 size, unsigned old_type, int checktype); Note that this improved organization of the functions shows another problem that was easy to miss before: sometimes the E820 entry type is 'int', sometimes 'unsigned int' - but this will be fixed in a separate patch. No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Rename e820_setup_gap() to e820__setup_pci_gap()Ingo Molnar
The e820_setup_gap() function name is unnecessarily silent about what kind of gap it sets up. Make it clear that it's about the PCI gap. No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Rename e820_any_mapped()/e820_all_mapped() to ↵Ingo Molnar
e820__mapped_any()/e820__mapped_all() The 'any' and 'all' are modified to the 'mapped' concept, so move them last in the name. No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Rename sanitize_e820_table() to e820__update_table()Ingo Molnar
sanitize_e820_table() is a minor misnomer in that it suggests that the E820 table requires sanitizing - which implies that it will only do anything if the E820 table is irregular (not sane). That is wrong, because sanitize_e820_table() also does a very regular sorting of the E820 table, which is a necessity in the basic append-only flow of E820 updates the kernel is allowed to perform to it. So rename it to e820__update_table() to include that purpose as well. This also lines up all the table-update functions into a coherent naming family: int e820__update_table(struct e820_entry *biosmap, int max_nr_map, u32 *pnr_map); void e820__update_table_print(void); void e820__update_table_firmware(void); No change in functionality. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>