summaryrefslogtreecommitdiff
path: root/arch/x86/kernel
AgeCommit message (Collapse)Author
2009-08-31x86: Distangle ioapic and i8259Thomas Gleixner
The proposed Moorestown support patches use an extra feature flag mechanism to make the ioapic work w/o an i8259. There is a much simpler solution. Most i8259 specific functions are already called dependend on the irq number less than NR_IRQS_LEGACY. Replacing that constant by a read_mostly variable which can be set to 0 by the platform setup code allows us to achieve the same without any special feature flags. That trivial change allows us to proceed with MRST w/o doing a full blown overhaul of the ioapic code which would delay MRST unduly. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31x86: Add Moorestown early detectionThomas Gleixner
Moorestown MID devices need to be detected early in the boot process to setup and do not call x86_default_early_setup as there is no EBDA region to reserve. [ Copied the minimal code from Jacobs latest MRST series ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jacob Pan <jacob.jun.pan@intel.com>
2009-08-31x86: Add hardware_subarch ID for MoorestownPan, Jacob jun
x86 bootprotocol 2.07 has introduced hardware_subarch ID in the boot parameters provided by FW. We use it to identify Moorestown platforms. [ tglx: Cleanup and paravirt fix ] Signed-off-by: Jacob Pan <jacob.jun.pan@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31x86: Add early platform detectionThomas Gleixner
Platforms like Moorestown require early setup and want to avoid the call to reserve_ebda_region. The x86_init override is too late when the MRST detection happens in setup_arch. Move the default i386 x86_init overrides and the call to reserve_ebda_region into a separate function which is called as the default of a switch case depending on the hardware_subarch id in boot params. This allows us to add a case for MRST and let MRST have its own early setup function. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31x86: Move tsc_init to late_time_initThomas Gleixner
We do not need the TSC before late_time_init. Move the tsc_init to the late time init code so we can also utilize HPET for calibration (which we claimed to do but never did except in some older kernel version). This also helps Moorestown to calibrate the TSC with the AHBT timer which needs to be initialized in late_time_init like HPET. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31x86: Move tsc_calibration to x86_init_opsThomas Gleixner
TSC calibration is modified by the vmware hypervisor and paravirt by separate means. Moorestown wants to add its own calibration routine as well. So make calibrate_tsc a proper x86_init_ops function and override it by paravirt or by the early setup of the vmware hypervisor. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31x86: Replace the now identical time_32/64.c by time.cThomas Gleixner
Remove the redundant copy. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31x86: time_32/64.c unify profile_pcThomas Gleixner
The code is identical except for the formatting and a useless #ifdef. Make it the same. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31x86: Move calibrate_cpu to tsc.cThomas Gleixner
Move the code where it's only user is. Also we need to look whether this hardwired hackery might interfere with perfcounters. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31x86: Make timer setup and global variables the same in time_32/64.cThomas Gleixner
The timer and timer irq setup code is identical in 32 and 64 bit. Make it the same formatting as well. Also add the global variables under the necessary ifdefs to both files. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31x86: Remove mca bus ifdef from timer interruptThomas Gleixner
MCA_bus is constant 0 when CONFIG_MCA=n. So the compiler removes that code w/o needing an extra #ifdef Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31x86: Simplify timer_ack magic in time_32.cThomas Gleixner
Let the compiler optimize the timer_ack magic away in the 32bit timer interrupt and put the same code into time_64.c. It's optimized out for CONFIG_X86_IO_APIC on 32bit and for 64bit because timer_ack is const 0 in both cases. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31x86: Prepare unification of time_32/64.cThomas Gleixner
Unify the top comment and the includes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31x86: Remove do_timer hookThomas Gleixner
This is a left over of the old x86 sub arch support. Remove it and open code it like we do in time_64.c Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31x86: Add timer_init to x86_init_opsThomas Gleixner
The timer init code is convoluted with several quirks and the paravirt timer chooser. Figuring out which code path is actually taken is not for the faint hearted. Move the numaq TSC quirk to tsc_pre_init x86_init_ops function and replace the paravirt time chooser and the remaining x86 quirk with a simple x86_init_ops function. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31x86: Move percpu clockevents setup to x86_init_opsThomas Gleixner
paravirt overrides the setup of the default apic timers as per cpu timers. Moorestown needs to override that as well. Move it to x86_init_ops setup and create a separate x86_cpuinit struct which holds the function for the secondary evtl. hotplugabble CPUs. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31x86: Move xen_post_allocator_init into xen_pagetable_setup_doneThomas Gleixner
We really do not need two paravirt/x86_init_ops functions which are called in two consecutive source lines. Move the only user of post_allocator_init into the already existing pagetable_setup_done function. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31x86: Move paravirt pagetable_setup to x86_init_opsThomas Gleixner
Replace more paravirt hackery by proper x86_init_ops. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31x86: Move paravirt banner printout to x86_init_opsThomas Gleixner
Replace another obscure paravirt magic and move it to x86_init_ops. Such a hook is also useful for embedded and special hardware. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31x86: Replace ARCH_SETUP by a proper x86_init_opsThomas Gleixner
ARCH_SETUP is a horrible leftover from the old arch/i386 mach support code. It still has a lonely user in xen. Move it to x86_init_ops. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31x86: Move traps_init to x86_init_opsThomas Gleixner
Replace the quirks by a simple x86_init_ops function. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31x86: Move irq_init to x86_init_opsThomas Gleixner
irq_init is overridden by x86_quirks and by paravirts. Unify the whole mess and make it an unconditional x86_init_ops function which defaults to the standard function and can be overridden by the early platform code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31x86: Move pre_intr_init to x86_init_opsThomas Gleixner
Replace the quirk machinery by a x86_init_ops function which defaults to the standard implementation. This is also a preparatory patch for Moorestown support which needs to replace the default init_ISA_irqs as well. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31x86: Move get/find_smp_config to x86_init_opsThomas Gleixner
Replace the quirk machinery by a x86_init_ops function which defaults to the standard implementation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-29x86: Fix earlyprintk=dbgp for machines without NXJan Beulich
Since parse_early_param() may (e.g. for earlyprintk=dbgp) involve calls to page table manipulation functions (here set_fixmap_nocache()), NX hardware support must be determined before calling that function (so that __supported_pte_mask gets properly set up). But the call after parse_early_param() can also not go away, as that will honor eventual command line specified disabling of the NX functionality. ( This will then just result in whatever mappings got established during parse_early_param() having the NX bit set despite it being disabled on the command line, but I think that's tolerable). Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> LKML-Reference: <4A97F3BD02000078000121B9@vpn.id2.novell.com> [ merged to x86/pat to resolve a conflict. ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-29Merge branch 'for-ingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-sfi-2.6 into x86/apic Merge reason: the SFI (Simple Firmware Interface) feature in the ACPI tree needs this cleanup, pull it into the APIC branch as well so that there's no interactions. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-28x86: add arch-specific SFI supportFeng Tang
arch/x86/kernel/sfi.c serves the dual-purpose of supporting the SFI core with arch specific code, as well as a home for the arch-specific code that uses SFI. analogous to ACPI, drivers/sfi/Kconfig is pulled in by arch/x86/Kconfig Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Cc: x86@kernel.org
2009-08-28ACPI, x86: expose some IO-APIC routines when CONFIG_ACPI=nFeng Tang
Some IO-APIC routines are ACPI specific now, but need to be exposed when CONFIG_ACPI=n for the benefit of SFI. Remove #ifdef ACPI around these routines: io_apic_get_unique_id(int ioapic, int apic_id); io_apic_get_version(int ioapic); io_apic_get_redir_entries(int ioapic); Move these routines from ACPI-specific boot.c to io_apic.c: uniq_ioapic_id(u8 id) mp_find_ioapic() mp_find_ioapic_pin() mp_register_ioapic() Also, since uniq_ioapic_id() is now no longer static, re-name it to io_apic_unique_id() for consistency with the other public io_apic routines. For simplicity, do not #ifdef the resulting code ACPI || SFI, thought that could be done in the future if it is important to optimize the !ACPI !SFI IO-APIC x86 kernel for size. Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Cc: x86@kernel.org
2009-08-28clocksource: Resolve cpu hotplug dead lock with TSC unstableThomas Gleixner
Martin Schwidefsky analyzed it: To register a clocksource the clocksource_mutex is acquired and if necessary timekeeping_notify is called to install the clocksource as the timekeeper clock. timekeeping_notify uses stop_machine which needs to take cpu_add_remove_lock mutex. Starting a new cpu is done with the cpu_add_remove_lock mutex held. native_cpu_up checks the tsc of the new cpu and if the tsc is no good clocksource_change_rating is called. Which needs the clocksource_mutex and the deadlock is complete. The solution is to replace the TSC via the clocksource watchdog mechanism. Mark the TSC as unstable and schedule the watchdog work so it gets removed in the watchdog thread context. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <new-submission> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: John Stultz <johnstul@us.ibm.com>
2009-08-27x86: Move oem_bus_info to x86_init_opsThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-27x86: Move mpc_oem_pci_bus to x86_init_opsThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-27x86: Move smp_read_mpc_oem to x86_init_ops.Thomas Gleixner
Move smp_read_mpc_oem from quirks to x86_init. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-27x86: Move mpc_apic_id to x86_init_opsThomas Gleixner
The mpc_apic_id setup is handled by a x86_quirk. Make it a x86_init_ops function with a default implementation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-27x86: Move ioapic_ids_setup to x86_init_opsThomas Gleixner
32bit and also the numaq code have special requirements on the ioapic_id setup. Convert it to a x86_init_ops function and get rid of the quirks and #ifdefs Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-27x86: Sanitize smp_record and move it to x86_init_opsThomas Gleixner
The x86 quirkification introduced an extra ugly hackery with a variable pointer in the mpparse code. If the pointer is initialized then it is dereferenced and the variable set to 0 or incremented. Create a x86_init_ops function and let the affected numaq code hold the function. Default init is a setup noop. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-27x86: Move memory_setup to x86_init_opsThomas Gleixner
memory_setup is overridden by x86_quirks and by paravirts with weak functions and quirks. Unify the whole mess and make it an unconditional x86_init_ops function which defaults to the standard function and can be overridden by the early platform code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-27x86: Add reserve_ebda_region to x86_init_opsThomas Gleixner
reserve_ebda_region needs to be called befor start_kernel. Moorestown needs to override it. Make it a x86_init_ops function and initialize it with the default reserve_ebda_region. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-27x86: Add request_standard_resources to x86_initThomas Gleixner
The 32bit and the 64bit code are slighty different in the reservation of standard resources. Also the upcoming Moorestown support needs its own version of that. Add it to x86_init_ops and initialize it with the 64bit default. 32bit overrides it in early boot. Now moorestown can add it's own override w/o sprinkling the code with more #ifdefs Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-27x86: Add probe_roms to x86_initThomas Gleixner
probe_roms is only used on 32bit. Add it to the x86_init ops and remove the #ifdefs. Default initializer is x86_init_noop() which is overridden in the 32bit boot code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-27x86: Add x86_init infrastructureThomas Gleixner
The upcoming Moorestown support brings the embedded world to x86. The setup code of x86 has already a couple of hooks which are either x86_quirks or paravirt ops. Some of those setup hooks are pretty convoluted like the timer setup and the tsc calibration code. But there are other places which could do with a cleanup. Instead of having inline functions/macros which are modified at compile time I decided to introduce x86_init ops which are unconditional in the code and make it clear that they can be changed either during compile time or in the early boot process. The function pointers are initialized by default functions which can be noops so that the pointer can be called unconditionally in the most cases. This also allows us to remove 32bit/64bit, paravirt and other #ifdeffery. paravirt guests are just a hardware platform in the setup code, so we should treat them as such and not hide all behind multiple layers of indirection and compile time dependencies. It's more obvious that x86_init.timers.timer_init() is a function pointer than the late_time_init = choose_time_init() obscurity. It's also way simpler to grep for x86_init.timers.timer_init and find all the places which modify that function pointer instead of analyzing weak functions, macros and paravirt indirections. Note. This is not a general paravirt_ops replacement. It just will move setup related hooks which are potentially useful for other platform setup purposes as well out of the paravirt domain. Add the base infrastructure without any functionality. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-27Merge branch 'sched/clock' into x86/cleanupsThomas Gleixner
Reason: The tsc init cleanup depends on sched_clock_init moving past late_time_init. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-26Merge branch 'x86/urgent' into x86/patH. Peter Anvin
Reason: Change to is_new_memtype_allowed() in x86/urgent Resolved semantic conflicts in: arch/x86/mm/pat.c arch/x86/mm/ioremap.c Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-08-26tracing: Convert event tracing code to use NR_syscallsJason Baron
Convert the syscalls event tracing code to use NR_syscalls, instead of FTRACE_SYSCALL_MAX. NR_syscalls is standard accross most arches, and reduces code confusion/complexity. Signed-off-by: Jason Baron <jbaron@redhat.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Jiaying Zhang <jiayingz@google.com> Cc: Martin Bligh <mbligh@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Josh Stone <jistone@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anwin <hpa@zytor.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> LKML-Reference: <9b4f1a84ecae57cc6599412772efa36f0d2b815b.1251146513.git.jbaron@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-08-26tracing: Define NR_syscalls for x86_64Jason Baron
Express the available number of syscalls in a standard way by defining NR_syscalls. The common way to define it is to place its definition in asm/unistd.h However, the number of syscalls is defined using __NR_syscall_max in x86-64 after building a dynamic header file "asm-offsets.h" The source file that generates this header, asm-offsets-64.c includes unistd.h, then if we want to express NR_syscalls from __NR_syscall_max in unistd.h only after generating the dynamic header file, we need a watchguard. If unistd.h is included from asm-offsets-64.c, then we are generating asm-offset.h which defines __NR_syscall_max. At this time, we don't want to (we can't) define NR_syscalls, then we do nothing. Otherwise we define NR_syscalls because we know asm-offsets.h has been generated. Signed-off-by: Jason Baron <jbaron@redhat.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Jiaying Zhang <jiayingz@google.com> Cc: Martin Bligh <mbligh@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Josh Stone <jistone@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anwin <hpa@zytor.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> LKML-Reference: <20090826160910.GB2658@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-08-26x86, apic: Slim down stack usage in early_init_lapic_mapping()Cyrill Gorcunov
As far as I see there is no external poking of mp_lapic_addr in this procedure which could lead to unpredited changes and require local storage unit for it. Lets use it plain forward. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <20090826171324.GC4548@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-26x86, mce: CE in last bank prevents panic by unknown MCEHidetoshi Seto
If MCE handler is called but none of mces_seen have machine check event which might signal the MCE (i.e. event higher than MCE_KEEP_SEVERITY), panic with "Machine check from unknown source" will be taken since the MCE is assumed to be signaled from external agent or so. Usually mces_seen never point MCE_KEEP_SEVERITY event such as CE. But it can happen because initial value of mces_seen is accidentally modified by mce_no_way_out() - in case if mce_no_way_out() run through all banks and the last bank has the CE, mces_seen points the CE and the "panic by unknown" will not be taken. This patch fixes this undesired behavior, and clarifies the logic. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Dongming <jin.dongming@np.css.fujitsu.com> LKML-Reference: <4A94E244.3020301@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Reported-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
2009-08-26x86: Fix vSMP boot crashYinghai Lu
2.6.31-rc7 does not boot on vSMP systems: [ 8.501108] CPU31: Thermal monitoring enabled (TM1) [ 8.501127] CPU 31 MCA banks SHD:2 SHD:3 SHD:5 SHD:6 SHD:8 [ 8.650254] CPU31: Intel(R) Xeon(R) CPU E5540 @ 2.53GHz stepping 04 [ 8.710324] Brought up 32 CPUs [ 8.713916] Total of 32 processors activated (162314.96 BogoMIPS). [ 8.721489] ERROR: parent span is not a superset of domain->span [ 8.727686] ERROR: domain->groups does not contain CPU0 [ 8.733091] ERROR: groups don't span domain->span [ 8.737975] ERROR: domain->cpu_power not set [ 8.742416] Ravikiran Thirumalai bisected it to: | commit 2759c3287de27266e06f1f4e82cbd2d65f6a044c | x86: don't call read_apic_id if !cpu_has_apic The problem is that on vSMP systems the CPUID derived initial-APICIDs are overlapping - so we need to fall back on hard_smp_processor_id() which reads the local APIC. Both come from the hardware (influenced by firmware though) so it's a tough call which one to trust. Doing the quirk expresses the vSMP property properly and also does not affect other systems, so we go for this solution instead of a revert. Reported-and-Tested-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Shai Fultheim <shai@scalex86.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <4A944D3C.5030100@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-26x86, e820: Guard against array overflowed in __e820_add_region()Cyrill Gorcunov
Better to be paranoid against unpredicted nr_map modifications. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> LKML-Reference: <20090824175551.146070377@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-26x86, ioapic: Get rid of needless check and simplify ioapic_setup_resources()Cyrill Gorcunov
alloc_bootmem() already panics on allocation failure. There is no need to check the result. Also there is a way to unbind global variable from its body and use it as a parameter which allow us to simplify ioapic_init_mappings as well -- "for" cycle already uses nr_ioapics as a conditional variable and there is no need to check if ioapic_setup_resources was returning NULL again. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <20090824175551.493629148@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-26x86, ioapic: Define IO_APIC_DEFAULT_PHYS_BASE constantCyrill Gorcunov
We already have APIC_DEFAULT_PHYS_BASE so just to be consistent. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> LKML-Reference: <20090824175550.927946757@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>