summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2008-07-12x86: fix ldt limit for 64 bitMichael Karcher
Fix size of LDT entries. On x86-64, ldt_desc is a double-sized descriptor. Signed-off-by: Michael Karcher <kernel@mkarcher.dialup.fu-berlin.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12x86_64: fix delayed signalsRoland McGrath
On three of the several paths in entry_64.S that call do_notify_resume() on the way back to user mode, we fail to properly check again for newly-arrived work that requires another call to do_notify_resume() before going to user mode. These paths set the mask to check only _TIF_NEED_RESCHED, but this is wrong. The other paths that lead to do_notify_resume() do this correctly already, and entry_32.S does it correctly in all cases. All paths back to user mode have to check all the _TIF_WORK_MASK flags at the last possible stage, with interrupts disabled. Otherwise, we miss any flags (TIF_SIGPENDING for example) that were set any time after we entered do_notify_resume(). More work flags can be set (or left set) synchronously inside do_notify_resume(), as TIF_SIGPENDING can be, or asynchronously by interrupts or other CPUs (which then send an asynchronous interrupt). There are many different scenarios that could hit this bug, most of them races. The simplest one to demonstrate does not require any race: when one signal has done handler setup at the check before returning from a syscall, and there is another signal pending that should be handled. The second signal's handler should interrupt the first signal handler before it actually starts (so the interrupted PC is still at the handler's entry point). Instead, it runs away until the next kernel entry (next syscall, tick, etc). This test behaves correctly on 32-bit kernels, and fails on 64-bit (either 32-bit or 64-bit test binary). With this fix, it works. #define _GNU_SOURCE #include <stdio.h> #include <signal.h> #include <string.h> #include <sys/ucontext.h> #ifndef REG_RIP #define REG_RIP REG_EIP #endif static sig_atomic_t hit1, hit2; static void handler (int sig, siginfo_t *info, void *ctx) { ucontext_t *uc = ctx; if ((void *) uc->uc_mcontext.gregs[REG_RIP] == &handler) { if (sig == SIGUSR1) hit1 = 1; else hit2 = 1; } printf ("%s at %#lx\n", strsignal (sig), uc->uc_mcontext.gregs[REG_RIP]); } int main (void) { struct sigaction sa; sigset_t set; sigemptyset (&sa.sa_mask); sa.sa_flags = SA_SIGINFO; sa.sa_sigaction = &handler; if (sigaction (SIGUSR1, &sa, NULL) || sigaction (SIGUSR2, &sa, NULL)) return 2; sigemptyset (&set); sigaddset (&set, SIGUSR1); sigaddset (&set, SIGUSR2); if (sigprocmask (SIG_BLOCK, &set, NULL)) return 3; printf ("main at %p, handler at %p\n", &main, &handler); raise (SIGUSR1); raise (SIGUSR2); if (sigprocmask (SIG_UNBLOCK, &set, NULL)) return 4; if (hit1 + hit2 == 1) { puts ("PASS"); return 0; } puts ("FAIL"); return 1; } Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12x86: remove conflicting nx6325 and nx6125 quirksRafael J. Wysocki
We have two conflicting DMA-based quirks in there for the same set of boxes (HP nx6325 and nx6125) and one of them actually breaks my box. So remove the extra code. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: =?iso-8859-1?q?T=F6r=F6k_Edwin?= <edwintorok@gmail.com> Cc: Vegard Nossum <vegard.nossum@gmail.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdogLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog: [PATCH] IPMI: return correct value from ipmi_write
2008-07-11IB/umad: BKL is not needed for ib_umad_open()Roland Dreier
Remove explicit lock_kernel() calls and document why the code is safe. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2008-07-11[PATCH] IPMI: return correct value from ipmi_writeMark Rustad
This patch corrects the handling of write operations to the IPMI watchdog to work as intended by returning the number of characters actually processed. Without this patch, an "echo V >/dev/watchdog" enables the watchdog if IPMI is providing the watchdog function. Signed-off-by: Mark Rustad <MRustad@gmail.com> Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2008-07-11Merge branch 'x86/generalize-visws' into x86/coreIngo Molnar
2008-07-11x86: Recover timer_ack lost in the merge of the NMI watchdogMaciej W. Rozycki
In the course of the recent unification of the NMI watchdog an assignment to timer_ack to switch off unnecesary POLL commands to the 8259A in the case of a watchdog failure has been accidentally removed. The statement used to be limited to the 32-bit variation as since the rewrite of the timer code it has been relevant for the 82489DX only. This change brings it back. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11x86: I/O APIC: Never configure IRQ2Maciej W. Rozycki
There is no such entity as ISA IRQ2. The ACPI spec does not make it explicitly clear, but does not preclude it either -- all it says is ISA legacy interrupts are identity mapped by default (subject to overrides), but it does not state whether IRQ2 exists or not. As a result if there is no IRQ0 override, then IRQ2 is normally initialised as an ISA interrupt, which implies an edge-triggered line, which is unmasked by default as this is what we do for edge-triggered I/O APIC interrupts so as not to miss an edge. To the best of my knowledge it is useless, as IRQ2 has not been in use since the PC/AT as back then it was taken by the 8259A cascade interrupt to the slave, with the line position in the slot rerouted to newly-created IRQ9. No device could thus make use of this line with the pair of 8259A chips. Now in theory INTIN2 of the I/O APIC may be usable, but the interrupt of the device wired to it would not be available in the PIC mode at all, so I seriously doubt if anybody decided to reuse it for a regular device. However there are two common uses of INTIN2. One is for IRQ0, with an ACPI interrupt override (or its equivalent in the MP table). But in this case IRQ2 is gone entirely with INTIN0 left vacant. The other one is for an 8959A ExtINTA cascade. In this case IRQ0 goes to INTIN0 and if ACPI is used INTIN2 is assumed to be IRQ2 (there is no override and ACPI has no way to report ExtINTA interrupts). This is where a problem happens. The problem is INTIN2 is configured as a native APIC interrupt, with a vector assigned and the mask cleared. And the line may indeed get active and inject interrupts if the master 8959A has its timer interrupt enabled (it might happen for other interrupts too, but they are normally masked in the process of rerouting them to the I/O APIC). There are two cases where it will happen: * When the I/O APIC NMI watchdog is enabled. This is actually a misnomer as the watchdog pulses are delivered through the 8259A to the LINT0 inputs of all the local APICs in the system. The implication is the output of the master 8259A goes high and low repeatedly, signalling interrupts to INTIN2 which is enabled too! [The origin of the name is I think for a brief period during the development we had a capability in our code to configure the watchdog to use an I/O APIC input; that would be INTIN2 in this scenario.] * When the native route of IRQ0 via INTIN0 fails for whatever reason -- as it happens with the system considered here. In this scenario the timer pulse is delivered through the 8259A to LINT0 input of the local APIC of the bootstrap processor, quite similarly to how is done for the watchdog described above. The result is, again, INTIN2 receives these pulses too. Rafael's system used to escape this scenario, because an incorrect IRQ0 override would occupy INTIN2 and prevent it from being unmasked. My conclusion is IRQ2 should be excluded from configuration in all the cases and the current exception for ACPI systems should be lifted. The reason being the exception not only being useless, but harmful as well. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11x86: L-APIC: Always fully configure IRQ0Maciej W. Rozycki
Unlike the 32-bit one, the 64-bit variation of the LVT0 setup code for the "8259A Virtual Wire" through the local APIC timer configuration does not fully configure the relevant irq_chip structure. Instead it relies on the preceding I/O APIC code to have set it up, which does not happen if the I/O APIC variants have not been tried. The patch includes corresponding changes to the 32-bit variation too which make them both the same, barring a small syntactic difference involving sequence of functions in the source. That should work as an aid with the upcoming merge. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11x86: L-APIC: Set IRQ0 as edge-triggeredMaciej W. Rozycki
IRQ0 is edge-triggered, but the "8259A Virtual Wire" through the local APIC configuration in the 32-bit version uses the "fasteoi" handler suitable for level-triggered APIC interrupt. Rewrite code so that the "edge" handler is used. The 64-bit version uses different code and is unaffected. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11x86: merge dwarf2 headersGlauber Costa
Merge dwarf2_32.h and dwarf2_64.h into dwarf2.h. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11x86: use AS_CFI instead of UNWIND_INFOGlauber Costa
In dwarf2_32.h, test for CONFIG_AS_CFI instead of CONFIG_UNWIND_INFO. Turns out that searching for UNWIND_INFO returns no match in any Kconfig or Makefile, so we're really just throwing everything away regarding dwarf frames for i386. The test that generates CONFIG_AS_CFI does not have anything x86_64-specific, and right now, checking V=1 builds shows me that the flags is there anyway, although unused. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11x86: use ignore macro instead of hash commentGlauber Costa
In dwarf_64.h header, use the "ignore" macro the way i386 does. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11x86: use matching CFI_ENDPROCGlauber Costa
The RING0_INT_FRAME macro defines a CFI_STARTPROC. So we should really be using CFI_ENDPROC after it. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11[SCSI] ipr: Fix HDIO_GET_IDENTITY oops for SATA devicesBrian King
Currently, ipr does not support HDIO_GET_IDENTITY to SATA devices. An oops occurs if userspace attempts to send the command. Since hald issues the command, ensure we fail the ioctl in ipr. This is a temporary solution to the oops. Once the ipr libata EH conversion is upstream, ipr will fully support HDIO_GET_IDENTITY. Tested-by: Milton Miller <miltonm@bga.com> Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-11Merge branch 'upstream-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: libata-acpi: don't call sleeping function from invalid context Added Targa Visionary 1000 IDE adapter to pata_sis.c libata-acpi: filter out DIPM enable
2008-07-11Fix reference counting race on log buffersDave Chinner
When we release the iclog, we do an atomic_dec_and_lock to determine if we are the last reference and need to trigger update of log headers and writeout. However, in xlog_state_get_iclog_space() we also need to check if we have the last reference count there. If we do, we release the log buffer, otherwise we decrement the reference count. But the compare and decrement in xlog_state_get_iclog_space() is not atomic, so both places can see a reference count of 2 and neither will release the iclog. That leads to a filesystem hang. Close the race by replacing the atomic_read() and atomic_dec() pair with atomic_add_unless() to ensure that they are executed atomically. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Tim Shimmin <tes@sgi.com> Tested-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-11x86: fix savesegment() bug causing crashes on 64-bitIngo Molnar
i spent a fair amount of time chasing a 64-bit bootup crash that manifested itself as bootup segfaults: S10network[1825]: segfault at 7f3e2b5d16b8 ip 00000031108748c9 sp 00007fffb9c14c70 error 4 in libc-2.7.so[3110800000+14d000] eventually causing init to die and panic the system: Kernel panic - not syncing: Attempted to kill init! Pid: 1, comm: init Not tainted 2.6.26-rc9-tip #13878 after a maratonic bisection session, the bad commit turned out to be: | b7675791859075418199c7af86a116ea34eaf5bd is first bad commit | commit b7675791859075418199c7af86a116ea34eaf5bd | Author: Jeremy Fitzhardinge <jeremy@goop.org> | Date: Wed Jun 25 00:19:00 2008 -0400 | | x86: remove open-coded save/load segment operations | | This removes a pile of buggy open-coded implementations of savesegment | and loadsegment. after some more bisection of this patch itself, it turns out that what makes the difference are the savesegment() changes to __switch_to(). Taking a look at this portion of arch/x86/kernel/process_64.o revealed this crutial difference: | good: 99c: 8c e0 mov %fs,%eax | 99e: 89 45 cc mov %eax,-0x34(%rbp) | | bad: 99c: 8c 65 cc mov %fs,-0x34(%rbp) which is due to: | unsigned fsindex; | - asm volatile("movl %%fs,%0" : "=r" (fsindex)); | + savesegment(fs, fsindex); savesegment() is implemented as: #define savesegment(seg, value) \ asm("mov %%" #seg ",%0":"=rm" (value) : : "memory") note the "m" modifier - it allows GCC to generate the segment move into a memory operand as well. But regarding segment operands there's a subtle detail in the x86 instruction set: the above 16-bit moves are zero-extend, but only if it goes to a register. If it goes to a memory operand, -0x34(%rbp) in the above case, there's no zero-extend to 32-bit and the instruction will only save 16 bits instead of the intended 32-bit. The other 16 bits is random data - which can cause problems when that value is used later on. The solution is to only allow segment operands to go to registers. This fix allows my test-system to boot up without crashing. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11ftrace: build fix for ftraced_suspendIngo Molnar
fix: kernel/trace/ftrace.c:1615: error: 'ftraced_suspend' undeclared (first use in this function) kernel/trace/ftrace.c:1615: error: (Each undeclared identifier is reported only once kernel/trace/ftrace.c:1615: error: for each function it appears in.) Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11sched_clock: and multiplier for TSC to gtod driftSteven Rostedt
The sched_clock code currently tries to keep all CPU clocks of all CPUS somewhat in sync. At every clock tick it records the gtod clock and uses that and jiffies and the TSC to calculate a CPU clock that tries to stay in sync with all the other CPUs. ftrace depends heavily on this timer and it detects when this timer "jumps". One problem is that the TSC and the gtod also drift. When the TSC is 0.1% faster or slower than the gtod it is very noticeable in ftrace. To help compensate for this, I've added a multiplier that tries to keep the CPU clock updating at the same rate as the gtod. I've tried various ways to get it to be in sync and this ended up being the most reliable. At every scheduler tick we calculate the new multiplier: multi = delta_gtod / delta_TSC This means we perform a 64 bit divide at the tick (once a HZ). A shift is used to handle the accuracy. Other methods that failed due to dynamic HZ are: (not used) multi += (gtod - tsc) / delta_gtod (not used) multi += (gtod - (last_tsc + delta_tsc)) / delta_gtod as well as other variants. This code still allows for a slight drift between TSC and gtod, but it keeps the damage down to a minimum. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11sched_clock: record TSC after gtodSteven Rostedt
To read the gtod we need to grab the xtime lock for read. Reading the gtod before the TSC can cause a bigger gab if the xtime lock is contended. This patch simply reverses the order to read the TSC after the gtod. The locking in the reading of the gtod handles any barriers one might think is needed. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11sched_clock: only update deltas with local reads.Steven Rostedt
Reading the CPU clock should try to stay accurate within the CPU. By reading the CPU clock from another CPU and updating the deltas can cause unneeded jumps when reading from the local CPU. This patch changes the code to update the last read TSC only when read from the local CPU. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11sched_clock: fix calculation of other CPUSteven Rostedt
The algorithm to calculate the 'now' of another CPU is not correct. At each scheduler tick, each CPU records the last sched_clock and gtod (tick_raw and tick_gtod respectively). If the TSC is somewhat the same in speed between two clocks the algorithm would be: tick_gtod1 + (now1 - tick_raw1) = tick_gtod2 + (now2 - tick_raw2) To calculate now2 we would have: now2 = (tick_gtod1 - tick_gtod2) + (tick_raw2 - tick_raw1) + now1 Currently the algorithm is: now2 = (tick_gtod1 - tick_gtod2) + (tick_raw1 - tick_raw2) + now1 This solves most of the rest of the issues I've had with timestamps in ftace. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: john stultz <johnstul@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11sched_clock: stop maximum check on NO HZSteven Rostedt
Working with ftrace I would get large jumps of 11 millisecs or more with the clock tracer. This killed the latencing timings of ftrace and also caused the irqoff self tests to fail. What was happening is with NO_HZ the idle would stop the jiffy counter and before the jiffy counter was updated the sched_clock would have a bad delta jiffies to compare with the gtod with the maximum. The jiffies would stop and the last sched_tick would record the last gtod. On wakeup, the sched clock update would compare the gtod + delta jiffies (which would be zero) and compare it to the TSC. The TSC would have correctly (with a stable TSC) moved forward several jiffies. But because the jiffies has not been updated yet the clock would be prevented from moving forward because it would appear that the TSC jumped too far ahead. The clock would then virtually stop, until the jiffies are updated. Then the next sched clock update would see that the clock was very much behind since the delta jiffies is now correct. This would then jump the clock forward by several jiffies. This caused ftrace to report several milliseconds of interrupts off latency at every resume from NO_HZ idle. This patch adds hooks into the nohz code to disable the checking of the maximum clock update when nohz is in effect. It resumes the max check when nohz has updated the jiffies again. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11sched_clock: widen the max and min timeSteven Rostedt
With keeping the max and min sched time within one jiffy of the gtod clock was too tight. Just before a schedule tick the max could easily be hit, as well as just after a schedule_tick the min could be hit. This caused the clock to jump around by a jiffy. This patch widens the minimum to last gtod + (delta_jiffies ? delta_jiffies - 1 : 0) * TICK_NSECS and the maximum to last gtod + (2 + delta_jiffies) * TICK_NSECS This keeps the minum to gtod or if one jiffy less than delta jiffies and the maxim 2 jiffies ahead of gtod. This may cause unstable TSCs to be a bit more sporadic, but it helps keep a clock with a stable TSC working well. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11sched_clock: record from last tickSteven Rostedt
The sched_clock code tries to keep within the gtod time by one tick (jiffy). The current code mistakenly keeps track of the delta jiffies between updates of the clock, where the the delta is used to compare with the number of jiffies that have past since an update of the gtod. The gtod is updated at each schedule tick not each sched_clock update. After one jiffy passes the clock is updated fine. But the delta is taken from the last update so if the next update happens before the next tick the delta jiffies used will be incorrect. This patch changes the code to check the delta of jiffies between ticks and not updates to match the comparison of the updates with the gtod. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11ftrace: separate out the function enabled variableSteven Rostedt
Currently the function tracer uses the global tracer_enabled variable that is used to keep track if the tracer is enabled or not. The function tracing startup needs to be separated out, otherwise the internal happenings of the tracer startup is also recorded. This patch creates a ftrace_function_enabled variable to all the starting of the function traces to happen after everything has been started. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11ftrace: add ftrace_kill_atomicSteven Rostedt
It has been suggested that I add a way to disable the function tracer on an oops. This code adds a ftrace_kill_atomic. It is not meant to be used in normal situations. It will disable the ftrace tracer, but will not perform the nice shutdown that requires scheduling. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11ftrace: use current CPU for function startupSteven Rostedt
This is more of a clean up. Currently the function tracer initializes the tracer with which ever CPU was last used for tracing. This value isn't realy useful for function tracing, but at least it should be something other than a random number. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11ftrace: start wakeup tracing after setting function tracerSteven Rostedt
Enabling the wakeup tracer before enabling the function tracing causes some strange results due to the dynamic enabling of the functions. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11ftrace: check proper config for preempt typeSteven Rostedt
There is no CONFIG_PREEMPT_DESKTOP. Use the proper entry CONFIG_PREEMPT. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11ftrace: trace scheduleSteven Rostedt
After the sched_clock code has been removed from sched.c we can now trace the scheduler. The scheduler has a lot of functions that would be worth tracing. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11ftrace: define function trace nopSteven Rostedt
When CONFIG_FTRACE is not enabled, the tracing_start_functon_trace and tracing_stop_function_trace should be nops. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11ftrace: move sched_switch enable after markersSteven Rostedt
We have two markers now that are enabled on sched_switch. One that records the context switching and the other that records task wake ups. Currently we enable the tracing first and then set the markers. This causes some confusing traces: # tracer: sched_switch # # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | trace-cmd-3973 [00] 115.834817: 3973:120:R + 3: 0:S trace-cmd-3973 [01] 115.834910: 3973:120:R + 6: 0:S trace-cmd-3973 [02] 115.834910: 3973:120:R + 9: 0:S trace-cmd-3973 [03] 115.834910: 3973:120:R + 12: 0:S trace-cmd-3973 [02] 115.834910: 3973:120:R + 9: 0:S <idle>-0 [02] 115.834910: 0:140:R ==> 3973:120:R Here we see that trace-cmd with PID 3973 wakes up task 9 but the next line shows the idle task doing a context switch to task 3973. Enabling the tracing to _after_ the markers are set creates a much saner output: # tracer: sched_switch # # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | <idle>-0 [02] 7922.634225: 0:140:R ==> 4790:120:R trace-cmd-4789 [03] 7922.634225: 0:140:R + 4790:120:R Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11x86_64: vdso32 cleanup using feature flagsJeremy Fitzhardinge
Use the X86_FEATURE_SYSENTER32 to remove hard-coded CPU vendor check. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11x86_64: add pseudo-features for 32-bit compat syscallJeremy Fitzhardinge
Add pseudo-feature bits to describe whether the CPU supports sysenter and/or syscall from ia32-compat userspace. This removes a hardcoded test in vdso32-setup. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11libata-acpi: don't call sleeping function from invalid contextZhang Rui
The problem is introduced by commit 664d080c41463570b95717b5ad86e79dc1be0877. acpi_evaluate_integer is a sleeping function, and it should not be called with spin_lock_irqsave. https://bugzilla.redhat.com/show_bug.cgi?id=451399 Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-07-11Added Targa Visionary 1000 IDE adapter to pata_sis.cKai Krakow
This enables short 40-wire detection for my laptop thus enabling UDMA/100. Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-07-11libata-acpi: filter out DIPM enableTejun Heo
Some BIOSen enable DIPM via _GTF which causes command timeouts under certain configuration. This didn't occur on 2.6.25 because 2.6.25 defaulted to SRST, so _GTF wasn't executed during boot probe, so ahci host reset disabled DIPM and as _GTF wasn't executed after SRST, DIPM wasn't enabled. On 2.6.26, hardreset is used during probe and after probe _GTF is executed enabling DIPM and thus the failures. This patch could theoretically disable DIPM on machines which used to have it enabled on 2.6.25 but AFAIK ahci is currently the only driver which uses SATA ACPI hierarchy (_SDD) and as the host reset would have always disabled DIPM, this shouldn't happen. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-07-11x86: fix tsc unification buglet with ftrace and stackprotectorIngo Molnar
Yinghai Lu reported crashes on 64-bit x86: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 IP: [<ffffffff80253b17>] hrtick_start_fair+0x89/0x173 [...] And with a long session of debugging and a lot of difficulty, tracked it down to this commit: ---------------> 8fbbc4b45ce3e4c0eeb15004c79c72b6896a79c2 is first bad commit commit 8fbbc4b45ce3e4c0eeb15004c79c72b6896a79c2 Author: Alok Kataria <akataria@vmware.com> Date: Tue Jul 1 11:43:34 2008 -0700 x86: merge tsc_init and clocksource code <-------------- The problem is that the TSC unification missed these Makefile rules in arch/x86/kernel/Makefile: # Do not profile debug and lowlevel utilities CFLAGS_REMOVE_tsc_64.o = -pg CFLAGS_REMOVE_tsc_32.o = -pg ... CFLAGS_tsc_64.o := $(nostackp) ... which rules make sure that various instrumentation and debugging facilities are disabled for code that might end up in a VDSO - such as the TSC code. Reported-and-bisected-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Conflicts: Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11x86: introduce max_low_pfn_mapped for 64-bitYinghai Lu
when more than 4g memory is installed, don't map the big hole below 4g. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11x86: reserve SLITYinghai Lu
save the SLIT, in case we are using fixmap to read it, and that fixmap could be cleared by others. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11x86: e820: user-defined memory maps: remove the range instead of update it ↵Yinghai Lu
to reserved also let mem= to print out modified e820 map too Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Bernhard Walle <bwalle@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-10rtc: fix reported IRQ rate for when HPET is enabledPaul Gortmaker
The IRQ rate reported back by the RTC is incorrect when HPET is enabled. Newer hardware that has HPET to emulate the legacy RTC device gets this value wrong since after it sets the rate, it returns before setting the variable used to report the IRQ rate back to users of the device -- so the set rate and the reported rate get out of sync. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Brownell <david-b@pacbell.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-10Fix name of Russell King in various commentsUwe Kleine-König
This patch was created by git grep -E -l 'Rus(el|s?e)l King' | xargs -r -t perl -p -i -e 's/Rus(el|s?e)l King/Russell King/g' Signed-off-by: Uwe Kleine-König <Uwe.Kleine-Koenig@digi.com> Most-Definitely-Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-10rapidio: fix device reference countingEugene Surovegin
Fix RapidIO device reference counting. Signed-of-by: Eugene Surovegin <ebs@ebshome.net> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-10tpm: add Intel TPM TIS device HIDMarcin Obara
This patch adds Intel TPM TIS device HID: ICO0102 Signed-off-by: Marcin Obara <marcin_obara@users.sourceforge.net> Acked-by: Marcel Selhorst <tpm@selhorst.net> Acked-by: Rajiv Andrade <srajiv@linux.vnet.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits) tun: Persistent devices can get stuck in xoff state xfrm: Add a XFRM_STATE_AF_UNSPEC flag to xfrm_usersa_info ipv6: missed namespace context in ipv6_rthdr_rcv netlabel: netlink_unicast calls kfree_skb on error path by itself ipv4: fib_trie: Fix lookup error return tcp: correct kcalloc usage ip: sysctl documentation cleanup Documentation: clarify tcp_{r,w}mem sysctl docs netfilter: nf_nat_snmp_basic: fix a range check in NAT for SNMP netfilter: nf_conntrack_tcp: fix endless loop libertas: fix memory alignment problems on the blackfin zd1211rw: stop beacons on remove_interface rt2x00: Disable synchronization during initialization rc80211_pid: Fix fast_start parameter handling sctp: Add documentation for sctp sysctl variable ipv6: fix race between ipv6_del_addr and DAD timer irda: Fix netlink error path return value irda: New device ID for nsc-ircc irda: via-ircc proper dma freeing sctp: Mark the tsn as received after all allocations finish ...
2008-07-10tun: Persistent devices can get stuck in xoff stateMax Krasnyansky
The scenario goes like this. App stops reading from tun/tap. TX queue gets full and driver does netif_stop_queue(). App closes fd and TX queue gets flushed as part of the cleanup. Next time the app opens tun/tap and starts reading from it but the xoff state is not cleared. We're stuck. Normally xoff state is cleared when netdev is brought up. But in the case of persistent devices this happens only during initial setup. The fix is trivial. If device is already up when an app opens it we clear xoff state and that gets things moving again. Signed-off-by: Max Krasnyansky <maxk@qualcomm.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>