linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2009-09-14	sh: multi-evt support for SH-X3 proto CPU.	Paul Mundt
	This adds support for multiple vectors per unique IRQ masking source on the SH-X3 proto CPU. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-09-14	perf tools: Add an option to multiplex counters in a single channel	Frederic Weisbecker
	Add an option to multiplex counters output in the channel of the group leader, ie: the first counter opened: -M --multiplex The effect is better serialized samples. This is especially useful for tracepoint samples that need to be well serialized for their post-processing. Also make use of this option in 'perf sched'. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-14	block: use blkdev_issue_discard in blk_ioctl_discard	Christoph Hellwig
	blk_ioctl_discard duplicates large amounts of code from blkdev_issue_discard, the only difference between the two is that blkdev_issue_discard needs to send a barrier discard request and blk_ioctl_discard a non-barrier one, and blk_ioctl_discard needs to wait on the request. To facilitates this add a flags argument to blkdev_issue_discard to control both aspects of the behaviour. This will be very useful later on for using the waiting funcitonality for other callers. Based on an earlier patch from Matthew Wilcox <matthew@wil.cx>. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-14	Make DISCARD_BARRIER and DISCARD_NOBARRIER writes instead of reads	David Woodhouse
	The commands are conceptually writes, and in the case of IDE and SCSI commands actually are writes. They were only reads because we thought that would interact better with the elevators. Now the elevators know about discard requests, that advantage no longer exists. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-14	block: don't assume device has a request list backing in nr_requests store	Jens Axboe
	Stacked devices do not. For now, just error out with -EINVAL. Later we could make the limit apply on stacked devices too, for throttling reasons. This fixes 5a54cd13353bb3b88887604e2c980aa01e314309 and should go into 2.6.31 stable as well. Cc: stable@kernel.org Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-14	block: Optimal I/O limit wrapper	Martin K. Petersen
	Implement blk_limits_io_opt() and make blk_queue_io_opt() a wrapper around it. DM needs this to avoid poking at the queue_limits directly. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-14	cfq: choose a new next_req when a request is dispatched	Jeff Moyer
	This patch addresses http://bugzilla.kernel.org/show_bug.cgi?id=13401, a regression introduced in 2.6.30. From the bug report: Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-14	Seperate read and write statistics of in_flight requests	Nikanth Karthikesan
	Currently, there is a single in_flight counter measuring the number of requests in the request_queue. But some monitoring tools would like to know how many read requests and write requests are in progress. Split the current in_flight counter into two seperate counters for read and write. This information is exported as a sysfs attribute, as changing the currently available stat files would break the existing tools. Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-14	aoe: end barrier bios with EOPNOTSUPP	Ed Cashin
	BugLink: http://bugzilla.kernel.org/show_bug.cgi?id=13942 Bruno Premont noticed that aoe throws a BUG during umount of an XFS in 2.6.31: [ 5259.349897] aoe: bi_io_vec is NULL [ 5259.349940] ------------[ cut here ]------------ [ 5259.349958] kernel BUG at /usr/src/linux-2.6/drivers/block/aoe/aoeblk.c:177! [ 5259.349990] invalid opcode: 0000 [#1] The bio in question is a barrier. Jens Axboe suggested that such bios need to be recognized and ended with -EOPNOTSUPP by any driver that provides its own ->make_request_fn handler and does not handle barriers. In testing the changes below eliminate the BUG. (Better would be real barrier support, something that Ed says he'll add for later in the .32 cycle. For now, this at least gets rid of a bug with crashing on an empty barrier. Jens) Signed-off-by: Ed L. Cashin <ecashin@coraid.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-14	drm/radeon/kms: cleanup - remove radeon_share.h	Jerome Glisse
	radeon_share.h was begining to give problem with include order in respect of radeon.h. It's easier and also i think cleaner to move what was in radeon_share.h into radeon.h. At the same time use the extern keyword for function shared accross the module. Signed-off-by: Jerome Glisse <jglisse@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2009-09-14	drm/radeon/kms: move mtrr range add and memory information	Jerome Glisse
	Move mtrr range and memory information printing to radeon_object_init, this are memory information and initialization common to all GPU and they better fit in this function. Will also prevent code duplication with upcoming init path changes. airlied: fixed warning introduced Signed-off-by: Jerome Glisse <jglisse@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2009-09-14	drm/radeon/kms: convert r4xx to new init path	Jerome Glisse
	This convert r4xx to new init path it also fix few bugs. Signed-off-by: Jerome Glisse <jglisse@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2009-09-14	sh: clkfwk: remove bogus set_bus_parent() from SH7709.	Rafael Ignacio Zurita
	This fixes up broken clock re-parenting undertaken by the SH7709 clock framework code, which is currently in conflict with the legacy CPG framework. With this change in place, the legacy CPG ancestry is used, and we manage to avoid contending on the clock_list_sem mutex, which is already held under the legacy registration path, resulting in livelock. In order for SH7709 to fully support the varying clock modes, it needs to implement a more complete clock framework. After this change it is in sync with legacy CPG mode, which ends up being the default configuration for this CPU anyways. Signed-off-by: Rafael Ignacio Zurita <rizurita@yahoo.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-09-14	slub: fix slab_pad_check()	Eric Dumazet
	When SLAB_POISON is used and slab_pad_check() finds an overwrite of the slab padding, we call restore_bytes() on the whole slab, not only on the padding. Acked-by: Christoph Lameer <cl@linux-foundation.org> Reported-by: Zdenek Kabelac <zdenek.kabelac@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-09-13	Merge branch 'next' into for-linus	Dmitry Torokhov

2009-09-14	SELinux: flush the avc before disabling SELinux	Eric Paris
	Before SELinux is disabled at boot it can create AVC entries. This patch will flush those entries before disabling SELinux. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2009-09-14	SELinux: seperate avc_cache flushing	Eric Paris
	Move the avc_cache flushing into it's own function so it can be reused when disabling SELinux. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2009-09-14	Creds: creds->security can be NULL is selinux is disabled	Eric Paris
	__validate_process_creds should check if selinux is actually enabled before running tests on the selinux portion of the credentials struct. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2009-09-13	nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition	Benny Halevy
	Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-13	nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel	Alexandros Batsakis
	[sunrpc: change idle timeout value for the backchannel] Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-13	Merge branches 'upstream', 'upstream-fixes' and 'debugfs' into for-linus	Jiri Kosina

2009-09-13	perf_counter, sched: Add sched_stat_runtime tracepoint	Ingo Molnar
	This allows more precise tracking of how the scheduler accounts (and acts upon) a task having spent N nanoseconds of CPU time. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13	Input: bcm5974 - silence uninitialized variables warnings	Henrik Rydberg
	Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-09-13	Input: wistron_btns - add keymap for AOpen 1557	Dmitry Torokhov
	This one does not have anything useful in DMI either so again we need to use: force=1 keymap=aopen1557 Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-09-13	perf sched: Add 'perf sched trace', improve documentation	Ingo Molnar
	Alias 'perf sched trace' to 'perf trace', for workflow completeness. Add a bit of documentation for perf sched. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13	perf_counter: Allow mmap if paranoid checks are turned off	Ingo Molnar
	Before: $ perf sched record -f sleep 1 Error: failed to mmap with 1 (Operation not permitted) After: $ perf sched record -f sleep 1 [ perf record: Captured and wrote 0.095 MB perf.data (~4161 samples) ] Note, this is only allowed if perfcounter_paranoid is set to the most permissive (non-default) value of -1. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13	perf sched: Implement the 'perf sched record' subcommand	Ingo Molnar
	Implement the 'perf sched record' subcommand that adds a default list of events, turns on raw sampling and system-wide tracing and passes off the rest of the command to perf record. This is more convenient than having to specify the events all the time. Before: $ perf record -a -R -e sched:sched_switch:r -e sched:sched_stat_wait:r -e sched:sched_stat_sleep:r -e sched:sched_stat_iowait:r -e sched:sched_process_exit:r -e sched:sched_process_fork:r -e sched:sched_wakeup:r -e sched:sched_migrate_task:r -c 1 sleep 1 After: $ perf sched record -f sleep 1 Also fix an assumption in the event string parser that assumed that strings passed in can be modified. (In this case they wont be as they come from a readonly constant section.) Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13	perf sched: Clean up PID sorting logic	Ingo Molnar
	Use a sort list for thread atoms insertion as well - instead of hardcoded for PID. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13	perf sched: Finish latency => atom rename and misc cleanups	Ingo Molnar
	- Rename 'latency' field/variable names to the better 'atom' ones - Reduce the number of #include lines and consolidate them - Gather file scope variables at the top of the file - Remove unused bits No change in functionality. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13	perf sched: Add 'perf sched latency' and 'perf sched replay'	Ingo Molnar
	Separate the option parsing cleanly and add two variants: - 'perf sched latency' (can be abbreviated via 'perf sched lat') - 'perf sched replay' (can be abbreviated via 'perf sched rep') Also add a repeat count option to replay and add a separation set of options for replay. Do the sorting setup only in the latency sub-command. Display separate help screens for 'perf sched' and 'perf sched replay -h' - i.e. further separation of the sub-commands. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13	perf sched: Implement multidimensional sorting	Frederic Weisbecker
	Implement multidimensional sorting on perf sched so that you can sort either by number of switches, latency average, latency maximum, runtime. perf sched -l -s avg,max (this is the default) ----------------------------------------------------------------------------------- Task \| Runtime ms \| Switches \| Average delay ms \| Maximum delay ms \| ----------------------------------------------------------------------------------- gnome-power-man \| 0.113 ms \| 1 \| avg: 4998.531 ms \| max: 4998.531 ms \| xfdesktop \| 1.190 ms \| 7 \| avg: 136.475 ms \| max: 940.933 ms \| xfce-mcs-manage \| 2.194 ms \| 22 \| avg: 38.534 ms \| max: 735.174 ms \| notification-da \| 2.749 ms \| 31 \| avg: 27.436 ms \| max: 731.791 ms \| xfce4-session \| 3.343 ms \| 28 \| avg: 26.796 ms \| max: 734.891 ms \| xfwm4 \| 3.159 ms \| 22 \| avg: 12.406 ms \| max: 241.333 ms \| xchat \| 42.789 ms \| 214 \| avg: 11.886 ms \| max: 100.349 ms \| xfce4-terminal \| 5.386 ms \| 22 \| avg: 11.414 ms \| max: 241.611 ms \| firefox \| 151.992 ms \| 123 \| avg: 9.543 ms \| max: 153.717 ms \| xfce4-panel \| 24.324 ms \| 47 \| avg: 8.189 ms \| max: 242.352 ms \| :5090 \| 6.932 ms \| 111 \| avg: 8.131 ms \| max: 102.665 ms \| events/0 \| 0.758 ms \| 12 \| avg: 1.964 ms \| max: 21.879 ms \| Xorg \| 280.558 ms \| 340 \| avg: 1.864 ms \| max: 99.526 ms \| geany \| 63.391 ms \| 295 \| avg: 1.099 ms \| max: 9.334 ms \| reiserfs/0 \| 0.039 ms \| 2 \| avg: 0.854 ms \| max: 1.487 ms \| kondemand/0 \| 8.251 ms \| 245 \| avg: 0.691 ms \| max: 34.372 ms \| Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13	perf sched: Fix nsec to msec conversion	Frederic Weisbecker
	We are dividing a time in ns by 1e9. This is a nsec to sec conversion. What we want is msecs. Fix it by dividing by 1e6. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13	perf sched: Export the total, max latency and total runtime to thread atoms list	Frederic Weisbecker
	Add a field in the thread atom list that keeps track of the total and max latencies and also the total runtime. This makes a faster output and also prepares for sorting. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13	perf sched: Add involuntarily sleeping task in work atoms	Frederic Weisbecker
	Currently in perf sched, we are measuring the scheduler wakeup latencies. Now we also want measure the time a task wait to be scheduled after it gets preempted. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13	perf sched: Rename struct lat_snapshot to struct work atoms	Frederic Weisbecker
	To measures the latencies, we capture the sched atoms data into a specific structure named struct lat_snapshot. As this structure can be used for other purposes of scheduler profiling and mirrors what happens in a thread work atom, lets rename it to struct work_atom and propagate this renaming in other functions and structures names to keep it coherent. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13	perf sched: Output runtime and context switch totals	Ingo Molnar
	After: ----------------------------------------------------------------------------------- Task \| Runtime ms \| Switches \| Average delay ms \| Maximum delay ms \| ----------------------------------------------------------------------------------- make \| 0.678 ms \| 13 \| avg: 0.018 ms \| max: 0.050 ms \| gcc \| 0.014 ms \| 2 \| avg: 0.320 ms \| max: 0.627 ms \| gcc \| 0.000 ms \| 2 \| avg: 0.185 ms \| max: 0.369 ms \| ... ----------------------------------------------------------------------------------- TOTAL: \| 21.316 ms \| 63 \| --------------------------------------------- Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13	perf sched: Add runtime stats	Ingo Molnar
	Extend the latency tracking structure with scheduling atom runtime info - and sum it up during per task display. (Also clean up a few details.) Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13	perf sched: Display time in milliseconds, reorganize output	Ingo Molnar
	After: ----------------------------------------------------------------------------------- Task \| runtime ms \| switches \| average delay ms \| maximum delay ms \| ----------------------------------------------------------------------------------- migration/0 \| 0.000 ms \| 1 \| avg: 0.047 ms \| max: 0.047 ms \| ksoftirqd/0 \| 0.000 ms \| 1 \| avg: 0.039 ms \| max: 0.039 ms \| migration/1 \| 0.000 ms \| 3 \| avg: 0.013 ms \| max: 0.016 ms \| migration/3 \| 0.000 ms \| 2 \| avg: 0.003 ms \| max: 0.004 ms \| migration/4 \| 0.000 ms \| 1 \| avg: 0.022 ms \| max: 0.022 ms \| distccd \| 0.000 ms \| 1 \| avg: 0.004 ms \| max: 0.004 ms \| distccd \| 0.000 ms \| 1 \| avg: 0.014 ms \| max: 0.014 ms \| distccd \| 0.000 ms \| 2 \| avg: 0.000 ms \| max: 0.000 ms \| distccd \| 0.000 ms \| 2 \| avg: 0.012 ms \| max: 0.019 ms \| distccd \| 0.000 ms \| 1 \| avg: 0.002 ms \| max: 0.002 ms \| as \| 0.000 ms \| 2 \| avg: 0.019 ms \| max: 0.019 ms \| as \| 0.000 ms \| 3 \| avg: 0.015 ms \| max: 0.017 ms \| as \| 0.000 ms \| 1 \| avg: 0.009 ms \| max: 0.009 ms \| perf \| 0.000 ms \| 1 \| avg: 0.001 ms \| max: 0.001 ms \| gcc \| 0.000 ms \| 1 \| avg: 0.021 ms \| max: 0.021 ms \| run-mozilla.sh \| 0.000 ms \| 2 \| avg: 0.010 ms \| max: 0.017 ms \| mozilla-plugin- \| 0.000 ms \| 1 \| avg: 0.006 ms \| max: 0.006 ms \| gcc \| 0.000 ms \| 2 \| avg: 0.013 ms \| max: 0.013 ms \| ----------------------------------------------------------------------------------- (The runtime ms column is not filled in yet.) Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13	perf sched: Clean up latency and replay sub-commands	Ingo Molnar
	- Separate the latency and the replay commands more cleanly - Use consistent naming - Display help page on 'perf sched' outlining comments, instead of aborting Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13	perf sched: Add sched latency profiling	Frederic Weisbecker
	Add the -l --latency option that reports statistics about the scheduler latencies. For now, the latencies are measured in the following sequence scope: - task A is sleeping (D or S state) - task B wakes up A ^ \| \| latency timeframe \| \| v - task A is scheduled in Start by recording every scheduler events: perf record -e sched:* and then fetch the results: perf sched -l Tasks count total avg max migration/0 2 39849 19924 28826 ksoftirqd/0 7 756383 108054 373014 migration/1 5 45391 9078 10452 ksoftirqd/1 2 399055 199527 359130 events/0 8 4780110 597513 4500250 events/1 9 6353057 705895 2986012 kblockd/0 42 37805097 900121 5077684 The snapshot are in nanoseconds. - Count: number of snapshots taken for the given task - Total: total latencies in nanosec - Avg : average of latency between wake up and sched in - Max : max snapshot latency Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13	perf sched: Make it easier to plug in new sub profilers	Frederic Weisbecker
	Create a sched event structure of handlers in which various sched events reader can plug their own callbacks. This makes easier the addition of new perf sched sub commands. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13	perf sched: Fix bad event alignment	Frederic Weisbecker
	perf sched raises the following error when it meets a sched switch event: perf: builtin-sched.c:286: register_pid: Assertion `!(pid >= 65536)' failed. Abandon Currently in x86-64, the sched switch events have a hole in the middle of the structure: u16 common_type; u8 common_flags; u8 common_preempt_count; u32 common_pid; u32 common_tgid; char prev_comm[16]; u32 prev_pid; u32 prev_prio; <--- there u64 prev_state; char next_comm[16]; u32 next_pid; u32 next_prio; Gcc inserts a 4 bytes hole there for prev_state to be u64 aligned. And the events are exported to userspace with this hole. But in userspace, from perf sched, we fetch it using a structure that has a new field in the beginning: u32 size. This is because our trace is exported with its size as a field. But now that we have this new field, the hole in the middle disappears because it makes prev_state becoming well aligned. And since we are using a pointer to the raw trace using this struct, instead of reading prev_state, we are reading the hole. We could fix it by keeping the size seperate from the struct but actually there a lot of other potential problems: some fields may be saved as long in a 64 bits system and later read as long in a 32 bits system. Also this direct cast doesn't care about the endianness differences between the host traced machine and the machine in which we do the post processing. So instead of using such dangerous direct casts, fetch the values using the trace parsing API that already takes care of all these problems. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13	perf tools: Allow the specification of all tracepoints at once	Frederic Weisbecker
	Currently, when one wants to activate every tracepoint counters of a subsystem from perf record, the current sequence is needed: perf record -e subsys:ev1 -e subsys:ev2 -e subsys:ev3 This may annoy the most patient of us. Now we can just do: perf record -e subsys:* Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13	perf sched: Tighten up the code	Ingo Molnar
	Various small cleanups - removal of debug printks and dead functions, etc. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13	perf sched: Implement the scheduling workload replay engine	Ingo Molnar
	Integrate the schedbench.c bits with the raw trace events that we get from the perf machinery, and activate the workload replayer/simulator. Example of a captured 'make -j' workload: $ perf sched run measurement overhead: 90 nsecs sleep measurement overhead: 2724743 nsecs the run test took 1000081 nsecs the sleep test took 2981111 nsecs version = 0.5 ... nr_run_events: 70 nr_sleep_events: 66 nr_wakeup_events: 9 target-less wakeups: 71 multi-target wakeups: 47 run events optimized: 139 task 0 ( perf: 6607), nr_events: 2 task 1 ( perf: 6608), nr_events: 6 task 2 ( : 0), nr_events: 1 task 3 ( make: 6609), nr_events: 5 task 4 ( sh: 6610), nr_events: 4 task 5 ( make: 6611), nr_events: 6 task 6 ( sh: 6612), nr_events: 4 task 7 ( make: 6613), nr_events: 5 task 8 ( migration/11: 25), nr_events: 1 task 9 ( migration/13: 29), nr_events: 1 task 10 ( migration/15: 33), nr_events: 1 task 11 ( migration/9: 21), nr_events: 1 task 12 ( sh: 6614), nr_events: 4 task 13 ( make: 6615), nr_events: 5 task 14 ( sh: 6616), nr_events: 4 task 15 ( make: 6617), nr_events: 7 task 16 ( migration/3: 9), nr_events: 1 task 17 ( migration/5: 13), nr_events: 1 task 18 ( migration/7: 17), nr_events: 1 task 19 ( migration/1: 5), nr_events: 1 task 20 ( sh: 6618), nr_events: 4 task 21 ( make: 6619), nr_events: 5 task 22 ( sh: 6620), nr_events: 4 task 23 ( make: 6621), nr_events: 10 task 24 ( sh: 6623), nr_events: 3 task 25 ( gcc: 6624), nr_events: 4 task 26 ( gcc: 6625), nr_events: 4 task 27 ( gcc: 6626), nr_events: 5 task 28 ( collect2: 6627), nr_events: 5 task 29 ( sh: 6622), nr_events: 1 task 30 ( make: 6628), nr_events: 7 task 31 ( sh: 6630), nr_events: 4 task 32 ( gcc: 6631), nr_events: 4 task 33 ( sh: 6629), nr_events: 1 task 34 ( gcc: 6632), nr_events: 4 task 35 ( gcc: 6633), nr_events: 4 task 36 ( collect2: 6634), nr_events: 4 task 37 ( make: 6635), nr_events: 8 task 38 ( sh: 6637), nr_events: 4 task 39 ( sh: 6636), nr_events: 1 task 40 ( gcc: 6638), nr_events: 4 task 41 ( gcc: 6639), nr_events: 4 task 42 ( gcc: 6640), nr_events: 4 task 43 ( collect2: 6641), nr_events: 4 task 44 ( make: 6642), nr_events: 6 task 45 ( sh: 6643), nr_events: 5 task 46 ( sh: 6644), nr_events: 3 task 47 ( sh: 6645), nr_events: 4 task 48 ( make: 6646), nr_events: 6 task 49 ( sh: 6647), nr_events: 3 task 50 ( make: 6648), nr_events: 5 task 51 ( sh: 6649), nr_events: 5 task 52 ( sh: 6650), nr_events: 6 task 53 ( make: 6651), nr_events: 4 task 54 ( make: 6652), nr_events: 5 task 55 ( make: 6653), nr_events: 4 task 56 ( make: 6654), nr_events: 4 task 57 ( make: 6655), nr_events: 5 task 58 ( sh: 6656), nr_events: 4 task 59 ( gcc: 6657), nr_events: 9 task 60 ( ksoftirqd/3: 10), nr_events: 1 task 61 ( gcc: 6658), nr_events: 4 task 62 ( make: 6659), nr_events: 5 task 63 ( sh: 6660), nr_events: 3 task 64 ( gcc: 6661), nr_events: 5 task 65 ( collect2: 6662), nr_events: 4 ------------------------------------------------------------ #1 : 256.745, ravg: 256.74, cpu: 0.00 / 0.00 #2 : 439.372, ravg: 275.01, cpu: 0.00 / 0.00 #3 : 411.971, ravg: 288.70, cpu: 0.00 / 0.00 #4 : 385.500, ravg: 298.38, cpu: 0.00 / 0.00 #5 : 366.526, ravg: 305.20, cpu: 0.00 / 0.00 #6 : 381.281, ravg: 312.81, cpu: 0.00 / 0.00 #7 : 410.756, ravg: 322.60, cpu: 0.00 / 0.00 #8 : 368.009, ravg: 327.14, cpu: 0.00 / 0.00 #9 : 408.098, ravg: 335.24, cpu: 0.00 / 0.00 #10 : 368.582, ravg: 338.57, cpu: 0.00 / 0.00 I.e. we successfully analyzed the trace, replayed it via real threads and measured the replayed workload's scheduling properties. This is how it looked like in 'top' output: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7164 mingo 20 0 1434m 8080 888 R 57.0 0.1 0:02.04 :perf 7165 mingo 20 0 1434m 8080 888 R 41.8 0.1 0:01.52 :perf 7228 mingo 20 0 1434m 8080 888 R 39.8 0.1 0:01.44 :gcc 7225 mingo 20 0 1434m 8080 888 R 33.8 0.1 0:01.26 :gcc 7202 mingo 20 0 1434m 8080 888 R 31.2 0.1 0:01.16 :sh 7222 mingo 20 0 1434m 8080 888 R 25.2 0.1 0:00.96 :sh 7211 mingo 20 0 1434m 8080 888 R 21.9 0.1 0:00.82 :sh 7213 mingo 20 0 1434m 8080 888 D 19.2 0.1 0:00.74 :sh 7194 mingo 20 0 1434m 8080 888 D 18.6 0.1 0:00.72 :make There's still various kinks in it - more patches to come. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13	perf sched: Import schedbench.c	Ingo Molnar
	Import the schedbench.c tool that i wrote some time ago to simulate scheduler behavior but never finished. It's a good basis for perf sched nevertheless. Most of its guts are not hooked up to the perf event loop yet - that will be done in the patches to come. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13	perf: Add 'perf sched' tool	Ingo Molnar
	This turn-key tool allows scheduler measurements to be conducted and the results be displayed numerically. First baby step towards that goal: clone the new command off of perf trace. Fix a few other details along the way: - add (minimal) perf trace documentation - reorder a few places - list perf trace in the mainporcelain list as well as it's a very useful utility. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-12	tracing: add filter event logic to special, mmiotrace and boot tracers	Steven Rostedt
	Now that the pluging tracers use macros to create the structures and automate the exporting of their formats to the format files, they also automatically get a filter file. This patch adds the code to implement the filter logic in the trace recordings. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-09-12	tracing: remove trace_event_types.h	Steven Rostedt
	The macros in trace_entries.h have made the code in trace_event_types.h obsolete. The file is no longer used, so this patch removes it. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-09-12	tracing: use the new trace_entries.h to create format files	Steven Rostedt
	This patch changes the way the format files in debugfs/tracing/events/ftrace/*/format are created. It uses the new trace_entries.h file to automate the creation of the format files to ensure that they are always in sync with the actual structures. This is the same methodology used to create the format files for the TRACE_EVENT macro. This also updates the filter creation that was built on the creation of the format files. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>