summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2013-03-15tracing: Add snapshot_raw to extract the raw data from snapshotSteven Rostedt (Red Hat)
Add a 'snapshot_raw' per_cpu file that allows tools to read the raw binary data of the snapshot buffer. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add config option to allow snapshot to swap per cpuSteven Rostedt (Red Hat)
When the preempt or irq latency tracers are enabled, they require the ring buffer to be able to swap the per cpu sub buffers between two main buffers. This adds a slight overhead to tracing as the trace recording needs to perform some checks to synchronize between recording and swaps that might be happening on other CPUs. The config RING_BUFFER_ALLOW_SWAP is set when a user of the ring buffer needs the "swap cpu" feature, otherwise the extra checks are not implemented and removed from the tracing overhead. The snapshot feature will swap per CPU if the RING_BUFFER_ALLOW_SWAP config is set. But that only gets set by things like OPROFILE and the irqs and preempt latency tracers. This config is added to let the user decide to include this feature with the snapshot agnostic from whether or not another user of the ring buffer sets this config. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add snapshot in the per_cpu trace directoriesSteven Rostedt (Red Hat)
Add the snapshot file into the per_cpu tracing directories to allow them to be read for an individual cpu. This also allows to clear an individual cpu from the snapshot buffer. If the kernel allows it (CONFIG_RING_BUFFER_ALLOW_SWAP is set), then echoing in '1' into one of the per_cpu snapshot files will do an individual cpu buffer swap instead of the entire file. Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Consolidate max_tr into main trace_array structureSteven Rostedt (Red Hat)
Currently, the way the latency tracers and snapshot feature works is to have a separate trace_array called "max_tr" that holds the snapshot buffer. For latency tracers, this snapshot buffer is used to swap the running buffer with this buffer to save the current max latency. The only items needed for the max_tr is really just a copy of the buffer itself, the per_cpu data pointers, the time_start timestamp that states when the max latency was triggered, and the cpu that the max latency was triggered on. All other fields in trace_array are unused by the max_tr, making the max_tr mostly bloat. This change removes the max_tr completely, and adds a new structure called trace_buffer, that holds the buffer pointer, the per_cpu data pointers, the time_start timestamp, and the cpu where the latency occurred. The trace_array, now has two trace_buffers, one for the normal trace and one for the max trace or snapshot. By doing this, not only do we remove the bloat from the max_trace but the instances of traces can now use their own snapshot feature and not have just the top level global_trace have the snapshot feature and latency tracers for itself. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Enable snapshot when any latency tracer is enabledSteven Rostedt (Red Hat)
The snapshot utility is extremely useful, and does not add any more overhead in memory when another latency tracer is enabled. They use the snapshot underneath. There's no reason to hide the snapshot file when a latency tracer has been enabled in the kernel. If any of the latency tracers (irq, preempt or wakeup) is enabled then also select the snapshot facility. Note, snapshot can be enabled without the latency tracers enabled. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Clear all trace buffers when unloaded module event was usedSteven Rostedt (Red Hat)
Currently we do not know what buffer a module event was enabled in. On unload, it is safest to clear all buffer instances, not just the top level buffer. Todo: Clear only the buffer that the event was used in. The infrastructure is there to do this, but it makes the code a bit more complex. Lets get the current code vetted before we add that. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Only clear trace buffer on module unload if event was tracedSteven Rostedt (Red Hat)
Currently, when a module with events is unloaded, the trace buffer is cleared. This is just a safety net in case the module might have some strange callback when its event is outputted. But there's no reason to reset the buffer if the module didn't have any of its events traced. Add a flag to the event "call" structure called WAS_ENABLED and gets set when the event is ever enabled, and this flag never gets cleared. When a module gets unloaded, if any of its events have this flag set, then the trace buffer will get cleared. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15ring-buffer: Init waitqueue for blocked readersSteven Rostedt (Red Hat)
The move of blocked readers to the ring buffer left out the init of the wait queue that is used. Tests missed this due to running stress tests against the buffers, which didn't allow for any readers to end up waiting. Running a simple read and wait triggered a bug. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Fix some section mismatch warningsLi Zefan
As we've added __init annotation to field-defining functions, we should add __refdata annotation to event_call variables, which reference those functions. Link: http://lkml.kernel.org/r/51343C1F.2050502@huawei.com Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Fix trace events build without modulesSteven Rostedt (Red Hat)
The new multi-buffers added a descriptor that kept track of module events, and the directories they use, with struct ftace_module_file_ops. This is used to add a ref count to keep modules from unloading while their files are being accessed. As the descriptor is only needed when CONFIG_MODULES is enabled, it is only declared when the config is enabled. But that struct is dereferenced in a few areas outside the #ifdef CONFIG_MODULES. By adding some helper routines and moving code around a little, events can be compiled again without modules. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add __per_cpu annotation to trace array percpu data pointerSteven Rostedt (Red Hat)
With the conversion of the data array to per cpu, sparse now complains about the use of per_cpu_ptr() on the variable. But The variable is allocated with alloc_percpu() and is fine to use. But since the structure that contains the data variable does not annotate it as such, sparse gives out a lot of false warnings. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing/syscalls: Annotate field-defining functions with __initLi Zefan
These two functions are called during kernel boot only. Link: http://lkml.kernel.org/r/51258796.7020704@huawei.com Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Annotate event field-defining functions with __initLi Zefan
Those functions are called either during kernel boot or module init. Before: $ dmesg | grep 'Freeing unused kernel memory' Freeing unused kernel memory: 1208k freed Freeing unused kernel memory: 1360k freed Freeing unused kernel memory: 1960k freed After: $ dmesg | grep 'Freeing unused kernel memory' Freeing unused kernel memory: 1236k freed Freeing unused kernel memory: 1388k freed Freeing unused kernel memory: 1960k freed Link: http://lkml.kernel.org/r/5125877D.5000201@huawei.com Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add a helper function for event print functionsLi Zefan
Move duplicate code in event print functions to a helper function. This shrinks the size of the kernel by ~13K. text data bss dec hex filename 6596137 1743966 10138672 18478775 119f6b7 vmlinux.o.old 6583002 1743849 10138672 18465523 119c2f3 vmlinux.o.new Link: http://lkml.kernel.org/r/51258746.2060304@huawei.com Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing/ring-buffer: Move poll wake ups into ring buffer codeSteven Rostedt (Red Hat)
Move the logic to wake up on ring buffer data into the ring buffer code itself. This simplifies the tracing code a lot and also has the added benefit that waiters on one of the instance buffers can be woken only when data is added to that instance instead of data added to any instance. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Fix read blocking on trace_pipe_rawSteven Rostedt
If the ring buffer is empty, a read to trace_pipe_raw wont block. The tracing code has the infrastructure to wake up waiting readers, but the trace_pipe_raw doesn't take advantage of that. When a read is done to trace_pipe_raw without the O_NONBLOCK flag set, have the read block until there's data in the requested buffer. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Fix polling on trace_pipe_rawSteven Rostedt
The trace_pipe_raw never implemented polling and this was casing issues for several utilities. This is now implemented. Blocked reads still are on the TODO list. Reported-by: Mauro Carvalho Chehab <mchehab@redhat.com> Tested-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Do not block on splice if either file or splice NONBLOCK flag is setSteven Rostedt (Red Hat)
Currently only the splice NONBLOCK flag is checked to determine if the splice read should block or not. But the file descriptor NONBLOCK flag also needs to be checked. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Use direct field, type and system namesSteven Rostedt
The names used to display the field and type in the event format files are copied, as well as the system name that is displayed. All these names are created by constant values passed in. If one of theses values were to be removed by a module, the module would also be required to remove any event it created. By using the strings directly, we can save over 100K of memory. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Use kmem_cache_alloc instead of kmalloc in trace_events.cSteven Rostedt
The event structures used by the trace events are mostly persistent, but they are also allocated by kmalloc, which is not the best at allocating space for what is used. By converting these kmallocs into kmem_cache_allocs, we can save over 50K of space that is permanently allocated. After boot we have: slab name active allocated size --------- ------ --------- ---- ftrace_event_file 979 1005 56 67 1 ftrace_event_field 2301 2310 48 77 1 The ftrace_event_file has at boot up 979 active objects out of 1005 allocated in the slabs. Each object is 56 bytes. In a normal kmalloc, that would allocate 64 bytes for each object. 1005 - 979 = 26 objects not used 26 * 56 = 1456 bytes wasted But if we used kmalloc: 64 - 56 = 8 bytes unused per allocation 8 * 979 = 7832 bytes wasted 7832 - 1456 = 6376 bytes in savings Doing the same for ftrace_event_field where there's 2301 objects allocated in a slab that can hold 2310 with 48 bytes each we have: 2310 - 2301 = 9 objects not used 9 * 48 = 432 bytes wasted A kmalloc would also use 64 bytes per object: 64 - 48 = 16 bytes unused per allocation 16 * 2301 = 36816 bytes wasted! 36816 - 432 = 36384 bytes in savings This change gives us a total of 42760 bytes in savings. At least on my machine, but as there's a lot of these persistent objects for all configurations that use trace points, this is a net win. Thanks to Ezequiel Garcia for his trace_analyze presentation which pointed out the wasted space in my code. Cc: Ezequiel Garcia <elezegarcia@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Get trace_events kernel command line working againSteven Rostedt
With the new descriptors used to allow multiple buffers in the tracing directory added, the kernel command line parameter trace_events=... no longer works. This is because the top level (global) trace array now has a list of descriptors associated with the events and the files in the debugfs directory. But in early bootup, when the command line is processed and the events enabled, the trace array list of events has not been set up yet. Without the list of events in the trace array, the setting of events to record will fail because it would not match any events. The solution is to set up the top level array in two stages. The first is to just add the ftrace file descriptors that just point to the events. This will allow events to be enabled and start tracing. The second stage is called after the filesystem is set up, and this stage will create the debugfs event files and directories associated with the trace array events. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add rmdir to remove multibuffer instancesSteven Rostedt
Add a method to the hijacked dentry descriptor of the "instances" directory to allow for rmdir to remove an instance of a multibuffer. Example: cd /debug/tracing/instances mkdir hello ls hello/ rmdir hello ls Like the mkdir method, the i_mutex is dropped for the instances directory. The instances directory is created at boot up and can not be renamed or removed. The trace_types_lock mutex is used to synchronize adding and removing of instances. I've run several stress tests with different threads trying to create and delete directories of the same name, and it has stood up fine. Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add interface to allow multiple trace buffersSteven Rostedt
Add the interface ("instances" directory) to add multiple buffers to ftrace. To create a new instance, simply do a mkdir in the instances directory: This will create a directory with the following: # cd instances # mkdir foo # ls foo buffer_size_kb free_buffer trace_clock trace_pipe buffer_total_size_kb set_event trace_marker tracing_enabled events/ trace trace_options tracing_on Currently only events are able to be set, and there isn't a way to delete a buffer when one is created (yet). Note, the i_mutex lock is dropped from the parent "instances" directory during the mkdir operation. As the "instances" directory can not be renamed or deleted (created on boot), I do not see any harm in dropping the lock. The creation of the sub directories is protected by trace_types_lock mutex, which only lets one instance get into the code path at a time. If two tasks try to create or delete directories of the same name, only one will occur and the other will fail with -EEXIST. Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Make syscall events suitable for multiple buffersSteven Rostedt
Currently the syscall events record into the global buffer. But if multiple buffers are in place, then we need to have syscall events record in the proper buffers. By adding descriptors to pass to the syscall event functions, the syscall events can now record into the buffers that have been assigned to them (one event may be applied to mulitple buffers). This will allow tracing high volume syscalls along with seldom occurring syscalls without losing the seldom syscall events. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Replace the static global per_cpu arrays with allocated per_cpuSteven Rostedt
The global and max-tr currently use static per_cpu arrays for the CPU data descriptors. But in order to get new allocated trace_arrays, they need to be allocated per_cpu arrays. Instead of using the static arrays, switch the global and max-tr to use allocated data. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Pass the ftrace_file to the buffer lock reserve codeSteven Rostedt
Pass the struct ftrace_event_file *ftrace_file to the trace_event_buffer_lock_reserve() (new function that replaces the trace_current_buffer_lock_reserver()). The ftrace_file holds a pointer to the trace_array that is in use. In the case of multiple buffers with different trace_arrays, this allows different events to be recorded into different buffers. Also fixed some of the stale comments in include/trace/ftrace.h Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Encapsulate global_trace and remove dependencies on global varsSteven Rostedt
The global_trace variable in kernel/trace/trace.c has been kept 'static' and local to that file so that it would not be used too much outside of that file. This has paid off, even though there were lots of changes to make the trace_array structure more generic (not depending on global_trace). Removal of a lot of direct usages of global_trace is needed to be able to create more trace_arrays such that we can add multiple buffers. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Use RING_BUFFER_ALL_CPUS for TRACE_PIPE_ALL_CPUSteven Rostedt
Both RING_BUFFER_ALL_CPUS and TRACE_PIPE_ALL_CPU are defined as -1 and used to say that all the ring buffers are to be modified or read (instead of just a single cpu, which would be >= 0). There's no reason to keep TRACE_PIPE_ALL_CPU as it is also started to be used for more than what it was created for, and now that the ring buffer code added a generic RING_BUFFER_ALL_CPUS define, we can clean up the trace code to use that instead and remove the TRACE_PIPE_ALL_CPU macro. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Separate out trace events from global variablesSteven Rostedt
The trace events for ftrace are all defined via global variables. The arrays of events and event systems are linked to a global list. This prevents multiple users of the event system (what to enable and what not to). By adding descriptors to represent the event/file relation, as well as to which trace_array descriptor they are associated with, allows for more than one set of events to be defined. Once the trace events files have a link between the trace event and the trace_array they are associated with, we can create multiple trace_arrays that can record separate events in separate buffers. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-14tracing: Prevent buffer overwrite disabled for latency tracersSteven Rostedt (Red Hat)
The latency tracers require the buffers to be in overwrite mode, otherwise they get screwed up. Force the buffers to stay in overwrite mode when latency tracers are enabled. Added a flag_changed() method to the tracer structure to allow the tracers to see what flags are being changed, and also be able to prevent the change from happing. Cc: stable@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-14tracing: Keep overwrite in sync between regular and snapshot buffersSteven Rostedt (Red Hat)
Changing the overwrite mode for the ring buffer via the trace option only sets the normal buffer. But the snapshot buffer could swap with it, and then the snapshot would be in non overwrite mode and the normal buffer would be in overwrite mode, even though the option flag states otherwise. Keep the two buffers overwrite modes in sync. Cc: stable@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-14tracing: Protect tracer flags with trace_types_lockSteven Rostedt (Red Hat)
Seems that the tracer flags have never been protected from synchronous writes. Luckily, admins don't usually modify the tracing flags via two different tasks. But if scripts were to be used to modify them, then they could get corrupted. Move the trace_types_lock that protects against tracers changing to also protect the flags being set. Cc: stable@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-14watchdog: Add comments to explain the watchdog_disabled variableanish kumar
The watchdog_disabled flag is a bit cryptic. However it's usefulness is multifold. Uses are: 1. Check if smpboot_register_percpu_thread function passed. 2. Makes sure that user enables and disables the watchdog in sequence i.e. enable watchdog->disable watchdog->enable watchdog Unlike enable watchdog->enable watchdog which is wrong. Signed-off-by: anish kumar <anish198519851985@gmail.com> [small text cleanups] Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: chuansheng.liu@intel.com Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1363113848-18344-1-git-send-email-dzickus@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-14sched: Fix variable name misnomer, add commentsAndrei Epure
The min_vruntime variable actually stores the maximum value. The added comment was taken from place_entity function. Signed-off-by: Andrei Epure <epure.andrei@gmail.com> Cc: peterz@infradead.org Link: http://lkml.kernel.org/r/1363115544-1964-1-git-send-email-epure.andrei@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-14Merge branch 'tip/perf/urgent-2' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/urgent Pull tracing fixes from Steven Rostedt. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-13workqueue: rename workqueue_lock to wq_mayday_lockTejun Heo
With the recent locking updates, the only thing protected by workqueue_lock is workqueue->maydays list. Rename workqueue_lock to wq_mayday_lock. This patch is pure rename. Signed-off-by: Tejun Heo <tj@kernel.org>
2013-03-13workqueue: separate out pool_workqueue locking into pwq_lockTejun Heo
This patch continues locking cleanup from the previous patch. It breaks out pool_workqueue synchronization from workqueue_lock into a new spinlock - pwq_lock. The followings are protected by pwq_lock. * workqueue->pwqs * workqueue->saved_max_active The conversion is straight-forward. workqueue_lock usages which cover the above two are converted to pwq_lock. New locking label PW added for things protected by pwq_lock and FR is updated to mean flush_mutex + pwq_lock + sched-RCU. This patch shouldn't introduce any visible behavior changes. Signed-off-by: Tejun Heo <tj@kernel.org>
2013-03-13workqueue: separate out pool and workqueue locking into wq_mutexTejun Heo
Currently, workqueue_lock protects most shared workqueue resources - the pools, workqueues, pool_workqueues, draining, ID assignments, mayday handling and so on. The coverage has grown organically and there is no identified bottleneck coming from workqueue_lock, but it has grown a bit too much and scheduled rebinding changes need the pools and workqueues to be protected by a mutex instead of a spinlock. This patch breaks out pool and workqueue synchronization from workqueue_lock into a new mutex - wq_mutex. The followings are protected by wq_mutex. * worker_pool_idr and unbound_pool_hash * pool->refcnt * workqueues list * workqueue->flags, ->nr_drainers Most changes are mostly straight-forward. workqueue_lock is replaced with wq_mutex where applicable and workqueue_lock lock/unlocks are added where wq_mutex conversion leaves data structures not protected by wq_mutex without locking. irq / preemption flippings were added where the conversion affects them. Things worth noting are * New WQ and WR locking lables added along with assert_rcu_or_wq_mutex(). * worker_pool_assign_id() now expects to be called under wq_mutex. * create_mutex is removed from get_unbound_pool(). It now just holds wq_mutex. This patch shouldn't introduce any visible behavior changes. Signed-off-by: Tejun Heo <tj@kernel.org>
2013-03-13workqueue: relocate global variable defs and function decls in workqueue.cTejun Heo
They're split across debugobj code for some reason. Collect them. This patch is pure relocation. Signed-off-by: Tejun Heo <tj@kernel.org>
2013-03-13workqueue: better define locking rules around worker creation / destructionTejun Heo
When a manager creates or destroys workers, the operations are always done with the manager_mutex held; however, initial worker creation or worker destruction during pool release don't grab the mutex. They are still correct as initial worker creation doesn't require synchronization and grabbing manager_arb provides enough exclusion for pool release path. Still, let's make everyone follow the same rules for consistency and such that lockdep annotations can be added. Update create_and_start_worker() and put_unbound_pool() to grab manager_mutex around thread creation and destruction respectively and add lockdep assertions to create_worker() and destroy_worker(). This patch doesn't introduce any visible behavior changes. Signed-off-by: Tejun Heo <tj@kernel.org>
2013-03-13workqueue: factor out initial worker creation into create_and_start_worker()Tejun Heo
get_unbound_pool(), workqueue_cpu_up_callback() and init_workqueues() have similar code pieces to create and start the initial worker factor those out into create_and_start_worker(). This patch doesn't introduce any functional changes. Signed-off-by: Tejun Heo <tj@kernel.org>
2013-03-13workqueue: rename worker_pool->assoc_mutex to ->manager_mutexTejun Heo
Manager operations are currently governed by two mutexes - pool->manager_arb and ->assoc_mutex. The former is used to decide who gets to be the manager and the latter to exclude the actual manager operations including creation and destruction of workers. Anyone who grabs ->manager_arb must perform manager role; otherwise, the pool might stall. Grabbing ->assoc_mutex blocks everyone else from performing manager operations but doesn't require the holder to perform manager duties as it's merely blocking manager operations without becoming the manager. Because the blocking was necessary when [dis]associating per-cpu workqueues during CPU hotplug events, the latter was named assoc_mutex. The mutex is scheduled to be used for other purposes, so this patch gives it a more fitting generic name - manager_mutex - and updates / adds comments to explain synchronization around the manager role and operations. This patch is pure rename / doc update. Signed-off-by: Tejun Heo <tj@kernel.org>
2013-03-13workqueue: inline trivial wrappersTejun Heo
There's no reason to make these trivial wrappers full (exported) functions. Inline the followings. queue_work() queue_delayed_work() mod_delayed_work() schedule_work_on() schedule_work() schedule_delayed_work_on() schedule_delayed_work() keventd_up() Signed-off-by: Tejun Heo <tj@kernel.org>
2013-03-13workqueue: rename @id to @pi in for_each_each_pool()Tejun Heo
Rename @id argument of for_each_pool() to @pi so that it doesn't get reused accidentally when for_each_pool() is used in combination with other iterators. This patch is purely cosmetic. Signed-off-by: Tejun Heo <tj@kernel.org>
2013-03-13workqueue: update comments and a warning messageTejun Heo
* Update incorrect and add missing synchronization labels. * Update incorrect or misleading comments. Add new comments where clarification is necessary. Reformat / rephrase some comments. * drain_workqueue() can be used separately from destroy_workqueue() but its warning message was incorrectly referring to destruction. Other than the warning message change, this patch doesn't make any functional changes. Signed-off-by: Tejun Heo <tj@kernel.org>
2013-03-13workqueue: fix max_active handling in init_and_link_pwq()Tejun Heo
Since 9e8cd2f589 ("workqueue: implement apply_workqueue_attrs()"), init_and_link_pwq() may be called to initialize a new pool_workqueue for a workqueue which is already online, but the function was setting pwq->max_active to wq->saved_max_active without proper synchronization. Fix it by calling pwq_adjust_max_active() under proper locking instead of manually setting max_active. Signed-off-by: Tejun Heo <tj@kernel.org>
2013-03-13workqueue: implement and use pwq_adjust_max_active()Tejun Heo
Rename pwq_set_max_active() to pwq_adjust_max_active() and move pool_workqueue->max_active synchronization and max_active determination logic into it. The new function should be called with workqueue_lock held for stable workqueue->saved_max_active, determines the current max_active value the target pool_workqueue should be using from @wq->saved_max_active and the state of the associated pool, and applies it with proper synchronization. The current two users - workqueue_set_max_active() and thaw_workqueues() - are updated accordingly. In addition, the manual freezing handling in __alloc_workqueue_key() and freeze_workqueues_begin() are replaced with calls to pwq_adjust_max_active(). This centralizes max_active handling so that it's less error-prone. Signed-off-by: Tejun Heo <tj@kernel.org>
2013-03-13workqueue: relocate pwq_set_max_active()Tejun Heo
pwq_set_max_active() is gonna be modified and used during pool_workqueue init. Move it above init_and_link_pwq(). This patch is pure code reorganization and doesn't introduce any functional changes. Signed-off-by: Tejun Heo <tj@kernel.org>
2013-03-13Merge branch 'akpm' (fixes from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: - A bunch of fixes - Finish off the idr API conversions before someone starts to use the old interfaces again. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: idr: idr_alloc() shouldn't trigger lowmem warning when preloaded UAPI: fix endianness conditionals in M32R's asm/stat.h UAPI: fix endianness conditionals in linux/raid/md_p.h UAPI: fix endianness conditionals in linux/acct.h UAPI: fix endianness conditionals in linux/aio_abi.h decompressors: fix typo "POWERPC" mm/fremap.c: fix oops on error path idr: deprecate idr_pre_get() and idr_get_new[_above]() tidspbridge: convert to idr_alloc() zcache: convert to idr_alloc() mlx4: remove leftover idr_pre_get() call workqueue: convert to idr_alloc() nfsd: convert to idr_alloc() nfsd: remove unused get_new_stid() kernel/signal.c: use __ARCH_HAS_SA_RESTORER instead of SA_RESTORER signal: always clear sa_restorer on execve mm: remove_memory(): fix end_pfn setting include/linux/res_counter.h needs errno.h
2013-03-13workqueue: convert to idr_alloc()Tejun Heo
idr_get_new*() and friends are about to be deprecated. Convert to the new idr_alloc() interface. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>