Age | Commit message (Collapse) | Author |
|
When we change cpuset.memory_spread_{page,slab}, cpuset will flip
PF_SPREAD_{PAGE,SLAB} bit of tsk->flags for each task in that cpuset.
This should be done using atomic bitops, but currently we don't,
which is broken.
Tetsuo reported a hard-to-reproduce kernel crash on RHEL6, which happened
when one thread tried to clear PF_USED_MATH while at the same time another
thread tried to flip PF_SPREAD_PAGE/PF_SPREAD_SLAB. They both operate on
the same task.
Here's the full report:
https://lkml.org/lkml/2014/9/19/230
To fix this, we make PF_SPREAD_PAGE and PF_SPREAD_SLAB atomic flags.
v4:
- updated mm/slab.c. (Fengguang Wu)
- updated Documentation.
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Miao Xie <miaox@cn.fujitsu.com>
Cc: Kees Cook <keescook@chromium.org>
Fixes: 950592f7b991 ("cpusets: update tasks' page/slab spread flags in time")
Cc: <stable@vger.kernel.org> # 2.6.31+
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Zefan Li <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Also adds a class type PM_QOS_SUM that aggregates the values by summing them.
It can be used by memory controllers to calculate the optimum clock frequency
based on the bandwidth needs of the different memory clients.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Revert commit 6efde38f0769 (PM / Hibernate: Iterate over set bits
instead of PFNs in swsusp_free()) that introduced a NULL pointer
dereference during system resume from hibernation:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff810a8cc1>] swsusp_free+0x21/0x190
PGD b39c2067 PUD b39c1067 PMD 0
Oops: 0000 [#1] SMP
Modules linked in: <irrelevant list of modules>
CPU: 1 PID: 4898 Comm: s2disk Tainted: G C 3.17-rc5-amd64 #1 Debian 3.17~rc5-1~exp1
Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011
task: ffff88023155ea40 ti: ffff8800b3b14000 task.ti: ffff8800b3b14000
RIP: 0010:[<ffffffff810a8cc1>] [<ffffffff810a8cc1>]
swsusp_free+0x21/0x190
RSP: 0018:ffff8800b3b17ea8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8800b39bab00 RCX: 0000000000000001
RDX: ffff8800b39bab10 RSI: ffff8800b39bab00 RDI: 0000000000000000
RBP: 0000000000000010 R08: 0000000000000000 R09: 0000000000000000
R10: ffff8800b39bab10 R11: 0000000000000246 R12: ffffea0000000000
R13: ffff880232f485a0 R14: ffff88023ac27cd8 R15: ffff880232927590
FS: 00007f406d83b700(0000) GS:ffff88023bc80000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 00000000b3a62000 CR4: 00000000000007e0
Stack:
ffff8800b39bab00 0000000000000010 ffff880232927590 ffffffff810acb4a
ffff8800b39bab00 ffffffff811a955a ffff8800b39bab10 0000000000000000
ffff88023155f098 ffffffff81a6b8c0 ffff88023155ea40 0000000000000007
Call Trace:
[<ffffffff810acb4a>] ? snapshot_release+0x2a/0xb0
[<ffffffff811a955a>] ? __fput+0xca/0x1d0
[<ffffffff81080627>] ? task_work_run+0x97/0xd0
[<ffffffff81012d89>] ? do_notify_resume+0x69/0xa0
[<ffffffff8151452a>] ? int_signal+0x12/0x17
Code: 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 41 54 48 8b 05 ba 62 9c 00 49 bc 00 00 00 00 00 ea ff ff 48 8b 3d a1 62 9c 00 55 53 <48> 8b 10 48 89 50 18 48 8b 52 20 48 c7 40 28 00 00 00 00 c7 40
RIP [<ffffffff810a8cc1>] swsusp_free+0x21/0x190
RSP <ffff8800b3b17ea8>
CR2: 0000000000000000
---[ end trace f02be86a1ec0cccb ]---
due to forbidden_pages_map being NULL in swsusp_free().
Fixes: 6efde38f0769 "PM / Hibernate: Iterate over set bits instead of PFNs in swsusp_free()"
Reported-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
|
|
With the recent addition of percpu_ref_reinit(), percpu_ref now can be
used as a persistent switch which can be turned on and off repeatedly
where turning off maps to killing the ref and waiting for it to drain;
however, there currently isn't a way to initialize a percpu_ref in its
off (killed and drained) state, which can be inconvenient for certain
persistent switch use cases.
Similarly, percpu_ref_switch_to_atomic/percpu() allow dynamic
selection of operation mode; however, currently a newly initialized
percpu_ref is always in percpu mode making it impossible to avoid the
latency overhead of switching to atomic mode.
This patch adds @flags to percpu_ref_init() and implements the
following flags.
* PERCPU_REF_INIT_ATOMIC : start ref in atomic mode
* PERCPU_REF_INIT_DEAD : start ref killed and drained
These flags should be able to serve the above two use cases.
v2: target_core_tpg.c conversion was missing. Fixed.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block into for-3.18
This is to receive 0a30288da1ae ("blk-mq, percpu_ref: implement a
kludge for SCSI blk-mq stall during probe") which implements
__percpu_ref_kill_expedited() to work around SCSI blk-mq stall. The
commit reverted and patches to implement proper fix will be added.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@lst.de>
|
|
This reverts commit 1f9a7268c67f0290837aada443d28fd953ddca90.
With the fix of the initial state for the cloned event we now correctly
handle the error described in:
1f9a7268c67f perf: Do not allow optimized switch for non-cloned events
so we can revert it.
I made an automated test for this, but its not suitable for automated
perf tests framework. It needs to be customized for each machine (the
more cpu the higher numbers for GROUPS/WORKERS/BYTES) and it could take
longer time to hit the issue.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140910143535.GD2409@krava.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Currently we initialize the child event based on the original
parent state. This is wrong, because the original parent event
(and its state) is not related to current fork and also could
be already gone.
We need to initialize the child state based on the immediate
parent event state.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1410520708-19275-2-git-send-email-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Currently we return POLLHUP in event polling if the monitored
process is done, but we didn't consider possible children,
that might be still running and producing data.
Before returning POLLHUP making sure that:
1) the monitored task has exited and that
2) we don't have any children to monitor
Also adding parent wakeup when the child event is gone.
Suggested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1410520708-19275-1-git-send-email-jolsa@kernel.org
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Some time ago PREEMPT_NEED_RESCHED was implemented,
so reschedule technics is a little more difficult now.
Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140922183642.11015.66039.stgit@localhost
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Probability of use-after-free isn't zero in this place.
Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org> # v3.14+
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140922183636.11015.83611.stgit@localhost
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Nothing is locked there, so label's name only confuses a reader.
Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140922183630.11015.59500.stgit@localhost
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
dl_bw_of() dereferences rq->rd which has to have RCU read lock held.
Probability of use-after-free isn't zero here.
Also add lockdep assert into dl_bw_cpus().
Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org> # v3.14+
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140922183624.11015.71558.stgit@localhost
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Combine two branches which do the same.
Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140922183612.11015.64200.stgit@localhost
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Kirill found that there's a subtle race in the
__ARCH_WANT_UNLOCKED_CTXSW code, and instead of fixing it, remove the
entire exception because neither arch that uses it seems to actually
still require it.
Boot tested on mips64el (qemu) only.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kirill Tkhai <tkhai@yandex.ru>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Qais Yousef <qais.yousef@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: oleg@redhat.com
Cc: linux@roeck-us.net
Cc: linux-ia64@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Link: http://lkml.kernel.org/r/20140923150641.GH3312@worktop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
read_lock_irqsave(tasklist_lock) in print_rq() looks strange. We do
not need to disable irqs, and they are already disabled by the caller.
And afaics this lock buys nothing, we can rely on rcu_read_lock().
In this case it makes sense to also move rcu_read_lock/unlock from
the caller to print_rq().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Kirill Tkhai <tkhai@yandex.ru>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140921193341.GA28628@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
task_rq_lock()
1. read_lock(tasklist_lock) does not need to disable irqs.
2. ->mm != NULL is a common mistake, use PF_KTHREAD.
3. The second ->mm check can be simply removed.
4. task_rq_lock() looks better than raw_spin_lock(&p->pi_lock) +
__task_rq_lock().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Kirill Tkhai <tkhai@yandex.ru>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140921193338.GA28621@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
tg_has_rt_tasks() wants to find an RT task in this task_group, but
task_rq(p)->rt.tg wrongly checks the root rt_rq.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Kirill Tkhai <ktkhai@parallels.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Link: http://lkml.kernel.org/r/20140921193336.GA28618@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The code in find_idlest_cpu() looks for the CPU with the smallest load.
However, if multiple CPUs are idle, the first idle CPU is selected
irrespective of the depth of its idle state.
Among the idle CPUs we should pick the one with with the shallowest idle
state, or the latest to have gone idle if all idle CPUs are in the same
state. The later applies even when cpuidle is configured out.
This patch doesn't cover the following issues:
- The idle exit latency of a CPU might be larger than the time needed
to migrate the waking task to an already running CPU with sufficient
capacity, and therefore performance would benefit from task packing
in such case (in most cases task packing is about power saving).
- Some idle states have a non negligible and non abortable entry latency
which needs to run to completion before the exit latency can start.
A concurrent patch series is making this info available to the cpuidle
core. Once available, the entry latency with the idle timestamp could
determine when the exit latency may be effective.
Those issues will be handled in due course. In the mean time, what
is implemented here should improve things already compared to the current
state of affairs.
Based on an initial patch from Daniel Lezcano.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-pm@vger.kernel.org
Cc: linaro-kernel@lists.linaro.org
Link: http://lkml.kernel.org/n/tip-@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
When the cpu enters idle, it stores the cpuidle state pointer in its
struct rq instance which in turn could be used to make a better decision
when balancing tasks.
As soon as the cpu exits its idle state, the struct rq reference is
cleared.
There are a couple of situations where the idle state pointer could be changed
while it is being consulted:
1. For x86/acpi with dynamic c-states, when a laptop switches from battery
to AC that could result on removing the deeper idle state. The acpi driver
triggers:
'acpi_processor_cst_has_changed'
'cpuidle_pause_and_lock'
'cpuidle_uninstall_idle_handler'
'kick_all_cpus_sync'.
All cpus will exit their idle state and the pointed object will be set to
NULL.
2. The cpuidle driver is unloaded. Logically that could happen but not
in practice because the drivers are always compiled in and 95% of them are
not coded to unregister themselves. In any case, the unloading code must
call 'cpuidle_unregister_device', that calls 'cpuidle_pause_and_lock'
leading to 'kick_all_cpus_sync' as mentioned above.
A race can happen if we use the pointer and then one of these two scenarios
occurs at the same moment.
In order to be safe, the idle state pointer stored in the rq must be
used inside a rcu_read_lock section where we are protected with the
'rcu_barrier' in the 'cpuidle_uninstall_idle_handler' function. The
idle_get_state() and idle_put_state() accessors should be used to that
effect.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: linux-pm@vger.kernel.org
Cc: linaro-kernel@lists.linaro.org
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Users can perform clustered scheduling using the cpuset facility.
After an exclusive cpuset is created, task migrations happen only
between CPUs belonging to the same cpuset. Inter- cpuset migrations
can only happen when the user requires so, moving a task between
different cpusets. This behaviour is broken in SCHED_DEADLINE, as
currently spurious inter- cpuset migration may happen without user
intervention.
This patch fix the problem (and shuffles the code a bit to improve
clarity).
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: raistlin@linux.it
Cc: michael@amarulasolutions.com
Cc: fchecconi@gmail.com
Cc: daniel.wagner@bmw-carit.de
Cc: vincent@legout.info
Cc: luca.abeni@unitn.it
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1411118561-26323-4-git-send-email-juri.lelli@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
When a task is using SCHED_DEADLINE and the user setschedules it to a
different class its sched_dl_entity static parameters are not cleaned
up. This causes a bug if the user sets it back to SCHED_DEADLINE with
the same parameters again. The problem resides in the check we
perform at the very beginning of dl_overflow():
if (new_bw == p->dl.dl_bw)
return 0;
This condition is met in the case depicted above, so the function
returns and dl_b->total_bw is not updated (the p->dl.dl_bw is not
added to it). After this, admission control is broken.
This patch fixes the thing, properly clearing static parameters for a
task that ceases to use SCHED_DEADLINE.
Reported-by: Daniele Alessandrelli <daniele.alessandrelli@gmail.com>
Reported-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Reported-by: Vincent Legout <vincent@legout.info>
Tested-by: Luca Abeni <luca.abeni@unitn.it>
Tested-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Tested-by: Vincent Legout <vincent@legout.info>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Fabio Checconi <fchecconi@gmail.com>
Cc: Dario Faggioli <raistlin@linux.it>
Cc: Michael Trimarchi <michael@amarulasolutions.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1411118561-26323-2-git-send-email-juri.lelli@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
current->state == TASK_DEAD means that the task is doing its
last schedule(), page fault is obviously impossible at this
stage.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140921194743.GA30114@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
When task->comm is passed directly to audit_log_untrustedstring() without
getting a copy or using the task_lock, there is a race that could happen that
would output a NULL (\0) in the output string that would effectively truncate
the rest of the report text after the comm= field in the audit, losing fields.
Use get_task_comm() to get a copy while acquiring the task_lock to prevent
this and to prevent the result from being a mixture of old and new values of
comm.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
|
|
open_arg() was added in commit 55669bfa "audit: AUDIT_PERM support"
and never used. Remove it.
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
|
|
When an AUDIT_GET_FEATURE message is sent from userspace to the kernel, it
should reply with a message tagged as an AUDIT_GET_FEATURE type with a struct
audit_feature. The current reply is a message tagged as an AUDIT_GET
type with a struct audit_feature.
This appears to have been a cut-and-paste-eo in commit b0fed40.
Reported-by: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
|
|
Report:
Looking at your example code in
http://people.redhat.com/rbriggs/audit-multicast-listen/audit-multicast-listen.c,
it seems that nlmsg_len field in the received messages is supposed to
contain the length of the header + payload, but it is always set to the
size of the header only, i.e. 16. The example program works, because
the printf format specifies the minimum width, not "precision", so it
simply prints out the payload until the first zero byte. This isn't too
much of a problem, but precludes the use of recvmmsg, iiuc?
(gdb) p *(struct nlmsghdr*)nlh
$14 = {nlmsg_len = 16, nlmsg_type = 1100, nlmsg_flags = 0, nlmsg_seq = 0, nlmsg_pid = 9910}
The only time nlmsg_len would have been updated was at audit_buffer_alloc()
inside audit_log_start() and never updated after. It should arguably be done
in audit_log_vformat(), but would be more efficient in audit_log_end().
Reported-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
|
|
Since only one of val, uid, gid and lsm* are used at any given time, combine
them to reduce the size of the struct audit_field.
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
|
|
Various audit events dealing with adding, removing and updating rules result in
invalid values set for the op keys which result in embedded spaces in op=
values.
The invalid values are
op="add rule" set in kernel/auditfilter.c
op="remove rule" set in kernel/auditfilter.c
op="remove rule" set in kernel/audit_tree.c
op="updated rules" set in kernel/audit_watch.c
op="remove rule" set in kernel/audit_watch.c
Replace the space in the above values with an underscore character ('_').
Coded-by: Burn Alting <burn@swtf.dyndns.org>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
|
|
Since there is already a primitive to do this operation in the atomic_t, use it
to simplify audit_serial().
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
|
|
Use kernel.h definition.
Cc: Eric Paris <eparis@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
|
|
audit_log_fcaps() isn't used outside kernel/audit.c. Reduce its scope.
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
|
|
audit_net_id isn't used outside kernel/audit.c. Reduce its scope.
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
|
|
Since the arch is found locally in __audit_syscall_entry(), there is no need to
pass it in as a parameter. Delete it from the parameter list.
x86* was the only arch to call __audit_syscall_entry() directly and did so from
assembly code.
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-audit@redhat.com
Signed-off-by: Eric Paris <eparis@redhat.com>
---
As this patch relies on changes in the audit tree, I think it
appropriate to send it through my tree rather than the x86 tree.
|
|
The AUDIT_SECCOMP record looks something like this:
type=SECCOMP msg=audit(1373478171.953:32775): auid=4325 uid=4325 gid=4325 ses=1 subj=unconfined_u:unconfined_r:unconfined_t:s0 pid=12381 comm="test" sig=31 syscall=231 compat=0 ip=0x39ea8bca89 code=0x0
In order to determine what syscall 231 maps to, we need to have the arch= field right before it.
To see the event, compile this test.c program:
=====
int main(void)
{
return seccomp_load(seccomp_init(SCMP_ACT_KILL));
}
=====
gcc -g test.c -o test -lseccomp
After running the program, find the record by: ausearch --start recent -m SECCOMP -i
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
signed-off-by: Eric Paris <eparis@redhat.com>
|
|
directly
Since every arch should have syscall_get_arch() defined, stop using the
function argument and just collect this ourselves. We do not drop the
argument as fixing some code paths (in assembly) to not pass this first
argument is non-trivial. The argument will be dropped when that is
fixed.
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
|
|
Conflicts:
arch/mips/net/bpf_jit.c
drivers/net/can/flexcan.c
Both the flexcan and MIPS bpf_jit conflicts were cases of simple
overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fix from Tejun Heo:
"One late fix for cgroup.
I was waiting for another set of fixes for a long-standing obscure
cpuset bug but am not sure whether they'll be ready before v3.17
release. This one is a simple fix for a mutex unlock balance bug in
an allocation failure path in pidlist_array_load().
The bug was introduced in v3.14 and the fix is tagged for -stable"
* 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: fix unbalanced locking
|
|
'tiny/no-advice', 'tiny/tinyconfig' and 'tiny/x86-boot-compressed-use-yn' into tiny/next
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull the v3.18 RCU changes from Paul E. McKenney:
"
* Update RCU documentation. These were posted to LKML at
https://lkml.org/lkml/2014/8/28/378.
* Miscellaneous fixes. These were posted to LKML at
https://lkml.org/lkml/2014/8/28/386. An additional fix that
eliminates a documented (but now inconvenient) deadlock between
RCU hotplug and expedited grace periods was posted at
https://lkml.org/lkml/2014/8/28/573.
* Changes related to No-CBs CPUs and NO_HZ_FULL. These were posted
to LKML at https://lkml.org/lkml/2014/8/28/412.
* Torture-test updates. These were posted to LKML at
https://lkml.org/lkml/2014/8/28/546 and at
https://lkml.org/lkml/2014/9/11/1114.
* RCU-tasks implementation. These were posted to LKML at
https://lkml.org/lkml/2014/8/28/540.
"
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
This patch moves Exynos PM domain code to use the new generic PM domain
look-up framework introduced in previous patches, thus also allowing
the new code to be compiled with CONFIG_ARCH_EXYNOS.
This patch was originally submitted by Tomasz Figa when he was employed
by Samsung.
Link: http://marc.info/?l=linux-pm&m=139955336002083&w=2
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Kevin Hilman <khilman@linaro.org>
Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
This patch introduces generic code to perform PM domain look-up using
device tree and automatically bind devices to their PM domains.
Generic device tree bindings are introduced to specify PM domains of
devices in their device tree nodes.
Backwards compatibility with legacy Samsung-specific PM domain bindings
is provided, but for now the new code is not compiled when
CONFIG_ARCH_EXYNOS is selected to avoid collision with legacy code.
This will change as soon as the Exynos PM domain code gets converted to
use the generic framework in further patch.
This patch was originally submitted by Tomasz Figa when he was employed
by Samsung.
Link: http://marc.info/?l=linux-pm&m=139955349702152&w=2
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Rob Herring <robh@kernel.org>
Tested-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
This patch adds another suspend_resume trace event for analyze_suspend
to capture. The resume_console call can take several hundred milliseconds
if the printk buffer is full of debug info. The tool will now inform
testers of the wasted time and encourage them to disable it in
production builds.
Signed-off-by: Todd Brandt <todd.e.brandt@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Signed-off-by: Zhihui Zhang <zzhsuny@gmail.com>
Cc: peterz@infradead.org
Link: http://lkml.kernel.org/r/1411262676-19928-1-git-send-email-zzhsuny@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Both pinned_sb and new_sb indicate if a new superblock is needed,
so we can just remove new_sb.
Note now we must check if kernfs_tryget_sb() returns NULL, because
when it returns NULL, kernfs_mount() may still re-use an existing
superblock, which is just allocated by another concurent mount.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Zefan Li <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
The patch 971ff4935538: "cgroup: use a per-cgroup work for release
agent" from Sep 18, 2014, leads to the following static checker
warning:
kernel/cgroup.c:5310 cgroup_release_agent()
warn: 'mutex:&cgroup_mutex' is sometimes locked here and sometimes unlocked.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Zefan Li <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Two kernel side fixes: a kprobes fix and a perf_remove_from_context()
fix (which does not yet fix the migration bug which is WIP)"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Fix a race condition in perf_remove_from_context()
kprobes/x86: Free 'optinsn' cache when range check fails
|
|
We call put_css_set() after setting CGRP_RELEASABLE flag in
cgroup_task_migrate(), but in other places we call it without setting
the flag. I don't see the necessity of this flag.
Moreover once the flag is set, it will never be cleared, unless writing
to the notify_on_release control file, so it can be quite confusing
if we look at the output of debug.releasable.
# mount -t cgroup -o debug xxx /cgroup
# mkdir /cgroup/child
# cat /cgroup/child/debug.releasable
0 <-- shows 0 though the cgroup is empty
# echo $$ > /cgroup/child/tasks
# cat /cgroup/child/debug.releasable
0
# echo $$ > /cgroup/tasks && echo $$ > /cgroup/child/tasks
# cat /proc/child/debug.releasable
1 <-- shows 1 though the cgroup is not empty
This patch removes the flag, and now debug.releasable shows if the
cgroup is empty or not.
Signed-off-by: Zefan Li <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Commit 5a17f543ed68 ("cgroup: improve css_from_dir() into css_tryget_from_dir()")
removed perf_tryget_cgroup(), so let's also remove perf_put_cgroup().
Signed-off-by: Zefan Li <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Currently the task always wakes affine on this_cpu if the latter is idle.
Before waking up the task on this_cpu, we check that this_cpu capacity is not
significantly reduced because of RT tasks or irq activity.
Use case where the number of irq and/or the time spent under irq is important
will take benefit of this because the task that is woken up by irq or softirq
will not use the same CPU than irq (and softirq) but a idle one.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: preeti@linux.vnet.ibm.com
Cc: riel@redhat.com
Cc: Morten.Rasmussen@arm.com
Cc: efault@gmx.de
Cc: nicolas.pitre@linaro.org
Cc: daniel.lezcano@linaro.org
Cc: dietmar.eggemann@arm.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1409051215-16788-8-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|