From d559db086ff5be9bcc259e5aa50bf3d881eaf1d1 Mon Sep 17 00:00:00 2001 From: KAMEZAWA Hiroyuki Date: Fri, 5 Mar 2010 13:41:39 -0800 Subject: mm: clean up mm_counter Presently, per-mm statistics counter is defined by macro in sched.h This patch modifies it to - defined in mm.h as inlinf functions - use array instead of macro's name creation. This patch is for reducing patch size in future patch to modify implementation of per-mm counter. Signed-off-by: KAMEZAWA Hiroyuki Reviewed-by: Minchan Kim Cc: Christoph Lameter Cc: Lee Schermerhorn Cc: David Rientjes Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/oom_kill.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'mm/oom_kill.c') diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 237050478f28..35755a4156d6 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -401,8 +401,8 @@ static void __oom_kill_task(struct task_struct *p, int verbose) "vsz:%lukB, anon-rss:%lukB, file-rss:%lukB\n", task_pid_nr(p), p->comm, K(p->mm->total_vm), - K(get_mm_counter(p->mm, anon_rss)), - K(get_mm_counter(p->mm, file_rss))); + K(get_mm_counter(p->mm, MM_ANONPAGES)), + K(get_mm_counter(p->mm, MM_FILEPAGES))); task_unlock(p); /* -- cgit From daaf1e68874c078a15ae6ae827751839c4d81739 Mon Sep 17 00:00:00 2001 From: KAMEZAWA Hiroyuki Date: Wed, 10 Mar 2010 15:22:32 -0800 Subject: memcg: handle panic_on_oom=always case Presently, if panic_on_oom=2, the whole system panics even if the oom happend in some special situation (as cpuset, mempolicy....). Then, panic_on_oom=2 means painc_on_oom_always. Now, memcg doesn't check panic_on_oom flag. This patch adds a check. BTW, how it's useful ? kdump+panic_on_oom=2 is the last tool to investigate what happens in oom-ed system. When a task is killed, the sysytem recovers and there will be few hint to know what happnes. In mission critical system, oom should never happen. Then, panic_on_oom=2+kdump is useful to avoid next OOM by knowing precise information via snapshot. TODO: - For memcg, it's for isolate system's memory usage, oom-notiifer and freeze_at_oom (or rest_at_oom) should be implemented. Then, management daemon can do similar jobs (as kdump) or taking snapshot per cgroup. Signed-off-by: KAMEZAWA Hiroyuki Cc: Balbir Singh Cc: David Rientjes Cc: Nick Piggin Reviewed-by: Daisuke Nishimura Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/oom_kill.c | 2 ++ 1 file changed, 2 insertions(+) (limited to 'mm/oom_kill.c') diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 35755a4156d6..71d10bf52dc8 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -473,6 +473,8 @@ void mem_cgroup_out_of_memory(struct mem_cgroup *mem, gfp_t gfp_mask) unsigned long points = 0; struct task_struct *p; + if (sysctl_panic_on_oom == 2) + panic("out of memory(memcg). panic_on_oom is selected.\n"); read_lock(&tasklist_lock); retry: p = select_bad_process(&points, mem); -- cgit From 867578cbccb0893cc14fc29c670f7185809c90d6 Mon Sep 17 00:00:00 2001 From: KAMEZAWA Hiroyuki Date: Wed, 10 Mar 2010 15:22:39 -0800 Subject: memcg: fix oom kill behavior In current page-fault code, handle_mm_fault() -> ... -> mem_cgroup_charge() -> map page or handle error. -> check return code. If page fault's return code is VM_FAULT_OOM, page_fault_out_of_memory() is called. But if it's caused by memcg, OOM should have been already invoked. Then, I added a patch: a636b327f731143ccc544b966cfd8de6cb6d72c6. That patch records last_oom_jiffies for memcg's sub-hierarchy and prevents page_fault_out_of_memory from being invoked in near future. But Nishimura-san reported that check by jiffies is not enough when the system is terribly heavy. This patch changes memcg's oom logic as. * If memcg causes OOM-kill, continue to retry. * remove jiffies check which is used now. * add memcg-oom-lock which works like perzone oom lock. * If current is killed(as a process), bypass charge. Something more sophisticated can be added but this pactch does fundamental things. TODO: - add oom notifier - add permemcg disable-oom-kill flag and freezer at oom. - more chances for wake up oom waiter (when changing memory limit etc..) Reviewed-by: Daisuke Nishimura Tested-by: Daisuke Nishimura Signed-off-by: KAMEZAWA Hiroyuki Cc: Balbir Singh Cc: David Rientjes Signed-off-by: Daisuke Nishimura Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/oom_kill.c | 8 -------- 1 file changed, 8 deletions(-) (limited to 'mm/oom_kill.c') diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 71d10bf52dc8..9b223af6a147 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -603,13 +603,6 @@ void pagefault_out_of_memory(void) /* Got some memory back in the last second. */ return; - /* - * If this is from memcg, oom-killer is already invoked. - * and not worth to go system-wide-oom. - */ - if (mem_cgroup_oom_called(current)) - goto rest_and_return; - if (sysctl_panic_on_oom) panic("out of memory from page fault. panic_on_oom is selected.\n"); @@ -621,7 +614,6 @@ void pagefault_out_of_memory(void) * Give "p" a good chance of killing itself before we * retry to allocate memory. */ -rest_and_return: if (!test_thread_flag(TIF_MEMDIE)) schedule_timeout_uninterruptible(1); } -- cgit