mm, oom: remove 3% bonus for CAP_SYS_ADMIN processes
Since the 2.6 kernel, the oom killer has slightly biased away from CAP_SYS_ADMIN processes by discounting some of its memory usage in comparison to other processes. This has always been implicit and nothing exactly relies on the behavior. Gaurav notices that __task_cred() can dereference a potentially freed pointer if the task under consideration is exiting because a reference to the task_struct is not held. Remove the CAP_SYS_ADMIN bias so that all processes are treated equally. If any CAP_SYS_ADMIN process would like to be biased against, it is always allowed to adjust /proc/pid/oom_score_adj. Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1803071548510.6996@chino.kir.corp.google.com Signed-off-by: David Rientjes <rientjes@google.com> Reported-by: Gaurav Kohli <gkohli@codeaurora.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:
parent
5ecd9d403a
commit
d46078b288
|
@ -226,13 +226,6 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
|
|||
mm_pgtables_bytes(p->mm) / PAGE_SIZE;
|
||||
task_unlock(p);
|
||||
|
||||
/*
|
||||
* Root processes get 3% bonus, just like the __vm_enough_memory()
|
||||
* implementation used by LSMs.
|
||||
*/
|
||||
if (has_capability_noaudit(p, CAP_SYS_ADMIN))
|
||||
points -= (points * 3) / 100;
|
||||
|
||||
/* Normalize to oom_score_adj units */
|
||||
adj *= totalpages / 1000;
|
||||
points += adj;
|
||||
|
|
Loading…
Reference in New Issue