pids_can_fork() is special in that the css association is guaranteed
to be stable throughout the function and thus doesn't need RCU
protection around task_css access. When determining the css to charge
the pid, task_css_check() is used to override the RCU sanity check.
While adding a warning message on fork rejection from pids limit,
135b8b37bd ("cgroup: Add pids controller event when fork fails
because of pid limit") incorrectly added a task_css access which is
neither RCU protected or explicitly annotated. This triggers the
following suspicious RCU usage warning when RCU debugging is enabled.
cgroup: fork rejected by pids controller in
===============================
[ ERR: suspicious RCU usage. ]
4.10.0-work+ #1 Not tainted
-------------------------------
./include/linux/cgroup.h:435 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 0
1 lock held by bash/1748:
#0: (&cgroup_threadgroup_rwsem){+++++.}, at: [<ffffffff81052c96>] _do_fork+0xe6/0x6e0
stack backtrace:
CPU: 3 PID: 1748 Comm: bash Not tainted 4.10.0-work+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014
Call Trace:
dump_stack+0x68/0x93
lockdep_rcu_suspicious+0xd7/0x110
pids_can_fork+0x1c7/0x1d0
cgroup_can_fork+0x67/0xc0
copy_process.part.58+0x1709/0x1e90
_do_fork+0xe6/0x6e0
SyS_clone+0x19/0x20
do_syscall_64+0x5c/0x140
entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x7f7853fab93a
RSP: 002b:00007ffc12d05c90 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7853fab93a
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
RBP: 00007ffc12d05cc0 R08: 0000000000000000 R09: 00007f78548db700
R10: 00007f78548db9d0 R11: 0000000000000246 R12: 00000000000006d4
R13: 0000000000000001 R14: 0000000000000000 R15: 000055e3ebe2c04d
/asdf
There's no reason to dereference task_css again here when the
associated css is already available. Fix it by replacing the
task_cgroup() call with css->cgroup.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Mike Galbraith <efault@gmx.de>
Fixes: 135b8b37bd ("cgroup: Add pids controller event when fork fails because of pid limit")
Cc: Kenny Yu <kennyyu@fb.com>
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: Tejun Heo <tj@kernel.org>
threadgroup_change_begin()/end() is a pointless wrapper around
cgroup_threadgroup_change_begin()/end(), minus a might_sleep()
in the !CONFIG_CGROUPS=y case.
Remove the wrappery, move the might_sleep() (the down_read()
already has a might_sleep() check).
This debloats <linux/sched.h> a bit and simplifies this API.
Update all call sites.
No change in functionality.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
They're growing to be too many and planned to get split further. Move
them under their own directory.
kernel/cgroup.c -> kernel/cgroup/cgroup.c
kernel/cgroup_freezer.c -> kernel/cgroup/freezer.c
kernel/cgroup_pids.c -> kernel/cgroup/pids.c
kernel/cpuset.c -> kernel/cgroup/cpuset.c
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Acked-by: Zefan Li <lizefan@huawei.com>