cgroup: add delegation section to unified hierarchy documentation
v2: Rearranged paragraphs as suggested by Johannes Weiner. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
This commit is contained in:
parent
187fe84067
commit
8a0792ef8e
|
@ -17,15 +17,18 @@ CONTENTS
|
|||
3. Structural Constraints
|
||||
3-1. Top-down
|
||||
3-2. No internal tasks
|
||||
4. Other Changes
|
||||
4-1. [Un]populated Notification
|
||||
4-2. Other Core Changes
|
||||
4-3. Per-Controller Changes
|
||||
4-3-1. blkio
|
||||
4-3-2. cpuset
|
||||
4-3-3. memory
|
||||
5. Planned Changes
|
||||
5-1. CAP for resource control
|
||||
4. Delegation
|
||||
4-1. Model of delegation
|
||||
4-2. Common ancestor rule
|
||||
5. Other Changes
|
||||
5-1. [Un]populated Notification
|
||||
5-2. Other Core Changes
|
||||
5-3. Per-Controller Changes
|
||||
5-3-1. blkio
|
||||
5-3-2. cpuset
|
||||
5-3-3. memory
|
||||
6. Planned Changes
|
||||
6-1. CAP for resource control
|
||||
|
||||
|
||||
1. Background
|
||||
|
@ -245,9 +248,72 @@ cgroup must create children and transfer all its tasks to the children
|
|||
before enabling controllers in its "cgroup.subtree_control" file.
|
||||
|
||||
|
||||
4. Other Changes
|
||||
4. Delegation
|
||||
|
||||
4-1. [Un]populated Notification
|
||||
4-1. Model of delegation
|
||||
|
||||
A cgroup can be delegated to a less privileged user by granting write
|
||||
access of the directory and its "cgroup.procs" file to the user. Note
|
||||
that the resource control knobs in a given directory concern the
|
||||
resources of the parent and thus must not be delegated along with the
|
||||
directory.
|
||||
|
||||
Once delegated, the user can build sub-hierarchy under the directory,
|
||||
organize processes as it sees fit and further distribute the resources
|
||||
it got from the parent. The limits and other settings of all resource
|
||||
controllers are hierarchical and regardless of what happens in the
|
||||
delegated sub-hierarchy, nothing can escape the resource restrictions
|
||||
imposed by the parent.
|
||||
|
||||
Currently, cgroup doesn't impose any restrictions on the number of
|
||||
cgroups in or nesting depth of a delegated sub-hierarchy; however,
|
||||
this may in the future be limited explicitly.
|
||||
|
||||
|
||||
4-2. Common ancestor rule
|
||||
|
||||
On the unified hierarchy, to write to a "cgroup.procs" file, in
|
||||
addition to the usual write permission to the file and uid match, the
|
||||
writer must also have write access to the "cgroup.procs" file of the
|
||||
common ancestor of the source and destination cgroups. This prevents
|
||||
delegatees from smuggling processes across disjoint sub-hierarchies.
|
||||
|
||||
Let's say cgroups C0 and C1 have been delegated to user U0 who created
|
||||
C00, C01 under C0 and C10 under C1 as follows.
|
||||
|
||||
~~~~~~~~~~~~~ - C0 - C00
|
||||
~ cgroup ~ \ C01
|
||||
~ hierarchy ~
|
||||
~~~~~~~~~~~~~ - C1 - C10
|
||||
|
||||
C0 and C1 are separate entities in terms of resource distribution
|
||||
regardless of their relative positions in the hierarchy. The
|
||||
resources the processes under C0 are entitled to are controlled by
|
||||
C0's ancestors and may be completely different from C1. It's clear
|
||||
that the intention of delegating C0 to U0 is allowing U0 to organize
|
||||
the processes under C0 and further control the distribution of C0's
|
||||
resources.
|
||||
|
||||
On traditional hierarchies, if a task has write access to "tasks" or
|
||||
"cgroup.procs" file of a cgroup and its uid agrees with the target, it
|
||||
can move the target to the cgroup. In the above example, U0 will not
|
||||
only be able to move processes in each sub-hierarchy but also across
|
||||
the two sub-hierarchies, effectively allowing it to violate the
|
||||
organizational and resource restrictions implied by the hierarchical
|
||||
structure above C0 and C1.
|
||||
|
||||
On the unified hierarchy, let's say U0 wants to write the pid of a
|
||||
process which has a matching uid and is currently in C10 into
|
||||
"C00/cgroup.procs". U0 obviously has write access to the file and
|
||||
migration permission on the process; however, the common ancestor of
|
||||
the source cgroup C10 and the destination cgroup C00 is above the
|
||||
points of delegation and U0 would not have write access to its
|
||||
"cgroup.procs" and thus be denied with -EACCES.
|
||||
|
||||
|
||||
5. Other Changes
|
||||
|
||||
5-1. [Un]populated Notification
|
||||
|
||||
cgroup users often need a way to determine when a cgroup's
|
||||
subhierarchy becomes empty so that it can be cleaned up. cgroup
|
||||
|
@ -289,7 +355,7 @@ supported and the interface files "release_agent" and
|
|||
"notify_on_release" do not exist.
|
||||
|
||||
|
||||
4-2. Other Core Changes
|
||||
5-2. Other Core Changes
|
||||
|
||||
- None of the mount options is allowed.
|
||||
|
||||
|
@ -306,14 +372,14 @@ supported and the interface files "release_agent" and
|
|||
- The "cgroup.clone_children" file is removed.
|
||||
|
||||
|
||||
4-3. Per-Controller Changes
|
||||
5-3. Per-Controller Changes
|
||||
|
||||
4-3-1. blkio
|
||||
5-3-1. blkio
|
||||
|
||||
- blk-throttle becomes properly hierarchical.
|
||||
|
||||
|
||||
4-3-2. cpuset
|
||||
5-3-2. cpuset
|
||||
|
||||
- Tasks are kept in empty cpusets after hotplug and take on the masks
|
||||
of the nearest non-empty ancestor, instead of being moved to it.
|
||||
|
@ -322,7 +388,7 @@ supported and the interface files "release_agent" and
|
|||
masks of the nearest non-empty ancestor.
|
||||
|
||||
|
||||
4-3-3. memory
|
||||
5-3-3. memory
|
||||
|
||||
- use_hierarchy is on by default and the cgroup file for the flag is
|
||||
not created.
|
||||
|
@ -407,9 +473,9 @@ supported and the interface files "release_agent" and
|
|||
memory.low, memory.high, and memory.max will use the string "max" to
|
||||
indicate and set the highest possible value.
|
||||
|
||||
5. Planned Changes
|
||||
6. Planned Changes
|
||||
|
||||
5-1. CAP for resource control
|
||||
6-1. CAP for resource control
|
||||
|
||||
Unified hierarchy will require one of the capabilities(7), which is
|
||||
yet to be decided, for all resource control related knobs. Process
|
||||
|
|
Loading…
Reference in New Issue