Since this is a runC-specific feature, this belongs here over in
opencontainers/ocitools (which is for generic OCI runtimes).
In addition, we don't create a new network namespace. This is because
currently if you want to set up a veth bridge you need CAP_NET_ADMIN in
both network namespaces' pinned user namespace to create the necessary
interfaces in each network namespace.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
This enables the support for the rootless container mode. There are many
restrictions on what rootless containers can do, so many different runC
commands have been disabled:
* runc checkpoint
* runc events
* runc pause
* runc ps
* runc restore
* runc resume
* runc update
The following commands work:
* runc create
* runc delete
* runc exec
* runc kill
* runc list
* runc run
* runc spec
* runc state
In addition, any specification options that imply joining cgroups have
also been disabled. This is due to support for unprivileged subtree
management not being available from Linux upstream.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
For example, the /sys/firmware directory should be masked because it can contain some sensitive files:
- /sys/firmware/acpi/tables/{SLIC,MSDM}: Windows license information:
- /sys/firmware/ibft/target0/chap-secret: iSCSI CHAP secret
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
runc currently fails to build against the upstream version of
runtime-spec/specs-go.
```
# github.com/opencontainers/runc
./spec.go:189: cannot use specs.Linux literal (type specs.Linux) as type *specs.Linux in field value
```
on account of 63231576ec (diff-7f24d60f0cbb9c433e165467e3d34838R25)
This commit updates the dependency to current runtime-spec master and
fixes the type mismatch.
Fixes#1035
Signed-off-by: Adam Thomason <ad@mthomason.net>
/proc/timer_list seems to leak information about the host. Here is
an example from a busybox container running on docker+kubernetes.
# cat /proc/timer_list | grep -i -e kube
<ffff8800b8cc3db0>, hrtimer_wakeup, S:01, futex_wait_queue_me, kubelet/2497
<ffff880129ac3db0>, hrtimer_wakeup, S:01, futex_wait_queue_me, kube-proxy/3478
<ffff8800b1b77db0>, hrtimer_wakeup, S:01, futex_wait_queue_me, kube-proxy/3470
<ffff8800bb6abdb0>, hrtimer_wakeup, S:01, futex_wait_queue_me, kubelet/2499
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
Users of libcontainer other than runc may also require parsing and
converting specification configuration files.
Since runc cannot be imported, move the relevant functions and
definitions to a separate package, libcontainer/specconv.
Signed-off-by: Ido Yariv <ido@wizery.com>
systemd expects cgroupsPath to be of form "slice:prefix:name".
So dont call cleanPath on it anymore.
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
This updates runc and libcontainer to handle rlimits per process and set
them correctly for the container.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
The error handling on the runc cli is currenly pretty messy because
messages to the user are split between regular stderr format and logrus
message format. This changes all the error reporting to the cli to only
output on stderr and exit(1) for consumers of the api.
By default logrus logs to /dev/null so that it is not seen by the user.
If the user wants extra and/or structured loggging/errors from runc they
can use the `--log` flag to provide a path to the file where they want
this information. This allows a consistent behavior on the cli but
extra power and information when debugging with logs.
This also includes a change to enable the same logging information
inside the container's init by adding an init cli command that can share
the existing flags for all other runc commands.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This commit adds support to libcontainer to allow caps, no new privs,
apparmor, and selinux process label to the process struct so that it can
be used together of override the base settings on the container config
per individual process.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This bump of the spec includes a change to the deivce type to be a
string so that it is more readable in the json serialization.
It also includes the change were caps, no new privs, and process
labeling features are moved from the container config onto the process.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This updates the current list to what we have now in docker and also
makes these always added so that these are masked out. Privileged
containers can always unmount these if they want to read from kcore or
something like that.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This saves and returns the bundle path for the container in the
container's config and state. It also returns the information via runc
list.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This prior fix to set "-1" explicitly was lost, and it is simpler to use
the same pointer type from the OCI spec to handle nil pointer == -1 ==
unset case.
Also, as a nearly humorous aside, there was a test for MemorySwappiness
that was actually setting Memory, and it was passing because of this
bug (as it was always setting everyone's MemorySwappiness to zero!)
Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
When CgroupsPath code was introduced with #497 it was mistakenly made
to act as the equivalent of docker CgroupsParent. This ensure that it
is taken as the final cgroup path.
A couple of unit tests have been added to prevent future regression.
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>