OpenCloudOS-Kernel/include/uapi/linux/landlock.h

138 lines
4.6 KiB
C
Raw Normal View History

landlock: Support filesystem access-control Using Landlock objects and ruleset, it is possible to tag inodes according to a process's domain. To enable an unprivileged process to express a file hierarchy, it first needs to open a directory (or a file) and pass this file descriptor to the kernel through landlock_add_rule(2). When checking if a file access request is allowed, we walk from the requested dentry to the real root, following the different mount layers. The access to each "tagged" inodes are collected according to their rule layer level, and ANDed to create access to the requested file hierarchy. This makes possible to identify a lot of files without tagging every inodes nor modifying the filesystem, while still following the view and understanding the user has from the filesystem. Add a new ARCH_EPHEMERAL_INODES for UML because it currently does not keep the same struct inodes for the same inodes whereas these inodes are in use. This commit adds a minimal set of supported filesystem access-control which doesn't enable to restrict all file-related actions. This is the result of multiple discussions to minimize the code of Landlock to ease review. Thanks to the Landlock design, extending this access-control without breaking user space will not be a problem. Moreover, seccomp filters can be used to restrict the use of syscall families which may not be currently handled by Landlock. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: James Morris <jmorris@namei.org> Cc: Jann Horn <jannh@google.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Kees Cook <keescook@chromium.org> Cc: Richard Weinberger <richard@nod.at> Cc: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Link: https://lore.kernel.org/r/20210422154123.13086-8-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22 23:41:17 +08:00
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
* Landlock - User space API
*
* Copyright © 2017-2020 Mickaël Salaün <mic@digikod.net>
* Copyright © 2018-2020 ANSSI
*/
#ifndef _UAPI_LINUX_LANDLOCK_H
#define _UAPI_LINUX_LANDLOCK_H
landlock: Add syscall implementations These 3 system calls are designed to be used by unprivileged processes to sandbox themselves: * landlock_create_ruleset(2): Creates a ruleset and returns its file descriptor. * landlock_add_rule(2): Adds a rule (e.g. file hierarchy access) to a ruleset, identified by the dedicated file descriptor. * landlock_restrict_self(2): Enforces a ruleset on the calling thread and its future children (similar to seccomp). This syscall has the same usage restrictions as seccomp(2): the caller must have the no_new_privs attribute set or have CAP_SYS_ADMIN in the current user namespace. All these syscalls have a "flags" argument (not currently used) to enable extensibility. Here are the motivations for these new syscalls: * A sandboxed process may not have access to file systems, including /dev, /sys or /proc, but it should still be able to add more restrictions to itself. * Neither prctl(2) nor seccomp(2) (which was used in a previous version) fit well with the current definition of a Landlock security policy. All passed structs (attributes) are checked at build time to ensure that they don't contain holes and that they are aligned the same way for each architecture. See the user and kernel documentation for more details (provided by a following commit): * Documentation/userspace-api/landlock.rst * Documentation/security/landlock.rst Cc: Arnd Bergmann <arnd@arndb.de> Cc: James Morris <jmorris@namei.org> Cc: Jann Horn <jannh@google.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Acked-by: Serge Hallyn <serge@hallyn.com> Link: https://lore.kernel.org/r/20210422154123.13086-9-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22 23:41:18 +08:00
#include <linux/types.h>
/**
* struct landlock_ruleset_attr - Ruleset definition
*
* Argument of sys_landlock_create_ruleset(). This structure can grow in
* future versions.
*/
struct landlock_ruleset_attr {
/**
* @handled_access_fs: Bitmask of actions (cf. `Filesystem flags`_)
* that is handled by this ruleset and should then be forbidden if no
* rule explicitly allow them. This is needed for backward
* compatibility reasons.
*/
__u64 handled_access_fs;
};
landlock: Enable user space to infer supported features Add a new flag LANDLOCK_CREATE_RULESET_VERSION to landlock_create_ruleset(2). This enables to retreive a Landlock ABI version that is useful to efficiently follow a best-effort security approach. Indeed, it would be a missed opportunity to abort the whole sandbox building, because some features are unavailable, instead of protecting users as much as possible with the subset of features provided by the running kernel. This new flag enables user space to identify the minimum set of Landlock features supported by the running kernel without relying on a filesystem interface (e.g. /proc/version, which might be inaccessible) nor testing multiple syscall argument combinations (i.e. syscall bisection). New Landlock features will be documented and tied to a minimum version number (greater than 1). The current version will be incremented for each new kernel release supporting new Landlock features. User space libraries can leverage this information to seamlessly restrict processes as much as possible while being compatible with newer APIs. This is a much more lighter approach than the previous landlock_get_features(2): the complexity is pushed to user space libraries. This flag meets similar needs as securityfs versions: selinux/policyvers, apparmor/features/*/version* and tomoyo/version. Supporting this flag now will be convenient for backward compatibility. Cc: Arnd Bergmann <arnd@arndb.de> Cc: James Morris <jmorris@namei.org> Cc: Jann Horn <jannh@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Link: https://lore.kernel.org/r/20210422154123.13086-14-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22 23:41:23 +08:00
/*
* sys_landlock_create_ruleset() flags:
*
* - %LANDLOCK_CREATE_RULESET_VERSION: Get the highest supported Landlock ABI
* version.
*/
#define LANDLOCK_CREATE_RULESET_VERSION (1U << 0)
landlock: Add syscall implementations These 3 system calls are designed to be used by unprivileged processes to sandbox themselves: * landlock_create_ruleset(2): Creates a ruleset and returns its file descriptor. * landlock_add_rule(2): Adds a rule (e.g. file hierarchy access) to a ruleset, identified by the dedicated file descriptor. * landlock_restrict_self(2): Enforces a ruleset on the calling thread and its future children (similar to seccomp). This syscall has the same usage restrictions as seccomp(2): the caller must have the no_new_privs attribute set or have CAP_SYS_ADMIN in the current user namespace. All these syscalls have a "flags" argument (not currently used) to enable extensibility. Here are the motivations for these new syscalls: * A sandboxed process may not have access to file systems, including /dev, /sys or /proc, but it should still be able to add more restrictions to itself. * Neither prctl(2) nor seccomp(2) (which was used in a previous version) fit well with the current definition of a Landlock security policy. All passed structs (attributes) are checked at build time to ensure that they don't contain holes and that they are aligned the same way for each architecture. See the user and kernel documentation for more details (provided by a following commit): * Documentation/userspace-api/landlock.rst * Documentation/security/landlock.rst Cc: Arnd Bergmann <arnd@arndb.de> Cc: James Morris <jmorris@namei.org> Cc: Jann Horn <jannh@google.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Acked-by: Serge Hallyn <serge@hallyn.com> Link: https://lore.kernel.org/r/20210422154123.13086-9-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22 23:41:18 +08:00
/**
* enum landlock_rule_type - Landlock rule type
*
* Argument of sys_landlock_add_rule().
*/
enum landlock_rule_type {
/**
* @LANDLOCK_RULE_PATH_BENEATH: Type of a &struct
* landlock_path_beneath_attr .
*/
LANDLOCK_RULE_PATH_BENEATH = 1,
};
/**
* struct landlock_path_beneath_attr - Path hierarchy definition
*
* Argument of sys_landlock_add_rule().
*/
struct landlock_path_beneath_attr {
/**
* @allowed_access: Bitmask of allowed actions for this file hierarchy
* (cf. `Filesystem flags`_).
*/
__u64 allowed_access;
/**
* @parent_fd: File descriptor, open with ``O_PATH``, which identifies
* the parent directory of a file hierarchy, or just a file.
*/
__s32 parent_fd;
/*
* This struct is packed to avoid trailing reserved members.
* Cf. security/landlock/syscalls.c:build_check_abi()
*/
} __attribute__((packed));
landlock: Support filesystem access-control Using Landlock objects and ruleset, it is possible to tag inodes according to a process's domain. To enable an unprivileged process to express a file hierarchy, it first needs to open a directory (or a file) and pass this file descriptor to the kernel through landlock_add_rule(2). When checking if a file access request is allowed, we walk from the requested dentry to the real root, following the different mount layers. The access to each "tagged" inodes are collected according to their rule layer level, and ANDed to create access to the requested file hierarchy. This makes possible to identify a lot of files without tagging every inodes nor modifying the filesystem, while still following the view and understanding the user has from the filesystem. Add a new ARCH_EPHEMERAL_INODES for UML because it currently does not keep the same struct inodes for the same inodes whereas these inodes are in use. This commit adds a minimal set of supported filesystem access-control which doesn't enable to restrict all file-related actions. This is the result of multiple discussions to minimize the code of Landlock to ease review. Thanks to the Landlock design, extending this access-control without breaking user space will not be a problem. Moreover, seccomp filters can be used to restrict the use of syscall families which may not be currently handled by Landlock. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: James Morris <jmorris@namei.org> Cc: Jann Horn <jannh@google.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Kees Cook <keescook@chromium.org> Cc: Richard Weinberger <richard@nod.at> Cc: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Link: https://lore.kernel.org/r/20210422154123.13086-8-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22 23:41:17 +08:00
/**
* DOC: fs_access
*
* A set of actions on kernel objects may be defined by an attribute (e.g.
* &struct landlock_path_beneath_attr) including a bitmask of access.
*
* Filesystem flags
* ~~~~~~~~~~~~~~~~
*
* These flags enable to restrict a sandboxed process to a set of actions on
* files and directories. Files or directories opened before the sandboxing
* are not subject to these restrictions.
*
* A file can only receive these access rights:
*
* - %LANDLOCK_ACCESS_FS_EXECUTE: Execute a file.
* - %LANDLOCK_ACCESS_FS_WRITE_FILE: Open a file with write access.
* - %LANDLOCK_ACCESS_FS_READ_FILE: Open a file with read access.
*
* A directory can receive access rights related to files or directories. The
* following access right is applied to the directory itself, and the
* directories beneath it:
*
* - %LANDLOCK_ACCESS_FS_READ_DIR: Open a directory or list its content.
*
* However, the following access rights only apply to the content of a
* directory, not the directory itself:
*
* - %LANDLOCK_ACCESS_FS_REMOVE_DIR: Remove an empty directory or rename one.
* - %LANDLOCK_ACCESS_FS_REMOVE_FILE: Unlink (or rename) a file.
* - %LANDLOCK_ACCESS_FS_MAKE_CHAR: Create (or rename or link) a character
* device.
* - %LANDLOCK_ACCESS_FS_MAKE_DIR: Create (or rename) a directory.
* - %LANDLOCK_ACCESS_FS_MAKE_REG: Create (or rename or link) a regular file.
* - %LANDLOCK_ACCESS_FS_MAKE_SOCK: Create (or rename or link) a UNIX domain
* socket.
* - %LANDLOCK_ACCESS_FS_MAKE_FIFO: Create (or rename or link) a named pipe.
* - %LANDLOCK_ACCESS_FS_MAKE_BLOCK: Create (or rename or link) a block device.
* - %LANDLOCK_ACCESS_FS_MAKE_SYM: Create (or rename or link) a symbolic link.
*
* .. warning::
*
* It is currently not possible to restrict some file-related actions
* accessible through these syscall families: :manpage:`chdir(2)`,
* :manpage:`truncate(2)`, :manpage:`stat(2)`, :manpage:`flock(2)`,
* :manpage:`chmod(2)`, :manpage:`chown(2)`, :manpage:`setxattr(2)`,
* :manpage:`utime(2)`, :manpage:`ioctl(2)`, :manpage:`fcntl(2)`,
* :manpage:`access(2)`.
* Future Landlock evolutions will enable to restrict them.
*/
#define LANDLOCK_ACCESS_FS_EXECUTE (1ULL << 0)
#define LANDLOCK_ACCESS_FS_WRITE_FILE (1ULL << 1)
#define LANDLOCK_ACCESS_FS_READ_FILE (1ULL << 2)
#define LANDLOCK_ACCESS_FS_READ_DIR (1ULL << 3)
#define LANDLOCK_ACCESS_FS_REMOVE_DIR (1ULL << 4)
#define LANDLOCK_ACCESS_FS_REMOVE_FILE (1ULL << 5)
#define LANDLOCK_ACCESS_FS_MAKE_CHAR (1ULL << 6)
#define LANDLOCK_ACCESS_FS_MAKE_DIR (1ULL << 7)
#define LANDLOCK_ACCESS_FS_MAKE_REG (1ULL << 8)
#define LANDLOCK_ACCESS_FS_MAKE_SOCK (1ULL << 9)
#define LANDLOCK_ACCESS_FS_MAKE_FIFO (1ULL << 10)
#define LANDLOCK_ACCESS_FS_MAKE_BLOCK (1ULL << 11)
#define LANDLOCK_ACCESS_FS_MAKE_SYM (1ULL << 12)
#endif /* _UAPI_LINUX_LANDLOCK_H */