2017-11-15 03:35:15 +08:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0 */
|
2017-01-25 02:58:06 +08:00
|
|
|
/*
|
2017-10-10 03:15:34 +08:00
|
|
|
* fscrypt.h: declarations for per-file encryption
|
|
|
|
*
|
2018-12-12 17:50:12 +08:00
|
|
|
* Filesystems that implement per-file encryption must include this header
|
|
|
|
* file.
|
2017-01-25 02:58:06 +08:00
|
|
|
*
|
|
|
|
* Copyright (C) 2015, Google, Inc.
|
|
|
|
*
|
|
|
|
* Written by Michael Halcrow, 2015.
|
|
|
|
* Modified by Jaegeuk Kim, 2015.
|
|
|
|
*/
|
2017-10-10 03:15:34 +08:00
|
|
|
#ifndef _LINUX_FSCRYPT_H
|
|
|
|
#define _LINUX_FSCRYPT_H
|
2017-01-25 02:58:06 +08:00
|
|
|
|
|
|
|
#include <linux/fs.h>
|
2018-12-12 17:50:12 +08:00
|
|
|
#include <linux/mm.h>
|
|
|
|
#include <linux/slab.h>
|
2019-08-05 10:35:43 +08:00
|
|
|
#include <uapi/linux/fscrypt.h>
|
2017-01-25 02:58:06 +08:00
|
|
|
|
|
|
|
#define FS_CRYPTO_BLOCK_SIZE 16
|
|
|
|
|
|
|
|
struct fscrypt_info;
|
|
|
|
|
|
|
|
struct fscrypt_str {
|
|
|
|
unsigned char *name;
|
|
|
|
u32 len;
|
|
|
|
};
|
|
|
|
|
|
|
|
struct fscrypt_name {
|
|
|
|
const struct qstr *usr_fname;
|
|
|
|
struct fscrypt_str disk_name;
|
|
|
|
u32 hash;
|
|
|
|
u32 minor_hash;
|
|
|
|
struct fscrypt_str crypto_buf;
|
fscrypt: fix race where ->lookup() marks plaintext dentry as ciphertext
->lookup() in an encrypted directory begins as follows:
1. fscrypt_prepare_lookup():
a. Try to load the directory's encryption key.
b. If the key is unavailable, mark the dentry as a ciphertext name
via d_flags.
2. fscrypt_setup_filename():
a. Try to load the directory's encryption key.
b. If the key is available, encrypt the name (treated as a plaintext
name) to get the on-disk name. Otherwise decode the name
(treated as a ciphertext name) to get the on-disk name.
But if the key is concurrently added, it may be found at (2a) but not at
(1a). In this case, the dentry will be wrongly marked as a ciphertext
name even though it was actually treated as plaintext.
This will cause the dentry to be wrongly invalidated on the next lookup,
potentially causing problems. For example, if the racy ->lookup() was
part of sys_mount(), then the new mount will be detached when anything
tries to access it. This is despite the mountpoint having a plaintext
path, which should remain valid now that the key was added.
Of course, this is only possible if there's a userspace race. Still,
the additional kernel-side race is confusing and unexpected.
Close the kernel-side race by changing fscrypt_prepare_lookup() to also
set the on-disk filename (step 2b), consistent with the d_flags update.
Fixes: 28b4c263961c ("ext4 crypto: revalidate dentry after adding or removing the key")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2019-03-21 02:39:13 +08:00
|
|
|
bool is_ciphertext_name;
|
2017-01-25 02:58:06 +08:00
|
|
|
};
|
|
|
|
|
|
|
|
#define FSTR_INIT(n, l) { .name = n, .len = l }
|
|
|
|
#define FSTR_TO_QSTR(f) QSTR_INIT((f)->name, (f)->len)
|
|
|
|
#define fname_name(p) ((p)->disk_name.name)
|
|
|
|
#define fname_len(p) ((p)->disk_name.len)
|
|
|
|
|
2017-07-06 12:01:59 +08:00
|
|
|
/* Maximum value for the third parameter of fscrypt_operations.set_context(). */
|
fscrypt: v2 encryption policy support
Add a new fscrypt policy version, "v2". It has the following changes
from the original policy version, which we call "v1" (*):
- Master keys (the user-provided encryption keys) are only ever used as
input to HKDF-SHA512. This is more flexible and less error-prone, and
it avoids the quirks and limitations of the AES-128-ECB based KDF.
Three classes of cryptographically isolated subkeys are defined:
- Per-file keys, like used in v1 policies except for the new KDF.
- Per-mode keys. These implement the semantics of the DIRECT_KEY
flag, which for v1 policies made the master key be used directly.
These are also planned to be used for inline encryption when
support for it is added.
- Key identifiers (see below).
- Each master key is identified by a 16-byte master_key_identifier,
which is derived from the key itself using HKDF-SHA512. This prevents
users from associating the wrong key with an encrypted file or
directory. This was easily possible with v1 policies, which
identified the key by an arbitrary 8-byte master_key_descriptor.
- The key must be provided in the filesystem-level keyring, not in a
process-subscribed keyring.
The following UAPI additions are made:
- The existing ioctl FS_IOC_SET_ENCRYPTION_POLICY can now be passed a
fscrypt_policy_v2 to set a v2 encryption policy. It's disambiguated
from fscrypt_policy/fscrypt_policy_v1 by the version code prefix.
- A new ioctl FS_IOC_GET_ENCRYPTION_POLICY_EX is added. It allows
getting the v1 or v2 encryption policy of an encrypted file or
directory. The existing FS_IOC_GET_ENCRYPTION_POLICY ioctl could not
be used because it did not have a way for userspace to indicate which
policy structure is expected. The new ioctl includes a size field, so
it is extensible to future fscrypt policy versions.
- The ioctls FS_IOC_ADD_ENCRYPTION_KEY, FS_IOC_REMOVE_ENCRYPTION_KEY,
and FS_IOC_GET_ENCRYPTION_KEY_STATUS now support managing keys for v2
encryption policies. Such keys are kept logically separate from keys
for v1 encryption policies, and are identified by 'identifier' rather
than by 'descriptor'. The 'identifier' need not be provided when
adding a key, since the kernel will calculate it anyway.
This patch temporarily keeps adding/removing v2 policy keys behind the
same permission check done for adding/removing v1 policy keys:
capable(CAP_SYS_ADMIN). However, the next patch will carefully take
advantage of the cryptographically secure master_key_identifier to allow
non-root users to add/remove v2 policy keys, thus providing a full
replacement for v1 policies.
(*) Actually, in the API fscrypt_policy::version is 0 while on-disk
fscrypt_context::format is 1. But I believe it makes the most sense
to advance both to '2' to have them be in sync, and to consider the
numbering to start at 1 except for the API quirk.
Reviewed-by: Paul Crowley <paulcrowley@google.com>
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-08-05 10:35:47 +08:00
|
|
|
#define FSCRYPT_SET_CONTEXT_MAX_SIZE 40
|
2017-07-06 12:01:59 +08:00
|
|
|
|
2018-12-12 17:50:12 +08:00
|
|
|
#ifdef CONFIG_FS_ENCRYPTION
|
|
|
|
/*
|
|
|
|
* fscrypt superblock flags
|
|
|
|
*/
|
|
|
|
#define FS_CFLG_OWN_PAGES (1U << 1)
|
|
|
|
|
|
|
|
/*
|
|
|
|
* crypto operations for filesystems
|
|
|
|
*/
|
|
|
|
struct fscrypt_operations {
|
|
|
|
unsigned int flags;
|
|
|
|
const char *key_prefix;
|
2020-05-12 03:13:57 +08:00
|
|
|
int (*get_context)(struct inode *inode, void *ctx, size_t len);
|
|
|
|
int (*set_context)(struct inode *inode, const void *ctx, size_t len,
|
|
|
|
void *fs_data);
|
|
|
|
bool (*dummy_context)(struct inode *inode);
|
|
|
|
bool (*empty_dir)(struct inode *inode);
|
2018-12-12 17:50:12 +08:00
|
|
|
unsigned int max_namelen;
|
fscrypt: add support for IV_INO_LBLK_64 policies
Inline encryption hardware compliant with the UFS v2.1 standard or with
the upcoming version of the eMMC standard has the following properties:
(1) Per I/O request, the encryption key is specified by a previously
loaded keyslot. There might be only a small number of keyslots.
(2) Per I/O request, the starting IV is specified by a 64-bit "data unit
number" (DUN). IV bits 64-127 are assumed to be 0. The hardware
automatically increments the DUN for each "data unit" of
configurable size in the request, e.g. for each filesystem block.
Property (1) makes it inefficient to use the traditional fscrypt
per-file keys. Property (2) precludes the use of the existing
DIRECT_KEY fscrypt policy flag, which needs at least 192 IV bits.
Therefore, add a new fscrypt policy flag IV_INO_LBLK_64 which causes the
encryption to modified as follows:
- The encryption keys are derived from the master key, encryption mode
number, and filesystem UUID.
- The IVs are chosen as (inode_number << 32) | file_logical_block_num.
For filenames encryption, file_logical_block_num is 0.
Since the file nonces aren't used in the key derivation, many files may
share the same encryption key. This is much more efficient on the
target hardware. Including the inode number in the IVs and mixing the
filesystem UUID into the keys ensures that data in different files is
nevertheless still encrypted differently.
Additionally, limiting the inode and block numbers to 32 bits and
placing the block number in the low bits maintains compatibility with
the 64-bit DUN convention (property (2) above).
Since this scheme assumes that inode numbers are stable (which may
preclude filesystem shrinking) and that inode and file logical block
numbers are at most 32-bit, IV_INO_LBLK_64 will only be allowed on
filesystems that meet these constraints. These are acceptable
limitations for the cases where this format would actually be used.
Note that IV_INO_LBLK_64 is an on-disk format, not an implementation.
This patch just adds support for it using the existing filesystem layer
encryption. A later patch will add support for inline encryption.
Reviewed-by: Paul Crowley <paulcrowley@google.com>
Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-10-25 05:54:36 +08:00
|
|
|
bool (*has_stable_inodes)(struct super_block *sb);
|
|
|
|
void (*get_ino_and_lblk_bits)(struct super_block *sb,
|
|
|
|
int *ino_bits_ret, int *lblk_bits_ret);
|
2018-12-12 17:50:12 +08:00
|
|
|
};
|
|
|
|
|
|
|
|
static inline bool fscrypt_has_encryption_key(const struct inode *inode)
|
|
|
|
{
|
2019-04-12 05:32:15 +08:00
|
|
|
/* pairs with cmpxchg_release() in fscrypt_get_encryption_info() */
|
|
|
|
return READ_ONCE(inode->i_crypt_info) != NULL;
|
2018-12-12 17:50:12 +08:00
|
|
|
}
|
|
|
|
|
2019-12-10 04:50:21 +08:00
|
|
|
/**
|
|
|
|
* fscrypt_needs_contents_encryption() - check whether an inode needs
|
|
|
|
* contents encryption
|
2020-05-12 03:13:56 +08:00
|
|
|
* @inode: the inode to check
|
2019-12-10 04:50:21 +08:00
|
|
|
*
|
|
|
|
* Return: %true iff the inode is an encrypted regular file and the kernel was
|
|
|
|
* built with fscrypt support.
|
|
|
|
*
|
|
|
|
* If you need to know whether the encrypt bit is set even when the kernel was
|
|
|
|
* built without fscrypt support, you must use IS_ENCRYPTED() directly instead.
|
|
|
|
*/
|
|
|
|
static inline bool fscrypt_needs_contents_encryption(const struct inode *inode)
|
|
|
|
{
|
|
|
|
return IS_ENCRYPTED(inode) && S_ISREG(inode->i_mode);
|
|
|
|
}
|
|
|
|
|
2018-12-12 17:50:12 +08:00
|
|
|
static inline bool fscrypt_dummy_context_enabled(struct inode *inode)
|
|
|
|
{
|
|
|
|
return inode->i_sb->s_cop->dummy_context &&
|
|
|
|
inode->i_sb->s_cop->dummy_context(inode);
|
|
|
|
}
|
|
|
|
|
2019-03-21 02:39:11 +08:00
|
|
|
/*
|
|
|
|
* When d_splice_alias() moves a directory's encrypted alias to its decrypted
|
|
|
|
* alias as a result of the encryption key being added, DCACHE_ENCRYPTED_NAME
|
|
|
|
* must be cleared. Note that we don't have to support arbitrary moves of this
|
|
|
|
* flag because fscrypt doesn't allow encrypted aliases to be the source or
|
|
|
|
* target of a rename().
|
|
|
|
*/
|
|
|
|
static inline void fscrypt_handle_d_move(struct dentry *dentry)
|
|
|
|
{
|
|
|
|
dentry->d_flags &= ~DCACHE_ENCRYPTED_NAME;
|
|
|
|
}
|
|
|
|
|
2018-12-12 17:50:12 +08:00
|
|
|
/* crypto.c */
|
|
|
|
extern void fscrypt_enqueue_decrypt_work(struct work_struct *);
|
2019-05-21 00:29:44 +08:00
|
|
|
|
|
|
|
extern struct page *fscrypt_encrypt_pagecache_blocks(struct page *page,
|
|
|
|
unsigned int len,
|
|
|
|
unsigned int offs,
|
|
|
|
gfp_t gfp_flags);
|
2019-05-21 00:29:43 +08:00
|
|
|
extern int fscrypt_encrypt_block_inplace(const struct inode *inode,
|
|
|
|
struct page *page, unsigned int len,
|
|
|
|
unsigned int offs, u64 lblk_num,
|
|
|
|
gfp_t gfp_flags);
|
2019-05-21 00:29:47 +08:00
|
|
|
|
|
|
|
extern int fscrypt_decrypt_pagecache_blocks(struct page *page, unsigned int len,
|
|
|
|
unsigned int offs);
|
2019-05-21 00:29:46 +08:00
|
|
|
extern int fscrypt_decrypt_block_inplace(const struct inode *inode,
|
|
|
|
struct page *page, unsigned int len,
|
|
|
|
unsigned int offs, u64 lblk_num);
|
2018-12-12 17:50:12 +08:00
|
|
|
|
2019-05-21 00:29:39 +08:00
|
|
|
static inline bool fscrypt_is_bounce_page(struct page *page)
|
2018-12-12 17:50:12 +08:00
|
|
|
{
|
2019-05-21 00:29:39 +08:00
|
|
|
return page->mapping == NULL;
|
2018-12-12 17:50:12 +08:00
|
|
|
}
|
|
|
|
|
2019-05-21 00:29:39 +08:00
|
|
|
static inline struct page *fscrypt_pagecache_page(struct page *bounce_page)
|
|
|
|
{
|
|
|
|
return (struct page *)page_private(bounce_page);
|
|
|
|
}
|
|
|
|
|
|
|
|
extern void fscrypt_free_bounce_page(struct page *bounce_page);
|
2018-12-12 17:50:12 +08:00
|
|
|
|
|
|
|
/* policy.c */
|
2020-05-12 03:13:57 +08:00
|
|
|
extern int fscrypt_ioctl_set_policy(struct file *filp, const void __user *arg);
|
|
|
|
extern int fscrypt_ioctl_get_policy(struct file *filp, void __user *arg);
|
|
|
|
extern int fscrypt_ioctl_get_policy_ex(struct file *filp, void __user *arg);
|
2020-03-15 04:50:49 +08:00
|
|
|
extern int fscrypt_ioctl_get_nonce(struct file *filp, void __user *arg);
|
2020-05-12 03:13:57 +08:00
|
|
|
extern int fscrypt_has_permitted_context(struct inode *parent,
|
|
|
|
struct inode *child);
|
|
|
|
extern int fscrypt_inherit_context(struct inode *parent, struct inode *child,
|
|
|
|
void *fs_data, bool preload);
|
|
|
|
|
fscrypt: add FS_IOC_ADD_ENCRYPTION_KEY ioctl
Add a new fscrypt ioctl, FS_IOC_ADD_ENCRYPTION_KEY. This ioctl adds an
encryption key to the filesystem's fscrypt keyring ->s_master_keys,
making any files encrypted with that key appear "unlocked".
Why we need this
~~~~~~~~~~~~~~~~
The main problem is that the "locked/unlocked" (ciphertext/plaintext)
status of encrypted files is global, but the fscrypt keys are not.
fscrypt only looks for keys in the keyring(s) the process accessing the
filesystem is subscribed to: the thread keyring, process keyring, and
session keyring, where the session keyring may contain the user keyring.
Therefore, userspace has to put fscrypt keys in the keyrings for
individual users or sessions. But this means that when a process with a
different keyring tries to access encrypted files, whether they appear
"unlocked" or not is nondeterministic. This is because it depends on
whether the files are currently present in the inode cache.
Fixing this by consistently providing each process its own view of the
filesystem depending on whether it has the key or not isn't feasible due
to how the VFS caches work. Furthermore, while sometimes users expect
this behavior, it is misguided for two reasons. First, it would be an
OS-level access control mechanism largely redundant with existing access
control mechanisms such as UNIX file permissions, ACLs, LSMs, etc.
Encryption is actually for protecting the data at rest.
Second, almost all users of fscrypt actually do need the keys to be
global. The largest users of fscrypt, Android and Chromium OS, achieve
this by having PID 1 create a "session keyring" that is inherited by
every process. This works, but it isn't scalable because it prevents
session keyrings from being used for any other purpose.
On general-purpose Linux distros, the 'fscrypt' userspace tool [1] can't
similarly abuse the session keyring, so to make 'sudo' work on all
systems it has to link all the user keyrings into root's user keyring
[2]. This is ugly and raises security concerns. Moreover it can't make
the keys available to system services, such as sshd trying to access the
user's '~/.ssh' directory (see [3], [4]) or NetworkManager trying to
read certificates from the user's home directory (see [5]); or to Docker
containers (see [6], [7]).
By having an API to add a key to the *filesystem* we'll be able to fix
the above bugs, remove userspace workarounds, and clearly express the
intended semantics: the locked/unlocked status of an encrypted directory
is global, and encryption is orthogonal to OS-level access control.
Why not use the add_key() syscall
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We use an ioctl for this API rather than the existing add_key() system
call because the ioctl gives us the flexibility needed to implement
fscrypt-specific semantics that will be introduced in later patches:
- Supporting key removal with the semantics such that the secret is
removed immediately and any unused inodes using the key are evicted;
also, the eviction of any in-use inodes can be retried.
- Calculating a key-dependent cryptographic identifier and returning it
to userspace.
- Allowing keys to be added and removed by non-root users, but only keys
for v2 encryption policies; and to prevent denial-of-service attacks,
users can only remove keys they themselves have added, and a key is
only really removed after all users who added it have removed it.
Trying to shoehorn these semantics into the keyrings syscalls would be
very difficult, whereas the ioctls make things much easier.
However, to reuse code the implementation still uses the keyrings
service internally. Thus we get lockless RCU-mode key lookups without
having to re-implement it, and the keys automatically show up in
/proc/keys for debugging purposes.
References:
[1] https://github.com/google/fscrypt
[2] https://goo.gl/55cCrI#heading=h.vf09isp98isb
[3] https://github.com/google/fscrypt/issues/111#issuecomment-444347939
[4] https://github.com/google/fscrypt/issues/116
[5] https://bugs.launchpad.net/ubuntu/+source/fscrypt/+bug/1770715
[6] https://github.com/google/fscrypt/issues/128
[7] https://askubuntu.com/questions/1130306/cannot-run-docker-on-an-encrypted-filesystem
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-08-05 10:35:46 +08:00
|
|
|
/* keyring.c */
|
|
|
|
extern void fscrypt_sb_free(struct super_block *sb);
|
|
|
|
extern int fscrypt_ioctl_add_key(struct file *filp, void __user *arg);
|
fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY ioctl
Add a new fscrypt ioctl, FS_IOC_REMOVE_ENCRYPTION_KEY. This ioctl
removes an encryption key that was added by FS_IOC_ADD_ENCRYPTION_KEY.
It wipes the secret key itself, then "locks" the encrypted files and
directories that had been unlocked using that key -- implemented by
evicting the relevant dentries and inodes from the VFS caches.
The problem this solves is that many fscrypt users want the ability to
remove encryption keys, causing the corresponding encrypted directories
to appear "locked" (presented in ciphertext form) again. Moreover,
users want removing an encryption key to *really* remove it, in the
sense that the removed keys cannot be recovered even if kernel memory is
compromised, e.g. by the exploit of a kernel security vulnerability or
by a physical attack. This is desirable after a user logs out of the
system, for example. In many cases users even already assume this to be
the case and are surprised to hear when it's not.
It is not sufficient to simply unlink the master key from the keyring
(or to revoke or invalidate it), since the actual encryption transform
objects are still pinned in memory by their inodes. Therefore, to
really remove a key we must also evict the relevant inodes.
Currently one workaround is to run 'sync && echo 2 >
/proc/sys/vm/drop_caches'. But, that evicts all unused inodes in the
system rather than just the inodes associated with the key being
removed, causing severe performance problems. Moreover, it requires
root privileges, so regular users can't "lock" their encrypted files.
Another workaround, used in Chromium OS kernels, is to add a new
VFS-level ioctl FS_IOC_DROP_CACHE which is a more restricted version of
drop_caches that operates on a single super_block. It does:
shrink_dcache_sb(sb);
invalidate_inodes(sb, false);
But it's still a hack. Yet, the major users of filesystem encryption
want this feature badly enough that they are actually using these hacks.
To properly solve the problem, start maintaining a list of the inodes
which have been "unlocked" using each master key. Originally this
wasn't possible because the kernel didn't keep track of in-use master
keys at all. But, with the ->s_master_keys keyring it is now possible.
Then, add an ioctl FS_IOC_REMOVE_ENCRYPTION_KEY. It finds the specified
master key in ->s_master_keys, then wipes the secret key itself, which
prevents any additional inodes from being unlocked with the key. Then,
it syncs the filesystem and evicts the inodes in the key's list. The
normal inode eviction code will free and wipe the per-file keys (in
->i_crypt_info). Note that freeing ->i_crypt_info without evicting the
inodes was also considered, but would have been racy.
Some inodes may still be in use when a master key is removed, and we
can't simply revoke random file descriptors, mmap's, etc. Thus, the
ioctl simply skips in-use inodes, and returns -EBUSY to indicate that
some inodes weren't evicted. The master key *secret* is still removed,
but the fscrypt_master_key struct remains to keep track of the remaining
inodes. Userspace can then retry the ioctl to evict the remaining
inodes. Alternatively, if userspace adds the key again, the refreshed
secret will be associated with the existing list of inodes so they
remain correctly tracked for future key removals.
The ioctl doesn't wipe pagecache pages. Thus, we tolerate that after a
kernel compromise some portions of plaintext file contents may still be
recoverable from memory. This can be solved by enabling page poisoning
system-wide, which security conscious users may choose to do. But it's
very difficult to solve otherwise, e.g. note that plaintext file
contents may have been read in other places than pagecache pages.
Like FS_IOC_ADD_ENCRYPTION_KEY, FS_IOC_REMOVE_ENCRYPTION_KEY is
initially restricted to privileged users only. This is sufficient for
some use cases, but not all. A later patch will relax this restriction,
but it will require introducing key hashes, among other changes.
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-08-05 10:35:46 +08:00
|
|
|
extern int fscrypt_ioctl_remove_key(struct file *filp, void __user *arg);
|
2019-08-05 10:35:47 +08:00
|
|
|
extern int fscrypt_ioctl_remove_key_all_users(struct file *filp,
|
|
|
|
void __user *arg);
|
fscrypt: add FS_IOC_GET_ENCRYPTION_KEY_STATUS ioctl
Add a new fscrypt ioctl, FS_IOC_GET_ENCRYPTION_KEY_STATUS. Given a key
specified by 'struct fscrypt_key_specifier' (the same way a key is
specified for the other fscrypt key management ioctls), it returns
status information in a 'struct fscrypt_get_key_status_arg'.
The main motivation for this is that applications need to be able to
check whether an encrypted directory is "unlocked" or not, so that they
can add the key if it is not, and avoid adding the key (which may
involve prompting the user for a passphrase) if it already is.
It's possible to use some workarounds such as checking whether opening a
regular file fails with ENOKEY, or checking whether the filenames "look
like gibberish" or not. However, no workaround is usable in all cases.
Like the other key management ioctls, the keyrings syscalls may seem at
first to be a good fit for this. Unfortunately, they are not. Even if
we exposed the keyring ID of the ->s_master_keys keyring and gave
everyone Search permission on it (note: currently the keyrings
permission system would also allow everyone to "invalidate" the keyring
too), the fscrypt keys have an additional state that doesn't map cleanly
to the keyrings API: the secret can be removed, but we can be still
tracking the files that were using the key, and the removal can be
re-attempted or the secret added again.
After later patches, some applications will also need a way to determine
whether a key was added by the current user vs. by some other user.
Reserved fields are included in fscrypt_get_key_status_arg for this and
other future extensions.
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-08-05 10:35:46 +08:00
|
|
|
extern int fscrypt_ioctl_get_key_status(struct file *filp, void __user *arg);
|
fscrypt: add FS_IOC_ADD_ENCRYPTION_KEY ioctl
Add a new fscrypt ioctl, FS_IOC_ADD_ENCRYPTION_KEY. This ioctl adds an
encryption key to the filesystem's fscrypt keyring ->s_master_keys,
making any files encrypted with that key appear "unlocked".
Why we need this
~~~~~~~~~~~~~~~~
The main problem is that the "locked/unlocked" (ciphertext/plaintext)
status of encrypted files is global, but the fscrypt keys are not.
fscrypt only looks for keys in the keyring(s) the process accessing the
filesystem is subscribed to: the thread keyring, process keyring, and
session keyring, where the session keyring may contain the user keyring.
Therefore, userspace has to put fscrypt keys in the keyrings for
individual users or sessions. But this means that when a process with a
different keyring tries to access encrypted files, whether they appear
"unlocked" or not is nondeterministic. This is because it depends on
whether the files are currently present in the inode cache.
Fixing this by consistently providing each process its own view of the
filesystem depending on whether it has the key or not isn't feasible due
to how the VFS caches work. Furthermore, while sometimes users expect
this behavior, it is misguided for two reasons. First, it would be an
OS-level access control mechanism largely redundant with existing access
control mechanisms such as UNIX file permissions, ACLs, LSMs, etc.
Encryption is actually for protecting the data at rest.
Second, almost all users of fscrypt actually do need the keys to be
global. The largest users of fscrypt, Android and Chromium OS, achieve
this by having PID 1 create a "session keyring" that is inherited by
every process. This works, but it isn't scalable because it prevents
session keyrings from being used for any other purpose.
On general-purpose Linux distros, the 'fscrypt' userspace tool [1] can't
similarly abuse the session keyring, so to make 'sudo' work on all
systems it has to link all the user keyrings into root's user keyring
[2]. This is ugly and raises security concerns. Moreover it can't make
the keys available to system services, such as sshd trying to access the
user's '~/.ssh' directory (see [3], [4]) or NetworkManager trying to
read certificates from the user's home directory (see [5]); or to Docker
containers (see [6], [7]).
By having an API to add a key to the *filesystem* we'll be able to fix
the above bugs, remove userspace workarounds, and clearly express the
intended semantics: the locked/unlocked status of an encrypted directory
is global, and encryption is orthogonal to OS-level access control.
Why not use the add_key() syscall
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We use an ioctl for this API rather than the existing add_key() system
call because the ioctl gives us the flexibility needed to implement
fscrypt-specific semantics that will be introduced in later patches:
- Supporting key removal with the semantics such that the secret is
removed immediately and any unused inodes using the key are evicted;
also, the eviction of any in-use inodes can be retried.
- Calculating a key-dependent cryptographic identifier and returning it
to userspace.
- Allowing keys to be added and removed by non-root users, but only keys
for v2 encryption policies; and to prevent denial-of-service attacks,
users can only remove keys they themselves have added, and a key is
only really removed after all users who added it have removed it.
Trying to shoehorn these semantics into the keyrings syscalls would be
very difficult, whereas the ioctls make things much easier.
However, to reuse code the implementation still uses the keyrings
service internally. Thus we get lockless RCU-mode key lookups without
having to re-implement it, and the keys automatically show up in
/proc/keys for debugging purposes.
References:
[1] https://github.com/google/fscrypt
[2] https://goo.gl/55cCrI#heading=h.vf09isp98isb
[3] https://github.com/google/fscrypt/issues/111#issuecomment-444347939
[4] https://github.com/google/fscrypt/issues/116
[5] https://bugs.launchpad.net/ubuntu/+source/fscrypt/+bug/1770715
[6] https://github.com/google/fscrypt/issues/128
[7] https://askubuntu.com/questions/1130306/cannot-run-docker-on-an-encrypted-filesystem
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-08-05 10:35:46 +08:00
|
|
|
|
2019-08-05 10:35:45 +08:00
|
|
|
/* keysetup.c */
|
2020-05-12 03:13:57 +08:00
|
|
|
extern int fscrypt_get_encryption_info(struct inode *inode);
|
|
|
|
extern void fscrypt_put_encryption_info(struct inode *inode);
|
|
|
|
extern void fscrypt_free_inode(struct inode *inode);
|
fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY ioctl
Add a new fscrypt ioctl, FS_IOC_REMOVE_ENCRYPTION_KEY. This ioctl
removes an encryption key that was added by FS_IOC_ADD_ENCRYPTION_KEY.
It wipes the secret key itself, then "locks" the encrypted files and
directories that had been unlocked using that key -- implemented by
evicting the relevant dentries and inodes from the VFS caches.
The problem this solves is that many fscrypt users want the ability to
remove encryption keys, causing the corresponding encrypted directories
to appear "locked" (presented in ciphertext form) again. Moreover,
users want removing an encryption key to *really* remove it, in the
sense that the removed keys cannot be recovered even if kernel memory is
compromised, e.g. by the exploit of a kernel security vulnerability or
by a physical attack. This is desirable after a user logs out of the
system, for example. In many cases users even already assume this to be
the case and are surprised to hear when it's not.
It is not sufficient to simply unlink the master key from the keyring
(or to revoke or invalidate it), since the actual encryption transform
objects are still pinned in memory by their inodes. Therefore, to
really remove a key we must also evict the relevant inodes.
Currently one workaround is to run 'sync && echo 2 >
/proc/sys/vm/drop_caches'. But, that evicts all unused inodes in the
system rather than just the inodes associated with the key being
removed, causing severe performance problems. Moreover, it requires
root privileges, so regular users can't "lock" their encrypted files.
Another workaround, used in Chromium OS kernels, is to add a new
VFS-level ioctl FS_IOC_DROP_CACHE which is a more restricted version of
drop_caches that operates on a single super_block. It does:
shrink_dcache_sb(sb);
invalidate_inodes(sb, false);
But it's still a hack. Yet, the major users of filesystem encryption
want this feature badly enough that they are actually using these hacks.
To properly solve the problem, start maintaining a list of the inodes
which have been "unlocked" using each master key. Originally this
wasn't possible because the kernel didn't keep track of in-use master
keys at all. But, with the ->s_master_keys keyring it is now possible.
Then, add an ioctl FS_IOC_REMOVE_ENCRYPTION_KEY. It finds the specified
master key in ->s_master_keys, then wipes the secret key itself, which
prevents any additional inodes from being unlocked with the key. Then,
it syncs the filesystem and evicts the inodes in the key's list. The
normal inode eviction code will free and wipe the per-file keys (in
->i_crypt_info). Note that freeing ->i_crypt_info without evicting the
inodes was also considered, but would have been racy.
Some inodes may still be in use when a master key is removed, and we
can't simply revoke random file descriptors, mmap's, etc. Thus, the
ioctl simply skips in-use inodes, and returns -EBUSY to indicate that
some inodes weren't evicted. The master key *secret* is still removed,
but the fscrypt_master_key struct remains to keep track of the remaining
inodes. Userspace can then retry the ioctl to evict the remaining
inodes. Alternatively, if userspace adds the key again, the refreshed
secret will be associated with the existing list of inodes so they
remain correctly tracked for future key removals.
The ioctl doesn't wipe pagecache pages. Thus, we tolerate that after a
kernel compromise some portions of plaintext file contents may still be
recoverable from memory. This can be solved by enabling page poisoning
system-wide, which security conscious users may choose to do. But it's
very difficult to solve otherwise, e.g. note that plaintext file
contents may have been read in other places than pagecache pages.
Like FS_IOC_ADD_ENCRYPTION_KEY, FS_IOC_REMOVE_ENCRYPTION_KEY is
initially restricted to privileged users only. This is sufficient for
some use cases, but not all. A later patch will relax this restriction,
but it will require introducing key hashes, among other changes.
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-08-05 10:35:46 +08:00
|
|
|
extern int fscrypt_drop_inode(struct inode *inode);
|
2018-12-12 17:50:12 +08:00
|
|
|
|
|
|
|
/* fname.c */
|
2020-05-12 03:13:57 +08:00
|
|
|
extern int fscrypt_setup_filename(struct inode *inode, const struct qstr *iname,
|
|
|
|
int lookup, struct fscrypt_name *fname);
|
2018-12-12 17:50:12 +08:00
|
|
|
|
|
|
|
static inline void fscrypt_free_filename(struct fscrypt_name *fname)
|
|
|
|
{
|
|
|
|
kfree(fname->crypto_buf.name);
|
|
|
|
}
|
|
|
|
|
2020-05-12 03:13:57 +08:00
|
|
|
extern int fscrypt_fname_alloc_buffer(const struct inode *inode,
|
|
|
|
u32 max_encrypted_len,
|
|
|
|
struct fscrypt_str *crypto_str);
|
|
|
|
extern void fscrypt_fname_free_buffer(struct fscrypt_str *crypto_str);
|
2019-12-16 05:39:47 +08:00
|
|
|
extern int fscrypt_fname_disk_to_usr(const struct inode *inode,
|
|
|
|
u32 hash, u32 minor_hash,
|
|
|
|
const struct fscrypt_str *iname,
|
|
|
|
struct fscrypt_str *oname);
|
fscrypt: improve format of no-key names
When an encrypted directory is listed without the key, the filesystem
must show "no-key names" that uniquely identify directory entries, are
at most 255 (NAME_MAX) bytes long, and don't contain '/' or '\0'.
Currently, for short names the no-key name is the base64 encoding of the
ciphertext filename, while for long names it's the base64 encoding of
the ciphertext filename's dirhash and second-to-last 16-byte block.
This format has the following problems:
- Since it doesn't always include the dirhash, it's incompatible with
directories that will use a secret-keyed dirhash over the plaintext
filenames. In this case, the dirhash won't be computable from the
ciphertext name without the key, so it instead must be retrieved from
the directory entry and always included in the no-key name.
Casefolded encrypted directories will use this type of dirhash.
- It's ambiguous: it's possible to craft two filenames that map to the
same no-key name, since the method used to abbreviate long filenames
doesn't use a proper cryptographic hash function.
Solve both these problems by switching to a new no-key name format that
is the base64 encoding of a variable-length structure that contains the
dirhash, up to 149 bytes of the ciphertext filename, and (if any bytes
remain) the SHA-256 of the remaining bytes of the ciphertext filename.
This ensures that each no-key name contains everything needed to find
the directory entry again, contains only legal characters, doesn't
exceed NAME_MAX, is unambiguous unless there's a SHA-256 collision, and
that we only take the performance hit of SHA-256 on very long filenames.
Note: this change does *not* address the existing issue where users can
modify the 'dirhash' part of a no-key name and the filesystem may still
accept the name.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
[EB: improved comments and commit message, fixed checking return value
of base64_decode(), check for SHA-256 error, continue to set disk_name
for short names to keep matching simpler, and many other cleanups]
Link: https://lore.kernel.org/r/20200120223201.241390-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-01-21 06:32:01 +08:00
|
|
|
extern bool fscrypt_match_name(const struct fscrypt_name *fname,
|
|
|
|
const u8 *de_name, u32 de_name_len);
|
fscrypt: derive dirhash key for casefolded directories
When we allow indexed directories to use both encryption and
casefolding, for the dirhash we can't just hash the ciphertext filenames
that are stored on-disk (as is done currently) because the dirhash must
be case insensitive, but the stored names are case-preserving. Nor can
we hash the plaintext names with an unkeyed hash (or a hash keyed with a
value stored on-disk like ext4's s_hash_seed), since that would leak
information about the names that encryption is meant to protect.
Instead, if we can accept a dirhash that's only computable when the
fscrypt key is available, we can hash the plaintext names with a keyed
hash using a secret key derived from the directory's fscrypt master key.
We'll use SipHash-2-4 for this purpose.
Prepare for this by deriving a SipHash key for each casefolded encrypted
directory. Make sure to handle deriving the key not only when setting
up the directory's fscrypt_info, but also in the case where the casefold
flag is enabled after the fscrypt_info was already set up. (We could
just always derive the key regardless of casefolding, but that would
introduce unnecessary overhead for people not using casefolding.)
Signed-off-by: Daniel Rosenberg <drosen@google.com>
[EB: improved commit message, updated fscrypt.rst, squashed with change
that avoids unnecessarily deriving the key, and many other cleanups]
Link: https://lore.kernel.org/r/20200120223201.241390-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-01-21 06:31:57 +08:00
|
|
|
extern u64 fscrypt_fname_siphash(const struct inode *dir,
|
|
|
|
const struct qstr *name);
|
|
|
|
|
2018-12-12 17:50:12 +08:00
|
|
|
/* bio.c */
|
2020-05-12 03:13:57 +08:00
|
|
|
extern void fscrypt_decrypt_bio(struct bio *bio);
|
|
|
|
extern int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
|
|
|
|
sector_t pblk, unsigned int len);
|
2018-12-12 17:50:12 +08:00
|
|
|
|
|
|
|
/* hooks.c */
|
|
|
|
extern int fscrypt_file_open(struct inode *inode, struct file *filp);
|
2019-03-21 02:39:10 +08:00
|
|
|
extern int __fscrypt_prepare_link(struct inode *inode, struct inode *dir,
|
|
|
|
struct dentry *dentry);
|
2018-12-12 17:50:12 +08:00
|
|
|
extern int __fscrypt_prepare_rename(struct inode *old_dir,
|
|
|
|
struct dentry *old_dentry,
|
|
|
|
struct inode *new_dir,
|
|
|
|
struct dentry *new_dentry,
|
|
|
|
unsigned int flags);
|
fscrypt: fix race where ->lookup() marks plaintext dentry as ciphertext
->lookup() in an encrypted directory begins as follows:
1. fscrypt_prepare_lookup():
a. Try to load the directory's encryption key.
b. If the key is unavailable, mark the dentry as a ciphertext name
via d_flags.
2. fscrypt_setup_filename():
a. Try to load the directory's encryption key.
b. If the key is available, encrypt the name (treated as a plaintext
name) to get the on-disk name. Otherwise decode the name
(treated as a ciphertext name) to get the on-disk name.
But if the key is concurrently added, it may be found at (2a) but not at
(1a). In this case, the dentry will be wrongly marked as a ciphertext
name even though it was actually treated as plaintext.
This will cause the dentry to be wrongly invalidated on the next lookup,
potentially causing problems. For example, if the racy ->lookup() was
part of sys_mount(), then the new mount will be detached when anything
tries to access it. This is despite the mountpoint having a plaintext
path, which should remain valid now that the key was added.
Of course, this is only possible if there's a userspace race. Still,
the additional kernel-side race is confusing and unexpected.
Close the kernel-side race by changing fscrypt_prepare_lookup() to also
set the on-disk filename (step 2b), consistent with the d_flags update.
Fixes: 28b4c263961c ("ext4 crypto: revalidate dentry after adding or removing the key")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2019-03-21 02:39:13 +08:00
|
|
|
extern int __fscrypt_prepare_lookup(struct inode *dir, struct dentry *dentry,
|
|
|
|
struct fscrypt_name *fname);
|
2020-01-21 06:31:56 +08:00
|
|
|
extern int fscrypt_prepare_setflags(struct inode *inode,
|
|
|
|
unsigned int oldflags, unsigned int flags);
|
2018-12-12 17:50:12 +08:00
|
|
|
extern int __fscrypt_prepare_symlink(struct inode *dir, unsigned int len,
|
|
|
|
unsigned int max_len,
|
|
|
|
struct fscrypt_str *disk_link);
|
|
|
|
extern int __fscrypt_encrypt_symlink(struct inode *inode, const char *target,
|
|
|
|
unsigned int len,
|
|
|
|
struct fscrypt_str *disk_link);
|
|
|
|
extern const char *fscrypt_get_symlink(struct inode *inode, const void *caddr,
|
|
|
|
unsigned int max_size,
|
|
|
|
struct delayed_call *done);
|
2019-03-26 15:52:31 +08:00
|
|
|
static inline void fscrypt_set_ops(struct super_block *sb,
|
|
|
|
const struct fscrypt_operations *s_cop)
|
|
|
|
{
|
|
|
|
sb->s_cop = s_cop;
|
|
|
|
}
|
2018-12-12 17:50:12 +08:00
|
|
|
#else /* !CONFIG_FS_ENCRYPTION */
|
|
|
|
|
|
|
|
static inline bool fscrypt_has_encryption_key(const struct inode *inode)
|
|
|
|
{
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2019-12-10 04:50:21 +08:00
|
|
|
static inline bool fscrypt_needs_contents_encryption(const struct inode *inode)
|
|
|
|
{
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2018-12-12 17:50:12 +08:00
|
|
|
static inline bool fscrypt_dummy_context_enabled(struct inode *inode)
|
|
|
|
{
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2019-03-21 02:39:11 +08:00
|
|
|
static inline void fscrypt_handle_d_move(struct dentry *dentry)
|
|
|
|
{
|
|
|
|
}
|
|
|
|
|
2018-12-12 17:50:12 +08:00
|
|
|
/* crypto.c */
|
|
|
|
static inline void fscrypt_enqueue_decrypt_work(struct work_struct *work)
|
|
|
|
{
|
|
|
|
}
|
|
|
|
|
2019-05-21 00:29:44 +08:00
|
|
|
static inline struct page *fscrypt_encrypt_pagecache_blocks(struct page *page,
|
|
|
|
unsigned int len,
|
|
|
|
unsigned int offs,
|
|
|
|
gfp_t gfp_flags)
|
2018-12-12 17:50:12 +08:00
|
|
|
{
|
|
|
|
return ERR_PTR(-EOPNOTSUPP);
|
|
|
|
}
|
|
|
|
|
2019-05-21 00:29:43 +08:00
|
|
|
static inline int fscrypt_encrypt_block_inplace(const struct inode *inode,
|
|
|
|
struct page *page,
|
|
|
|
unsigned int len,
|
|
|
|
unsigned int offs, u64 lblk_num,
|
|
|
|
gfp_t gfp_flags)
|
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
|
2019-05-21 00:29:47 +08:00
|
|
|
static inline int fscrypt_decrypt_pagecache_blocks(struct page *page,
|
|
|
|
unsigned int len,
|
|
|
|
unsigned int offs)
|
2018-12-12 17:50:12 +08:00
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
|
2019-05-21 00:29:46 +08:00
|
|
|
static inline int fscrypt_decrypt_block_inplace(const struct inode *inode,
|
|
|
|
struct page *page,
|
|
|
|
unsigned int len,
|
|
|
|
unsigned int offs, u64 lblk_num)
|
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
|
2019-05-21 00:29:39 +08:00
|
|
|
static inline bool fscrypt_is_bounce_page(struct page *page)
|
|
|
|
{
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline struct page *fscrypt_pagecache_page(struct page *bounce_page)
|
2018-12-12 17:50:12 +08:00
|
|
|
{
|
|
|
|
WARN_ON_ONCE(1);
|
|
|
|
return ERR_PTR(-EINVAL);
|
|
|
|
}
|
|
|
|
|
2019-05-21 00:29:39 +08:00
|
|
|
static inline void fscrypt_free_bounce_page(struct page *bounce_page)
|
2018-12-12 17:50:12 +08:00
|
|
|
{
|
|
|
|
}
|
|
|
|
|
|
|
|
/* policy.c */
|
|
|
|
static inline int fscrypt_ioctl_set_policy(struct file *filp,
|
|
|
|
const void __user *arg)
|
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline int fscrypt_ioctl_get_policy(struct file *filp, void __user *arg)
|
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
|
fscrypt: v2 encryption policy support
Add a new fscrypt policy version, "v2". It has the following changes
from the original policy version, which we call "v1" (*):
- Master keys (the user-provided encryption keys) are only ever used as
input to HKDF-SHA512. This is more flexible and less error-prone, and
it avoids the quirks and limitations of the AES-128-ECB based KDF.
Three classes of cryptographically isolated subkeys are defined:
- Per-file keys, like used in v1 policies except for the new KDF.
- Per-mode keys. These implement the semantics of the DIRECT_KEY
flag, which for v1 policies made the master key be used directly.
These are also planned to be used for inline encryption when
support for it is added.
- Key identifiers (see below).
- Each master key is identified by a 16-byte master_key_identifier,
which is derived from the key itself using HKDF-SHA512. This prevents
users from associating the wrong key with an encrypted file or
directory. This was easily possible with v1 policies, which
identified the key by an arbitrary 8-byte master_key_descriptor.
- The key must be provided in the filesystem-level keyring, not in a
process-subscribed keyring.
The following UAPI additions are made:
- The existing ioctl FS_IOC_SET_ENCRYPTION_POLICY can now be passed a
fscrypt_policy_v2 to set a v2 encryption policy. It's disambiguated
from fscrypt_policy/fscrypt_policy_v1 by the version code prefix.
- A new ioctl FS_IOC_GET_ENCRYPTION_POLICY_EX is added. It allows
getting the v1 or v2 encryption policy of an encrypted file or
directory. The existing FS_IOC_GET_ENCRYPTION_POLICY ioctl could not
be used because it did not have a way for userspace to indicate which
policy structure is expected. The new ioctl includes a size field, so
it is extensible to future fscrypt policy versions.
- The ioctls FS_IOC_ADD_ENCRYPTION_KEY, FS_IOC_REMOVE_ENCRYPTION_KEY,
and FS_IOC_GET_ENCRYPTION_KEY_STATUS now support managing keys for v2
encryption policies. Such keys are kept logically separate from keys
for v1 encryption policies, and are identified by 'identifier' rather
than by 'descriptor'. The 'identifier' need not be provided when
adding a key, since the kernel will calculate it anyway.
This patch temporarily keeps adding/removing v2 policy keys behind the
same permission check done for adding/removing v1 policy keys:
capable(CAP_SYS_ADMIN). However, the next patch will carefully take
advantage of the cryptographically secure master_key_identifier to allow
non-root users to add/remove v2 policy keys, thus providing a full
replacement for v1 policies.
(*) Actually, in the API fscrypt_policy::version is 0 while on-disk
fscrypt_context::format is 1. But I believe it makes the most sense
to advance both to '2' to have them be in sync, and to consider the
numbering to start at 1 except for the API quirk.
Reviewed-by: Paul Crowley <paulcrowley@google.com>
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-08-05 10:35:47 +08:00
|
|
|
static inline int fscrypt_ioctl_get_policy_ex(struct file *filp,
|
|
|
|
void __user *arg)
|
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
|
2020-03-15 04:50:49 +08:00
|
|
|
static inline int fscrypt_ioctl_get_nonce(struct file *filp, void __user *arg)
|
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
|
2018-12-12 17:50:12 +08:00
|
|
|
static inline int fscrypt_has_permitted_context(struct inode *parent,
|
|
|
|
struct inode *child)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline int fscrypt_inherit_context(struct inode *parent,
|
|
|
|
struct inode *child,
|
|
|
|
void *fs_data, bool preload)
|
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
|
fscrypt: add FS_IOC_ADD_ENCRYPTION_KEY ioctl
Add a new fscrypt ioctl, FS_IOC_ADD_ENCRYPTION_KEY. This ioctl adds an
encryption key to the filesystem's fscrypt keyring ->s_master_keys,
making any files encrypted with that key appear "unlocked".
Why we need this
~~~~~~~~~~~~~~~~
The main problem is that the "locked/unlocked" (ciphertext/plaintext)
status of encrypted files is global, but the fscrypt keys are not.
fscrypt only looks for keys in the keyring(s) the process accessing the
filesystem is subscribed to: the thread keyring, process keyring, and
session keyring, where the session keyring may contain the user keyring.
Therefore, userspace has to put fscrypt keys in the keyrings for
individual users or sessions. But this means that when a process with a
different keyring tries to access encrypted files, whether they appear
"unlocked" or not is nondeterministic. This is because it depends on
whether the files are currently present in the inode cache.
Fixing this by consistently providing each process its own view of the
filesystem depending on whether it has the key or not isn't feasible due
to how the VFS caches work. Furthermore, while sometimes users expect
this behavior, it is misguided for two reasons. First, it would be an
OS-level access control mechanism largely redundant with existing access
control mechanisms such as UNIX file permissions, ACLs, LSMs, etc.
Encryption is actually for protecting the data at rest.
Second, almost all users of fscrypt actually do need the keys to be
global. The largest users of fscrypt, Android and Chromium OS, achieve
this by having PID 1 create a "session keyring" that is inherited by
every process. This works, but it isn't scalable because it prevents
session keyrings from being used for any other purpose.
On general-purpose Linux distros, the 'fscrypt' userspace tool [1] can't
similarly abuse the session keyring, so to make 'sudo' work on all
systems it has to link all the user keyrings into root's user keyring
[2]. This is ugly and raises security concerns. Moreover it can't make
the keys available to system services, such as sshd trying to access the
user's '~/.ssh' directory (see [3], [4]) or NetworkManager trying to
read certificates from the user's home directory (see [5]); or to Docker
containers (see [6], [7]).
By having an API to add a key to the *filesystem* we'll be able to fix
the above bugs, remove userspace workarounds, and clearly express the
intended semantics: the locked/unlocked status of an encrypted directory
is global, and encryption is orthogonal to OS-level access control.
Why not use the add_key() syscall
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We use an ioctl for this API rather than the existing add_key() system
call because the ioctl gives us the flexibility needed to implement
fscrypt-specific semantics that will be introduced in later patches:
- Supporting key removal with the semantics such that the secret is
removed immediately and any unused inodes using the key are evicted;
also, the eviction of any in-use inodes can be retried.
- Calculating a key-dependent cryptographic identifier and returning it
to userspace.
- Allowing keys to be added and removed by non-root users, but only keys
for v2 encryption policies; and to prevent denial-of-service attacks,
users can only remove keys they themselves have added, and a key is
only really removed after all users who added it have removed it.
Trying to shoehorn these semantics into the keyrings syscalls would be
very difficult, whereas the ioctls make things much easier.
However, to reuse code the implementation still uses the keyrings
service internally. Thus we get lockless RCU-mode key lookups without
having to re-implement it, and the keys automatically show up in
/proc/keys for debugging purposes.
References:
[1] https://github.com/google/fscrypt
[2] https://goo.gl/55cCrI#heading=h.vf09isp98isb
[3] https://github.com/google/fscrypt/issues/111#issuecomment-444347939
[4] https://github.com/google/fscrypt/issues/116
[5] https://bugs.launchpad.net/ubuntu/+source/fscrypt/+bug/1770715
[6] https://github.com/google/fscrypt/issues/128
[7] https://askubuntu.com/questions/1130306/cannot-run-docker-on-an-encrypted-filesystem
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-08-05 10:35:46 +08:00
|
|
|
/* keyring.c */
|
|
|
|
static inline void fscrypt_sb_free(struct super_block *sb)
|
|
|
|
{
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline int fscrypt_ioctl_add_key(struct file *filp, void __user *arg)
|
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
|
fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY ioctl
Add a new fscrypt ioctl, FS_IOC_REMOVE_ENCRYPTION_KEY. This ioctl
removes an encryption key that was added by FS_IOC_ADD_ENCRYPTION_KEY.
It wipes the secret key itself, then "locks" the encrypted files and
directories that had been unlocked using that key -- implemented by
evicting the relevant dentries and inodes from the VFS caches.
The problem this solves is that many fscrypt users want the ability to
remove encryption keys, causing the corresponding encrypted directories
to appear "locked" (presented in ciphertext form) again. Moreover,
users want removing an encryption key to *really* remove it, in the
sense that the removed keys cannot be recovered even if kernel memory is
compromised, e.g. by the exploit of a kernel security vulnerability or
by a physical attack. This is desirable after a user logs out of the
system, for example. In many cases users even already assume this to be
the case and are surprised to hear when it's not.
It is not sufficient to simply unlink the master key from the keyring
(or to revoke or invalidate it), since the actual encryption transform
objects are still pinned in memory by their inodes. Therefore, to
really remove a key we must also evict the relevant inodes.
Currently one workaround is to run 'sync && echo 2 >
/proc/sys/vm/drop_caches'. But, that evicts all unused inodes in the
system rather than just the inodes associated with the key being
removed, causing severe performance problems. Moreover, it requires
root privileges, so regular users can't "lock" their encrypted files.
Another workaround, used in Chromium OS kernels, is to add a new
VFS-level ioctl FS_IOC_DROP_CACHE which is a more restricted version of
drop_caches that operates on a single super_block. It does:
shrink_dcache_sb(sb);
invalidate_inodes(sb, false);
But it's still a hack. Yet, the major users of filesystem encryption
want this feature badly enough that they are actually using these hacks.
To properly solve the problem, start maintaining a list of the inodes
which have been "unlocked" using each master key. Originally this
wasn't possible because the kernel didn't keep track of in-use master
keys at all. But, with the ->s_master_keys keyring it is now possible.
Then, add an ioctl FS_IOC_REMOVE_ENCRYPTION_KEY. It finds the specified
master key in ->s_master_keys, then wipes the secret key itself, which
prevents any additional inodes from being unlocked with the key. Then,
it syncs the filesystem and evicts the inodes in the key's list. The
normal inode eviction code will free and wipe the per-file keys (in
->i_crypt_info). Note that freeing ->i_crypt_info without evicting the
inodes was also considered, but would have been racy.
Some inodes may still be in use when a master key is removed, and we
can't simply revoke random file descriptors, mmap's, etc. Thus, the
ioctl simply skips in-use inodes, and returns -EBUSY to indicate that
some inodes weren't evicted. The master key *secret* is still removed,
but the fscrypt_master_key struct remains to keep track of the remaining
inodes. Userspace can then retry the ioctl to evict the remaining
inodes. Alternatively, if userspace adds the key again, the refreshed
secret will be associated with the existing list of inodes so they
remain correctly tracked for future key removals.
The ioctl doesn't wipe pagecache pages. Thus, we tolerate that after a
kernel compromise some portions of plaintext file contents may still be
recoverable from memory. This can be solved by enabling page poisoning
system-wide, which security conscious users may choose to do. But it's
very difficult to solve otherwise, e.g. note that plaintext file
contents may have been read in other places than pagecache pages.
Like FS_IOC_ADD_ENCRYPTION_KEY, FS_IOC_REMOVE_ENCRYPTION_KEY is
initially restricted to privileged users only. This is sufficient for
some use cases, but not all. A later patch will relax this restriction,
but it will require introducing key hashes, among other changes.
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-08-05 10:35:46 +08:00
|
|
|
static inline int fscrypt_ioctl_remove_key(struct file *filp, void __user *arg)
|
2019-08-05 10:35:47 +08:00
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline int fscrypt_ioctl_remove_key_all_users(struct file *filp,
|
|
|
|
void __user *arg)
|
fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY ioctl
Add a new fscrypt ioctl, FS_IOC_REMOVE_ENCRYPTION_KEY. This ioctl
removes an encryption key that was added by FS_IOC_ADD_ENCRYPTION_KEY.
It wipes the secret key itself, then "locks" the encrypted files and
directories that had been unlocked using that key -- implemented by
evicting the relevant dentries and inodes from the VFS caches.
The problem this solves is that many fscrypt users want the ability to
remove encryption keys, causing the corresponding encrypted directories
to appear "locked" (presented in ciphertext form) again. Moreover,
users want removing an encryption key to *really* remove it, in the
sense that the removed keys cannot be recovered even if kernel memory is
compromised, e.g. by the exploit of a kernel security vulnerability or
by a physical attack. This is desirable after a user logs out of the
system, for example. In many cases users even already assume this to be
the case and are surprised to hear when it's not.
It is not sufficient to simply unlink the master key from the keyring
(or to revoke or invalidate it), since the actual encryption transform
objects are still pinned in memory by their inodes. Therefore, to
really remove a key we must also evict the relevant inodes.
Currently one workaround is to run 'sync && echo 2 >
/proc/sys/vm/drop_caches'. But, that evicts all unused inodes in the
system rather than just the inodes associated with the key being
removed, causing severe performance problems. Moreover, it requires
root privileges, so regular users can't "lock" their encrypted files.
Another workaround, used in Chromium OS kernels, is to add a new
VFS-level ioctl FS_IOC_DROP_CACHE which is a more restricted version of
drop_caches that operates on a single super_block. It does:
shrink_dcache_sb(sb);
invalidate_inodes(sb, false);
But it's still a hack. Yet, the major users of filesystem encryption
want this feature badly enough that they are actually using these hacks.
To properly solve the problem, start maintaining a list of the inodes
which have been "unlocked" using each master key. Originally this
wasn't possible because the kernel didn't keep track of in-use master
keys at all. But, with the ->s_master_keys keyring it is now possible.
Then, add an ioctl FS_IOC_REMOVE_ENCRYPTION_KEY. It finds the specified
master key in ->s_master_keys, then wipes the secret key itself, which
prevents any additional inodes from being unlocked with the key. Then,
it syncs the filesystem and evicts the inodes in the key's list. The
normal inode eviction code will free and wipe the per-file keys (in
->i_crypt_info). Note that freeing ->i_crypt_info without evicting the
inodes was also considered, but would have been racy.
Some inodes may still be in use when a master key is removed, and we
can't simply revoke random file descriptors, mmap's, etc. Thus, the
ioctl simply skips in-use inodes, and returns -EBUSY to indicate that
some inodes weren't evicted. The master key *secret* is still removed,
but the fscrypt_master_key struct remains to keep track of the remaining
inodes. Userspace can then retry the ioctl to evict the remaining
inodes. Alternatively, if userspace adds the key again, the refreshed
secret will be associated with the existing list of inodes so they
remain correctly tracked for future key removals.
The ioctl doesn't wipe pagecache pages. Thus, we tolerate that after a
kernel compromise some portions of plaintext file contents may still be
recoverable from memory. This can be solved by enabling page poisoning
system-wide, which security conscious users may choose to do. But it's
very difficult to solve otherwise, e.g. note that plaintext file
contents may have been read in other places than pagecache pages.
Like FS_IOC_ADD_ENCRYPTION_KEY, FS_IOC_REMOVE_ENCRYPTION_KEY is
initially restricted to privileged users only. This is sufficient for
some use cases, but not all. A later patch will relax this restriction,
but it will require introducing key hashes, among other changes.
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-08-05 10:35:46 +08:00
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
|
fscrypt: add FS_IOC_GET_ENCRYPTION_KEY_STATUS ioctl
Add a new fscrypt ioctl, FS_IOC_GET_ENCRYPTION_KEY_STATUS. Given a key
specified by 'struct fscrypt_key_specifier' (the same way a key is
specified for the other fscrypt key management ioctls), it returns
status information in a 'struct fscrypt_get_key_status_arg'.
The main motivation for this is that applications need to be able to
check whether an encrypted directory is "unlocked" or not, so that they
can add the key if it is not, and avoid adding the key (which may
involve prompting the user for a passphrase) if it already is.
It's possible to use some workarounds such as checking whether opening a
regular file fails with ENOKEY, or checking whether the filenames "look
like gibberish" or not. However, no workaround is usable in all cases.
Like the other key management ioctls, the keyrings syscalls may seem at
first to be a good fit for this. Unfortunately, they are not. Even if
we exposed the keyring ID of the ->s_master_keys keyring and gave
everyone Search permission on it (note: currently the keyrings
permission system would also allow everyone to "invalidate" the keyring
too), the fscrypt keys have an additional state that doesn't map cleanly
to the keyrings API: the secret can be removed, but we can be still
tracking the files that were using the key, and the removal can be
re-attempted or the secret added again.
After later patches, some applications will also need a way to determine
whether a key was added by the current user vs. by some other user.
Reserved fields are included in fscrypt_get_key_status_arg for this and
other future extensions.
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-08-05 10:35:46 +08:00
|
|
|
static inline int fscrypt_ioctl_get_key_status(struct file *filp,
|
|
|
|
void __user *arg)
|
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
|
2019-08-05 10:35:45 +08:00
|
|
|
/* keysetup.c */
|
2018-12-12 17:50:12 +08:00
|
|
|
static inline int fscrypt_get_encryption_info(struct inode *inode)
|
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void fscrypt_put_encryption_info(struct inode *inode)
|
|
|
|
{
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2019-04-11 04:21:15 +08:00
|
|
|
static inline void fscrypt_free_inode(struct inode *inode)
|
|
|
|
{
|
|
|
|
}
|
|
|
|
|
fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY ioctl
Add a new fscrypt ioctl, FS_IOC_REMOVE_ENCRYPTION_KEY. This ioctl
removes an encryption key that was added by FS_IOC_ADD_ENCRYPTION_KEY.
It wipes the secret key itself, then "locks" the encrypted files and
directories that had been unlocked using that key -- implemented by
evicting the relevant dentries and inodes from the VFS caches.
The problem this solves is that many fscrypt users want the ability to
remove encryption keys, causing the corresponding encrypted directories
to appear "locked" (presented in ciphertext form) again. Moreover,
users want removing an encryption key to *really* remove it, in the
sense that the removed keys cannot be recovered even if kernel memory is
compromised, e.g. by the exploit of a kernel security vulnerability or
by a physical attack. This is desirable after a user logs out of the
system, for example. In many cases users even already assume this to be
the case and are surprised to hear when it's not.
It is not sufficient to simply unlink the master key from the keyring
(or to revoke or invalidate it), since the actual encryption transform
objects are still pinned in memory by their inodes. Therefore, to
really remove a key we must also evict the relevant inodes.
Currently one workaround is to run 'sync && echo 2 >
/proc/sys/vm/drop_caches'. But, that evicts all unused inodes in the
system rather than just the inodes associated with the key being
removed, causing severe performance problems. Moreover, it requires
root privileges, so regular users can't "lock" their encrypted files.
Another workaround, used in Chromium OS kernels, is to add a new
VFS-level ioctl FS_IOC_DROP_CACHE which is a more restricted version of
drop_caches that operates on a single super_block. It does:
shrink_dcache_sb(sb);
invalidate_inodes(sb, false);
But it's still a hack. Yet, the major users of filesystem encryption
want this feature badly enough that they are actually using these hacks.
To properly solve the problem, start maintaining a list of the inodes
which have been "unlocked" using each master key. Originally this
wasn't possible because the kernel didn't keep track of in-use master
keys at all. But, with the ->s_master_keys keyring it is now possible.
Then, add an ioctl FS_IOC_REMOVE_ENCRYPTION_KEY. It finds the specified
master key in ->s_master_keys, then wipes the secret key itself, which
prevents any additional inodes from being unlocked with the key. Then,
it syncs the filesystem and evicts the inodes in the key's list. The
normal inode eviction code will free and wipe the per-file keys (in
->i_crypt_info). Note that freeing ->i_crypt_info without evicting the
inodes was also considered, but would have been racy.
Some inodes may still be in use when a master key is removed, and we
can't simply revoke random file descriptors, mmap's, etc. Thus, the
ioctl simply skips in-use inodes, and returns -EBUSY to indicate that
some inodes weren't evicted. The master key *secret* is still removed,
but the fscrypt_master_key struct remains to keep track of the remaining
inodes. Userspace can then retry the ioctl to evict the remaining
inodes. Alternatively, if userspace adds the key again, the refreshed
secret will be associated with the existing list of inodes so they
remain correctly tracked for future key removals.
The ioctl doesn't wipe pagecache pages. Thus, we tolerate that after a
kernel compromise some portions of plaintext file contents may still be
recoverable from memory. This can be solved by enabling page poisoning
system-wide, which security conscious users may choose to do. But it's
very difficult to solve otherwise, e.g. note that plaintext file
contents may have been read in other places than pagecache pages.
Like FS_IOC_ADD_ENCRYPTION_KEY, FS_IOC_REMOVE_ENCRYPTION_KEY is
initially restricted to privileged users only. This is sufficient for
some use cases, but not all. A later patch will relax this restriction,
but it will require introducing key hashes, among other changes.
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-08-05 10:35:46 +08:00
|
|
|
static inline int fscrypt_drop_inode(struct inode *inode)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2018-12-12 17:50:12 +08:00
|
|
|
/* fname.c */
|
|
|
|
static inline int fscrypt_setup_filename(struct inode *dir,
|
|
|
|
const struct qstr *iname,
|
|
|
|
int lookup, struct fscrypt_name *fname)
|
|
|
|
{
|
|
|
|
if (IS_ENCRYPTED(dir))
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
|
fscrypt: fix race where ->lookup() marks plaintext dentry as ciphertext
->lookup() in an encrypted directory begins as follows:
1. fscrypt_prepare_lookup():
a. Try to load the directory's encryption key.
b. If the key is unavailable, mark the dentry as a ciphertext name
via d_flags.
2. fscrypt_setup_filename():
a. Try to load the directory's encryption key.
b. If the key is available, encrypt the name (treated as a plaintext
name) to get the on-disk name. Otherwise decode the name
(treated as a ciphertext name) to get the on-disk name.
But if the key is concurrently added, it may be found at (2a) but not at
(1a). In this case, the dentry will be wrongly marked as a ciphertext
name even though it was actually treated as plaintext.
This will cause the dentry to be wrongly invalidated on the next lookup,
potentially causing problems. For example, if the racy ->lookup() was
part of sys_mount(), then the new mount will be detached when anything
tries to access it. This is despite the mountpoint having a plaintext
path, which should remain valid now that the key was added.
Of course, this is only possible if there's a userspace race. Still,
the additional kernel-side race is confusing and unexpected.
Close the kernel-side race by changing fscrypt_prepare_lookup() to also
set the on-disk filename (step 2b), consistent with the d_flags update.
Fixes: 28b4c263961c ("ext4 crypto: revalidate dentry after adding or removing the key")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2019-03-21 02:39:13 +08:00
|
|
|
memset(fname, 0, sizeof(*fname));
|
2018-12-12 17:50:12 +08:00
|
|
|
fname->usr_fname = iname;
|
|
|
|
fname->disk_name.name = (unsigned char *)iname->name;
|
|
|
|
fname->disk_name.len = iname->len;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void fscrypt_free_filename(struct fscrypt_name *fname)
|
|
|
|
{
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline int fscrypt_fname_alloc_buffer(const struct inode *inode,
|
|
|
|
u32 max_encrypted_len,
|
|
|
|
struct fscrypt_str *crypto_str)
|
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void fscrypt_fname_free_buffer(struct fscrypt_str *crypto_str)
|
|
|
|
{
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2019-12-16 05:39:47 +08:00
|
|
|
static inline int fscrypt_fname_disk_to_usr(const struct inode *inode,
|
2018-12-12 17:50:12 +08:00
|
|
|
u32 hash, u32 minor_hash,
|
|
|
|
const struct fscrypt_str *iname,
|
|
|
|
struct fscrypt_str *oname)
|
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline bool fscrypt_match_name(const struct fscrypt_name *fname,
|
|
|
|
const u8 *de_name, u32 de_name_len)
|
|
|
|
{
|
|
|
|
/* Encryption support disabled; use standard comparison */
|
|
|
|
if (de_name_len != fname->disk_name.len)
|
|
|
|
return false;
|
|
|
|
return !memcmp(de_name, fname->disk_name.name, fname->disk_name.len);
|
|
|
|
}
|
|
|
|
|
fscrypt: derive dirhash key for casefolded directories
When we allow indexed directories to use both encryption and
casefolding, for the dirhash we can't just hash the ciphertext filenames
that are stored on-disk (as is done currently) because the dirhash must
be case insensitive, but the stored names are case-preserving. Nor can
we hash the plaintext names with an unkeyed hash (or a hash keyed with a
value stored on-disk like ext4's s_hash_seed), since that would leak
information about the names that encryption is meant to protect.
Instead, if we can accept a dirhash that's only computable when the
fscrypt key is available, we can hash the plaintext names with a keyed
hash using a secret key derived from the directory's fscrypt master key.
We'll use SipHash-2-4 for this purpose.
Prepare for this by deriving a SipHash key for each casefolded encrypted
directory. Make sure to handle deriving the key not only when setting
up the directory's fscrypt_info, but also in the case where the casefold
flag is enabled after the fscrypt_info was already set up. (We could
just always derive the key regardless of casefolding, but that would
introduce unnecessary overhead for people not using casefolding.)
Signed-off-by: Daniel Rosenberg <drosen@google.com>
[EB: improved commit message, updated fscrypt.rst, squashed with change
that avoids unnecessarily deriving the key, and many other cleanups]
Link: https://lore.kernel.org/r/20200120223201.241390-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-01-21 06:31:57 +08:00
|
|
|
static inline u64 fscrypt_fname_siphash(const struct inode *dir,
|
|
|
|
const struct qstr *name)
|
|
|
|
{
|
|
|
|
WARN_ON_ONCE(1);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2018-12-12 17:50:12 +08:00
|
|
|
/* bio.c */
|
|
|
|
static inline void fscrypt_decrypt_bio(struct bio *bio)
|
|
|
|
{
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
|
|
|
|
sector_t pblk, unsigned int len)
|
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* hooks.c */
|
|
|
|
|
|
|
|
static inline int fscrypt_file_open(struct inode *inode, struct file *filp)
|
|
|
|
{
|
|
|
|
if (IS_ENCRYPTED(inode))
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2019-03-21 02:39:10 +08:00
|
|
|
static inline int __fscrypt_prepare_link(struct inode *inode, struct inode *dir,
|
|
|
|
struct dentry *dentry)
|
2018-12-12 17:50:12 +08:00
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline int __fscrypt_prepare_rename(struct inode *old_dir,
|
|
|
|
struct dentry *old_dentry,
|
|
|
|
struct inode *new_dir,
|
|
|
|
struct dentry *new_dentry,
|
|
|
|
unsigned int flags)
|
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline int __fscrypt_prepare_lookup(struct inode *dir,
|
fscrypt: fix race where ->lookup() marks plaintext dentry as ciphertext
->lookup() in an encrypted directory begins as follows:
1. fscrypt_prepare_lookup():
a. Try to load the directory's encryption key.
b. If the key is unavailable, mark the dentry as a ciphertext name
via d_flags.
2. fscrypt_setup_filename():
a. Try to load the directory's encryption key.
b. If the key is available, encrypt the name (treated as a plaintext
name) to get the on-disk name. Otherwise decode the name
(treated as a ciphertext name) to get the on-disk name.
But if the key is concurrently added, it may be found at (2a) but not at
(1a). In this case, the dentry will be wrongly marked as a ciphertext
name even though it was actually treated as plaintext.
This will cause the dentry to be wrongly invalidated on the next lookup,
potentially causing problems. For example, if the racy ->lookup() was
part of sys_mount(), then the new mount will be detached when anything
tries to access it. This is despite the mountpoint having a plaintext
path, which should remain valid now that the key was added.
Of course, this is only possible if there's a userspace race. Still,
the additional kernel-side race is confusing and unexpected.
Close the kernel-side race by changing fscrypt_prepare_lookup() to also
set the on-disk filename (step 2b), consistent with the d_flags update.
Fixes: 28b4c263961c ("ext4 crypto: revalidate dentry after adding or removing the key")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2019-03-21 02:39:13 +08:00
|
|
|
struct dentry *dentry,
|
|
|
|
struct fscrypt_name *fname)
|
2018-12-12 17:50:12 +08:00
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
|
2020-01-21 06:31:56 +08:00
|
|
|
static inline int fscrypt_prepare_setflags(struct inode *inode,
|
|
|
|
unsigned int oldflags,
|
|
|
|
unsigned int flags)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2018-12-12 17:50:12 +08:00
|
|
|
static inline int __fscrypt_prepare_symlink(struct inode *dir,
|
|
|
|
unsigned int len,
|
|
|
|
unsigned int max_len,
|
|
|
|
struct fscrypt_str *disk_link)
|
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
static inline int __fscrypt_encrypt_symlink(struct inode *inode,
|
|
|
|
const char *target,
|
|
|
|
unsigned int len,
|
|
|
|
struct fscrypt_str *disk_link)
|
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline const char *fscrypt_get_symlink(struct inode *inode,
|
|
|
|
const void *caddr,
|
|
|
|
unsigned int max_size,
|
|
|
|
struct delayed_call *done)
|
|
|
|
{
|
|
|
|
return ERR_PTR(-EOPNOTSUPP);
|
|
|
|
}
|
2019-03-26 15:52:31 +08:00
|
|
|
|
|
|
|
static inline void fscrypt_set_ops(struct super_block *sb,
|
|
|
|
const struct fscrypt_operations *s_cop)
|
|
|
|
{
|
|
|
|
}
|
|
|
|
|
2018-12-12 17:50:12 +08:00
|
|
|
#endif /* !CONFIG_FS_ENCRYPTION */
|
2017-10-10 03:15:34 +08:00
|
|
|
|
2017-10-10 03:15:39 +08:00
|
|
|
/**
|
2020-05-12 03:13:56 +08:00
|
|
|
* fscrypt_require_key() - require an inode's encryption key
|
2017-10-10 03:15:39 +08:00
|
|
|
* @inode: the inode we need the key for
|
|
|
|
*
|
|
|
|
* If the inode is encrypted, set up its encryption key if not already done.
|
|
|
|
* Then require that the key be present and return -ENOKEY otherwise.
|
|
|
|
*
|
|
|
|
* No locks are needed, and the key will live as long as the struct inode --- so
|
|
|
|
* it won't go away from under you.
|
|
|
|
*
|
|
|
|
* Return: 0 on success, -ENOKEY if the key is missing, or another -errno code
|
|
|
|
* if a problem occurred while setting up the encryption key.
|
|
|
|
*/
|
|
|
|
static inline int fscrypt_require_key(struct inode *inode)
|
|
|
|
{
|
|
|
|
if (IS_ENCRYPTED(inode)) {
|
|
|
|
int err = fscrypt_get_encryption_info(inode);
|
|
|
|
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
if (!fscrypt_has_encryption_key(inode))
|
|
|
|
return -ENOKEY;
|
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
2017-10-10 03:15:34 +08:00
|
|
|
|
2017-10-10 03:15:41 +08:00
|
|
|
/**
|
2020-05-12 03:13:56 +08:00
|
|
|
* fscrypt_prepare_link() - prepare to link an inode into a possibly-encrypted
|
|
|
|
* directory
|
2017-10-10 03:15:41 +08:00
|
|
|
* @old_dentry: an existing dentry for the inode being linked
|
|
|
|
* @dir: the target directory
|
|
|
|
* @dentry: negative dentry for the target filename
|
|
|
|
*
|
|
|
|
* A new link can only be added to an encrypted directory if the directory's
|
|
|
|
* encryption key is available --- since otherwise we'd have no way to encrypt
|
|
|
|
* the filename. Therefore, we first set up the directory's encryption key (if
|
|
|
|
* not already done) and return an error if it's unavailable.
|
|
|
|
*
|
|
|
|
* We also verify that the link will not violate the constraint that all files
|
|
|
|
* in an encrypted directory tree use the same encryption policy.
|
|
|
|
*
|
|
|
|
* Return: 0 on success, -ENOKEY if the directory's encryption key is missing,
|
fscrypt: return -EXDEV for incompatible rename or link into encrypted dir
Currently, trying to rename or link a regular file, directory, or
symlink into an encrypted directory fails with EPERM when the source
file is unencrypted or is encrypted with a different encryption policy,
and is on the same mountpoint. It is correct for the operation to fail,
but the choice of EPERM breaks tools like 'mv' that know to copy rather
than rename if they see EXDEV, but don't know what to do with EPERM.
Our original motivation for EPERM was to encourage users to securely
handle their data. Encrypting files by "moving" them into an encrypted
directory can be insecure because the unencrypted data may remain in
free space on disk, where it can later be recovered by an attacker.
It's much better to encrypt the data from the start, or at least try to
securely delete the source data e.g. using the 'shred' program.
However, the current behavior hasn't been effective at achieving its
goal because users tend to be confused, hack around it, and complain;
see e.g. https://github.com/google/fscrypt/issues/76. And in some cases
it's actually inconsistent or unnecessary. For example, 'mv'-ing files
between differently encrypted directories doesn't work even in cases
where it can be secure, such as when in userspace the same passphrase
protects both directories. Yet, you *can* already 'mv' unencrypted
files into an encrypted directory if the source files are on a different
mountpoint, even though doing so is often insecure.
There are probably better ways to teach users to securely handle their
files. For example, the 'fscrypt' userspace tool could provide a
command that migrates unencrypted files into an encrypted directory,
acting like 'shred' on the source files and providing appropriate
warnings depending on the type of the source filesystem and disk.
Receiving errors on unimportant files might also force some users to
disable encryption, thus making the behavior counterproductive. It's
desirable to make encryption as unobtrusive as possible.
Therefore, change the error code from EPERM to EXDEV so that tools
looking for EXDEV will fall back to a copy.
This, of course, doesn't prevent users from still doing the right things
to securely manage their files. Note that this also matches the
behavior when a file is renamed between two project quota hierarchies;
so there's precedent for using EXDEV for things other than mountpoints.
xfstests generic/398 will require an update with this change.
[Rewritten from an earlier patch series by Michael Halcrow.]
Cc: Michael Halcrow <mhalcrow@google.com>
Cc: Joe Richey <joerichey@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-01-23 08:20:21 +08:00
|
|
|
* -EXDEV if the link would result in an inconsistent encryption policy, or
|
2017-10-10 03:15:41 +08:00
|
|
|
* another -errno code.
|
|
|
|
*/
|
|
|
|
static inline int fscrypt_prepare_link(struct dentry *old_dentry,
|
|
|
|
struct inode *dir,
|
|
|
|
struct dentry *dentry)
|
|
|
|
{
|
|
|
|
if (IS_ENCRYPTED(dir))
|
2019-03-21 02:39:10 +08:00
|
|
|
return __fscrypt_prepare_link(d_inode(old_dentry), dir, dentry);
|
2017-10-10 03:15:41 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2017-10-10 03:15:42 +08:00
|
|
|
/**
|
2020-05-12 03:13:56 +08:00
|
|
|
* fscrypt_prepare_rename() - prepare for a rename between possibly-encrypted
|
|
|
|
* directories
|
2017-10-10 03:15:42 +08:00
|
|
|
* @old_dir: source directory
|
|
|
|
* @old_dentry: dentry for source file
|
|
|
|
* @new_dir: target directory
|
|
|
|
* @new_dentry: dentry for target location (may be negative unless exchanging)
|
|
|
|
* @flags: rename flags (we care at least about %RENAME_EXCHANGE)
|
|
|
|
*
|
|
|
|
* Prepare for ->rename() where the source and/or target directories may be
|
|
|
|
* encrypted. A new link can only be added to an encrypted directory if the
|
|
|
|
* directory's encryption key is available --- since otherwise we'd have no way
|
|
|
|
* to encrypt the filename. A rename to an existing name, on the other hand,
|
|
|
|
* *is* cryptographically possible without the key. However, we take the more
|
|
|
|
* conservative approach and just forbid all no-key renames.
|
|
|
|
*
|
|
|
|
* We also verify that the rename will not violate the constraint that all files
|
|
|
|
* in an encrypted directory tree use the same encryption policy.
|
|
|
|
*
|
fscrypt: return -EXDEV for incompatible rename or link into encrypted dir
Currently, trying to rename or link a regular file, directory, or
symlink into an encrypted directory fails with EPERM when the source
file is unencrypted or is encrypted with a different encryption policy,
and is on the same mountpoint. It is correct for the operation to fail,
but the choice of EPERM breaks tools like 'mv' that know to copy rather
than rename if they see EXDEV, but don't know what to do with EPERM.
Our original motivation for EPERM was to encourage users to securely
handle their data. Encrypting files by "moving" them into an encrypted
directory can be insecure because the unencrypted data may remain in
free space on disk, where it can later be recovered by an attacker.
It's much better to encrypt the data from the start, or at least try to
securely delete the source data e.g. using the 'shred' program.
However, the current behavior hasn't been effective at achieving its
goal because users tend to be confused, hack around it, and complain;
see e.g. https://github.com/google/fscrypt/issues/76. And in some cases
it's actually inconsistent or unnecessary. For example, 'mv'-ing files
between differently encrypted directories doesn't work even in cases
where it can be secure, such as when in userspace the same passphrase
protects both directories. Yet, you *can* already 'mv' unencrypted
files into an encrypted directory if the source files are on a different
mountpoint, even though doing so is often insecure.
There are probably better ways to teach users to securely handle their
files. For example, the 'fscrypt' userspace tool could provide a
command that migrates unencrypted files into an encrypted directory,
acting like 'shred' on the source files and providing appropriate
warnings depending on the type of the source filesystem and disk.
Receiving errors on unimportant files might also force some users to
disable encryption, thus making the behavior counterproductive. It's
desirable to make encryption as unobtrusive as possible.
Therefore, change the error code from EPERM to EXDEV so that tools
looking for EXDEV will fall back to a copy.
This, of course, doesn't prevent users from still doing the right things
to securely manage their files. Note that this also matches the
behavior when a file is renamed between two project quota hierarchies;
so there's precedent for using EXDEV for things other than mountpoints.
xfstests generic/398 will require an update with this change.
[Rewritten from an earlier patch series by Michael Halcrow.]
Cc: Michael Halcrow <mhalcrow@google.com>
Cc: Joe Richey <joerichey@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-01-23 08:20:21 +08:00
|
|
|
* Return: 0 on success, -ENOKEY if an encryption key is missing, -EXDEV if the
|
2017-10-10 03:15:42 +08:00
|
|
|
* rename would cause inconsistent encryption policies, or another -errno code.
|
|
|
|
*/
|
|
|
|
static inline int fscrypt_prepare_rename(struct inode *old_dir,
|
|
|
|
struct dentry *old_dentry,
|
|
|
|
struct inode *new_dir,
|
|
|
|
struct dentry *new_dentry,
|
|
|
|
unsigned int flags)
|
|
|
|
{
|
|
|
|
if (IS_ENCRYPTED(old_dir) || IS_ENCRYPTED(new_dir))
|
|
|
|
return __fscrypt_prepare_rename(old_dir, old_dentry,
|
|
|
|
new_dir, new_dentry, flags);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2017-10-10 03:15:43 +08:00
|
|
|
/**
|
2020-05-12 03:13:56 +08:00
|
|
|
* fscrypt_prepare_lookup() - prepare to lookup a name in a possibly-encrypted
|
|
|
|
* directory
|
2017-10-10 03:15:43 +08:00
|
|
|
* @dir: directory being searched
|
|
|
|
* @dentry: filename being looked up
|
fscrypt: fix race where ->lookup() marks plaintext dentry as ciphertext
->lookup() in an encrypted directory begins as follows:
1. fscrypt_prepare_lookup():
a. Try to load the directory's encryption key.
b. If the key is unavailable, mark the dentry as a ciphertext name
via d_flags.
2. fscrypt_setup_filename():
a. Try to load the directory's encryption key.
b. If the key is available, encrypt the name (treated as a plaintext
name) to get the on-disk name. Otherwise decode the name
(treated as a ciphertext name) to get the on-disk name.
But if the key is concurrently added, it may be found at (2a) but not at
(1a). In this case, the dentry will be wrongly marked as a ciphertext
name even though it was actually treated as plaintext.
This will cause the dentry to be wrongly invalidated on the next lookup,
potentially causing problems. For example, if the racy ->lookup() was
part of sys_mount(), then the new mount will be detached when anything
tries to access it. This is despite the mountpoint having a plaintext
path, which should remain valid now that the key was added.
Of course, this is only possible if there's a userspace race. Still,
the additional kernel-side race is confusing and unexpected.
Close the kernel-side race by changing fscrypt_prepare_lookup() to also
set the on-disk filename (step 2b), consistent with the d_flags update.
Fixes: 28b4c263961c ("ext4 crypto: revalidate dentry after adding or removing the key")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2019-03-21 02:39:13 +08:00
|
|
|
* @fname: (output) the name to use to search the on-disk directory
|
2017-10-10 03:15:43 +08:00
|
|
|
*
|
fscrypt: fix race where ->lookup() marks plaintext dentry as ciphertext
->lookup() in an encrypted directory begins as follows:
1. fscrypt_prepare_lookup():
a. Try to load the directory's encryption key.
b. If the key is unavailable, mark the dentry as a ciphertext name
via d_flags.
2. fscrypt_setup_filename():
a. Try to load the directory's encryption key.
b. If the key is available, encrypt the name (treated as a plaintext
name) to get the on-disk name. Otherwise decode the name
(treated as a ciphertext name) to get the on-disk name.
But if the key is concurrently added, it may be found at (2a) but not at
(1a). In this case, the dentry will be wrongly marked as a ciphertext
name even though it was actually treated as plaintext.
This will cause the dentry to be wrongly invalidated on the next lookup,
potentially causing problems. For example, if the racy ->lookup() was
part of sys_mount(), then the new mount will be detached when anything
tries to access it. This is despite the mountpoint having a plaintext
path, which should remain valid now that the key was added.
Of course, this is only possible if there's a userspace race. Still,
the additional kernel-side race is confusing and unexpected.
Close the kernel-side race by changing fscrypt_prepare_lookup() to also
set the on-disk filename (step 2b), consistent with the d_flags update.
Fixes: 28b4c263961c ("ext4 crypto: revalidate dentry after adding or removing the key")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2019-03-21 02:39:13 +08:00
|
|
|
* Prepare for ->lookup() in a directory which may be encrypted by determining
|
|
|
|
* the name that will actually be used to search the directory on-disk. Lookups
|
|
|
|
* can be done with or without the directory's encryption key; without the key,
|
2017-10-10 03:15:43 +08:00
|
|
|
* filenames are presented in encrypted form. Therefore, we'll try to set up
|
|
|
|
* the directory's encryption key, but even without it the lookup can continue.
|
|
|
|
*
|
2019-03-21 02:39:09 +08:00
|
|
|
* This also installs a custom ->d_revalidate() method which will invalidate the
|
|
|
|
* dentry if it was created without the key and the key is later added.
|
2017-10-10 03:15:43 +08:00
|
|
|
*
|
fscrypt: fix race where ->lookup() marks plaintext dentry as ciphertext
->lookup() in an encrypted directory begins as follows:
1. fscrypt_prepare_lookup():
a. Try to load the directory's encryption key.
b. If the key is unavailable, mark the dentry as a ciphertext name
via d_flags.
2. fscrypt_setup_filename():
a. Try to load the directory's encryption key.
b. If the key is available, encrypt the name (treated as a plaintext
name) to get the on-disk name. Otherwise decode the name
(treated as a ciphertext name) to get the on-disk name.
But if the key is concurrently added, it may be found at (2a) but not at
(1a). In this case, the dentry will be wrongly marked as a ciphertext
name even though it was actually treated as plaintext.
This will cause the dentry to be wrongly invalidated on the next lookup,
potentially causing problems. For example, if the racy ->lookup() was
part of sys_mount(), then the new mount will be detached when anything
tries to access it. This is despite the mountpoint having a plaintext
path, which should remain valid now that the key was added.
Of course, this is only possible if there's a userspace race. Still,
the additional kernel-side race is confusing and unexpected.
Close the kernel-side race by changing fscrypt_prepare_lookup() to also
set the on-disk filename (step 2b), consistent with the d_flags update.
Fixes: 28b4c263961c ("ext4 crypto: revalidate dentry after adding or removing the key")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2019-03-21 02:39:13 +08:00
|
|
|
* Return: 0 on success; -ENOENT if key is unavailable but the filename isn't a
|
|
|
|
* correctly formed encoded ciphertext name, so a negative dentry should be
|
|
|
|
* created; or another -errno code.
|
2017-10-10 03:15:43 +08:00
|
|
|
*/
|
|
|
|
static inline int fscrypt_prepare_lookup(struct inode *dir,
|
|
|
|
struct dentry *dentry,
|
fscrypt: fix race where ->lookup() marks plaintext dentry as ciphertext
->lookup() in an encrypted directory begins as follows:
1. fscrypt_prepare_lookup():
a. Try to load the directory's encryption key.
b. If the key is unavailable, mark the dentry as a ciphertext name
via d_flags.
2. fscrypt_setup_filename():
a. Try to load the directory's encryption key.
b. If the key is available, encrypt the name (treated as a plaintext
name) to get the on-disk name. Otherwise decode the name
(treated as a ciphertext name) to get the on-disk name.
But if the key is concurrently added, it may be found at (2a) but not at
(1a). In this case, the dentry will be wrongly marked as a ciphertext
name even though it was actually treated as plaintext.
This will cause the dentry to be wrongly invalidated on the next lookup,
potentially causing problems. For example, if the racy ->lookup() was
part of sys_mount(), then the new mount will be detached when anything
tries to access it. This is despite the mountpoint having a plaintext
path, which should remain valid now that the key was added.
Of course, this is only possible if there's a userspace race. Still,
the additional kernel-side race is confusing and unexpected.
Close the kernel-side race by changing fscrypt_prepare_lookup() to also
set the on-disk filename (step 2b), consistent with the d_flags update.
Fixes: 28b4c263961c ("ext4 crypto: revalidate dentry after adding or removing the key")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2019-03-21 02:39:13 +08:00
|
|
|
struct fscrypt_name *fname)
|
2017-10-10 03:15:43 +08:00
|
|
|
{
|
|
|
|
if (IS_ENCRYPTED(dir))
|
fscrypt: fix race where ->lookup() marks plaintext dentry as ciphertext
->lookup() in an encrypted directory begins as follows:
1. fscrypt_prepare_lookup():
a. Try to load the directory's encryption key.
b. If the key is unavailable, mark the dentry as a ciphertext name
via d_flags.
2. fscrypt_setup_filename():
a. Try to load the directory's encryption key.
b. If the key is available, encrypt the name (treated as a plaintext
name) to get the on-disk name. Otherwise decode the name
(treated as a ciphertext name) to get the on-disk name.
But if the key is concurrently added, it may be found at (2a) but not at
(1a). In this case, the dentry will be wrongly marked as a ciphertext
name even though it was actually treated as plaintext.
This will cause the dentry to be wrongly invalidated on the next lookup,
potentially causing problems. For example, if the racy ->lookup() was
part of sys_mount(), then the new mount will be detached when anything
tries to access it. This is despite the mountpoint having a plaintext
path, which should remain valid now that the key was added.
Of course, this is only possible if there's a userspace race. Still,
the additional kernel-side race is confusing and unexpected.
Close the kernel-side race by changing fscrypt_prepare_lookup() to also
set the on-disk filename (step 2b), consistent with the d_flags update.
Fixes: 28b4c263961c ("ext4 crypto: revalidate dentry after adding or removing the key")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2019-03-21 02:39:13 +08:00
|
|
|
return __fscrypt_prepare_lookup(dir, dentry, fname);
|
|
|
|
|
|
|
|
memset(fname, 0, sizeof(*fname));
|
|
|
|
fname->usr_fname = &dentry->d_name;
|
|
|
|
fname->disk_name.name = (unsigned char *)dentry->d_name.name;
|
|
|
|
fname->disk_name.len = dentry->d_name.len;
|
2017-10-10 03:15:43 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2017-10-10 03:15:44 +08:00
|
|
|
/**
|
2020-05-12 03:13:56 +08:00
|
|
|
* fscrypt_prepare_setattr() - prepare to change a possibly-encrypted inode's
|
|
|
|
* attributes
|
2017-10-10 03:15:44 +08:00
|
|
|
* @dentry: dentry through which the inode is being changed
|
|
|
|
* @attr: attributes to change
|
|
|
|
*
|
|
|
|
* Prepare for ->setattr() on a possibly-encrypted inode. On an encrypted file,
|
|
|
|
* most attribute changes are allowed even without the encryption key. However,
|
|
|
|
* without the encryption key we do have to forbid truncates. This is needed
|
|
|
|
* because the size being truncated to may not be a multiple of the filesystem
|
|
|
|
* block size, and in that case we'd have to decrypt the final block, zero the
|
|
|
|
* portion past i_size, and re-encrypt it. (We *could* allow truncating to a
|
|
|
|
* filesystem block boundary, but it's simpler to just forbid all truncates ---
|
|
|
|
* and we already forbid all other contents modifications without the key.)
|
|
|
|
*
|
|
|
|
* Return: 0 on success, -ENOKEY if the key is missing, or another -errno code
|
|
|
|
* if a problem occurred while setting up the encryption key.
|
|
|
|
*/
|
|
|
|
static inline int fscrypt_prepare_setattr(struct dentry *dentry,
|
|
|
|
struct iattr *attr)
|
|
|
|
{
|
|
|
|
if (attr->ia_valid & ATTR_SIZE)
|
|
|
|
return fscrypt_require_key(d_inode(dentry));
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
fscrypt: new helper functions for ->symlink()
Currently, filesystems supporting fscrypt need to implement some tricky
logic when creating encrypted symlinks, including handling a peculiar
on-disk format (struct fscrypt_symlink_data) and correctly calculating
the size of the encrypted symlink. Introduce helper functions to make
things a bit easier:
- fscrypt_prepare_symlink() computes and validates the size the symlink
target will require on-disk.
- fscrypt_encrypt_symlink() creates the encrypted target if needed.
The new helpers actually fix some subtle bugs. First, when checking
whether the symlink target was too long, filesystems didn't account for
the fact that the NUL padding is meant to be truncated if it would cause
the maximum length to be exceeded, as is done for filenames in
directories. Consequently users would receive ENAMETOOLONG when
creating symlinks close to what is supposed to be the maximum length.
For example, with EXT4 with a 4K block size, the maximum symlink target
length in an encrypted directory is supposed to be 4093 bytes (in
comparison to 4095 in an unencrypted directory), but in
FS_POLICY_FLAGS_PAD_32-mode only up to 4064 bytes were accepted.
Second, symlink targets of "." and ".." were not being encrypted, even
though they should be, as these names are special in *directory entries*
but not in symlink targets. Fortunately, we can fix this simply by
starting to encrypt them, as old kernels already accept them in
encrypted form.
Third, the output string length the filesystems were providing when
doing the actual encryption was incorrect, as it was forgotten to
exclude 'sizeof(struct fscrypt_symlink_data)'. Fortunately though, this
bug didn't make a difference.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-06 02:45:01 +08:00
|
|
|
/**
|
2020-05-12 03:13:56 +08:00
|
|
|
* fscrypt_prepare_symlink() - prepare to create a possibly-encrypted symlink
|
fscrypt: new helper functions for ->symlink()
Currently, filesystems supporting fscrypt need to implement some tricky
logic when creating encrypted symlinks, including handling a peculiar
on-disk format (struct fscrypt_symlink_data) and correctly calculating
the size of the encrypted symlink. Introduce helper functions to make
things a bit easier:
- fscrypt_prepare_symlink() computes and validates the size the symlink
target will require on-disk.
- fscrypt_encrypt_symlink() creates the encrypted target if needed.
The new helpers actually fix some subtle bugs. First, when checking
whether the symlink target was too long, filesystems didn't account for
the fact that the NUL padding is meant to be truncated if it would cause
the maximum length to be exceeded, as is done for filenames in
directories. Consequently users would receive ENAMETOOLONG when
creating symlinks close to what is supposed to be the maximum length.
For example, with EXT4 with a 4K block size, the maximum symlink target
length in an encrypted directory is supposed to be 4093 bytes (in
comparison to 4095 in an unencrypted directory), but in
FS_POLICY_FLAGS_PAD_32-mode only up to 4064 bytes were accepted.
Second, symlink targets of "." and ".." were not being encrypted, even
though they should be, as these names are special in *directory entries*
but not in symlink targets. Fortunately, we can fix this simply by
starting to encrypt them, as old kernels already accept them in
encrypted form.
Third, the output string length the filesystems were providing when
doing the actual encryption was incorrect, as it was forgotten to
exclude 'sizeof(struct fscrypt_symlink_data)'. Fortunately though, this
bug didn't make a difference.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-06 02:45:01 +08:00
|
|
|
* @dir: directory in which the symlink is being created
|
|
|
|
* @target: plaintext symlink target
|
|
|
|
* @len: length of @target excluding null terminator
|
|
|
|
* @max_len: space the filesystem has available to store the symlink target
|
|
|
|
* @disk_link: (out) the on-disk symlink target being prepared
|
|
|
|
*
|
|
|
|
* This function computes the size the symlink target will require on-disk,
|
|
|
|
* stores it in @disk_link->len, and validates it against @max_len. An
|
|
|
|
* encrypted symlink may be longer than the original.
|
|
|
|
*
|
|
|
|
* Additionally, @disk_link->name is set to @target if the symlink will be
|
|
|
|
* unencrypted, but left NULL if the symlink will be encrypted. For encrypted
|
|
|
|
* symlinks, the filesystem must call fscrypt_encrypt_symlink() to create the
|
|
|
|
* on-disk target later. (The reason for the two-step process is that some
|
|
|
|
* filesystems need to know the size of the symlink target before creating the
|
|
|
|
* inode, e.g. to determine whether it will be a "fast" or "slow" symlink.)
|
|
|
|
*
|
|
|
|
* Return: 0 on success, -ENAMETOOLONG if the symlink target is too long,
|
|
|
|
* -ENOKEY if the encryption key is missing, or another -errno code if a problem
|
|
|
|
* occurred while setting up the encryption key.
|
|
|
|
*/
|
|
|
|
static inline int fscrypt_prepare_symlink(struct inode *dir,
|
|
|
|
const char *target,
|
|
|
|
unsigned int len,
|
|
|
|
unsigned int max_len,
|
|
|
|
struct fscrypt_str *disk_link)
|
|
|
|
{
|
|
|
|
if (IS_ENCRYPTED(dir) || fscrypt_dummy_context_enabled(dir))
|
|
|
|
return __fscrypt_prepare_symlink(dir, len, max_len, disk_link);
|
|
|
|
|
|
|
|
disk_link->name = (unsigned char *)target;
|
|
|
|
disk_link->len = len + 1;
|
|
|
|
if (disk_link->len > max_len)
|
|
|
|
return -ENAMETOOLONG;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
2020-05-12 03:13:56 +08:00
|
|
|
* fscrypt_encrypt_symlink() - encrypt the symlink target if needed
|
fscrypt: new helper functions for ->symlink()
Currently, filesystems supporting fscrypt need to implement some tricky
logic when creating encrypted symlinks, including handling a peculiar
on-disk format (struct fscrypt_symlink_data) and correctly calculating
the size of the encrypted symlink. Introduce helper functions to make
things a bit easier:
- fscrypt_prepare_symlink() computes and validates the size the symlink
target will require on-disk.
- fscrypt_encrypt_symlink() creates the encrypted target if needed.
The new helpers actually fix some subtle bugs. First, when checking
whether the symlink target was too long, filesystems didn't account for
the fact that the NUL padding is meant to be truncated if it would cause
the maximum length to be exceeded, as is done for filenames in
directories. Consequently users would receive ENAMETOOLONG when
creating symlinks close to what is supposed to be the maximum length.
For example, with EXT4 with a 4K block size, the maximum symlink target
length in an encrypted directory is supposed to be 4093 bytes (in
comparison to 4095 in an unencrypted directory), but in
FS_POLICY_FLAGS_PAD_32-mode only up to 4064 bytes were accepted.
Second, symlink targets of "." and ".." were not being encrypted, even
though they should be, as these names are special in *directory entries*
but not in symlink targets. Fortunately, we can fix this simply by
starting to encrypt them, as old kernels already accept them in
encrypted form.
Third, the output string length the filesystems were providing when
doing the actual encryption was incorrect, as it was forgotten to
exclude 'sizeof(struct fscrypt_symlink_data)'. Fortunately though, this
bug didn't make a difference.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-01-06 02:45:01 +08:00
|
|
|
* @inode: symlink inode
|
|
|
|
* @target: plaintext symlink target
|
|
|
|
* @len: length of @target excluding null terminator
|
|
|
|
* @disk_link: (in/out) the on-disk symlink target being prepared
|
|
|
|
*
|
|
|
|
* If the symlink target needs to be encrypted, then this function encrypts it
|
|
|
|
* into @disk_link->name. fscrypt_prepare_symlink() must have been called
|
|
|
|
* previously to compute @disk_link->len. If the filesystem did not allocate a
|
|
|
|
* buffer for @disk_link->name after calling fscrypt_prepare_link(), then one
|
|
|
|
* will be kmalloc()'ed and the filesystem will be responsible for freeing it.
|
|
|
|
*
|
|
|
|
* Return: 0 on success, -errno on failure
|
|
|
|
*/
|
|
|
|
static inline int fscrypt_encrypt_symlink(struct inode *inode,
|
|
|
|
const char *target,
|
|
|
|
unsigned int len,
|
|
|
|
struct fscrypt_str *disk_link)
|
|
|
|
{
|
|
|
|
if (IS_ENCRYPTED(inode))
|
|
|
|
return __fscrypt_encrypt_symlink(inode, target, len, disk_link);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2019-05-21 00:29:39 +08:00
|
|
|
/* If *pagep is a bounce page, free it and set *pagep to the pagecache page */
|
|
|
|
static inline void fscrypt_finalize_bounce_page(struct page **pagep)
|
|
|
|
{
|
|
|
|
struct page *page = *pagep;
|
|
|
|
|
|
|
|
if (fscrypt_is_bounce_page(page)) {
|
|
|
|
*pagep = fscrypt_pagecache_page(page);
|
|
|
|
fscrypt_free_bounce_page(page);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2017-10-10 03:15:34 +08:00
|
|
|
#endif /* _LINUX_FSCRYPT_H */
|