Documentation: path-lookup: include new LOOKUP flags

Now that we have new LOOKUP flags, we should document them in the
relevant path-walking documentation. And now that we've settled on a
common name for nd_jump_link() style symlinks ("magic links"), use that
term where magic-link semantics are described.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This commit is contained in:
Aleksa Sarai 2019-12-07 01:13:38 +11:00 committed by Al Viro
parent b28a10aedc
commit b55eef872a
1 changed files with 62 additions and 6 deletions

View File

@ -13,6 +13,7 @@ It has subsequently been updated to reflect changes in the kernel
including: including:
- per-directory parallel name lookup. - per-directory parallel name lookup.
- ``openat2()`` resolution restriction flags.
Introduction to pathname lookup Introduction to pathname lookup
=============================== ===============================
@ -235,6 +236,13 @@ renamed. If ``d_lookup`` finds that a rename happened while it
unsuccessfully scanned a chain in the hash table, it simply tries unsuccessfully scanned a chain in the hash table, it simply tries
again. again.
``rename_lock`` is also used to detect and defend against potential attacks
against ``LOOKUP_BENEATH`` and ``LOOKUP_IN_ROOT`` when resolving ".." (where
the parent directory is moved outside the root, bypassing the ``path_equal()``
check). If ``rename_lock`` is updated during the lookup and the path encounters
a "..", a potential attack occurred and ``handle_dots()`` will bail out with
``-EAGAIN``.
inode->i_rwsem inode->i_rwsem
~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~
@ -348,6 +356,13 @@ any changes to any mount points while stepping up. This locking is
needed to stabilize the link to the mounted-on dentry, which the needed to stabilize the link to the mounted-on dentry, which the
refcount on the mount itself doesn't ensure. refcount on the mount itself doesn't ensure.
``mount_lock`` is also used to detect and defend against potential attacks
against ``LOOKUP_BENEATH`` and ``LOOKUP_IN_ROOT`` when resolving ".." (where
the parent directory is moved outside the root, bypassing the ``path_equal()``
check). If ``mount_lock`` is updated during the lookup and the path encounters
a "..", a potential attack occurred and ``handle_dots()`` will bail out with
``-EAGAIN``.
RCU RCU
~~~ ~~~
@ -405,6 +420,10 @@ is requested. Keeping a reference in the ``nameidata`` ensures that
only one root is in effect for the entire path walk, even if it races only one root is in effect for the entire path walk, even if it races
with a ``chroot()`` system call. with a ``chroot()`` system call.
It should be noted that in the case of ``LOOKUP_IN_ROOT`` or
``LOOKUP_BENEATH``, the effective root becomes the directory file descriptor
passed to ``openat2()`` (which exposes these ``LOOKUP_`` flags).
The root is needed when either of two conditions holds: (1) either the The root is needed when either of two conditions holds: (1) either the
pathname or a symbolic link starts with a "'/'", or (2) a "``..``" pathname or a symbolic link starts with a "'/'", or (2) a "``..``"
component is being handled, since "``..``" from the root must always stay component is being handled, since "``..``" from the root must always stay
@ -1149,7 +1168,7 @@ so ``NULL`` is returned to indicate that the symlink can be released and
the stack frame discarded. the stack frame discarded.
The other case involves things in ``/proc`` that look like symlinks but The other case involves things in ``/proc`` that look like symlinks but
aren't really:: aren't really (and are therefore commonly referred to as "magic-links")::
$ ls -l /proc/self/fd/1 $ ls -l /proc/self/fd/1
lrwx------ 1 neilb neilb 64 Jun 13 10:19 /proc/self/fd/1 -> /dev/pts/4 lrwx------ 1 neilb neilb 64 Jun 13 10:19 /proc/self/fd/1 -> /dev/pts/4
@ -1286,7 +1305,9 @@ A few flags
A suitable way to wrap up this tour of pathname walking is to list A suitable way to wrap up this tour of pathname walking is to list
the various flags that can be stored in the ``nameidata`` to guide the the various flags that can be stored in the ``nameidata`` to guide the
lookup process. Many of these are only meaningful on the final lookup process. Many of these are only meaningful on the final
component, others reflect the current state of the pathname lookup. component, others reflect the current state of the pathname lookup, and some
apply restrictions to all path components encountered in the path lookup.
And then there is ``LOOKUP_EMPTY``, which doesn't fit conceptually with And then there is ``LOOKUP_EMPTY``, which doesn't fit conceptually with
the others. If this is not set, an empty pathname causes an error the others. If this is not set, an empty pathname causes an error
very early on. If it is set, empty pathnames are not considered to be very early on. If it is set, empty pathnames are not considered to be
@ -1310,13 +1331,48 @@ longer needed.
``LOOKUP_JUMPED`` means that the current dentry was chosen not because ``LOOKUP_JUMPED`` means that the current dentry was chosen not because
it had the right name but for some other reason. This happens when it had the right name but for some other reason. This happens when
following "``..``", following a symlink to ``/``, crossing a mount point following "``..``", following a symlink to ``/``, crossing a mount point
or accessing a "``/proc/$PID/fd/$FD``" symlink. In this case the or accessing a "``/proc/$PID/fd/$FD``" symlink (also known as a "magic
filesystem has not been asked to revalidate the name (with link"). In this case the filesystem has not been asked to revalidate the
``d_revalidate()``). In such cases the inode may still need to be name (with ``d_revalidate()``). In such cases the inode may still need
revalidated, so ``d_op->d_weak_revalidate()`` is called if to be revalidated, so ``d_op->d_weak_revalidate()`` is called if
``LOOKUP_JUMPED`` is set when the look completes - which may be at the ``LOOKUP_JUMPED`` is set when the look completes - which may be at the
final component or, when creating, unlinking, or renaming, at the penultimate component. final component or, when creating, unlinking, or renaming, at the penultimate component.
Resolution-restriction flags
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In order to allow userspace to protect itself against certain race conditions
and attack scenarios involving changing path components, a series of flags are
available which apply restrictions to all path components encountered during
path lookup. These flags are exposed through ``openat2()``'s ``resolve`` field.
``LOOKUP_NO_SYMLINKS`` blocks all symlink traversals (including magic-links).
This is distinctly different from ``LOOKUP_FOLLOW``, because the latter only
relates to restricting the following of trailing symlinks.
``LOOKUP_NO_MAGICLINKS`` blocks all magic-link traversals. Filesystems must
ensure that they return errors from ``nd_jump_link()``, because that is how
``LOOKUP_NO_MAGICLINKS`` and other magic-link restrictions are implemented.
``LOOKUP_NO_XDEV`` blocks all ``vfsmount`` traversals (this includes both
bind-mounts and ordinary mounts). Note that the ``vfsmount`` which contains the
lookup is determined by the first mountpoint the path lookup reaches --
absolute paths start with the ``vfsmount`` of ``/``, and relative paths start
with the ``dfd``'s ``vfsmount``. Magic-links are only permitted if the
``vfsmount`` of the path is unchanged.
``LOOKUP_BENEATH`` blocks any path components which resolve outside the
starting point of the resolution. This is done by blocking ``nd_jump_root()``
as well as blocking ".." if it would jump outside the starting point.
``rename_lock`` and ``mount_lock`` are used to detect attacks against the
resolution of "..". Magic-links are also blocked.
``LOOKUP_IN_ROOT`` resolves all path components as though the starting point
were the filesystem root. ``nd_jump_root()`` brings the resolution back to to
the starting point, and ".." at the starting point will act as a no-op. As with
``LOOKUP_BENEATH``, ``rename_lock`` and ``mount_lock`` are used to detect
attacks against ".." resolution. Magic-links are also blocked.
Final-component flags Final-component flags
~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~