OpenCloudOS-Kernel/fs
Eric W. Biederman 22d917d80e userns: Rework the user_namespace adding uid/gid mapping support
- Convert the old uid mapping functions into compatibility wrappers
- Add a uid/gid mapping layer from user space uid and gids to kernel
  internal uids and gids that is extent based for simplicty and speed.
  * Working with number space after mapping uids/gids into their kernel
    internal version adds only mapping complexity over what we have today,
    leaving the kernel code easy to understand and test.
- Add proc files /proc/self/uid_map /proc/self/gid_map
  These files display the mapping and allow a mapping to be added
  if a mapping does not exist.
- Allow entering the user namespace without a uid or gid mapping.
  Since we are starting with an existing user our uids and gids
  still have global mappings so are still valid and useful they just don't
  have local mappings.  The requirement for things to work are global uid
  and gid so it is odd but perfectly fine not to have a local uid
  and gid mapping.
  Not requiring global uid and gid mappings greatly simplifies
  the logic of setting up the uid and gid mappings by allowing
  the mappings to be set after the namespace is created which makes the
  slight weirdness worth it.
- Make the mappings in the initial user namespace to the global
  uid/gid space explicit.  Today it is an identity mapping
  but in the future we may want to twist this for debugging, similar
  to what we do with jiffies.
- Document the memory ordering requirements of setting the uid and
  gid mappings.  We only allow the mappings to be set once
  and there are no pointers involved so the requirments are
  trivial but a little atypical.

Performance:

In this scheme for the permission checks the performance is expected to
stay the same as the actuall machine instructions should remain the same.

The worst case I could think of is ls -l on a large directory where
all of the stat results need to be translated with from kuids and
kgids to uids and gids.  So I benchmarked that case on my laptop
with a dual core hyperthread Intel i5-2520M cpu with 3M of cpu cache.

My benchmark consisted of going to single user mode where nothing else
was running. On an ext4 filesystem opening 1,000,000 files and looping
through all of the files 1000 times and calling fstat on the
individuals files.  This was to ensure I was benchmarking stat times
where the inodes were in the kernels cache, but the inode values were
not in the processors cache.  My results:

v3.4-rc1:         ~= 156ns (unmodified v3.4-rc1 with user namespace support disabled)
v3.4-rc1-userns-: ~= 155ns (v3.4-rc1 with my user namespace patches and user namespace support disabled)
v3.4-rc1-userns+: ~= 164ns (v3.4-rc1 with my user namespace patches and user namespace support enabled)

All of the configurations ran in roughly 120ns when I performed tests
that ran in the cpu cache.

So in summary the performance impact is:
1ns improvement in the worst case with user namespace support compiled out.
8ns aka 5% slowdown in the worst case with user namespace support compiled in.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-04-26 02:01:39 -07:00
..
9p 9p changes for the 3.4 merge window 2012-03-28 09:58:38 -07:00
adfs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
affs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
afs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-03-21 13:36:41 -07:00
autofs4 Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-29 18:12:23 -07:00
befs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
bfs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
btrfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2012-03-30 12:44:29 -07:00
cachefiles switch touch_atime to struct path 2012-03-20 21:29:41 -04:00
ceph Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client 2012-03-28 10:01:29 -07:00
cifs [CIFS] Update CIFS version number to 1.77 2012-03-27 14:55:13 -05:00
coda Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
configfs make configfs_pin_fs() return root dentry on success 2012-03-20 21:29:48 -04:00
cramfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-03-21 13:36:41 -07:00
debugfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-03-21 13:36:41 -07:00
devpts Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-03-21 13:36:41 -07:00
dlm dlm for 3.4 2012-03-21 13:54:22 -07:00
ecryptfs userns: Use cred->user_ns instead of cred->user->user_ns 2012-04-07 16:55:51 -07:00
efs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
exofs Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd 2012-03-28 20:04:27 -07:00
exportfs
ext2 migrate ext2_fs.h guts to fs/ext2/ext2.h 2012-03-31 16:03:16 -04:00
ext3 ext3: move headers to fs/ext3/ 2012-03-31 16:03:16 -04:00
ext4 Revert "ext4: don't release page refs in ext4_end_bio()" 2012-03-29 17:00:56 -07:00
fat fat: fix bug in enforcing Long File Name length 2012-03-23 16:58:40 -07:00
freevxfs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
fscache
fuse Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-03-21 13:36:41 -07:00
gfs2 get rid of pointless includes of ext2_fs.h 2012-03-31 16:03:15 -04:00
hfs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
hfsplus hfsplus: add an ioctl to bless files 2012-03-20 21:29:53 -04:00
hostfs Merge branch 'for-linus-3.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml 2012-03-27 18:29:53 -07:00
hpfs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
hppfs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
hugetlbfs Merge branch 'akpm' (Andrew's patch-bomb) 2012-03-22 09:04:48 -07:00
isofs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
jbd Power management updates for 3.4 2012-03-21 10:15:51 -07:00
jbd2 Disintegrate and delete asm/system.h 2012-03-28 15:58:21 -07:00
jffs2 MTD merge for 3.4 2012-03-30 17:31:56 -07:00
jfs jfs: mising cleanup on register_filesystem() failure 2012-03-20 21:29:48 -04:00
lockd Merge nfs containerization work from Trond's tree 2012-03-26 11:48:54 -04:00
logfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-03-21 13:36:41 -07:00
minix Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-03-21 13:36:41 -07:00
ncpfs Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
nfs NFS client bugfixes for Linux 3.4 2012-03-28 19:02:35 -07:00
nfs_common
nfsd Merge branch 'for-3.4' of git://linux-nfs.org/~bfields/linux 2012-03-29 14:53:25 -07:00
nilfs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-03-21 13:36:41 -07:00
nls NLS: raname "maxlen" to "maxout" in UTF conversion routines 2011-11-26 19:58:47 -08:00
notify fs/notify/notification.c: make subsys_initcall function static 2012-03-23 16:58:31 -07:00
ntfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-03-21 13:36:41 -07:00
ocfs2 get rid of pointless includes of ext2_fs.h 2012-03-31 16:03:15 -04:00
omfs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
openpromfs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
proc userns: Rework the user_namespace adding uid/gid mapping support 2012-04-26 02:01:39 -07:00
pstore pstore: trim pstore_get_inode() 2012-03-31 16:03:15 -04:00
qnx4 qnx4: new helper - try_extent() 2012-03-20 21:29:52 -04:00
qnx6 fs: initial qnx6fs addition 2012-03-20 21:29:38 -04:00
quota Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2012-03-28 10:00:14 -07:00
ramfs tidy up after d_make_root() conversion 2012-03-20 21:29:37 -04:00
reiserfs Disintegrate and delete asm/system.h 2012-03-28 15:58:21 -07:00
romfs MTD merge for 3.4 2012-03-30 17:31:56 -07:00
squashfs Add an extra mount time sanity check, plus some code cleanups and bug fixes. 2012-03-28 18:05:54 -07:00
sysfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-03-21 13:36:41 -07:00
sysv switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
ubifs - Improve error messages 2012-03-23 09:27:40 -07:00
udf Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2012-03-28 10:00:14 -07:00
ufs Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
xfs Disintegrate and delete asm/system.h 2012-03-28 15:58:21 -07:00
Kconfig Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-03-21 13:36:41 -07:00
Kconfig.binfmt fs: binfmt_elf: create Kconfig variable for PIE randomization 2012-01-10 16:30:51 -08:00
Makefile fs: initial qnx6fs addition 2012-03-20 21:29:38 -04:00
aio.c aio: take final put_ioctx() into callers of io_destroy() 2012-03-31 16:03:15 -04:00
anon_inodes.c anon_inodes: move allocation of anon_inode into ->mount() 2012-03-20 21:29:45 -04:00
attr.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
bad_inode.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
binfmt_aout.c Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
binfmt_elf.c Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-29 18:12:23 -07:00
binfmt_elf_fdpic.c Add #includes needed to permit the removal of asm/system.h 2012-03-28 18:30:03 +01:00
binfmt_em86.c __register_binfmt() made void 2012-03-20 21:29:46 -04:00
binfmt_flat.c Disintegrate and delete asm/system.h 2012-03-28 15:58:21 -07:00
binfmt_misc.c magic.h: move some FS magic numbers into magic.h 2012-03-23 16:58:31 -07:00
binfmt_script.c __register_binfmt() made void 2012-03-20 21:29:46 -04:00
binfmt_som.c take removal of PF_FORKNOEXEC to flush_old_exec() 2012-03-20 21:29:51 -04:00
bio-integrity.c fs: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:21 +08:00
bio.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
block_dev.c magic.h: move some FS magic numbers into magic.h 2012-03-23 16:58:31 -07:00
buffer.c fs: only send IPI to invalidate LRU BH when needed 2012-03-28 17:14:35 -07:00
char_dev.c char_dev.c: fix up some whitespace errors 2011-12-13 11:18:17 -08:00
compat.c Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-29 18:12:23 -07:00
compat_binfmt_elf.c
compat_ioctl.c The following text was taken from the original review request: 2012-03-24 10:24:31 -07:00
dcache.c vfs: fix d_ancestor() case in d_materialize_unique 2012-03-28 09:54:34 -07:00
dcookies.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
direct-io.c Restore direct_io / truncate locking API 2012-02-23 15:56:21 -08:00
drop_caches.c
eventfd.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
eventpoll.c Disintegrate and delete asm/system.h 2012-03-28 15:58:21 -07:00
exec.c Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-29 18:12:23 -07:00
fcntl.c Wrap accesses to the fd_sets in struct fdtable 2012-02-19 10:30:52 -08:00
fhandle.c vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb 2012-01-06 23:16:53 -05:00
fifo.c
file.c Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-29 18:12:23 -07:00
file_table.c vfs: drop_file_write_access() made static 2012-03-20 21:29:32 -04:00
filesystems.c vfs: convert fs_supers to hlist 2012-01-03 22:52:39 -05:00
fs-writeback.c trivial writeback fixes 2012-03-28 10:07:27 -07:00
fs_struct.c The following text was taken from the original review request: 2012-03-24 10:24:31 -07:00
generic_acl.c
inode.c userns: Replace the hard to write inode_userns with inode_capable. 2012-04-07 17:02:46 -07:00
internal.h vfs: protect remounting superblock read-only 2012-01-06 23:20:12 -05:00
ioctl.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
ioprio.c userns: Disassociate user_struct from the user_namespace. 2012-04-07 17:11:46 -07:00
libfs.c The following text was taken from the original review request: 2012-03-24 10:24:31 -07:00
locks.c vfs: fix handling of lock allocation failure in lease-break case 2011-12-26 10:25:26 -08:00
mbcache.c
mount.h vfs: keep list of mounts for each superblock 2012-01-06 23:20:12 -05:00
mpage.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
namei.c userns: Replace the hard to write inode_userns with inode_capable. 2012-04-07 17:02:46 -07:00
namespace.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-01-08 13:21:22 -08:00
no-block.c
open.c Wrap accesses to the fd_sets in struct fdtable 2012-02-19 10:30:52 -08:00
pipe.c magic.h: move some FS magic numbers into magic.h 2012-03-23 16:58:31 -07:00
pnode.c vfs: switch pnode.h macros to struct mount * 2012-01-03 22:57:11 -05:00
pnode.h vfs: switch pnode.h macros to struct mount * 2012-01-03 22:57:11 -05:00
posix_acl.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
proc_namespace.c vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
read_write.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
read_write.h
readdir.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
select.c Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-29 18:12:23 -07:00
seq_file.c The following text was taken from the original review request: 2012-03-24 10:24:31 -07:00
signalfd.c epoll: ep_unregister_pollwait() can use the freed pwq->whead 2012-02-24 11:42:50 -08:00
splice.c The following text was taken from the original review request: 2012-03-24 10:24:31 -07:00
stack.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
stat.c The following text was taken from the original review request: 2012-03-24 10:24:31 -07:00
statfs.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
super.c The following text was taken from the original review request: 2012-03-24 10:24:31 -07:00
sync.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
timerfd.c
utimes.c
xattr.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
xattr_acl.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00