linux-sg2042/fs/ext3
Linus Torvalds 9c64daff9d ext3: avoid unnecessary spinlock in critical POSIX ACL path
If a filesystem supports POSIX ACL's, the VFS layer expects the filesystem
to do POSIX ACL checks on any files not owned by the caller, and it does
this for every single pathname component that it looks up.

That obviously can be pretty expensive if the filesystem isn't careful
about it, especially with locking. That's doubly sad, since the common
case tends to be that there are no ACL's associated with the files in
question.

ext3 already caches the ACL data so that it doesn't have to look it up
over and over again, but it does so by taking the inode->i_lock spinlock
on every lookup. Which is a noticeable overhead even if it's a private
lock, especially on CPU's where the serialization is expensive (eg Intel
Netburst aka 'P4').

For the special case of not actually having any ACL's, all that locking is
unnecessary. Even if somebody else were to be changing the ACL's on
another CPU, we simply don't care - if we've seen a NULL ACL, we might as
well use it.

So just load the ACL speculatively without any locking, and if it was
NULL, just use it. If it's non-NULL (either because we had a cached
entry, or because the cache hasn't been filled in at all), it means that
we'll need to get the lock and re-load it properly.

This is noticeable even on Nehalem, which does locking quite well (much
better than P4). From lmbench:

	Processor, Processes - times in microseconds - smaller is better
	--------------------------------------------------------------------
	Host                 OS  Mhz null null      open slct fork exec sh
	                             call  I/O stat clos TCP  proc proc proc
	--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ----
 - before:
	nehalem.l Linux 2.6.30- 3193 0.04 0.09 0.95 1.45 2.18 69.1 273. 1141
	nehalem.l Linux 2.6.30- 3193 0.04 0.09 0.95 1.48 2.28 69.9 253. 1140
	nehalem.l Linux 2.6.30- 3193 0.04 0.10 0.95 1.42 2.19 68.6 284. 1141
 - after:
	nehalem.l Linux 2.6.30- 3193 0.04 0.09 0.92 1.44 2.12 68.3 282. 1094
	nehalem.l Linux 2.6.30- 3193 0.04 0.09 0.92 1.39 2.20 67.0 308. 1123
	nehalem.l Linux 2.6.30- 3193 0.04 0.09 0.92 1.39 2.36 67.4 293. 1148

where you can see what appears to be a roughly 3% improvement in stat
and open/close latencies from just the removal of the locking overhead.

Of course, this only matters for files you don't own (the owner never
needs to do the ACL checks), but that's the common case for libraries,
header files, and executables. As well as for the base components of any
absolute pathname, even if you are the owner of the final file.

[ At some point we probably want to move this ACL caching logic entirely
  into the VFS layer (and only call down to the filesystem when
  uncached), but in the meantime this improves ext3 a bit.

  A similar fix to btrfs makes a much bigger difference (15x improvement
  in lmbench) due to broken caching. ]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-06-17 00:36:35 -04:00
..
Kconfig ext3: make default data ordering mode configurable 2009-04-06 17:16:47 -07:00
Makefile [PATCH] ext3: uninline large functions 2006-12-07 08:39:35 -08:00
acl.c ext3: avoid unnecessary spinlock in critical POSIX ACL path 2009-06-17 00:36:35 -04:00
acl.h [PATCH] sanitize ->permission() prototype 2008-07-26 20:53:14 -04:00
balloc.c ext3: remove ->write_super and stop maintaining ->s_dirt 2009-06-11 21:36:05 -04:00
bitmap.c fs: mark nibblemap const 2007-10-17 08:42:47 -07:00
dir.c ext3: remove the BKL in ext3/ioctl.c 2009-04-02 19:04:52 -07:00
ext3_jbd.c ext3: replace remaining __FUNCTION__ occurrences 2008-04-28 08:58:45 -07:00
file.c Merge branch 'ext3-latency-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 2009-04-03 11:10:33 -07:00
fsync.c ext3: fdatasync should skip metadata writeout when overwriting 2008-04-28 08:58:43 -07:00
hash.c ext3: Add support for non-native signed/unsigned htree hash algorithms 2008-10-28 13:21:55 -04:00
ialloc.c ext3: remove ->write_super and stop maintaining ->s_dirt 2009-06-11 21:36:05 -04:00
inode.c ext3: remove ->write_super and stop maintaining ->s_dirt 2009-06-11 21:36:05 -04:00
ioctl.c ext3: remove the BKL in ext3/ioctl.c 2009-04-02 19:04:52 -07:00
namei.c Merge branch 'ext3-latency-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 2009-04-03 11:10:33 -07:00
namei.h
resize.c ext3: remove ->write_super and stop maintaining ->s_dirt 2009-06-11 21:36:05 -04:00
super.c Push BKL down into ->remount_fs() 2009-06-11 21:36:11 -04:00
symlink.c [PATCH] mark struct inode_operations const 1 2007-02-12 09:48:46 -08:00
xattr.c ext3: remove ->write_super and stop maintaining ->s_dirt 2009-06-11 21:36:05 -04:00
xattr.h make ext3_xattr_list() static 2008-04-28 08:58:44 -07:00
xattr_security.c ext3: remove double definitions of xattr macros 2008-07-25 10:53:32 -07:00
xattr_trusted.c ext3: remove double definitions of xattr macros 2008-07-25 10:53:32 -07:00
xattr_user.c ext3: remove double definitions of xattr macros 2008-07-25 10:53:32 -07:00