OpenCloudOS-Kernel/fs/btrfs
Josef Bacik 9630308170 Btrfs: use hybrid extents+bitmap rb tree for free space
Currently btrfs has a problem where it can use a ridiculous amount of RAM simply
tracking free space.  As free space gets fragmented, we end up with thousands of
entries on an rb-tree per block group, which usually spans 1 gig of area.  Since
we currently don't ever flush free space cache back to disk this gets to be a
bit unweildly on large fs's with lots of fragmentation.

This patch solves this problem by using PAGE_SIZE bitmaps for parts of the free
space cache.  Initially we calculate a threshold of extent entries we can
handle, which is however many extent entries we can cram into 16k of ram.  The
maximum amount of RAM that should ever be used to track 1 gigabyte of diskspace
will be 32k of RAM, which scales much better than we did before.

Once we pass the extent threshold, we start adding bitmaps and using those
instead for tracking the free space.  This patch also makes it so that any free
space thats less than 4 * sectorsize we go ahead and put into a bitmap.  This is
nice since we try and allocate out of the front of a block group, so if the
front of a block group is heavily fragmented and then has a huge chunk of free
space at the end, we go ahead and add the fragmented areas to bitmaps and use a
normal extent entry to track the big chunk at the back of the block group.

I've also taken the opportunity to revamp how we search for free space.
Previously we indexed free space via an offset indexed rb tree and a bytes
indexed rb tree.  I've dropped the bytes indexed rb tree and use only the offset
indexed rb tree.  This cuts the number of tree operations we were doing
previously down by half, and gives us a little bit of a better allocation
pattern since we will always start from a specific offset and search forward
from there, instead of searching for the size we need and try and get it as
close as possible to the offset we want.

I've given this a healthy amount of testing pre-new format stuff, as well as
post-new format stuff.  I've booted up my fedora box which is installed on btrfs
with this patch and ran with it for a few days without issues.  I've not seen
any performance regressions in any of my tests.

Since the last patch Yan Zheng fixed a problem where we could have overlapping
entries, so updating their offset inline would cause problems.  Thanks,

Signed-off-by: Josef Bacik <jbacik@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-07-24 09:23:30 -04:00
..
Kconfig Btrfs: make btrfs acls selectable 2009-02-04 09:28:28 -05:00
Makefile Btrfs: Mixed back reference (FORWARD ROLLING FORMAT CHANGE) 2009-06-10 11:29:46 -04:00
acl.c Fix btrfs when ACLs are configured out 2009-06-10 11:36:43 -04:00
async-thread.c Btrfs: convert nested spin_lock_irqsave to spin_lock 2009-07-22 16:49:00 -04:00
async-thread.h Btrfs: add a priority queue to the async thread helpers 2009-04-20 15:53:08 -04:00
btrfs_inode.h Btrfs: implement FS_IOC_GETFLAGS/SETFLAGS/GETVERSION 2009-06-10 11:29:52 -04:00
compat.h Btrfs: drop remaining LINUX_KERNEL_VERSION checks and compat code 2009-01-06 09:38:55 -05:00
compression.c Btrfs: implement FS_IOC_GETFLAGS/SETFLAGS/GETVERSION 2009-06-10 11:29:52 -04:00
compression.h Btrfs: Add zlib compression support 2008-10-29 14:49:59 -04:00
ctree.c Btrfs: fix locking issue in btrfs_find_next_key 2009-07-22 09:59:00 -04:00
ctree.h Btrfs: use hybrid extents+bitmap rb tree for free space 2009-07-24 09:23:30 -04:00
delayed-ref.c Btrfs: Mixed back reference (FORWARD ROLLING FORMAT CHANGE) 2009-06-10 11:29:46 -04:00
delayed-ref.h Btrfs: Mixed back reference (FORWARD ROLLING FORMAT CHANGE) 2009-06-10 11:29:46 -04:00
dir-item.c Btrfs: leave btree locks spinning more often 2009-03-24 16:14:28 -04:00
disk-io.c Btrfs: Fix crash on read failures at mount 2009-07-22 16:52:13 -04:00
disk-io.h Btrfs: leave btree locks spinning more often 2009-03-24 16:14:28 -04:00
export.c Btrfs: Mixed back reference (FORWARD ROLLING FORMAT CHANGE) 2009-06-10 11:29:46 -04:00
export.h NFS support for btrfs - v3 2008-09-25 11:04:06 -04:00
extent-tree.c Btrfs: use hybrid extents+bitmap rb tree for free space 2009-07-24 09:23:30 -04:00
extent_io.c btrfs: Fix set/clear_extent_bit for 'end == (u64)-1' 2009-06-10 11:29:46 -04:00
extent_io.h Btrfs: leave btree locks spinning more often 2009-03-24 16:14:28 -04:00
extent_map.c Btrfs: kill btrfs_cache_create 2009-04-24 15:46:04 -04:00
extent_map.h Btrfs: Fix csum error for compressed data 2008-11-10 07:34:43 -05:00
file-item.c Btrfs: leave btree locks spinning more often 2009-03-24 16:14:28 -04:00
file.c Btrfs: don't log the inode in file_write while growing the file 2009-07-02 13:41:16 -04:00
free-space-cache.c Btrfs: use hybrid extents+bitmap rb tree for free space 2009-07-24 09:23:30 -04:00
free-space-cache.h Btrfs: use hybrid extents+bitmap rb tree for free space 2009-07-24 09:23:30 -04:00
hash.h Btrfs: remove crc32c.h and use libcrc32c directly. 2009-06-10 11:29:53 -04:00
inode-item.c Btrfs: leave btree locks spinning more often 2009-03-24 16:14:28 -04:00
inode-map.c Btrfs: Fix a trivial warning using max() of u64 vs ULL. 2009-04-27 08:37:49 -04:00
inode.c Btrfs: adjust NULL test 2009-07-22 16:49:01 -04:00
ioctl.c Btrfs: fix the file clone ioctl for preallocated extents 2009-07-02 13:41:16 -04:00
ioctl.h Btrfs: fix ioctl arg size (userland incompatible change!) 2009-01-16 11:59:08 -05:00
locking.c Btrfs: fix typos in comments 2009-04-02 16:46:06 -04:00
locking.h Btrfs: fix spinlock assertions on UP systems 2009-03-09 11:45:38 -04:00
ordered-data.c Btrfs: use WRITE_SYNC for synchronous writes 2009-04-20 15:53:08 -04:00
ordered-data.h Btrfs: add extra flushing for renames and truncates 2009-03-31 14:27:58 -04:00
orphan.c Btrfs: Create orphan inode records to prevent lost files after a crash 2008-09-25 11:04:05 -04:00
print-tree.c Btrfs: remove of redundant btrfs_header_level 2009-07-22 16:52:13 -04:00
print-tree.h Btrfs: Create extent_buffer interface for large blocksizes 2008-09-25 11:03:56 -04:00
ref-cache.c Btrfs: Make btrfs_drop_snapshot work in larger and more efficient chunks 2009-02-04 09:27:02 -05:00
ref-cache.h Btrfs: Make btrfs_drop_snapshot work in larger and more efficient chunks 2009-02-04 09:27:02 -05:00
relocation.c Btrfs: fix locking issue in btrfs_find_next_key 2009-07-22 09:59:00 -04:00
root-tree.c Btrfs: Mixed back reference (FORWARD ROLLING FORMAT CHANGE) 2009-06-10 11:29:46 -04:00
struct-funcs.c Btrfs: Fix checkpatch.pl warnings 2009-01-05 21:25:51 -05:00
super.c Btrfs: fix -o nodatasum printk spelling 2009-06-11 09:30:13 -04:00
sysfs.c Btrfs: Fix checkpatch.pl warnings 2009-01-05 21:25:51 -05:00
transaction.c Btrfs: make sure all dirty blocks are written at commit time 2009-07-22 10:07:05 -04:00
transaction.h Btrfs: Mixed back reference (FORWARD ROLLING FORMAT CHANGE) 2009-06-10 11:29:46 -04:00
tree-defrag.c Btrfs: do extent allocation and reference count updates in the background 2009-03-24 16:14:25 -04:00
tree-log.c Btrfs: fix extent_buffer leak during tree log replay 2009-06-11 11:24:47 -04:00
tree-log.h Btrfs: tree logging unlink/rename fixes 2009-03-24 16:14:52 -04:00
version.h Update Btrfs files for in-kernel usage 2008-09-25 15:41:59 -04:00
version.sh Btrfs: Fixes for 2.6.28-rc API changes 2008-11-19 21:17:22 -05:00
volumes.c Btrfs: Remove broken sanity check from btrfs_rmap_block() 2009-07-22 16:49:01 -04:00
volumes.h Btrfs: avoid races between super writeout and device list updates 2009-06-10 15:17:02 -04:00
xattr.c Btrfs: selinux support 2009-02-04 09:29:13 -05:00
xattr.h Btrfs: selinux support 2009-02-04 09:29:13 -05:00
zlib.c Btrfs: Fix checkpatch.pl warnings 2009-01-05 21:25:51 -05:00