Commit Graph

373 Commits

Author SHA1 Message Date
Josef Bacik 9ec7267751 Btrfs: stop using GFP_ATOMIC when allocating rewind ebs
There is no reason we can't just set the path to blocking and then do normal
GFP_NOFS allocations for these extent buffers.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-09-01 08:04:50 -04:00
Josef Bacik db7f3436c1 Btrfs: deal with enomem in the rewind path
We can get ENOMEM trying to allocate dummy bufs for the rewind operation of the
tree mod log.  Instead of BUG_ON()'ing in this case pass up ENOMEM.  I looked
back through the callers and I'm pretty sure I got everybody who did BUG_ON(ret)
in this path.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-09-01 08:04:49 -04:00
Josef Bacik c8cc634165 Btrfs: stop using GFP_ATOMIC for the tree mod log allocations
Previously we held the tree mod lock when adding stuff because we use it to
check and see if we truly do want to track tree modifications.  This is
admirable, but GFP_ATOMIC in a critical area that is going to get hit pretty
hard and often is not nice.  So instead do our basic checks to see if we don't
need to track modifications, and if those pass then do our allocation, and then
when we go to insert the new modification check if we still care, and if we
don't just free up our mod and return.  Otherwise we're good to go and we can
carry on.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-09-01 07:57:17 -04:00
Liu Bo b5b9b5b318 Btrfs: fix extent buffer leak after backref walking
commit 47fb091fb787420cd195e66f162737401cce023f(Btrfs: fix unlock after free on rewinded tree blocks)
takes an extra increment on the reference of allocated dummy extent buffer, so now we
cannot free this dummy one, and end up with extent buffer leak.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-08-09 19:29:42 -04:00
Josef Bacik 7fb7d76f96 Btrfs: only do the tree_mod_log_free_eb if this is our last ref
There is another bug in the tree mod log stuff in that we're calling
tree_mod_log_free_eb every single time a block is cow'ed.  The problem with this
is that if this block is shared by multiple snapshots we will call this multiple
times per block, so if we go to rewind the mod log for this block we'll BUG_ON()
in __tree_mod_log_rewind because we try to rewind a free twice.  We only want to
call tree_mod_log_free_eb if we are actually freeing the block.  With this patch
I no longer hit the panic in __tree_mod_log_rewind.  Thanks,

Cc: stable@vger.kernel.org
Reviewed-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-07-02 11:51:20 -04:00
Josef Bacik f1ca7e98a6 Btrfs: hold the tree mod lock in __tree_mod_log_rewind
We need to hold the tree mod log lock in __tree_mod_log_rewind since we walk
forward in the tree mod entries, otherwise we'll end up with random entries and
trip the BUG_ON() at the front of __tree_mod_log_rewind.  This fixes the panics
people were seeing when running

find /whatever -type f -exec btrfs fi defrag {} \;

Thansk,

Cc: stable@vger.kernel.org
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-07-02 11:51:18 -04:00
Josef Bacik 0b08851fda Btrfs: optimize reada_for_balance
This patch does two things.  First we no longer explicitly read in the blocks
we're trying to readahead.  For things like balance_level we may never actually
use the blocks so this just adds uneeded latency, and balance_level and
split_node will both read in the blocks they care about explicitly so if the
blocks need to be waited on it will be done there.  Secondly we no longer drop
the path if we do readahead, we just set the path blocking before we call
reada_for_balance() and then we're good to go.  Hopefully this will cut down on
the number of re-searches.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-07-01 08:52:32 -04:00
Josef Bacik bdf7c00e8f Btrfs: optimize read_block_for_search
This patch does two things, first it only does one call to
btrfs_buffer_uptodate() with the gen specified instead of once with 0 and then
again with gen specified.  The other thing is to call btrfs_read_buffer() on the
buffer we've found instead of dropping it and then calling read_tree_block().
This will keep us from doing yet another radix tree lookup for a buffer we've
already found.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-07-01 08:52:32 -04:00
Liu Bo 33157e05db Btrfs: check if leaf's parent exists before pushing items around
During splitting a leaf, pushing items around to hopefully get some space only
works when we have a parent, ie. we have at least one sibling leaf.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-06-14 11:29:58 -04:00
Liu Bo fdd99c7294 Btrfs: dont do log_removal in insert_new_root
As for splitting a leaf, root is just the leaf, and tree mod log does not apply
on leaf, so in this case, we don't do log_removal.

As for splitting a node, the old root is kept as a normal node and we have nicely
put records in tree mod log for moving keys and items, so in this case we don't do
that either.

As above, insert_new_root can get rid of log_removal.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-06-14 11:29:57 -04:00
Stefan Behrens 8f69dbd236 Btrfs: fix a comment
The size parameter to btrfs_extend_item() is the number of bytes
to add to the item, not the size of the item after the operation
(like it is for btrfs_truncate_item(), there the size parameter
is not the number of bytes to take away, but the total size of
the item after truncation).
Fix it in the comment.

Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-06-14 11:29:23 -04:00
Josef Bacik b1c79e0947 Btrfs: handle running extent ops with skinny metadata
Chris hit a bug where we weren't finding extent records when running extent ops.
This is because we use the delayed_ref_head when running the extent op, which
means we can't use the ->type checks to see if we are metadata.  We also lose
the level of the metadata we are working on.  So to fix this we can just check
the ->is_data section of the extent_op, and we can store the level of the buffer
we were modifying in the extent_op.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:15 -04:00
Eric Sandeen 48a3b6366f btrfs: make static code static & remove dead code
Big patch, but all it does is add statics to functions which
are in fact static, then remove the associated dead-code fallout.

removed functions:

btrfs_iref_to_path()
__btrfs_lookup_delayed_deletion_item()
__btrfs_search_delayed_insertion_item()
__btrfs_search_delayed_deletion_item()
find_eb_for_page()
btrfs_find_block_group()
range_straddles_pages()
extent_range_uptodate()
btrfs_file_extent_length()
btrfs_scrub_cancel_devid()
btrfs_start_transaction_lflush()

btrfs_print_tree() is left because it is used for debugging.
btrfs_start_transaction_lflush() and btrfs_reada_detach() are
left for symmetry.

ulist.c functions are left, another patch will take care of those.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:55:23 -04:00
Jan Schmidt fc36ed7e0b Btrfs: separate sequence numbers for delayed ref tracking and tree mod log
Sequence numbers for delayed refs have been introduced in the first version
of the qgroup patch set. To solve the problem of find_all_roots on a busy
file system, the tree mod log was introduced. The sequence numbers for that
were simply shared between those two users.

However, at one point in qgroup's quota accounting, there's a statement
accessing the previous sequence number, that's still just doing (seq - 1)
just as it would have to in the very first version.

To satisfy that requirement, this patch makes the sequence number counter 64
bit and splits it into a major part (used for qgroup sequence number
counting) and a minor part (incremented for each tree modification in the
log). This enables us to go exactly one major step backwards, as required
for qgroups, while still incrementing the sequence counter for tree mod log
insertions to keep track of their order. Keeping them in a single variable
means there's no need to change all the code dealing with comparisons of two
sequence numbers.

The sequence number is reset to 0 on commit (not new in this patch), which
ensures we won't overflow the two 32 bit counters.

Without this fix, the qgroup tracking can occasionally go wrong and WARN_ONs
from the tree mod log code may happen.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:55:17 -04:00
Josef Bacik 416bc6580b Btrfs: fix all callers of read_tree_block
We kept leaking extent buffers when mounting a broken file system and it turns
out it's because not everybody uses read_tree_block properly.  You need to check
and make sure the extent_buffer is uptodate before you use it.  This patch fixes
everybody who calls read_tree_block directly to make sure they check that it is
uptodate and free it and return an error if it is not.  With this we no longer
leak EB's when things go horribly wrong.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:55:07 -04:00
Tsutomu Itoh 4b90c68015 Btrfs: remove unused argument of btrfs_extend_item()
Argument 'trans' is not used in btrfs_extend_item().

Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:54:53 -04:00
Tsutomu Itoh afe5fea72b Btrfs: cleanup of function where fixup_low_keys() is called
If argument 'trans' is unnecessary in the function where
fixup_low_keys() is called, 'trans' is deleted.

Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:54:52 -04:00
Tsutomu Itoh d6a0a12684 Btrfs: remove unused argument of fixup_low_keys()
Argument 'trans' is not used in fixup_low_keys(). So, remove it.

Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:54:52 -04:00
Jan Schmidt 47fb091fb7 Btrfs: fix unlock after free on rewinded tree blocks
When tree_mod_log_rewind decides to make a copy of the current tree buffer
for its modifications, it subsequently freed the buffer before unlocking it.
Obviously, those operations are required in reverse order.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:54:48 -04:00
Jan Schmidt 30b0463a93 Btrfs: fix accessing the root pointer in tree mod log functions
The tree mod log functions were accessing root->node->... directly, without
use of btrfs_root_node() or explicit rcu locking. This could lead to an
extent buffer reference being leaked and another reference being freed too
early when preemtion was enabled.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:54:47 -04:00
Jan Schmidt 90f8d62ebb Btrfs: fix tree mod log regression on root split operations
Commit d9abbf1c changed tree mod log locking around ROOT_REPLACE operations.
When a tree root is split, however, we were logging removal of all elements
from the root node before logging removal of half of the elements for the
split operation. This leads to a BUG_ON when rewinding.

This commit removes the erroneous logging of removal of all elements.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:54:47 -04:00
Josef Bacik 09a2a8f96e Btrfs: fix bad extent logging
A user sent me a btrfs-image of a file system that was panicing on mount during
the log recovery.  I had originally thought these problems were from a bug in
the free space cache code, but that was just a symptom of the problem.  The
problem is if your application does something like this

[prealloc][prealloc][prealloc]

the internal extent maps will merge those all together into one extent map, even
though on disk they are 3 separate extents.  So if you go to write into one of
these ranges the extent map will be right since we use the physical extent when
doing the write, but when we log the extents they will use the wrong sizes for
the remainder prealloc space.  If this doesn't happen to trip up the free space
cache (which it won't in a lot of cases) then you will get bogus entries in your
extent tree which will screw stuff up later.  The data and such will still work,
but everything else is broken.  This patch fixes this by not allowing extents
that are on the modified list to be merged.  This has the side effect that we
are no longer adding everything to the modified list all the time, which means
we now have to call btrfs_drop_extents every time we log an extent into the
tree.  So this allows me to drop all this speciality code I was using to get
around calling btrfs_drop_extents.  With this patch the testcase I've created no
longer creates a bogus file system after replaying the log.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:54:34 -04:00
Josef Bacik 3173a18f70 Btrfs: add a incompatible format change for smaller metadata extent refs
We currently store the first key of the tree block inside the reference for the
tree block in the extent tree.  This takes up quite a bit of space.  Make a new
key type for metadata which holds the level as the offset and completely removes
storing the btrfs_tree_block_info inside the extent ref.  This reduces the size
from 51 bytes to 33 bytes per extent reference for each tree block.  In practice
this results in a 30-35% decrease in the size of our extent tree, which means we
COW less and can keep more of the extent tree in memory which makes our heavy
metadata operations go much faster.  This is not an automatic format change, you
must enable it at mkfs time or with btrfstune.  This patch deals with having
metadata stored as either the old format or the new format so it is easy to
convert.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:54:18 -04:00
Jan Schmidt d9abbf1c31 Btrfs: fix locking on ROOT_REPLACE operations in tree mod log
To resolve backrefs, ROOT_REPLACE operations in the tree mod log are
required to be tied to at least one KEY_REMOVE_WHILE_FREEING operation.
Therefore, those operations must be enclosed by tree_mod_log_write_lock()
and tree_mod_log_write_unlock() calls.

Those calls are private to the tree_mod_log_* functions, which means that
removal of the elements of an old root node must be logged from
tree_mod_log_insert_root. This partly reverts and corrects commit ba1bfbd5
(Btrfs: fix a tree mod logging issue for root replacement operations).

This fixes the brand-new version of xfstest 276 as of commit cfe73f71.

Cc: stable@vger.kernel.org
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-03-21 19:31:52 -04:00
Eric Sandeen de78b51a28 btrfs: remove cache only arguments from defrag path
The entry point at the defrag ioctl always sets "cache only" to 0;
the codepaths haven't run for a long time as far as I can
tell.  Chris says they're dead code, so remove them.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 12:59:36 -05:00
Eric Sandeen 1c697d4acc btrfs: annotate intentional switch case fallthroughs
This keeps static checkers happy.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 12:59:19 -05:00
Arne Jansen 2a745b14bc Btrfs: fix crash in log replay with qgroups enabled
When replaying a log tree with qgroups enabled, tree_mod_log_rewind does a
sanity-check of the number of items against the maximum possible number.
It calculates that number with the nodesize of fs_root. Unfortunately
fs_root is not yet set at this stage. So instead use the nodesize from
tree_root, which is already initialized.

Signed-off-by: Arne Jansen <sensille@gmx.net>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-02-14 20:47:41 -05:00
Chris Mason 57ba86c00f Revert "Btrfs: reorder tree mod log operations in deleting a pointer"
This reverts commit 6a7a665d78.

This was bug was fixed differently in 3.6, so this commit
isn't needed.

Conflicts:
	fs/btrfs/ctree.c

Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-18 19:35:32 -05:00
Chris Mason 4c3e696981 Revert "Btrfs: MOD_LOG_KEY_REMOVE_WHILE_MOVING never change node's nritems"
This reverts commit 95c80bb1f6.

The bug addressed by this commit was fixed differently back in 3.6

Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-18 15:43:18 -05:00
Josef Bacik 5124e00ec5 Btrfs: only unlock and relock if we have to
I noticed while doing fsync tests that we were always dropping the path and
re-searching when we first cow the log root even though we've already gotten
the write lock on the root.  That's because we don't take into account that
there might not be a parent node, so fix the check to make sure there is
actually a parent node before we undo all of this work for nothing.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-16 20:46:27 -05:00
Josef Bacik 41be1f3b40 Btrfs: optimize leaf_space_used
This gets called at least 4 times for every level while adding an object,
and it involves 3 kmapping calls, which on my box take about 5us a piece.
So instead use a token, which brings us down to 1 kmap call and makes this
function take 1/3 of the time per call.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-16 20:46:26 -05:00
Josef Bacik 70c8a91ce2 Btrfs: log changed inodes based on the extent map tree
We don't really need to copy extents from the source tree since we have all
of the information already available to us in the extent_map tree.  So
instead just write the extents straight to the log tree and don't bother to
copy the extent items from the source tree.

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-16 20:46:24 -05:00
Josef Bacik d6393786cd Btrfs: add path->really_keep_locks
You'd think path->keep_locks would keep all the locks wouldn't you?  You'd
be wrong.  It only keeps them if the slot is pointing to the last item in
the node.  This is for use with btrfs_next_leaf, which needs this sort of
thing.  But the horrible horrible things I'm going to do to the tree log
means I really need everything held from root to leaf so I can add and
delete items in the same search.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-16 20:46:24 -05:00
Anand Jain 5f3ab90a72 Btrfs: rename root_times_lock to root_item_lock
Originally root_times_lock was introduced as part of send/receive
code however newly developed patch to label the subvol reused
the same lock, so renaming it for a meaningful name.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-16 20:46:21 -05:00
Julia Lawall 6c1500f22a fs/btrfs: drop if around WARN_ON
Just use WARN_ON rather than an if containing only WARN_ON(1).

A simplified version of the semantic patch that makes this transformation
is as follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression e;
@@
- if (e) WARN_ON(1);
+ WARN_ON(e);
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-12 17:15:24 -05:00
Julia Lawall 31b1a2bd75 fs/btrfs: use WARN
Use WARN rather than printk followed by WARN_ON(1), for conciseness.

A simplified version of the semantic patch that makes this transformation
is as follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression list es;
@@

-printk(
+WARN(1,
  es);
-WARN_ON(1);
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-12 17:15:23 -05:00
Liu Bo 32adf09013 Btrfs: cleanup unused arguments
'disk_key' is not used at all.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-11 13:31:35 -05:00
Liu Bo 0e411ecec6 Btrfs: kill unnecessary arguments in del_ptr
The argument 'tree_mod_log' is not necessary since all of callers enable it.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-11 13:31:35 -05:00
Liu Bo 6a7a665d78 Btrfs: reorder tree mod log operations in deleting a pointer
Since we don't use MOD_LOG_KEY_REMOVE_WHILE_MOVING to add nritems
during rewinding, we should insert a MOD_LOG_KEY_REMOVE operation first.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-11 13:31:34 -05:00
Liu Bo 95c80bb1f6 Btrfs: MOD_LOG_KEY_REMOVE_WHILE_MOVING never change node's nritems
Key MOD_LOG_KEY_REMOVE_WHILE_MOVING means that we're doing memmove inside
an extent buffer node, and the node's number of items remains unchanged
(unless we are inserting a single pointer, but we have MOD_LOG_KEY_ADD for that).

So we don't need to increase node's number of items during rewinding,
otherwise we may get an node larger than leafsize and cause general protection
errors later.

Here is the details,
- If we do memory move for inserting a single pointer, we need to
  add node's nritems by one, and we honor MOD_LOG_KEY_ADD for adding.

- If we do memory move for deleting a single pointer, we need to
  decrease node's nritems by one, and we honor MOD_LOG_KEY_REMOVE for
  deleting.

- If we do memory move for balance left/right, we need to decrease
  node's nritems, and we honor MOD_LOG_KEY_REMOVE for balaning.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-11 13:31:33 -05:00
Liu Bo 7bfdcf7fba Btrfs: fix memory leak when cloning root's node
After cloning root's node, we forgot to dec the src's ref
which can lead to a memory leak.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-10-25 15:55:21 -04:00
Jan Schmidt 01763a2e37 Btrfs: comment for loop in tree_mod_log_insert_move
Emphasis the way tree_mod_log_insert_move avoids adding
MOD_LOG_KEY_REMOVE_WHILE_MOVING operations, depending on the direction of
the move operation.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-10-24 12:36:40 +02:00
Jan Schmidt d638108484 Btrfs: fix extent buffer reference for tree mod log roots
In get_old_root we grab a lock on the extent buffer before we obtain a
reference on that buffer. That order is changed now.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-10-24 12:36:39 +02:00
Jan Schmidt 5b6602e762 Btrfs: determine level of old roots
In btrfs_find_all_roots' termination condition, we compare the level of the
old buffer we got from btrfs_search_old_slot to the level of the current
root node. We'd better compare it to the level of the rewinded root node.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-10-24 12:36:38 +02:00
Jan Schmidt 834328a849 Btrfs: tree mod log's old roots could still be part of the tree
Tree mod log treated old root buffers as always empty buffers when starting
the rewind operations. However, the old root may still be part of the
current tree at a lower level, with still some valid entries.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-10-24 12:36:37 +02:00
Jan Schmidt ba1bfbd592 Btrfs: fix a tree mod logging issue for root replacement operations
Avoid the implicit free by tree_mod_log_set_root_pointer, which is wrong in
two places. Where needed, we call tree_mod_log_free_eb explicitly now.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-10-23 15:09:14 +02:00
Jan Schmidt 57911b8ba8 Btrfs: don't put removals from push_node_left into tree mod log twice
Independant of the check (push_items < src_items) tree_mod_log_eb_copy did
log the removal of the old data entries from the source buffer. Therefore,
we must not call tree_mod_log_eb_move if the check evaluates to true, as
that would log the removal twice, finally resulting in (rewinded) buffers
with wrong values for header_nritems.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-10-23 15:09:11 +02:00
Robin Dong 8d1a1317af btrfs: remove unused function btrfs_insert_some_items()
The function btrfs_insert_some_items() would not be called by any other functions,
so remove it.

Signed-off-by: Robin Dong <sanbai@taobao.com>
2012-10-09 09:15:43 -04:00
Chris Mason 74dd17fbe3 Btrfs: fix btrfs send for inline items and compression
The btrfs send code was assuming the offset of the file item into the
extent translated to bytes on disk.  If we're compressed, this isn't
true, and so it was off into extents owned by other files.

It was also improperly handling inline extents.  This solves a crash
where we may have gone past the end of the file extent item by not
testing early enough for an inline extent.  It also solves problems
where we have a whole between the end of the inline item and the start
of the full extent.

Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-10-01 15:19:00 -04:00
Chris Mason b12a3b1ea2 Btrfs: don't run __tree_mod_log_free_eb on leaves
When we split a leaf, we may end up inserting a new root on top of that
leaf.  The reflog code was incorrectly assuming the old root was always
a node.  This makes sure we skip over leaves.

Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-08-28 16:53:34 -04:00
Arne Jansen 1fa11e265f Btrfs: fix deadlock in wait_for_more_refs
Commit a168650c introduced a waiting mechanism to prevent busy waiting in
btrfs_run_delayed_refs. This can deadlock with btrfs_run_ordered_operations,
where a tree_mod_seq is held while waiting for the io to complete, while
the end_io calls btrfs_run_delayed_refs.
This whole mechanism is unnecessary. If not enough runnable refs are
available to satisfy count, just return as count is more like a guideline
than a strict requirement.
In case we have to run all refs, commit transaction makes sure that no
other threads are working in the transaction anymore, so we just assert
here that no refs are blocked.

Signed-off-by: Arne Jansen <sensille@gmx.net>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-08-28 16:53:32 -04:00
Chris Mason 113c1cb530 Merge branch 'send-v2' of git://github.com/ablock84/linux-btrfs into for-linus
This is the kernel portion of btrfs send/receive

Conflicts:
	fs/btrfs/Makefile
	fs/btrfs/backref.h
	fs/btrfs/ctree.c
	fs/btrfs/ioctl.c
	fs/btrfs/ioctl.h

Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-07-25 19:19:10 -04:00
Alexander Block 7069830a9e Btrfs: add btrfs_compare_trees function
This function is used to find the differences between
two trees. The tree compare skips whole subtrees if it
detects shared tree blocks and thus is pretty fast.

Signed-off-by: Alexander Block <ablock84@googlemail.com>
Reviewed-by: David Sterba <dave@jikos.cz>
Reviewed-by: Arne Jansen <sensille@gmx.net>
Reviewed-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Reviewed-by: Alex Lyakas <alex.bolshoy.btrfs@gmail.com>
2012-07-25 23:30:14 +02:00
Arne Jansen e679376911 Btrfs: add helper for tree enumeration
Often no exact match is wanted but just the next lower or
higher item. There's a lot of duplicated code throughout
btrfs to deal with the corner cases. This patch adds a
helper function that can facilitate searching.

Signed-off-by: Arne Jansen <sensille@gmx.net>
2012-07-25 17:33:18 +02:00
Arne Jansen 2f38b3e190 Btrfs: add helper for tree enumeration
Often no exact match is wanted but just the next lower or
higher item. There's a lot of duplicated code throughout
btrfs to deal with the corner cases. This patch adds a
helper function that can facilitate searching.

Signed-off-by: Arne Jansen <sensille@gmx.net>
2012-07-10 15:14:42 +02:00
Jan Schmidt 097b8a7c9e Btrfs: join tree mod log code with the code holding back delayed refs
We've got two mechanisms both required for reliable backref resolving (tree
mod log and holding back delayed refs). You cannot make use of one without
the other. So instead of requiring the user of this mechanism to setup both
correctly, we join them into a single interface.

Additionally, we stop inserting non-blockers into fs_info->tree_mod_seq_list
as we did before, which was of no value.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-07-10 15:14:41 +02:00
Jan Schmidt cf5388307a Btrfs: fix buffer leak in btrfs_next_old_leaf
When calling btrfs_next_old_leaf, we were leaking an extent buffer in the
rare case of using the deadlock avoidance code needed for the tree mod log.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-07-10 15:14:41 +02:00
Jan Schmidt d42244a0d3 Btrfs: resolve tree mod log locking issue in btrfs_next_leaf
With the tree mod log, we may end up with two roots (the current root and a
rewinded version of it) both pointing to two leaves, l1 and l2, of which l2
had already been cow-ed in the current transaction. If we don't rewind any
tree blocks, we cannot have two roots both pointing to an already cowed tree
block.

Now there is btrfs_next_leaf, which has a leaf locked and wants a lock on
the next (right) leaf. And there is push_leaf_left, which has a (cowed!)
leaf locked and wants a lock on the previous (left) leaf.

In order to solve this dead lock situation, we use try_lock in
btrfs_next_leaf (only in case it's called with a tree mod log time_seq
paramter) and if we fail to get a lock on the next leaf, we give up our lock
on the current leaf and retry from the very beginning.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-06-27 16:34:40 +02:00
Jan Schmidt 19956c7e94 Btrfs: fix tree mod log rewind of ADD operations
When a MOD_LOG_KEY_ADD operation is rewinded, we remove the key from the
tree block. If its not the last key, removal involves a move operation.
This move operation was explicitly done before this commit.

However, at insertion time, there's a move operation before the actual
addition to make room for the new key, which is recorded in the tree mod
log as well. This means, we must drop the move operation when rewinding the
add operation, because the next operation we'll be rewinding will be the
corresponding MOD_LOG_MOVE_KEYS operation.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-06-27 16:34:40 +02:00
Jan Schmidt c3e0696523 Btrfs: always put insert_ptr modifications into the tree mod log
Several callers of insert_ptr set the tree_mod_log parameter to 0 to avoid
addition to the tree mod log. In fact, we need all of those operations. This
commit simply removes the additional parameter and makes addition to the
tree mod log unconditional.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-06-27 16:34:39 +02:00
Jan Schmidt 28da9fb446 Btrfs: fix tree mod log for root replacements at leaf level
For the tree mod log, we don't log any operations at leaf level. If the root
is at the leaf level (i.e. the tree consists only of the root), then
__tree_mod_log_oldest_root will find a ROOT_REPLACE operation in the log
(because we always log that one no matter which level), but no other
operations.

With this patch __tree_mod_log_oldest_root exits cleanly instead of
BUGging in this situation. get_old_root checks if its really a root at leaf
level in case we don't have any operations and WARNs if this assumption
breaks.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-06-27 16:34:38 +02:00
Chris Mason 4325edd078 Btrfs: init old_generation in get_old_root
gcc was giving an uninit variable warning here.  Strictly
speaking we don't need to init it, but this will make things
much less error prone.

Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-06-15 20:06:54 -04:00
Jan Schmidt 3310c36eef Btrfs: fix race in tree mod log addition
When adding to the tree modification log, we grab two locks at different
stages. We must not drop the outer lock until we're done with section
protected by the inner lock. This moves the unlock call for the outer lock
to the appropriate position.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-06-14 18:52:39 +02:00
Jan Schmidt 3d7806eca4 Btrfs: add btrfs_next_old_leaf
To make sense of the tree mod log, the backref walker not only needs
btrfs_search_old_slot, but it also called btrfs_next_leaf, which in turn was
calling btrfs_search_slot. This obviously didn't give the correct result.

This commit adds btrfs_next_old_leaf, a drop-in replacement for
btrfs_next_leaf with a time_seq parameter. If it is zero, it behaves exactly
like btrfs_next_leaf. If it is non-zero, it will use btrfs_search_old_slot
with this time_seq parameter.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-06-14 18:52:09 +02:00
Jan Schmidt a95236d99f Btrfs: fix return value for __tree_mod_log_oldest_root
In __tree_mod_log_oldest_root() we must return the found operation even if
it's not a ROOT_REPLACE operation. Otherwise, the caller assumes that there
are no operations to be rewinded and returns immediately.

The code in the caller is modified to improve readability.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-06-14 18:44:22 +02:00
Jan Schmidt 8ba97a15e7 Btrfs: use btrfs_read_lock_root_node in get_old_root
get_old_root could race with root node updates because we weren't locking
the node early enough. Use btrfs_read_lock_root_node to grab the root locked
in the very beginning and release the lock as soon as possible (just like
btrfs_search_slot does).

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-06-14 18:44:21 +02:00
Jan Schmidt 4d5a0565ce Btrfs: remove call to btrfs_header_nritems with no effect
This is a leftover from cleanup patch 559af821. Before the cleanup,
btrfs_header_nritems was called inside an if condition. As it has no side
effects we need to preserve here, it should simply be dropped.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-06-04 14:35:29 +02:00
Chris Mason 1e20932a23 Merge branch 'for-chris' of git://git.jan-o-sch.net/btrfs-unstable into for-linus
Conflicts:
	fs/btrfs/ulist.h

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-05-31 16:49:53 -04:00
Jan Schmidt c31931088f Btrfs: fix tree mod log rewinded level and rewinding of moved keys
When we rewind REMOVE_WHILE_FREEING operations, there's code that allocates
a fresh buffer instead of cloning the old one. Setting that buffer's level
correctly was missing in this case.

When rewinding a MOVE_KEYS operation, btrfs_node_key_ptr_offset(slot) was
missing for memmove_extent_buffer()'s arguments.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-05-31 19:56:19 +02:00
Jan Schmidt f395694c2c Btrfs: fix tree mod log del_ptr
Logging for del_ptr when we're not deleting the last pointer was wrong. This
fixes both, duplicate log entries and log sequence.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-05-31 19:56:19 +02:00
Jan Schmidt e9b7fd4d8b Btrfs: add tree_mod_dont_log helper
Replace duplicate code by small inline helper function.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-05-31 19:56:18 +02:00
Jan Schmidt 926dd8a640 Btrfs: add missing spin_lock for insertion into tree mod log
tree_mod_alloc calls __get_tree_mod_seq and must acquire a spinlock before
doing so.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-05-31 19:56:18 +02:00
Tsutomu Itoh 018642a1f1 Btrfs: return value of btrfs_read_buffer is checked correctly
btrfs_read_buffer() has the possibility of returning the error.
Therefore, I add the code in which the return value of btrfs_read_buffer()
is checked.

Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
2012-05-30 10:23:41 -04:00
Jan Schmidt 5d9e75c41d Btrfs: add btrfs_search_old_slot
The tree modification log together with the current state of the tree gives
a consistent, old version of the tree. btrfs_search_old_slot is used to
search through this old version and return old (dummy!) extent buffers.
Naturally, this function cannot do any tree modifications.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-05-30 15:17:33 +02:00
Jan Schmidt f3ea38da3e Btrfs: add del_ptr and insert_ptr modifications to the tree mod log
Record all relevant modifications to block pointers in the tree mod log so
that we can rewind them later on for backref walking.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-05-30 15:17:32 +02:00
Jan Schmidt f230475e62 Btrfs: put all block modifications into the tree mod log
When running functions that can make changes to the internal trees
(e.g. btrfs_search_slot), we check if somebody may be interested in the
block we're currently modifying. If so, we record our modification to be
able to rewind it later on.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-05-30 15:17:29 +02:00
Jan Schmidt bd989ba359 Btrfs: add tree modification log functions
The tree mod log will log modifications made fs-tree nodes. Most
modifications are done by autobalance of the tree. Such changes are recorded
as long as a block entry exists. When released, the log is cleaned.

With the tree modification log, it's possible to reconstruct a consistent
old state of the tree. This is required to do backref walking on a busy
file system.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-05-30 15:17:01 +02:00
Jan Schmidt 5581a51a59 Btrfs: don't set for_cow parameter for tree block functions
Three callers of btrfs_free_tree_block or btrfs_alloc_tree_block passed
parameter for_cow = 1. In fact, these two functions should never mark
their tree modification operations as for_cow, because they can change
the number of blocks referenced by a tree.

Hence, we remove the extra for_cow parameter from these functions and
make them pass a zero down.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-05-26 12:17:53 +02:00
Wang Sheng-Hui f775738f6f btrfs/ctree.c: remove the unnecessary 'return -1;' at the end of bin_search
The code path should not reach there. Remove it.

Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
2012-05-11 10:56:37 -04:00
Chris Mason b9fab919b7 Btrfs: avoid sleeping in verify_parent_transid while atomic
verify_parent_transid needs to lock the extent range to make
sure no IO is underway, and so it can safely clear the
uptodate bits if our checks fail.

But, a few callers are using it with spinlocks held.  Most
of the time, the generation numbers are going to match, and
we don't want to switch to a blocking lock just for the error
case.  This adds an atomic flag to verify_parent_transid,
and changes it to return EAGAIN if it needs to block to
properly verifiy things.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-05-06 07:23:47 -04:00
Chris Mason e5846fc665 Btrfs: Add properly locking around add_root_to_dirty_list
add_root_to_dirty_list happens once at the very beginning of the
transaction, but it is still racey.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-05-04 15:14:11 -04:00
Chris Mason 1d4284bd6e Merge branch 'error-handling' into for-linus
Conflicts:
	fs/btrfs/ctree.c
	fs/btrfs/disk-io.c
	fs/btrfs/extent-tree.c
	fs/btrfs/extent_io.c
	fs/btrfs/extent_io.h
	fs/btrfs/inode.c
	fs/btrfs/scrub.c

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-03-28 20:31:37 -04:00
Chris Mason f7c79f30cb Btrfs: adjust the write_lock_level as we unlock
btrfs_search_slot sometimes needs write locks on high levels of
the tree.  It remembers the highest level that needs a write lock
and will use that for all future searches through the tree in a given
call.

But, very often we'll just cow the top level or the level below and we
won't really need write locks on the root again after that.  This patch
changes things to adjust the write lock requirement as it unlocks
levels.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-03-26 17:04:24 -04:00
Chris Mason cfed81a04e Btrfs: add the ability to cache a pointer into the eb
This cuts down on the CPU time used by map_private_extent_buffer

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-03-26 17:04:23 -04:00
Josef Bacik 3083ee2e18 Btrfs: introduce free_extent_buffer_stale
Because btrfs cow's we can end up with extent buffers that are no longer
necessary just sitting around in memory.  So instead of evicting these pages, we
could end up evicting things we actually care about.  Thus we have
free_extent_buffer_stale for use when we are freeing tree blocks.  This will
make it so that the ref for the eb being in the radix tree is dropped as soon as
possible and then is freed when the refcount hits 0 instead of waiting to be
released by releasepage.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
2012-03-26 16:51:08 -04:00
Jeff Mahoney 79787eaab4 btrfs: replace many BUG_ONs with proper error handling
btrfs currently handles most errors with BUG_ON. This patch is a work-in-
 progress but aims to handle most errors other than internal logic
 errors and ENOMEM more gracefully.

 This iteration prevents most crashes but can run into lockups with
 the page lock on occasion when the timing "works out."

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
2012-03-22 11:52:54 +01:00
Mark Fasheh 305a26af5b btrfs: Go readonly on tree errors in balance_level
balace_level() seems to deal with missing tree nodes by BUG_ON(). Instead,
we can easily just set the file system readonly and bubble -EROFS back up
the stack.

Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2012-03-22 01:45:38 +01:00
Mark Fasheh b68dc2a93e btrfs: Don't BUG_ON errors from update_ref_for_cow()
__btrfs_cow_block(), the only caller of update_ref_for_cow() will BUG_ON()
any error return.  Instead, we can go read-only fs as update_ref_for_cow()
manipulates disk data in a way which doesn't look like it's easily rolled
back.

Signed-off-by: Mark Fasheh <mfasheh@suse.de>
2012-03-22 01:45:38 +01:00
Mark Fasheh e5df957328 btrfs: Go readonly on bad extent refs in update_ref_for_cow()
update_ref_for_cow() will BUG_ON() after it's call to
btrfs_lookup_extent_info() if no existing references are found.  Since refs
are computed directly from disk, this should be treated as a corruption
instead of a logic error.

Signed-off-by: Mark Fasheh <mfasheh@suse.de>
2012-03-22 01:45:37 +01:00
Mark Fasheh be1a5564fd btrfs: Don't BUG_ON() errors in update_ref_for_cow()
The only caller of update_ref_for_cow() is __btrfs_cow_block() which was
originally ignoring any return values. update_ref_for_cow() however doesn't
look like a candidate to become a void function - there are a few places
where errors can occur.

So instead I changed update_ref_for_cow() to bubble all errors up (instead
of BUG_ON). __btrfs_cow_block() was then updated to catch and BUG_ON() any
errors from update_ref_for_cow(). The end effect is that we have no change
in behavior, but about 8 different places where a BUG_ON(ret) was removed.

Obviously a future patch will have to address the BUG_ON() in
__btrfs_cow_block().

Signed-off-by: Mark Fasheh <mfasheh@suse.de>
2012-03-22 01:45:36 +01:00
Jeff Mahoney 143bede527 btrfs: return void in functions without error conditions
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
2012-03-22 01:45:34 +01:00
Arne Jansen 66d7e7f09f Btrfs: mark delayed refs as for cow
Add a for_cow parameter to add_delayed_*_ref and pass the appropriate value
from every call site. The for_cow parameter will later on be used to
determine if a ref will change anything with respect to qgroups.

Delayed refs coming from relocation are always counted as for_cow, as they
don't change subvol quota.

Also pass in the fs_info for later use.

btrfs_find_all_roots() will use this as an optimization, as changes that are
for_cow will not change anything with respect to which root points to a
certain leaf. Thus, we don't need to add the current sequence number to
those delayed refs.

Signed-off-by: Arne Jansen <sensille@gmx.net>
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2011-12-22 16:22:27 +01:00
Liu Bo f1ebcc74d5 Btrfs: fix tree corruption after multi-thread snapshots and inode_cache flush
The btrfs snapshotting code requires that once a root has been
snapshotted, we don't change it during a commit.

But there are two cases to lead to tree corruptions:

1) multi-thread snapshots can commit serveral snapshots in a transaction,
   and this may change the src root when processing the following pending
   snapshots, which lead to the former snapshots corruptions;

2) the free inode cache was changing the roots when it root the cache,
   which lead to corruptions.

This fixes things by making sure we force COW the block after we create a
snapshot during commiting a transaction, then any changes to the roots
will result in COW, and we get all the fs roots and snapshot roots to be
consistent.

Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-11-15 09:53:28 -05:00
Li Zefan a05a9bb18a Btrfs: fix array bound checking
Otherwise we can execced the array bound of path->slots[].

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
2011-10-20 18:10:41 +02:00
Chris Mason 31533fb263 Btrfs: remove lockdep magic from btrfs_next_leaf
Before the reader/writer locks, btrfs_next_leaf needed to keep
the path blocking to avoid making lockdep upset.

Now that btrfs_next_leaf only takes read locks, this isn't required.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-07-27 12:46:47 -04:00
Chris Mason bd681513fa Btrfs: switch the btrfs tree locks to reader/writer
The btrfs metadata btree is the source of significant
lock contention, especially in the root node.   This
commit changes our locking to use a reader/writer
lock.

The lock is built on top of rw spinlocks, and it
extends the lock tracking to remember if we have a
read lock or a write lock when we go to blocking.  Atomics
count the number of blocking readers or writers at any
given time.

It removes all of the adaptive spinning from the old code
and uses only the spinning/blocking hints inside of btrfs
to decide when it should continue spinning.

In read heavy workloads this is dramatically faster.  In write
heavy workloads we're still faster because of less contention
on the root node lock.

We suffer slightly in dbench because we schedule more often
during write locks, but all other benchmarks so far are improved.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-07-27 12:46:46 -04:00
Chris Mason a65917156e Btrfs: stop using highmem for extent_buffers
The extent_buffers have a very complex interface where
we use HIGHMEM for metadata and try to cache a kmap mapping
to access the memory.

The next commit adds reader/writer locks, and concurrent use
of this kmap cache would make it even more complex.

This commit drops the ability to use HIGHMEM with extent buffers,
and rips out all of the related code.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-07-27 12:46:45 -04:00
Josef Bacik 25b8b936ed Btrfs: don't map extent buffer if path->skip_locking is set
Arne's scrub stuff exposed a problem with mapping the extent buffer in
reada_for_search.  He searches the commit root with multiple threads and with
skip_locking set, so we can race and overwrite node->map_token since node isn't
locked.  So fix this so that we only map the extent buffer if we don't already
have a map_token and skip_locking isn't set.  Without this patch scrub would
panic almost immediately, with the patch it doesn't panic anymore.  Thanks,

Reported-by: Arne Jansen <sensille@gmx.net>
Signed-off-by: Josef Bacik <josef@redhat.com>
2011-06-09 10:12:07 -04:00
Chris Mason ff5714cca9 Merge branch 'for-chris' of
git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-work into for-linus

Conflicts:
	fs/btrfs/disk-io.c
	fs/btrfs/extent-tree.c
	fs/btrfs/free-space-cache.c
	fs/btrfs/inode.c
	fs/btrfs/transaction.c

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-05-28 07:00:39 -04:00
Chris Mason d6c0cb379c Merge branch 'cleanups_and_fixes' into inode_numbers
Conflicts:
	fs/btrfs/tree-log.c
	fs/btrfs/volumes.c

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-05-23 14:37:47 -04:00