OpenCloudOS-Kernel/fs/btrfs
Filipe Manana 5522a27e59 btrfs: do not take the log_mutex of the subvolume when pinning the log
During a rename we pin the log to make sure no one commits a log that
reflects an ongoing rename operation, as it might result in a committed
log where it recorded the unlink of the old name without having recorded
the new name. However we are taking the subvolume's log_mutex before
incrementing the log_writers counter, which is not necessary since that
counter is atomic and we only remove the old name from the log and add
the new name to the log after we have incremented log_writers, ensuring
that no one can commit the log after we have removed the old name from
the log and before we added the new name to the log.

By taking the log_mutex lock we are just adding unnecessary contention on
the lock, which can become visible for workloads that mix renames with
fsyncs, writes for files opened with O_SYNC and unlink operations (if the
inode or its parent were fsynced before in the current transaction).

So just remove the lock and unlock of the subvolume's log_mutex at
btrfs_pin_log_trans().

Using dbench, which mixes different types of operations that end up taking
that mutex (fsyncs, renames, unlinks and writes into files opened with
O_SYNC) revealed some small gains. The following script that calls dbench
was used:

  #!/bin/bash

  DEV=/dev/nvme0n1
  MNT=/mnt/btrfs
  MOUNT_OPTIONS="-o ssd -o space_cache=v2"
  MKFS_OPTIONS="-m single -d single"
  THREADS=32

  echo "performance" | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
  mkfs.btrfs -f $MKFS_OPTIONS $DEV
  mount $MOUNT_OPTIONS $DEV $MNT

  dbench -s -t 600 -D $MNT $THREADS

  umount $MNT

The test was run on bare metal, no virtualization, on a box with 12 cores
(Intel i7-8700), 64Gb of RAM and using a NVMe device, with a kernel
configuration that is the default of typical distributions (debian in this
case), without debug options enabled (kasan, kmemleak, slub debug, debug
of page allocations, lock debugging, etc).

Results before this patch:

 Operation      Count    AvgLat    MaxLat
 ----------------------------------------
 NTCreateX    4410848     0.017   738.640
 Close        3240222     0.001     0.834
 Rename        186850     7.478  1272.476
 Unlink        890875     0.128   785.018
 Deltree          128     2.846    12.081
 Mkdir             64     0.002     0.003
 Qpathinfo    3997659     0.009    11.171
 Qfileinfo     701307     0.001     0.478
 Qfsinfo       733494     0.002     1.103
 Sfileinfo     359362     0.004     3.266
 Find         1546226     0.041     4.128
 WriteX       2202803     7.905  1376.989
 ReadX        6917775     0.003     3.887
 LockX          14392     0.002     0.043
 UnlockX        14392     0.001     0.085
 Flush         309225     0.128  1033.936

Throughput 231.555 MB/sec (sync open)  32 clients  32 procs  max_latency=1376.993 ms

Results after this patch:

Operation      Count    AvgLat    MaxLat
 ----------------------------------------
 NTCreateX    4603244     0.017   232.776
 Close        3381299     0.001     1.041
 Rename        194871     7.251  1073.165
 Unlink        929730     0.133   119.233
 Deltree          128     2.871    10.199
 Mkdir             64     0.002     0.004
 Qpathinfo    4171343     0.009    11.317
 Qfileinfo     731227     0.001     1.635
 Qfsinfo       765079     0.002     3.568
 Sfileinfo     374881     0.004     1.220
 Find         1612964     0.041     4.675
 WriteX       2296720     7.569  1178.204
 ReadX        7213633     0.003     3.075
 LockX          14976     0.002     0.076
 UnlockX        14976     0.001     0.061
 Flush         322635     0.102   579.505

Throughput 241.4 MB/sec (sync open)  32 clients  32 procs  max_latency=1178.207 ms
(+4.3% throughput, -14.4% max latency)

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-10-07 12:06:55 +02:00
..
tests btrfs: make btrfs_set_extent_delalloc take btrfs_inode 2020-07-27 12:55:35 +02:00
Kconfig Revert "btrfs: switch to iomap_dio_rw() for dio" 2020-06-14 01:19:02 +02:00
Makefile Btrfs: move all reflink implementation code into its own file 2020-03-23 17:01:54 +01:00
acl.c btrfs: cleanup btrfs_setxattr_trans and drop transaction parameter 2019-04-29 19:02:44 +02:00
async-thread.c Btrfs: fix crash during unmount due to race with delayed inode workers 2020-03-23 17:01:51 +01:00
async-thread.h Btrfs: fix crash during unmount due to race with delayed inode workers 2020-03-23 17:01:51 +01:00
backref.c btrfs: remove unnecessarily shadowed variables 2020-10-07 12:06:55 +02:00
backref.h btrfs: rename BTRFS_ROOT_REF_COWS to BTRFS_ROOT_SHAREABLE 2020-05-25 11:25:35 +02:00
block-group.c btrfs: call btrfs_try_granting_tickets when reserving space 2020-10-07 12:06:51 +02:00
block-group.h btrfs: convert block group refcount to refcount_t 2020-07-27 12:55:42 +02:00
block-rsv.c btrfs: rename BTRFS_ROOT_REF_COWS to BTRFS_ROOT_SHAREABLE 2020-05-25 11:25:35 +02:00
block-rsv.h btrfs: Remove __ prefix from btrfs_block_rsv_release 2020-03-23 17:01:55 +01:00
btrfs_inode.h btrfs: reduce contention on log trees when logging checksums 2020-07-27 12:55:45 +02:00
check-integrity.c btrfs: check-integrity: remove unnecessary failure messages during memory allocation 2020-07-27 12:55:21 +02:00
check-integrity.h btrfs: remove btrfsic_submit_bh() 2020-03-23 17:01:39 +01:00
compression.c btrfs: compression: move declarations to header 2020-10-07 12:06:55 +02:00
compression.h btrfs: compression: move declarations to header 2020-10-07 12:06:55 +02:00
ctree.c btrfs: delete duplicated words + other fixes in comments 2020-10-07 12:06:50 +02:00
ctree.h btrfs: do async reclaim for data reservations 2020-10-07 12:06:54 +02:00
delalloc-space.c btrfs: add btrfs_reserve_data_bytes and use it 2020-10-07 12:06:52 +02:00
delalloc-space.h btrfs: make btrfs_delalloc_reserve_space take btrfs_inode 2020-07-27 12:55:36 +02:00
delayed-inode.c btrfs: use nofs allocations for running delayed items 2020-03-25 16:26:00 +01:00
delayed-inode.h btrfs: delayed-inode: Replace zero-length array with flexible-array member 2020-03-23 17:01:53 +01:00
delayed-ref.c btrfs: Remove __ prefix from btrfs_block_rsv_release 2020-03-23 17:01:55 +01:00
delayed-ref.h btrfs: migrate the delayed refs rsv code 2019-07-04 17:26:17 +02:00
dev-replace.c btrfs: change nr to u64 in btrfs_start_delalloc_roots 2020-10-07 12:06:50 +02:00
dev-replace.h btrfs: add __pure attribute to functions 2019-11-18 12:46:52 +01:00
dir-item.c btrfs: remove unused parameter fs_info from btrfs_extend_item 2019-04-29 19:02:50 +02:00
discard.c btrfs: discard: add missing put when grabbing block group from unused list 2020-07-07 16:06:28 +02:00
discard.h btrfs: discard: Use the correct style for SPDX License Identifier 2020-04-20 17:43:42 +02:00
disk-io.c btrfs: do async reclaim for data reservations 2020-10-07 12:06:54 +02:00
disk-io.h btrfs: preallocate anon block device at first phase of snapshot creation 2020-07-27 12:55:38 +02:00
export.c btrfs: simplify iget helpers 2020-05-25 11:25:37 +02:00
export.h btrfs: export helpers for subvolume name/id resolution 2020-03-23 17:01:42 +01:00
extent-io-tree.h btrfs: trim: fix underflow in trim length to prevent access beyond device boundary 2020-08-12 10:15:58 +02:00
extent-tree.c btrfs: call btrfs_try_granting_tickets when unpinning anything 2020-10-07 12:06:51 +02:00
extent_io.c btrfs: delete duplicated words + other fixes in comments 2020-10-07 12:06:50 +02:00
extent_io.h btrfs: fix potential deadlock in the search ioctl 2020-08-27 13:56:27 +02:00
extent_map.c Btrfs: fix race between using extent maps and merging them 2020-02-12 17:16:46 +01:00
extent_map.h btrfs: remove extent_map::bdev 2019-11-18 23:43:44 +01:00
file-item.c btrfs: make btrfs_csum_one_bio takae btrfs_inode 2020-07-27 12:55:26 +02:00
file.c btrfs: cleanup calculation of lockend in lock_and_cleanup_extent_if_need() 2020-10-07 12:06:54 +02:00
free-space-cache.c btrfs: delete duplicated words + other fixes in comments 2020-10-07 12:06:50 +02:00
free-space-cache.h btrfs: let btrfs_return_cluster_to_free_space() return void 2020-07-27 12:55:21 +02:00
free-space-tree.c btrfs: block-group: fix free-space bitmap threshold 2020-08-27 13:37:54 +02:00
free-space-tree.h btrfs: rename btrfs_block_group_cache 2019-11-18 17:51:51 +01:00
inode-item.c btrfs: Make btrfs_find_name_in_ext_backref return struct btrfs_inode_extref 2019-09-09 14:59:16 +02:00
inode-map.c btrfs: make btrfs_delalloc_reserve_space take btrfs_inode 2020-07-27 12:55:36 +02:00
inode-map.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
inode.c btrfs: remove unnecessarily shadowed variables 2020-10-07 12:06:55 +02:00
ioctl.c btrfs: change nr to u64 in btrfs_start_delalloc_roots 2020-10-07 12:06:50 +02:00
locking.c btrfs: add missing annotation for btrfs_tree_lock() 2020-05-25 11:25:16 +02:00
locking.h btrfs: Implement DREW lock 2020-03-23 17:01:43 +01:00
lzo.c btrfs: compression: inline free_workspace 2019-11-18 12:46:59 +01:00
misc.h btrfs: rename tree_entry to rb_simple_node and export it 2020-05-25 11:25:19 +02:00
ordered-data.c btrfs: make btrfs_add_ordered_extent_dio take btrfs_inode 2020-07-27 12:55:34 +02:00
ordered-data.h btrfs: make btrfs_add_ordered_extent_dio take btrfs_inode 2020-07-27 12:55:34 +02:00
orphan.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
print-tree.c btrfs: require only sector size alignment for parent eb bytenr 2020-09-07 14:51:05 +02:00
print-tree.h btrfs: print-tree: debugging output enhancement 2018-04-20 19:18:16 +02:00
props.c btrfs: simplify iget helpers 2020-05-25 11:25:37 +02:00
props.h btrfs: delete unused function btrfs_set_prop_trans 2019-04-29 19:02:54 +02:00
qgroup.c btrfs: delete duplicated words + other fixes in comments 2020-10-07 12:06:50 +02:00
qgroup.h btrfs: qgroup: export qgroups in sysfs 2020-07-27 12:55:37 +02:00
raid56.c btrfs: raid56: remove out label in __raid56_parity_recover 2020-07-27 12:55:44 +02:00
raid56.h btrfs: constify map parameter for nr_parity_stripes and nr_data_stripes 2019-07-01 13:34:58 +02:00
rcu-string.h btrfs: rcu-string: Replace zero-length array with flexible-array member 2020-03-23 17:01:53 +01:00
reada.c btrfs: rename btrfs_block_group_cache 2019-11-18 17:51:51 +01:00
ref-verify.c btrfs: ref-verify: fix memory leak in add_block_entry 2020-07-27 12:55:43 +02:00
ref-verify.h btrfs: ref-verify: Use btrfs_ref to refactor btrfs_ref_tree_mod() 2019-04-29 19:02:49 +02:00
reflink.c btrfs: reduce contention on log trees when logging checksums 2020-07-27 12:55:45 +02:00
reflink.h Btrfs: move all reflink implementation code into its own file 2020-03-23 17:01:54 +01:00
relocation.c btrfs: relocation: review the call sites which can be interrupted by signal 2020-07-27 12:55:45 +02:00
root-tree.c btrfs: simplify root lookup by id 2020-05-25 11:25:36 +02:00
scrub.c btrfs: scrub: rename ratelimit state varaible to avoid shadowing 2020-10-07 12:06:55 +02:00
send.c btrfs: send: remove indirect callback parameter for changed_cb 2020-10-07 12:06:55 +02:00
send.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
space-info.c btrfs: fix possible infinite loop in data async reclaim 2020-10-07 12:06:54 +02:00
space-info.h btrfs: add btrfs_reserve_data_bytes and use it 2020-10-07 12:06:52 +02:00
struct-funcs.c btrfs: update documentation of set/get helpers 2020-05-25 11:25:35 +02:00
super.c btrfs: do async reclaim for data reservations 2020-10-07 12:06:54 +02:00
sysfs.c btrfs: remove const from btrfs_feature_set_name 2020-10-07 12:06:55 +02:00
sysfs.h btrfs: remove const from btrfs_feature_set_name 2020-10-07 12:06:55 +02:00
transaction.c btrfs: fix NULL pointer dereference after failure to create snapshot 2020-09-07 21:18:35 +02:00
transaction.h btrfs: qgroup: remove ASYNC_COMMIT mechanism in favor of reserve retry-after-EDQUOT 2020-07-27 12:55:43 +02:00
tree-checker.c btrfs: tree-checker: fix the error message for transid error 2020-08-27 14:16:05 +02:00
tree-checker.h btrfs: get fs_info from eb in btrfs_check_chunk_valid 2019-04-29 19:02:39 +02:00
tree-defrag.c btrfs: remove unused btrfs_root::defrag_trans_start 2020-07-27 12:55:28 +02:00
tree-log.c btrfs: do not take the log_mutex of the subvolume when pinning the log 2020-10-07 12:06:55 +02:00
tree-log.h btrfs: get fs_info from trans in btrfs_set_log_full_commit 2019-04-29 19:02:41 +02:00
ulist.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
ulist.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
uuid-tree.c btrfs: simplify root lookup by id 2020-05-25 11:25:36 +02:00
volumes.c btrfs: remove fsid argument from btrfs_sysfs_update_sprout_fsid 2020-10-07 12:06:50 +02:00
volumes.h btrfs: move btrfs_scratch_superblocks into btrfs_dev_replace_finishing 2020-09-25 16:40:22 +02:00
xattr.c Btrfs: fix failure to persist compression property xattr deletion on fsync 2019-06-17 16:37:17 +02:00
xattr.h btrfs: cleanup btrfs_setxattr_trans and drop transaction parameter 2019-04-29 19:02:44 +02:00
zlib.c btrfs: use larger zlib buffer for s390 hardware compression 2020-01-31 10:30:40 -08:00
zstd.c btrfs: compression: inline free_workspace 2019-11-18 12:46:59 +01:00