History

Alex Lorenz 986a2438b3 [swift] make sure the flow_sampling target depends on the flow Swift … (#10641 ) * [swift] make sure the flow_sampling target depends on the flow Swift generated header * [swift] Allow the googlebenmark g++ to be used with Clang and libstdc++ combination		2023-07-19 04:23:01 -05:00
..
include/flowbench	Move sample benchmarks to Flowbench.	2022-09-12 01:31:09 -07:00
BenchBlobDelta.cpp	blob granule inplace encryption (#10619 )	2023-07-17 10:44:11 -07:00
BenchCallback.actor.cpp	Update copyright header dates	2022-03-21 13:36:23 -07:00
BenchEncrypt.cpp	EaR: Implement Key Check Value semantics (#9936 )	2023-04-12 14:29:31 -07:00
BenchHash.cpp	Use faster KeepRunning loops in flowbench	2022-09-11 20:32:28 -07:00
BenchIONet2.actor.cpp	moved wellknownendpoints and fixed some includes	2022-06-23 17:03:53 -06:00
BenchIdempotencyIds.cpp	Performance improvements and benchmark tweaks	2022-10-12 13:53:11 -07:00
BenchIterate.cpp	Use faster KeepRunning loops in flowbench	2022-09-11 20:32:28 -07:00
BenchMem.cpp	Use faster KeepRunning loops in flowbench	2022-09-11 20:32:28 -07:00
BenchMetadataCheck.cpp	Convert literal string ref instances to use _sr suffix	2022-09-19 11:35:58 -07:00
BenchNet2.actor.cpp	Bug fix in items processed count, changed delay/yield test to be a template.	2022-10-31 12:49:38 -07:00
BenchPopulate.cpp	Use faster KeepRunning loops in flowbench	2022-09-11 20:32:28 -07:00
BenchPriorityMultiLock.actor.cpp	Remove runners list from PriorityMultiLock and rely on reference counting in the release handler instead of canceling the release handlers. This improves the microbenchmark by 26%.	2022-11-11 00:34:03 -08:00
BenchRandom.cpp	Use faster KeepRunning loops in flowbench	2022-09-11 20:32:28 -07:00
BenchRef.cpp	Use faster KeepRunning loops in flowbench	2022-09-11 20:32:28 -07:00
BenchSamples.cpp	Change Histogram::Unit::microseconds to milliseconds	2022-11-21 08:03:56 -08:00
BenchStream.actor.cpp	Update copyright header dates	2022-03-21 13:36:23 -07:00
BenchTimer.cpp	Revert "Use real clock source for trace events in real fdbserver, but now() in simulation. (#9270 )"	2023-02-02 10:30:31 -08:00
BenchVersionVector.cpp	Use faster KeepRunning loops in flowbench	2022-09-11 20:32:28 -07:00
BenchVersionVectorSerialization.cpp	Use faster KeepRunning loops in flowbench	2022-09-11 20:32:28 -07:00
BenchZstd.cpp	add benchmark for zstd	2022-10-11 08:23:55 -07:00
CMakeLists.txt	[swift] make sure the flow_sampling target depends on the flow Swift … (#10641 )	2023-07-19 04:23:01 -05:00
GlobalData.cpp	Place generateRandomData() under {I\|Deterministic}Random	2022-07-20 13:21:11 +02:00
README.md	Exclude flowbench target from all	2020-10-10 15:03:53 -07:00
benchmark.cmake	Add support for pre-built googlebenchmark	2023-02-27 15:38:58 -06:00
flowbench.actor.cpp	Simplify function call when transaction is null	2022-06-22 14:50:17 -07:00

README.md

Summary

flowbench is an executable that can be used to microbenchmark parts of the FoundationDB code. The goal is to make it easy to test the performance of various sub-millisecond operations using flow and fdbrpc. Specifically, this tool can be used to:

Test the performance effects of changes to the actor compiler or to the flow and fdbrpc libraries
Test the performance of various uses of the flow and fdbrpc libraries
Find areas for improvement in the flow and fdbrpc libraries
Compare flow/fdbrpc primitives to alternatives provided by the standard library or other third-party libraries.

Usage

To build the flowbench executable, run ninja flowbench or make flowbench depending on which build system you're using.
Then you can run bin/flowbench --help to see possible uses of flowbench.
Running bin/flowbench directly will run all registered benchmarks, but you may want to limit your run to a subset of benchmarks. This can be done by running bin/flowbench --benchmark_filter=<regex>
All benchmark names can be listed with bin/flowbench --benchmark_list_tests
Example output:

$ bin/flowbench --benchmark_filter=bench_ref
2020-08-04 21:49:40
Running bin/flowbench
Run on (7 X 2904 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x7)
  L1 Instruction 32 KiB (x7)
  L2 Unified 256 KiB (x7)
  L3 Unified 12288 KiB (x1)
Load Average: 0.15, 0.15, 0.72
---------------------------------------------------------------------------------------------------------------
Benchmark                                                     Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------------------------------------------------
bench_ref_create_and_destroy<RefType::RawPointer>          4.90 ns         4.90 ns    116822124 items_per_second=203.88M/s
bench_ref_create_and_destroy<RefType::UniquePointer>       4.94 ns         4.94 ns    141101924 items_per_second=202.555M/s
bench_ref_create_and_destroy<RefType::SharedPointer>       42.5 ns         42.5 ns     13802909 items_per_second=23.531M/s
bench_ref_create_and_destroy<RefType::FlowReference>       5.05 ns         5.05 ns    100000000 items_per_second=197.955M/s
bench_ref_copy<RefType::RawPointer>                        1.15 ns         1.15 ns    612121585 items_per_second=871.218M/s
bench_ref_copy<RefType::SharedPointer>                     10.0 ns         10.0 ns     67553102 items_per_second=99.8113M/s
bench_ref_copy<RefType::FlowReference>                     2.33 ns         2.33 ns    292317474 items_per_second=428.507M/s

More detailed documentation can be found at https://github.com/google/benchmark

Existing Benchmarks

bench_populate measures the population of a vector of mutations
bench_ref compares the performance of the flow Reference type to other pointer types
bench_iterate measures iteration over a list of mutations
bench_stream measures the performance of writing to and reading from a PromiseStream
bench_random measures the performance of DeterministicRandom.
bench_timer measures the perforamnce of FoundationDB timers.

Future use cases

Benchmark the overhead of sending and receiving messages through FlowTransport
Benchmark the performance of serializing/deserializing various types