foundationdb/flow
Evan Tschannen 1a4ca05a04
Merge pull request #1889 from ajbeamon/add-cache-memory-parameter
Add cache_memory parameter to fdbserver
2019-07-26 13:34:24 -07:00
..
actorcompiler Merge pull request #1146 from atn34/fix-actor-warning 2019-02-13 11:01:37 -08:00
coveragetool coveragetool: suppress output unless VERBOSE is set 2019-05-07 16:54:05 -10:00
stacktrace_internal Import //base/debugging:stacktrace from abseil. 2017-10-16 16:05:02 -07:00
ActorCollection.actor.cpp remove memory-hungry benchmark 2019-05-17 14:12:44 -07:00
ActorCollection.h Merge branch 'release-5.2' of github.com:apple/foundationdb into feature-redwood 2018-04-06 16:29:37 -07:00
Arena.h Pass type as param to VectorRef instead of bool 2019-07-15 15:08:49 -07:00
AsioReactor.h Mark all network sockets as close-on-exec. 2019-05-13 16:05:49 -10:00
CMakeLists.txt Make valgrind work on Fedora 30 2019-06-11 10:27:08 -07:00
CompressedInt.actor.cpp Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used. 2019-05-10 14:01:52 -07:00
CompressedInt.h Replace & operator with variadic function 2018-12-28 11:33:42 -08:00
Deque.cpp Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used. 2019-05-10 14:01:52 -07:00
Deque.h Fix some signed and unsigned mismatch warnings. 2019-03-26 14:54:11 -07:00
DeterministicRandom.h Make IRandom reference counted. Simplify the logic around deterministicRandom() to require less state and branching. 2019-05-23 18:51:59 -07:00
Error.cpp Extend RebootRequest API to include time to suspend the process before reboot. This is intended to be used for testing purposes to simulate failures. 2019-06-14 11:35:38 -07:00
Error.h Implementation complete (not yet working) 2019-05-13 14:15:22 -07:00
EventTypes.actor.h Adjust all includes to be relative to the root. 2018-10-19 17:35:33 +00:00
FastAlloc.cpp Merge branch 'master' into thread-safe-random-number-generation 2019-05-23 08:35:47 -07:00
FastAlloc.h Refactor NextFastAllocatedSize to be constexpr function 2019-06-11 15:55:23 -07:00
FastRef.h Remove obsolete function 2019-05-13 14:15:23 -07:00
FaultInjection.cpp Adjust all includes to be relative to the root. 2018-10-19 17:35:33 +00:00
FaultInjection.h remove trailing whitespace from our copyright headers ; fixed formatting of python setup.py 2018-02-21 10:25:11 -08:00
FileIdentifier.h Compiling on clang again 2019-05-13 14:15:23 -07:00
FileTraceLogWriter.cpp Update flow/FileTraceLogWriter.cpp 2019-07-22 16:38:18 -07:00
FileTraceLogWriter.h Remove dead code and revert back to base 10 2019-07-22 16:20:03 -07:00
Hash3.c Fixed headers and some whitespace 2018-02-23 04:50:23 -08:00
Hash3.h Add includes for missing definitions of size_t and uint32_t. 2018-08-14 15:50:26 -07:00
IDispatched.h remove trailing whitespace from our copyright headers ; fixed formatting of python setup.py 2018-02-21 10:25:11 -08:00
IRandom.h Expose serialization context too all traits 2019-07-15 12:58:31 -07:00
IThreadPool.cpp Adjust all includes to be relative to the root. 2018-10-19 17:35:33 +00:00
IThreadPool.h Fix errors and TaskPriority more priorities. 2019-07-03 21:03:58 -07:00
IndexedSet.actor.h Adjust all includes to be relative to the root. 2018-10-19 17:35:33 +00:00
IndexedSet.cpp Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used. 2019-05-10 14:01:52 -07:00
IndexedSet.h Remove noexcept macro and replace with BOOST_NOEXCEPT. 2019-03-05 22:06:12 -08:00
JsonTraceLogFormatter.cpp Emit valid json for unprintable characters 2018-12-20 11:43:18 -08:00
JsonTraceLogFormatter.h Integrate JsonTraceLogFormatter into build system + more. 2018-12-13 22:07:16 -08:00
Knobs.cpp Add cache_memory parameter to fdbserver to control the size of the (4K) page cache. Change the default slighty from 2000 MiB to 2GiB. 2019-07-23 15:05:21 -07:00
Knobs.h Merge pull request #1820 from fzhjon/load-balance-locality 2019-07-16 16:40:43 -07:00
MetricSample.h Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used. 2019-05-10 14:01:52 -07:00
Net2.actor.cpp Merge branch 'master' into track-run-loop-busyness 2019-07-09 18:39:23 -07:00
Net2Packet.cpp Eliminate unnecessary memcpy 2019-07-11 23:03:31 -07:00
Net2Packet.h Support PacketBuffer's of arbitrary size 2019-07-11 23:03:31 -07:00
ObjectSerializer.h Fix a few IncludeVersion bugs 2019-07-16 13:28:29 -07:00
ObjectSerializerTraits.h Merge remote-tracking branch 'upstream/master' into flatbuffers-fixes2 2019-07-18 09:50:08 -07:00
Platform.cpp Fixed two silly compilation bugs 2019-06-27 10:29:59 -07:00
Platform.h Merge pull request #1749 from satherton/feature-redwood 2019-06-28 16:22:06 -07:00
Profiler.actor.cpp A giant translation of TaskFooPriority -> TaskPriority::Foo 2019-06-25 02:47:35 -07:00
Profiler.h remove trailing whitespace from our copyright headers ; fixed formatting of python setup.py 2018-02-21 10:25:11 -08:00
ProtocolVersion.h Track the priority of sampled Transaction as part of GetReadVersion event. 2019-07-19 17:31:49 -07:00
README.md Update TODOs 2019-07-12 11:17:11 -07:00
SignalSafeUnwind.cpp Avoid possibly concurrent reassignment of dl_iterate_phdr. 2019-01-24 13:28:45 -08:00
SignalSafeUnwind.h Adjust all includes to be relative to the root. 2018-10-19 17:35:33 +00:00
SimpleOpt.h Adjust all includes to be relative to the root. 2018-10-19 17:35:33 +00:00
Stats.actor.cpp Add a random UID to TransactionMetrics in case a client opens multiple connections and also a field to indicate whether the connection is internal. Convert some of the metrics to our Counter object instead of running totals. 2019-07-08 14:01:04 -07:00
Stats.h Add a random UID to TransactionMetrics in case a client opens multiple connections and also a field to indicate whether the connection is internal. Convert some of the metrics to our Counter object instead of running totals. 2019-07-08 14:01:04 -07:00
SystemMonitor.cpp Fix a merge issue 2019-07-10 09:46:23 -07:00
SystemMonitor.h Change the way cache hits and misses are tracked to avoid counting blind page writes as misses and count the results of partial page writes. Report cache hit rate in status. 2019-07-10 14:43:20 -07:00
TDMetric.actor.h Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used. 2019-05-10 14:01:52 -07:00
TDMetric.cpp Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used. 2019-05-10 14:01:52 -07:00
ThreadHelper.actor.h A giant translation of TaskFooPriority -> TaskPriority::Foo 2019-06-25 02:47:35 -07:00
ThreadHelper.cpp Adjust all includes to be relative to the root. 2018-10-19 17:35:33 +00:00
ThreadPrimitives.cpp Adjust all includes to be relative to the root. 2018-10-19 17:35:33 +00:00
ThreadPrimitives.h Adjust all includes to be relative to the root. 2018-10-19 17:35:33 +00:00
ThreadSafeQueue.h Consistent use of #if VALGRIND 2019-05-13 12:44:13 -10:00
Trace.cpp Fix unsafe usage of now() function from multiple threads in trace logging. 2019-07-22 22:31:38 -07:00
Trace.h Fix unsafe usage of now() function from multiple threads in trace logging. 2019-07-22 22:31:38 -07:00
UnitTest.cpp Adjust all includes to be relative to the root. 2018-10-19 17:35:33 +00:00
UnitTest.h Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used. 2019-05-10 14:01:52 -07:00
Util.h Add a missing #include <algorithm> to flow/Util.h 2018-08-14 15:50:26 -07:00
XmlTraceLogFormatter.cpp Adjust all includes to be relative to the root. 2018-10-19 17:35:33 +00:00
XmlTraceLogFormatter.h Adjust all includes to be relative to the root. 2018-10-19 17:35:33 +00:00
actorcompiler.h Moved THIS and THIS_ADDR to actorcompiler.h 2019-02-19 15:16:59 -08:00
error_definitions.h fdbrpc: Reduced connection monitoring from clients 2019-07-09 14:24:16 -07:00
flat_buffers.cpp Merge remote-tracking branch 'upstream/master' into flatbuffers-fixes2 2019-07-18 09:50:08 -07:00
flat_buffers.h Expose serialization context too all traits 2019-07-15 12:58:31 -07:00
flow.cpp Make object serializer versioned 2019-07-12 11:53:14 -07:00
flow.h Fix Void flatbuffers serialization 2019-07-25 17:30:43 -07:00
flow.vcxproj Enabled C++17 for all Windows projects 2019-05-16 17:44:13 -07:00
flow.vcxproj.filters do not log a SevError trace event if we cannot deserialize the connect packet 2019-04-10 17:41:02 -07:00
genericactors.actor.cpp Remove including flow.h in actorcompiler.h, and fix resulting breakage. 2018-08-14 15:50:26 -07:00
genericactors.actor.h get rid of all timeouts and other changes 2019-07-24 15:36:28 -07:00
hgVersion.h.cmake fdbserver now compiling 2018-12-13 14:13:47 -08:00
local.mk makefile changes to accommodate boost/process.hpp 2019-05-28 22:07:46 -07:00
network.cpp statically link libstdc++ on Linux and remove std::variant 2019-07-16 14:53:16 -07:00
network.h Merge pull request #1869 from alexmiller-apple/sharded-txs-performance 2019-07-26 13:30:13 -07:00
no_intellisense.opt Initial repository commit 2017-05-25 13:48:44 -07:00
serialize.cpp Revert "Revert "Make protocol version a type"" 2019-06-18 14:49:04 -07:00
serialize.h Make object serializer versioned 2019-07-12 11:53:14 -07:00
stacktrace.amalgamation.cpp Fix some typos. 2018-04-19 11:44:01 -07:00
stacktrace.h Import //base/debugging:stacktrace from abseil. 2017-10-16 16:05:02 -07:00
unactorcompiler.h Moved THIS and THIS_ADDR to actorcompiler.h 2019-02-19 15:16:59 -08:00
version.cpp Adjust all includes to be relative to the root. 2018-10-19 17:35:33 +00:00

README.md

Flow Tutorial

Using Flow

Flow introduces some new keywords and flow controls. Combining these into workable units also introduces some new design patterns to C++ programmers.

Keywords/primitives

The essence of Flow is the capability of passing messages asynchronously between components. The basic data types that connect asynchronous senders and receivers are Promise<> and Future<>. The sender holds a Promise<X> to, sometime in the future, deliver a value of type X to the holder of the Future<X>. A receiver, holding a Future<X>, at some point needs the X to continue computation, and invokes the wait(Future<> f) statement to pause until the value is delivered. To use the wait() statement, a function needs to be declared as an ACTOR function, a special flow keyword which directs the flow compiler to create the necessary internal callbacks, etc. Similarly, When a component wants to deal not with one asynchronously delivered value, but with a series, there are PromiseStream<> and FutureStream<>. These two constructs allow for “reliable delivery” of messages, and play an important role in the design patterns.

Promise, Future

Promise<T> and Future<T> are intrinsically linked (they go in pairs) and are two wrappers around a construct called SingleAssignmentVar, a variable that can be set only once. A Promise is a handle to a SingleAssignmentVar that allows for a one-time set of the value; a Future is a read-only handle to the variable that only allows reading of the value.

The following example uses these two simple types:

Promise<int> p;
Future<int> f = p.getFuture();
p.send( 4 );
printf( "%d\n", f.get() ); // f is already set

Network traversal

Promises and futures can be used within a single process, but their real strength in a distributed system is that they can traverse the network. For example, one computer could create a promise/future pair, then send the promise to another computer over the network. The promise and future will still be connected, and when the promise is fulfilled by the remote computer, the original holder of the future will see the value appear.

[TODO: network delivery guarantees]

wait()

Wait allows for the calling code to pause execution while the value of a Future is set. This statement is called with a Future<T> as its parameter and returns a T; the eventual value of the Future. Errors that are generated in the code that is setting the Future, will be thrown from the location of the wait(), so Errors must be caught at this location.

The following example shows a snippet (from an ACTOR) of waiting on a Future:

Future<int> f = asyncCalculation(); // defined elsewhere
int count = wait( f );
printf( "%d\n", count );

It is worth nothing that, although the function wait() is declared in actorcompiler.h, this “function” is compiled by the Actor Compiler into a complex set of integrated statements and callbacks. It is therefore never present in generated code or at link time. Note : because of the way that the actor compiler is built, wait() must always assign the resulting value to a newly declared variable.

From 6.1, wait() on Void actors shouldn't assign the resulting value. So, the following code

Future<Void> asyncTask(); //defined elsewhere
wait(asyncTask());

becomes

Future<Void> asyncTask(); //defined elsewhere
wait(asyncTask());

ACTOR

The only code that can call the wait() function are functions that are themselves labeled with the “ACTOR” tag. This is the essential unit of asynchronous work that can be chained together to create complex message-passing systems. An actor, although declared as returning a Future<T>, simply returns a T. Because an actor may wait on the results of other actors, an actor must return either a Future or void. In most cases returning void is less advantageous than returning a Future, since there are implications for actor cancellation. See the Actor Cancellation section for details.

The following simple actor function waits on the Future to be ready, when it is ready adds offset and returns the result:

ACTOR Future<int> asyncAdd(Future<int> f, int offset) {
    int value = wait( f );
    return value + offset;
}

State Variables

Since ACTOR-labeled functions are compiled into a c++ class and numerous supporting functions, the variable scoping rules that normally apply are altered. The differences arise out of the fact that control flow is broken at wait() statements. Generally the compiled code is broken into chunks at wait statements, so scoping variables so that they can be seen in multiple “chunks” requires the state keyword. The following function waits on two inputs and outputs the sum with an offset attached:

ACTOR Future<int> asyncCalculation(Future<int> f1, Future<int> f2, int offset ) {
    state int value1 = wait( f1 );
    int value2 = wait( f2 );
    return value1 + value2 + offset;
}

Void

The Void type is used as a signalling-only type for coordination of asynchronous processes. The following function waits on an input, send an output to a Promise, and signals completion:

ACTOR Future<Void> asyncCalculation(Future<int> f, Promise<int> p, int offset ) {
    int value = wait( f );
    p.send( value + offset );
    return Void();
}

PromiseStream<>, FutureStream<>

PromiseStream and FutureStream are groupings of a series of asynchronous messages.

These allow for two important features: multiplexing and network reliability, discussed later. They can be waited on with the waitNext() function.

waitNext()

Like wait(), waitNext() pauses program execution and awaits the next value in a FutureStream. If there is a value ready in the stream, execution continues without delay. The following “server” waits on input, send an output to a PromiseStream:

ACTOR void asyncCalculation(FutureStream<int> f, PromiseStream<int> p, int offset ) {
    while( true ) {
        int value = waitNext( f );
        p.send( value + offset );
    }
}

choose / when

The choose / when construct allows an Actor to wait for multiple Future events at once in a ordered and predictable way. Only the when associated with the first future to become ready will be executed. The following shows the general use of choose and when:

choose {
    when( int number = waitNext( futureStreamA ) ) {
        // clause A
    }
    when( std::string text = wait( futureB ) ) {
        // clause B
    }
}

You can put this construct in a loop if you need multiple when clauses to execute.

Future composition

Futures can be chained together with the result of one depending on the output of another.

ACTOR Future<int> asyncAddition(Future<int> f, int offset ) {
    int value = wait( f );
    return value + offset;
}

ACTOR Future<int> asyncDivision(Future<int> f, int divisor ) {
    int value = wait( f );
    return value / divisor;
}

ACTOR Future<int> asyncCalculation( Future<int> f ) {
    int value = wait( asyncDivision(
    asyncAddition( f, 10 ), 2 ) );
    return value;
}

Design Patterns

RPC

Many of the “servers” in FoundationDB that communicate over the network expose their interfaces as a struct of PromiseStreams--one for each request type. For instance, a logical server that keeps a count could look like this:

struct CountingServerInterface {
    PromiseStream<int> addCount;
    PromiseStream<int> subtractCount;
    PromiseStream<Promise<int>> getCount;

    // serialization code required for use on a network
    template <class Ar>
    void serialize( Ar& ar ) {
        serializer(ar, addCount, subtractCount, getCount);
    }
};

Clients can then pass messages to the server with calls such as this:

CountingServerInterface csi = ...; // comes from somewhere
csi.addCount.send(5);
csi.subtractCount.send(2);
Promise<int> finalCount;
csi.getCount.send(finalCount);
int value = wait( finalCount.getFuture() );

There is even a utility function to take the place of the last three lines: [TODO: And is necessary when sending requests over a real network to ensure delivery]

CountingServerInterface csi = ...; // comes from somewhere
csi.addCount.send(5);
csi.subtractCount.send(2);
int value = wait( csi.getCount.getReply<int>() );

Canonically, a single server ACTOR that implements the interface is a loop with a choose statement between all of the request types:

ACTOR void serveCountingServerInterface(CountingServerInterface csi) {
    state int count = 0;
    loop {
        choose {
            when (int x = waitNext(csi.addCount.getFuture())){
                count += x;
            }
            when (int x = waitNext(csi.subtractCount.getFuture())){
                count -= x;
            }
            when (Promise<int> r = waitNext(csi.getCount.getFuture())){
                r.send( count ); // goes to client
            }
        }
    }
}

In this example, the add and subtract interfaces modify the count itself, stored with a state variable. The get interface is a bit more complicated, taking a Promise<int> instead of just an int. In the interface class, you can see a PromiseStream<Promise<int>>. This is a common construct that is analogous to sending someone a self-addressed envelope. You send a promise to a someone else, who then unpacks it and send the answer back to you, because you are holding the corresponding future.

Flatbuffers/ObjectSerializer

  1. Motivation and Goals
  2. Correspondence to flatbuffers IDL
    • Tables
    // Flow type
    struct A {
        constexpr static FileIdentifier file_identifier = 12345;
        int a;
        template <class Ar>
        void serialize(Ar& ar) {
            serializer(ar, a);
        }
    }
    
    // IDL equivalent
    table A {
        a:int;
    }
    
    • Unions
    // Flow type
    using T = std::variant<A, B, C>;
    
    // IDL equivalent
    union T { A, B, C}
    
    • Strings (there's a string type in the idl that guarantees null termination, but flow does not, so it's comparable to a vector of bytes)
    // Flow type
    StringRef, std::string
    
    // IDL equivalent
    [ubyte]
    
    • Vectors
    // Flow type
    VectorRef<T>, std::vector<T>
    
    // IDL equivalent
    [T]
    

TODO finish documenting/implementing the following.

  1. Vtables collected from default-constructed instances
  2. Requirements (serialize must be cheap for a default-constructed instance, must have a serialize method or implement a trait.)
  3. Traits/Concepts: vector_like, union_like, dynamic_size, scalar
  4. isDeserializing idiom
  5. Gotchas (serialize gets called more than once on save path, maybe more)
  6. File identifiers
  7. Schema evolution
  8. Testing plan: have buggify sometimes default initialize fields that are introduced without changing the protocol version.
  9. (Future work) Allow ObjectSerializer to take the usual version specifications, IncludeVersion, AssumeVersion, or Unversioned.
  10. (Future work) Smaller messages for deprecated fields
  11. (Future work) Deprecated<...> template that knows whether or not the field was present? Automatically buggifies the field being absent?

ACTOR return values

An actor can have only one returned Future, so there is a case that one actor wants to perform some operation more than once:

ACTOR Future<Void> periodically(PromiseStream<Void> ps, int seconds) {
    loop {
        wait( delay( seconds ) );
        ps.send(Void());
    }
}

In this example, the PromiseStream is actually a way for the actor to return data from some operation that it ongoing.

“gotchas”

Actor compiler

There are some things about the actor compiler that can confuse and may change over time

Switch statements

Do not use these with wait statements inside!

try/catch with no wait()

When a try/catch block does not wait() the blocks are still decomposed into separate functions. This means that variables that you want to access both before and after such a block will need to be declared state.

ACTOR cancellation

When the reference to the returned Future of an actor is dropped, that actor will be cancelled. Cancellation of an actor means that any wait()s that were currently active (the callback was currently registered) will be delivered an exception (actor_cancelled). In almost every case this exception should not be caught, though there are cetainly exceptions!

Memory Management

Reference Counting

The FoundationDB solution uses reference counting to manage the lifetimes of many of its constituent classes. In order for a class T to be reference counted, the following two globally defined methods must be defined (see FastRef.h):

void addref(T*);
void delref(T*);

The easiest way to implement these methods is by making your class a descendant of ReferenceCounted.

NOTE: Any descendants of ReferenceCounted should either have virtual destructors or be sealed. If you fail to meet these criteria, then references to descendants of your class will never be deleted.

If you choose not to inherit from ReferenceCounted, you will have to manage the reference count yourself. One way this can be done is to define void addref() and void delref() methods on your class, which will make it compatible with the existing global addref and delref methods. Otherwise, you will need to create the global addref and delref methods for your class, as mentioned above. In either case, you will need to manage the reference count on your object and delete it when the count reaches 0. Note that the reference count should usually be initialized to 1, as the addRef(T*) function is not called when the object is created.

To create a reference counted instance of a class T, you instantiate a Reference<T> on the stack with a pointer to your T object:

Reference<T> refCountedInstance(new T());

The Reference<T> class automatically calls addref on your T instance every time it is copied (such as by argument passing or assignment), but not when the object is initially created (consequently, ReferenceCounted classes are initialized with reference count 1). It will call delref on your T instance whenever a particular Reference<T> instance gets deleted (usually by going out of scope). When no more instances of Reference<T> holding a particular T instance exist, then that T instance will be destroyed.

Potential Gotchas

Reference Cycles

You must be cautious about creating reference cycles when using reference counting. For example, if two Reference<T> objects refer to each other, then without specific intervention their reference counts will never reach 0 and the objects will never be deleted.

Arenas

In addition to using reference counting, the FoundationDB solution also uses memory pools to allocate buffers. In this scheme, buffers are allocated from a common pool, called an Arena, and remain valid for the entire lifetime of that Arena. When the Arena is destroyed, all of the memory it held for the buffers is deallocated along with it. As a general convention, types which can use these Arenas and do not manage their own memory are given the "Ref" suffix. When a *Ref object is being used, consideration should be given to how its buffers are being managed (much in the same way that you would consider memory management when you see a T*).

As an example, consider the StringRef class. A StringRef is an object which contains a pointer to a sequence of bytes, but does not actually manage that buffer. Thus, if a StringRef is deallocated, the data remains intact. Conversely, if the data is deallocated, the StringRef becomes invalid. In order for the StringRef to manage its own buffer, we need to create an instance of the Standalone<StringRef> class:

Standalone<StringRef> str("data");

A Standalone<T> object has its own arena (technically, it is an Arena), and for classes like StringRef which support the use of arenas, the memory buffers used by the class are allocated from that arena. Standalone<T> is also a subclass of T, and so for all other purposes operates just like a T.

There are a number of classes which support the use of arenas, and some which have convenience types for their Standalone versions (not a complete list):

T Standalone alias
StringRef N/A
LiteralStringRef N/A
KeyRef Key
ValueRef Value
KeyValueRef KeyValue
KeyRangeRef KeyRange
KeySelectorRef KeySelector
VectorRef N/A

The VectorRef<T> class is an std::vector-like object which is used to manage a list of these *Ref objects. A Standalone<VectorRef<T>> has its own arena which can be used to store the buffers held by its constituents. In order for that to happen, one of the two deep insertion methods (push_back_deep or append_deep) should be used when placing items in the vector. The shallow insertion methods will hold the objects only; any arena-managed memory is not copied. Thus, the Standalone<VectorRef<T>> will hold the T objects without managing their memory. Note that the arena(s) used by the VectorRef need not be its own (and cannot be unless the VectorRef is a Standalone object), and are determined by arguments to the functions that insert items.

VectorRef<T> can also be used with types besides the standard Ref types, in which case the deep copy methods should not be used. In this case, the VectorRef<T> object holds the items in an arena much like a normal vector would hold the items in its buffer. Again, the arena used by the VectorRef<T> need not be its own.

When a Standalone<T> is copied (e.g. by argument passing or assignment) to another Standalone<T>, they will share the same memory. The actual memory contents of the arena are stored in a reference counted structure (ArenaBlock), so the memory will persist until all instances of Arena holding that memory are destroyed. If instead a T object is copied to a Standalone<T>, then its entire contents are copied into the arena of the new Standalone<T> object using a deep copy. Thus, it is generally more efficient to consistently use *Ref objects and manage the memory with something external, or to consistently use Standalone<T> objects (where assignments just increment reference counters) to avoid memory copies.

Potential Gotchas

Function Creating and Returning a non-Standalone Ref Object

A function which creates a Ref object should generally return a Standalone version of that object. Otherwise, make certain that the Arena on which that Ref object was created still exists when the caller uses the returned Ref.

Assigning Returned Standalone Object to non Standalone Variable

A caller which receives a Standalone return value should assign that return value to a Standalone variable. Consider the following example:

Standalone<StringRef> foo() {
    return Standalone<StringRef>("string");
}

void bar() {
    StringRef val = foo();
}

When val is copy-assigned in bar, its data is stored in the Arena of the StringRef that was returned from foo. When this returned StringRef is subsequently deallocated, val will no longer be valid.

Use of Standalone Objects in ACTOR Functions

Special care needs to be taken when using using Standalone values in actor functions. Consider the following example:

ACTOR Future<void> foo(StringRef param)
{
    //Do something
    return Void();
}

ACTOR Future<Void> bar()
{
    Standalone<StringRef> str("string");
    wait(foo(str));
    return Void();
}

Although it appears at first glance that bar keeps the Arena for str alive during the call to foo, it will actually go out of scope in the class generated by the actor compiler. As a result, param in foo will become invalid. To prevent this, either declare param to be of type Standalone<StringRef> or make str a state variable.