updated anti-features, known limitations, and fault tolerance documentation

This commit is contained in:
Evan Tschannen 2018-03-15 15:00:07 -07:00
parent 9f89454638
commit 0cc3ffa335
3 changed files with 33 additions and 34 deletions

View File

@ -37,17 +37,10 @@ The rise of mobile computing has led to the model of *disconnected operation* in
While a central server running FoundationDB could be used as a database for a mobile application to connect and sync to from time to time, FoundationDB's core does not itself directly provide disconnected operation. Because it would sacrifice ACID properties, we believe that in those applications where disconnected operation is needed, the database is the wrong tier to implement it.
Long-running transactions
=========================
Long-running read/write transactions
====================================
FoundationDB aims to provide low latencies across a range of metrics. Transaction latencies, in particular, are typically under 15 milliseconds. Some applications require very large operations that require several seconds or more, several orders of magnitude larger than our usual transaction latency. Large operations of this kind are best approached in FoundationDB by decomposition into a set of smaller transactions.
FoundationDB does not support *long-running transactions*, currently defined as those
:ref:`lasting over five seconds <long-transactions>`. The system employs multiversion concurrency control and maintains older versions of the database for a five second period. A transaction that is kept open longer will not be able to commit. If you have a requirement to support large operations, we would be happy to assist you to implement a decomposition strategy within a layer.
Content delivery networks (CDN)
===============================
A *content delivery network* (CDN) employs geographically dispersed datacenters to serve data with high performance to similarly dispersed end-users. While FoundationDB does support multiple datacenters, it has not been designed as a CDN. The FoundationDB core does not locate data in a geographically aware manner and does not aim to provide low write latencies (e.g., under 5 milliseconds) over large geographic distances.
In FoundationDB's configuration for multiple datacenters, each datacenter contains a complete, up-to-date copy of the database. Each client will have a primary datacenter, with other datacenters acting in a secondary mode to support minimal downtime if a datacenter becomes unavailable.
FoundationDB does not support *long-running read/write transactions*, currently defined as those
:ref:`lasting over five seconds <long-transactions>`. The system employs multiversion concurrency control and maintains conflict information for a five second period. A transaction that is kept open longer will not be able to commit.

View File

@ -23,10 +23,12 @@ Any distributed system faces some basic probabilistic constraints. For example,
FoundationDB improves these probabilities by selecting "teams" of machines on which to distribute data. Instead of putting each chunk of data on a different set of machines, each machine can participate in multiple teams. In the above example, by selecting only 450 teams of 4 machines that each chunk of data can be on, the chance of data unavailability is reduced to about 0.5%.
The number of machines in each team is based on the replication mode, the total number of teams increases with the size of the cluster.
Independence assumptions
========================
As a further refinement, FoundationDB can be made aware that certain machines might tend to fail together. For example, every machine in a rack might share a network and power connection. If either failed, then the entire rack of machines would fail. We use this knowledge when choosing teams, taking care not to place any two machines in a team that would have a tendency to fail together. Pieces of data can then be intelligently distributed across racks or even datacenters, so that characteristic multimachine failures (for example, based on rack configuration) do not cause service interruption or data loss. Using this method, FoundationDB can continuously operate through a failure of an entire datacenter.
As a further refinement, FoundationDB can be made aware that certain machines might tend to fail together by specifying the locality of each process. For example, every machine in a rack might share a network and power connection. If either failed, then the entire rack of machines would fail. We use this knowledge when choosing teams, taking care not to place any two machines in a team that would have a tendency to fail together. Pieces of data can then be intelligently distributed across racks or even datacenters, so that characteristic multimachine failures (for example, based on rack configuration) do not cause service interruption or data loss. Our ``three_data_hall`` and ``multi_dc`` configurations use this technique to continuously operate through a failure of a data hall or datacenter respectively.
Other types of failure
======================

View File

@ -20,28 +20,6 @@ Design limitations
These limitations come from fundamental design decisions and are unlikely to change in the short term. Applications using FoundationDB should plan to work around these limitations. See :doc:`anti-features` for related discussion of our design approach to the FoundationDB core.
.. _long-transactions:
Long transactions
-----------------
FoundationDB currently does not support transactions running for over five seconds. In particular, after 5 seconds from the first read in a transaction:
* subsequent reads that go to the database will usually raise a ``past_version`` :doc:`error <api-error-codes>` (although reads cached by the client will not);
* a commit with any write will raise a ``past_version`` or ``not_committed`` :doc:`error <api-error-codes>`.
Clients need to avoid these cases. For the design reasons behind this limitation, see the discussion in :doc:`anti-features`.
.. admonition:: Workarounds
The effect of long and large transactions can be achieved using short and small transactions with a variety of techniques, depending on the desired behavior:
* If an application wants long transactions because of an external process in the loop, it can perform optimistic validation itself at a higher layer.
* If it needs long-running read snapshots, it can perform versioning in a layer.
* If it needs large bulk inserts, it can use a level of indirection to swap in the inserted data quickly.
As with all data modeling problems, please ask for help on the community site (or via e-mail) with your specific needs.
.. _large-transactions:
Large transactions
@ -90,12 +68,36 @@ The current version of FoundationDB resolves key selectors with large offsets in
The RankedSet layer provides a data structure in which large offsets and counting operations require only O(log N) time. It is a good choice for applications such as large leaderboards that require such functionality.
Not a security boundary
-----------------------
Anyone who can connect to a FoundationDB cluster can read and write every key in the database. There is no user-level access control. External protections must be put into place to protect your database.
Current limitations
===================
These limitations do not reflect fundamental aspects of our design and are likely be resolved or mitigated in future versions. Administrators should be aware of these issues, but longer-term application development should be less driven by them.
.. _long-transactions:
Long running transactions
-------------------------
FoundationDB currently does not support transactions running for over five seconds. In particular, after 5 seconds from the first read in a transaction:
* subsequent reads that go to the database will usually raise a ``transaction_too_old`` :doc:`error <api-error-codes>` (although reads cached by the client will not);
* a commit with any write will raise a ``transaction_too_old`` or ``not_committed`` :doc:`error <api-error-codes>`.
Long running read/write transactions are a design limitation, see the discussion in :doc:`anti-features`.
.. admonition:: Workarounds
The effect of long and large transactions can be achieved using short and small transactions with a variety of techniques, depending on the desired behavior:
* If an application wants long transactions because of an external process in the loop, it can perform optimistic validation itself at a higher layer.
* If it needs long-running read snapshots, it can perform versioning in a layer.
* If it needs large bulk inserts, it can use a level of indirection to swap in the inserted data quickly.
Cluster size
------------
@ -116,3 +118,5 @@ FoundationDB load balances reads across the servers with replicas of the data be
If data is accessed exceptionally frequently, an application could avoid this limitation by storing such data in multiple subspaces, effectively increasing its replication factor.