Commit Graph

49 Commits

Author SHA1 Message Date
Alex Miller 7a500cd37f A giant translation of TaskFooPriority -> TaskPriority::Foo
This is so that APIs that take priorities don't take ints, which are
common and easy to accidentally pass the wrong thing.
2019-06-25 02:47:35 -07:00
A.J. Beamon 603721e125 Merge branch 'master' into thread-safe-random-number-generation
# Conflicts:
#	fdbclient/ManagementAPI.actor.cpp
#	fdbrpc/AsyncFileCached.actor.h
#	fdbrpc/genericactors.actor.cpp
#	fdbrpc/sim2.actor.cpp
#	fdbserver/DiskQueue.actor.cpp
#	fdbserver/workloads/BulkSetup.actor.h
#	flow/ActorCollection.actor.cpp
#	flow/Net2.actor.cpp
#	flow/Trace.cpp
#	flow/flow.cpp
2019-05-23 08:35:47 -07:00
mpilman 44db3450ec Several flatbuffers bug fixes 2019-05-13 14:15:23 -07:00
A.J. Beamon 5f55f3f613 Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used. 2019-05-10 14:01:52 -07:00
Andrew Noyes 6888baf1c3 Fix compile error 2019-04-16 15:44:55 -07:00
Andrew Noyes 08e66d1764 Prefer success to cast-to-void 2019-04-16 15:29:03 -07:00
Andrew Noyes 781b6ece77 Fix OPEN_FOR_IDE -Wunused-variable warnings
CC #1255, #1173
2019-04-16 15:28:01 -07:00
Evan Tschannen 05869a8383 do not log a degraded reset message if the previous reset was more than a week ago 2019-04-07 23:00:58 -07:00
Evan Tschannen 390ab9cfed A process will mark itself as degraded if it continually disconnects from a different process which the failure monitor thinks is healthy 2019-04-04 14:11:12 -07:00
Evan Tschannen 5392742902 fixed review comments 2019-03-12 14:38:54 -07:00
Evan Tschannen 2ff37f49da fix: compiler error 2019-03-10 22:56:12 -07:00
Evan Tschannen c6e94293bf reset a process to not be degraded after 2 days 2019-03-10 22:39:21 -07:00
Alex Miller 4891d31cc0 Remove a broken ASSERT.
It's now totally valid to have:
permits=3
take(1)
take(2)
take(72)

and the take(72) will only be granted once the first two finish.
2019-03-07 18:31:18 -08:00
Alex Miller c6a65389ae Remove noexcept macro and replace with BOOST_NOEXCEPT.
BOOST_NOEXCEPT does what the noexcept macro was supposed to do, but in a
way that is correctly maintained over time.
2019-03-05 22:06:12 -08:00
Alex Miller 75b2546e35 Allow FlowLock::take to succeed if there's no other active requests. 2019-03-04 01:42:38 -08:00
Alex Miller 98bb58628f Upgrade FlowLock from int to int64_t. 2019-03-04 01:42:38 -08:00
Jingyu Zhou 5e6577cc82 Final cleanup per review comments
Make distributor interface optional in ServerDBInfo and many other small
changes.
2019-02-14 16:37:17 -08:00
Jingyu Zhou 578473a974 Various review comments fixes 2019-02-14 16:37:16 -08:00
Alex Miller 0750dc0418 Change store from (Future, T&) to (T&, Future).
LHS = RHS, and the name of what's being modified is easier to find.
2019-02-04 18:04:22 -08:00
Evan Tschannen 1f3b6e8bdf Merge branch 'release-6.0'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbclient/BackupContainer.actor.cpp
#	fdbclient/BlobStore.actor.cpp
#	versions.target
2018-11-27 14:41:46 -08:00
Stephen Atherton ec9410492d Changed backup folder scheme to use a much smaller number of unique folders for kv range files, which will speed up list and expire operations. The old scheme is still readable but will no longer be written except in simulation in order to test backward compatibility. 2018-11-23 05:23:56 -08:00
Robert Escriva 268093a96d Adjust all includes to be relative to the root.
Remove the use of relative paths.  A header at foo/bar.h could be included by
files under foo/ with "bar.h", but would be included everywhere else as
"foo/bar.h".  Adjust so that every include references such a header with the
latter form.

Signed-off-by: Robert Escriva <rescriva@dropbox.com>
2018-10-19 17:35:33 +00:00
Alex Miller 7feb5d8209 Remove including flow.h in actorcompiler.h, and fix resulting breakage.
For files that required flow.h, and only got it through actorcompiler.h,
their version of flow.h would have the actorcompiler #defines defined.
Then, if it included a STL/boost file, the same breakage would result.

This needs to not happen, so the include of flow.h in actorcompiler.h
was removed.
2018-08-14 15:50:26 -07:00
Alex Miller fb31a6999f Rewrite all files to have #include actorcompiler.h as the last include. 2018-08-14 15:50:26 -07:00
Alex Miller 07e5281142 Restrict actor keyword #defines to actor files.
This introduces a new rule in our codebase, that any file that #includes
actorcompiler.h needs to do it as the last #include, and it needs to
then #include unactorcompiler.h at the end of the file.

The point of this is that it prevents our actorcompiler.h #defines from
leaking into boost or the c++ standard library.  Both of these start
throwing errors if you s/state// their code, which `#define state `
effectively does.
2018-08-14 15:50:26 -07:00
Alex Miller 535b5701e5 Rewrite all `Void _ = wait(...)` -> `wait(...)`.
This takes advantage of the new actorcompiler functionality to avoid
having duplicate definitions of `Void _` when trying to feed the
un-actorompiled source through clang.
2018-08-14 15:50:26 -07:00
A.J. Beamon 3535ddad80
Merge pull request #674 from alexmiller-apple/glibcxx-debug-fixes
Fix bugs uncovered by -D_GLIBCXX_DEBUG
2018-08-09 08:18:51 -07:00
Alex Miller 1a7cda4149 Stop performing self-moves. (e.g. a = std::move(a))
self-moves are frowned upon in C++, and in our code this generally happens from
calls to swap as part of trying to implement a "unordered erase" function via
swap-to-the-end-and-pop_back.  For convenience, a swapAndPop() function is now
offered that performs this, while disallowing self-moves.
2018-08-01 18:09:54 -07:00
Evan Tschannen 1c29275672 call all methods which could disable a trace event before it is initialized. In practice this means calling .error first, then .suppressFor, then all your details. 2018-08-01 14:30:57 -07:00
Evan Tschannen e0caa28758 code cleanup 2018-07-16 15:56:43 -07:00
Evan Tschannen f72a9f60c0 only disable fearless if a datacenter has actually been killed
fix: we must prevent recovery into the dead datacenter while reducing usable_regions
2018-07-16 10:06:57 -07:00
Evan Tschannen 8bd7eaebdb fix: broken_promise from push can be throw into the proxy’s actor collection 2018-06-21 15:55:27 -07:00
Evan Tschannen 889889323e The master will tell the cluster controller if it is going to take a long time to recruit new logs in its DC; the cluster controller can determine if the other DC would be better and recruit there.
The cluster controller will not switch to the other data center if remote logs are too far behind.
We will not recruit in DCs with negative priority.
2018-06-13 18:14:14 -07:00
Alex Miller 8518f6a8a8 smartQuorum shouldn't return if more responses are desired than futures provided. 2018-06-12 16:50:25 -07:00
Evan Tschannen e4d5817679 fix: we must server getTeam requests before readyToStart is set because we cannot complete relocateShard requests without getTeam responses from both team collections 2018-06-07 16:14:40 -07:00
Evan Tschannen 7af892f50b first working version of non-copying recovery working with fearless configurations 2018-04-08 21:24:05 -07:00
Evan Tschannen b36e08f08f first version of non-copying recovery. Upgrades are broken, and it has not been tested using fearless configurations yet 2018-03-29 15:12:38 -07:00
Evan Tschannen 37a6a81634 Merge commit '7f6fc3e039c911cd84b8540f7f799fc38a1c1822' into feature-remote-logs
# Conflicts:
#	fdbserver/workloads/RestartRecovery.actor.cpp
2018-02-23 12:33:28 -08:00
Alec Grieser 0bae9880f1 remove trailing whitespace from our copyright headers ; fixed formatting of python setup.py 2018-02-21 10:25:11 -08:00
Evan Tschannen 42405c78a5 Merge commit '4038bd2fd968d88861f2cebd442ce511724816cb' into feature-remote-logs
# Conflicts:
#	fdbserver/ClusterController.actor.cpp
#	fdbserver/Knobs.cpp
2018-02-10 12:08:52 -08:00
Evan Tschannen ebd94bb654 removed a separately configurable storage team size for the remote data center, because it did not make sense
fix: the master did not monitor for the failure of remote logs
stop merge attempts when a data center is failed
fixed a variety of other problems with data distribution when a data center is failed
2018-02-02 11:46:04 -08:00
A.J. Beamon 080a454051 fix: getVersionstamp would return broken promise if a transaction was disposed before being set. getAddressesForKey would not return when resetPromise was set. 2018-01-31 13:47:36 -08:00
Evan Tschannen 982f0dcb1e Merge pull request #222 from cie/alexmiller/drtimefix2
Fix yet another VersionStamp DR issue.
2017-12-20 15:09:23 -08:00
Alex Miller b5a6bc0ab7 Fix VersionStamp problems by instead adding a COMMIT_ON_FIRST_PROXY transaction option.
Simulation identified the fact that we can violate the
VersionStamps-are-always-increasing promise via the following series of events:

1. On proxy 0, dumpData adds commit requests to proxy 0's commit promise stream
2. To any proxy, a client submits the first transaction of abortBackup, which stops further dumpData calls on proxy 0.
3. To any proxy that is not proxy 0, submit a transaction that checks if it needs to upgrade the destination version.
4. The transaction from (3) is committed
5. Transactions from (1) are committed

This is possible because the dumpData transactions have no read conflict
ranges, and thus it's impossible to make them abort due to "conflicting"
transactions.  There's also no promise that if client C sends a commit to proxy
A, and later a client D sends a commit to proxy B, that B must log its commit
after A.  (We only promise that if C is told it was committed before D is told
it was committed, then A committed before B.)

There was a failed attempt to fix this problem.  We tried to add read conflict
ranges to dumpData transactions so that they could be aborted by "conflicting"
transactions.  However, this failed because this now means that dumpData
transactions require conflict resolution, and the stale read version that they
use can cause them to be aborted with a transaction_too_old error.
(Transactions that don't have read conflict ranges will never return
transaction_too_old, because with no reads, the read snapshot version is
effectively meaningless.)  This was never previously possible, so the existing
code doesn't retry commits, and to make things more complicated, the dumpData
commits must be applied in order.  This would require either adding
dependencies to transactions (if A is going to commit then B must also be/have
committed), which would be complicated, or submitting transactions with a fixed
read version, and replaying the failed commits with a higher read version once
we get a transaction_too_old error, which would unacceptably slow down the
maximum throughput of dumpData.

Thus, we've instead elected to add a special transaction option that bypasses
proxy load balancing for commits, and always commits against proxy 0.  We can
know for certain that after the transaction from (2) is committed, all of the
dumpData transactions that will be committed have been added to the commit
promise stream on proxy 0.  Thus, if we enqueue another transaction against
proxy 0, we can know that it will be placed into the promise stream after all
of the dumpData transactions, thus providing the semantics that we require:  no
dumpData transaction can commit after the destination version upgrade
transaction.
2017-12-20 15:04:04 -08:00
Stephen Atherton 7caa012fbf Added snapshot interval option to "fdbbackup start" which defaults to a new knob's value. Added snapshot info to backup status text. Improvements to fdbbackup help. 2017-12-20 00:49:08 -08:00
Stephen Atherton f8e89a40ac Bug fixes, take(1) is incorrect usage of FlowLock. 2017-12-04 10:25:47 -08:00
Evan Tschannen b1e3864c0e fix: stop after does not print errors if actor cancelled 2017-11-01 10:58:39 -07:00
Evan Tschannen 19fa31ffff fix: uncancellable was not forwarding errors to the result promise 2017-07-27 13:29:08 -07:00
FDB Dev Team a674cb4ef4 Initial repository commit 2017-05-25 13:48:44 -07:00