foundationdb

Commit Graph

Author	SHA1	Message	Date
Evan Tschannen	0217aed74c	Merge branch 'release-6.0' # Conflicts: # bindings/go/README.md # documentation/sphinx/source/release-notes.rst # fdbserver/MasterProxyServer.actor.cpp # versions.target	2018-10-15 18:38:51 -07:00
Evan Tschannen	0acfae1e76	fixed the windows linker error	2018-10-15 18:19:51 -07:00
Evan Tschannen	a8feecbfad	added a comment to explain code ordering	2018-10-12 16:27:13 -07:00
Evan Tschannen	8ed4ce183c	Merge branch 'release-6.0' of github.com:apple/foundationdb into release-6.0	2018-10-12 14:56:19 -07:00
Evan Tschannen	17a1e3ce35	fix: the master proxy would log an OpCommit for empty commits to the txnStateStore	2018-10-12 12:58:17 -07:00
A.J. Beamon	419231d798	Fix: status was trying to read a metric under the wrong name, leading to an error that caused the cluster to report itself unhealthy and some metrics to be missing.	2018-10-10 13:33:28 -07:00
Evan Tschannen	4c95a5ee0f	added the basic structure for parallel restore	2018-10-09 18:47:28 -07:00
Evan Tschannen	ecddeab2ae	fixed review comments; demote killRegionCycle test for now	2018-10-08 10:39:39 -07:00
Evan Tschannen	1314bcec9e	Merge branch 'release-6.0' # Conflicts: # documentation/sphinx/source/release-notes.rst	2018-10-05 12:54:00 -07:00
Evan Tschannen	47e31133aa	fix: only create a new version if the version has not been created before	2018-10-05 12:37:29 -07:00
Evan Tschannen	06be70bace	fix: if localEnd is smaller than begin, we cannot peek from the local dc	2018-10-05 12:36:34 -07:00
Evan Tschannen	daed31708b	fix: we can only repair dead DCs if we have a fearless configuration	2018-10-05 12:35:37 -07:00
Evan Tschannen	3922e477a5	Merge branch 'release-6.0' # Conflicts: # documentation/sphinx/source/release-notes.rst # fdbclient/ManagementAPI.actor.cpp # fdbserver/ClusterController.actor.cpp # fdbserver/DataDistribution.actor.cpp # fdbserver/LogSystemDiskQueueAdapter.actor.cpp # fdbserver/SimulatedCluster.actor.cpp # fdbserver/TLogServer.actor.cpp	2018-10-03 16:57:18 -07:00
Evan Tschannen	9de55f362b	Merge pull request #793 from ajbeamon/add-new-storage-status-metrics Add new metrics for bytes queried, keys queried, mutation bytes, muta…	2018-10-03 16:34:26 -07:00
Evan Tschannen	598788f60b	Merge pull request #801 from etschannen/feature-fix-forced-recovery Fixed a number of problems with forced recoveries	2018-10-03 16:32:03 -07:00
Evan Tschannen	636420abee	fix: if the disk queue adapter peek hangs for a while, switch to a peek from a different locality	2018-10-03 13:58:55 -07:00
Evan Tschannen	28545e0f8d	multi cursors start a get more for the first 10 cursors to hide latency	2018-10-03 13:57:45 -07:00
Evan Tschannen	aa51d69b2d	fix: set peekLocality for upgraded tags	2018-10-03 13:54:59 -07:00
Evan Tschannen	c9f4109539	fix: add some additional time in the kill region workload to detect if we recovered successfully	2018-10-02 17:47:15 -07:00
Evan Tschannen	cdaf5e1192	fix: forced recovery does not recover tags from any DC besides the surviving one	2018-10-02 17:46:22 -07:00
Evan Tschannen	69711a107b	fix: because of forced recovery, 0 log router tags does not mean we are a special tlog set	2018-10-02 17:45:11 -07:00
Evan Tschannen	e7e1c634e0	fix: we need to restart the peek cursor when the known committed version becomes available	2018-10-02 17:44:14 -07:00
Evan Tschannen	a92fc911ac	do not spin on a failed storage server recruitment	2018-10-02 17:31:07 -07:00
Evan Tschannen	15ce215c1b	fix: parallel peek requests leaked memory	2018-10-02 17:28:39 -07:00
A.J. Beamon	84c2e3567f	Fix keys queried to use the RowsQueried metric instead of BytesQueried.	2018-10-01 11:19:28 -07:00
A.J. Beamon	a98fcf5972	Rename durable_lag to durability_lag	2018-10-01 09:58:49 -07:00
Evan Tschannen	bd6b743a81	fix: the storage server must always keep MAX_READ_TRANSACTION_LIFE_VERSIONS of history in memory, because forced recovery could roll back an epoch end. fix: rollbacks were triggered unnecessarily	2018-09-28 16:04:59 -07:00
Evan Tschannen	3fdf72c626	fix: we need to force recovery if the master is still attempting to read the txs tag	2018-09-28 13:33:33 -07:00
Evan Tschannen	59335aa757	fix: the latest generation of remote transaction logs might has less data the a previous generation, because they take over at known committed version. Detect this case and end at the version that has the most data	2018-09-28 12:25:27 -07:00
Evan Tschannen	c577840020	fix: forced recovery should remove all references to the old primary tlogs in all generations of logs to help the peek logic avoid attempting to read from them	2018-09-28 12:23:09 -07:00
Evan Tschannen	05e7f08b26	added a peek method which will attempt to read the txsTag from the local region as much as possible	2018-09-28 12:21:08 -07:00
Evan Tschannen	a24eadd73a	fix: for remote logs, their known committed version cannot be set to 1, because they can be used when their durable version is 0, leading to a known committed version being greater than a queue committed version	2018-09-28 12:17:21 -07:00
Evan Tschannen	e64c55dce0	fix: data distribution would use the wrong priority sometimes when fixing an incomplete movement, this lead to the cluster thinking the data was replicated in all regions before it actually was	2018-09-28 12:15:23 -07:00
Evan Tschannen	b1fe069165	fix: during forced recovery logs can be removed from the logSystemConfig. We need to avoid killing the removed logs as unneeded until we actually complete the recovery	2018-09-28 12:13:46 -07:00
Evan Tschannen	22e6afbb18	fix: the cluster controller did not pass in its own locality when creating its database object, therefore it was not using locality aware load balancing	2018-09-28 12:12:06 -07:00
Evan Tschannen	b560b94ebc	fix: do not force a recovery if the master was already in the other region (and therefore already recovered) fix: reboot the remaining DC, because any storage server rejoins that were rolled back will cause that server to be unusable	2018-09-28 12:10:04 -07:00
A.J. Beamon	f196e2d4dc	Lot metrics about read requests as well as completed reads.	2018-09-27 15:32:39 -07:00
A.J. Beamon	118e21c446	Add new metrics for bytes queried, keys queried, mutation bytes, mutations, and durable lag to the storage role in status.	2018-09-27 14:33:21 -07:00
Steve Atherton	6756188f53	Merge pull request #760 from ajbeamon/fix-actor-warnings Fix warnings about ACTORs not having waits. Fix shadowing of future v…	2018-09-24 10:07:59 -07:00
A.J. Beamon	48e620c680	Change the first of two trace events named "BTreeIntegrityCheck" to have the name "BTreeIntegrityCheckResults"	2018-09-24 08:40:18 -07:00
A.J. Beamon	92990d6aef	Merge release-6.0 into master	2018-09-21 16:14:39 -07:00
Evan Tschannen	77e2fb787e	Merge branch 'release-6.0' into feature-fix-forced-recovery	2018-09-21 14:55:37 -07:00
Evan Tschannen	3f86905ea7	fix: restore did not take into account that the end version of a log file does not exist in that file. This resulted in restores done at the same version a snapshot completes to not apply the mutations at that final version.	2018-09-21 11:48:28 -07:00
Evan Tschannen	6b6d7a087d	The cluster controller should never consider itself as failed (that will be handled by the coordinators) Simplified the check that the cluster controller is excluded	2018-09-20 17:01:11 -07:00
Evan Tschannen	31d0b0315f	fix: tlog spill policy would spill everything when it wanted to spill nothing use a flow lock to protect updatePersistData and initPersistentState from committing simultaneously	2018-09-20 15:33:38 -07:00
Evan Tschannen	03728db99b	do not trigger better master exists if the cluster controller is excluded, since the master will change anyways once the cluster controller is moved	2018-09-19 18:28:24 -07:00
Evan Tschannen	861c8aa675	consider server health when building subsets of emergency teams	2018-09-19 17:57:01 -07:00
Evan Tschannen	702d018882	fix: we cannot use count on an async map, because someone waiting onChange for an item will cause it to exist in the map before it is set	2018-09-19 16:11:57 -07:00
Evan Tschannen	6d18193b3a	fix: team->setHealthy was not being called correctly on initially unhealthy teams	2018-09-19 14:48:07 -07:00
Evan Tschannen	270b1b24a6	fix: we have to use durableKnownCommittedVersion, because the is the true lower bound on the recovery version of the remote logs fixed a compiler error	2018-09-18 16:29:03 -07:00

1 2 3 4 5 ...

946 Commits