foundationdb

Commit Graph

Author	SHA1	Message	Date
Jingyu Zhou	7a205b1732	Move remoteRecovered to dataDistributionTeamCollection() Let the remote DC to wait until fully recovered before team collection starts.	2019-02-14 16:37:16 -08:00
Jingyu Zhou	3f7bbc68aa	Remove getDistributorInterface from cluster controller	2019-02-14 16:37:16 -08:00
Jingyu Zhou	ef868f599c	Add DataDistributorInterface to ServerDBInfo Also change the Proxy and QuietDatabase to use the DataDistributorInterface.	2019-02-14 16:37:16 -08:00
Jingyu Zhou	0490160714	Fix according to Evan's comments Use getRateInfo's endpoint as the ID for the DataDistributorInterface. For now, added a "rejoined" flag for ClusterControllerData and Proxy. TODO: move DataDistributorInterface into ServerDBInfo.	2019-02-14 16:30:13 -08:00
Jingyu Zhou	c35d1bf2ef	Fix according Alex's comment	2019-02-14 16:30:13 -08:00
Evan Tschannen	1818aab205	Apply suggestions from code review Co-Authored-By: jzhou77 <jingyuzhou@gmail.com>	2019-02-14 16:30:13 -08:00
Jingyu Zhou	886e7ab2ba	Add a new DataDistributor role. Let cluster controller to start a new data distributor role by sending a message to a chosen worker. Change MasterInterface usage in DataDistribution to masterId Add DataDistributor rejoin handling. This allows the data distributor to tell the new cluster controller of its existence so that the controller doesn't spawn a new one. I.e., there should be only ONE data distributor in the cluster. If DataDistributor (DD) doesn't join in a while, then ClusterController (CC) tries to recruit one as DD. CC also monitors DD and restarts one if it failed. The Proxy is also monitoring the DD. If DD failed, the Proxy will ask CC for the new DD. Add GetRecoveryInfo RPC to master server, which is called by data distributor to obtain the recovery Transaction version from the master server.	2019-02-14 16:30:13 -08:00
Meng Xu	8ee8b98122	TeamCollection: Cosmetic change	2019-02-14 15:59:20 -08:00
Vishesh Yadav	907446d0ce	Merge remote-tracking branch 'apple/master' into task/tls-upgrade	2019-02-14 11:37:38 -08:00
A.J. Beamon	8a17905621	Add a couple new files to CMakeLists	2019-02-14 08:08:44 -08:00
A.J. Beamon	b435d51061	Merge branch 'master' into track-server-request-latencies	2019-02-14 08:07:32 -08:00
Meng Xu	628f7ac8c0	TeamCollection: Remove an unused knob	2019-02-13 16:22:55 -08:00
Meng Xu	5481851e82	TeamCollection: Add knobs for team remover Added three knobs to control team remover bool TR_FLAG_DISABLE_TEAM_REMOVER: Disable the teamRemover actor double TR_REMOVE_MACHINE_TEAM_DELAY: Wait for the specified time before try to remove next machine team double TR_WAIT_FOR_ALL_MACHINES_HEALTHY_DELAY: Wait before checking if all machines are healthy	2019-02-13 15:11:56 -08:00
Alex Miller	12123f41d6	Plumb a read function up the stack to IDiskQueue	2019-02-12 23:44:13 -08:00
Alex Miller	6c7229ec07	read fix while recovery	2019-02-12 23:44:13 -08:00
Alex Miller	8b21d1ac8f	Add a standalone recovery initialization function.	2019-02-12 23:44:13 -08:00
Alex Miller	2f49acc8a0	Add a read function.	2019-02-12 23:44:13 -08:00
Alex Miller	63eb62cd36	Fix a bug when a read was delayed until after the entire disk queue has been rewritten.	2019-02-12 23:44:13 -08:00
Alex Miller	9886386a83	temporarily verify commited data as a test for read	2019-02-12 23:44:13 -08:00
Alex Miller	efa8aa7e2e	Adjust findPhysicalLocation to not spam. Context is now optional, so that our high-volume calls don't get logged, but low-volume calls still get logged the same way that they did before.	2019-02-12 23:44:13 -08:00
Alex Miller	f1c31e2305	Add a read function to disk queue	2019-02-12 23:44:13 -08:00
Alex Miller	2d2b03a9ff	prepare DiskQueue for actors	2019-02-12 23:44:13 -08:00
Alex Miller	40fe29c29b	Abstract TrackMe into a reusable CRTP class.	2019-02-12 23:44:13 -08:00
Alex Miller	018d12fe90	use firstpages instead of recoveryfirstpages	2019-02-12 23:43:10 -08:00
Alex Miller	dbf7cefcd8	Add firstPages to DiskQueue	2019-02-12 23:43:10 -08:00
Alex Miller	2570b37e6e	Add function to read pages from RawDiskQueue_TwoFiles	2019-02-12 23:43:10 -08:00
Meng Xu	01e55e43bd	TeamCollection: Minor improve code efficiency and style Rewording the feature item in the release document as well.	2019-02-12 19:10:53 -08:00
Andrew Noyes	65136a2ecd	Forward declare actors with ACTOR keyword. #1148 There are several more occurrences of this, but they're in .h files that now need to be .actor.h files. This gets the easy ones out of the way.	2019-02-12 17:56:20 -08:00
Andrew Noyes	067a445e06	Replace unused _ variables with wait(success(...))	2019-02-12 17:30:30 -08:00
Meng Xu	c8db205fd9	TeamCollection: Fix bug in remove a server When we remove a server due to server failure, we need to remove the related server teams AND remove the server team from the machine team. In the previous commit, we forgot to remove the server team from the machine team.	2019-02-12 16:18:19 -08:00
Meng Xu	fe4f43203d	TeamCollection: getTeam may add a new team getTeam function may add a new team for the GetTeamRequest. We need to check if the number of teams is larger than the desired team number.	2019-02-12 14:57:35 -08:00
Meng Xu	3ae8767ee8	TeamCollection: Apply clang-format	2019-02-12 13:41:18 -08:00
Meng Xu	214a72fba3	TeamCollection: Resolve review comments 1) Reduce the frequency of checking if we need to call teamRemover 2) Improve code efficiency in finding the machine team to remove 3) Remove unused code 4) Add sanity check	2019-02-12 10:59:57 -08:00
mpilman	6da5971e79	Guard all versions.h to not break old WIN32 build	2019-02-08 16:06:00 -08:00
Meng Xu	3b8ae0fe95	TeamCollection: Add into 6.1 release note	2019-02-08 13:50:27 -08:00
Balachandar Namasivayam	f44f26c232	Dynamically rate limit consistency check.	2019-02-07 16:08:39 -08:00
mpilman	7e26b4ef0d	Address comments from PR	2019-02-07 15:37:04 -08:00
mpilman	5737349676	Fix weird bug with boost interprocess Strangely, boost interprocess didn't compile with VS 2017. However, it does compile if it is included as the first thing. I don't quite know what is happening here, but for now this fix makes it that I am not blocked	2019-02-07 15:37:04 -08:00
mpilman	8a94d80deb	fdbservice and fdbrpc now compiling	2019-02-07 15:37:04 -08:00
A.J. Beamon	eb7c678e59	Return Void() in an actor return statement	2019-02-07 14:03:36 -08:00
Meng Xu	7cfe6de27e	TeamCollection: Server team number must match machine team number DESIRED_TEAMS_PER_MACHINE must equal to DESIRED_TEAMS_PER_SERVER. Otherwise, we may have to few machine teams to create enough server teams. Note that BUGGIFY macro value is based on a random number generator. When you have two BUGGIFY, one may be true and the other is false. Also fix a bug in get the number of healthy machine teams.	2019-02-07 13:53:55 -08:00
A.J. Beamon	d4349293b9	Reworked the way latency counters are tracked. Report the latency bands in separate events from StorageMetrics and ProxyMetrics. Fix a problem when the latency band configuration was changed. Add correctness testing.	2019-02-07 13:39:22 -08:00
Meng Xu	76d022f71c	TeamCollection: Remove redundant teams When the total number of teams is larger than the desired number, we should gracefully remove the redundant teams so that the number of teams is kept to a low number and the possibility of losing data is guaranteed to be extremely low even when multiple racks fail at the same time.	2019-02-07 11:24:51 -08:00
Meng Xu	455024b3fe	SimulationTest: Test the number of teams Magnify the possibility that the number of created machine teams is larger than the number of desired machine teams if we do NOT try to remove the surplus machine teams. This help test the upgrade to machine team in FDB 6.1	2019-02-06 11:04:41 -08:00
Evan Tschannen	7e0e0a7673	Merge pull request #1105 from vishesh/task/issue-218-compare-and-clear Implements CompareAndClear AtomicOp	2019-02-05 18:11:28 -08:00
Evan Tschannen	486e0e13c3	Merge pull request #1116 from alexmiller-apple/tstlog Random cleanups that prepare for Spill-By-Reference TLog	2019-02-05 18:09:06 -08:00
Meng Xu	2b73c89e98	TeamCollection: Test the number of teams Call the traceTeamCollectionInfo function to record the team numbers when we add a team directly from the shard information, instead of using addTeamsBestOf logic.	2019-02-05 15:58:16 -08:00
A.J. Beamon	882f8d70b7	Merge pull request #1066 from etschannen/master fix: coordinators auto could put two coordinators in the same zone	2019-02-05 11:52:04 -08:00
Meng Xu	f5171d1b57	TeamCollection: Test the number of teams The current simulator does not validate if the number of teams in the system is larger than the maximum desired number of teams. This validation should be added because we do NOT want too many teams in the system, which may impede the systems availability when multiple fault zones (e.g., machines) crashes at the same time. This commit adds the test at the consistency check in simulation. Since the current code does not handle the upgrading situation when we enforce the machine teams, the test is expected to fail. The later commit will handle the upgrading situation which gracefully remove the surplus teams.	2019-02-04 18:14:36 -08:00
Alex Miller	22a08b2b4e	Change mutable ref to pointer so outparams are obvious.	2019-02-04 18:04:22 -08:00
Alex Miller	0efcccc06f	Fix a long standing minor bug in disk queue that could lead to unnecessary commits. If the disk queue is called with the following series of operations: Push(a) -> 1 Commit() Pop(1) Push(b) Commit() Commit() Then the last Commit() should be a no-op, and not actually run accordingly. However, anyPopped was only set to `false` if no pages were pushed, and thus we'd falsely think that an extra empty page commit needed to happen to log to record the new popped position, but there actually was no new popped page position to record. Aside from the extra commit, it maybe makes getCommitOverhead slightly inaccurate, but that's only used for some accounting inside of the memory storage engine and at a quick glance doesn't look like it should have caused any bad effects. I dug through history, and this code has been this way since the initial commit by Dave, and then no one has touched the anyPopped logic since.	2019-02-04 18:04:22 -08:00
Vishesh Yadav	5985566a8e	Don't issue a second read in storageserver if possible for CompareAndClear If the previous eager read request is equal to the CompareAndClear op key, do not issue a read again.	2019-02-04 16:10:59 -08:00
Vishesh Yadav	c532d5c277	Implements CompareAndClear AtomicOp Adds CompareAndClear mutation. If the given parameter is equal to the current value of the key, the key is cleared. At client, the mutation is added to the operation stack. Hence if the mutation evaluates to clear, we only get to know so when `read()` evaluates the stack in `RYWIterator::kv()`, which is unlike what we currently do for typical ClearRange.	2019-02-04 14:59:56 -08:00
Trevor Clinkenbeard	a09afe5906	Added Throttling workload to test native health metrics API	2019-02-04 13:04:25 -08:00
Evan Tschannen	8bfde8c571	fix: increased the rate of ssl tests too much	2019-02-04 11:39:49 -08:00
Evan Tschannen	5b471699df	fix: restarting simulation tests looked for directories in the wrong location	2019-02-04 11:39:06 -08:00
Trevor Clinkenbeard	93d4ed6339	Fixed typo that was messing up storage server diskUsage calculation	2019-02-02 21:04:29 -08:00
Trevor Clinkenbeard	80cf5e057f	Compute worstStorageNDV for Ratekeeper health metrics	2019-02-02 21:03:02 -08:00
Trevor Clinkenbeard	b7eaaaf1e5	Proxy must update health metrics after receiving GetRateInfoReply	2019-02-02 17:08:54 -08:00
Trevor Clinkenbeard	4daf49ff4d	Proxy runs healthMetricsRequestServer to handle incoming health metrics requests	2019-02-01 10:58:42 -08:00
Evan Tschannen	e9ddd94e27	The failure monitor is given a list of all IP addresses associated with a process The connect packet includes the correct remote address Did a lot of code cleanup Simulation test mixed TLS and non-TLS listeners on the same process	2019-01-31 18:20:14 -08:00
Trevor Clinkenbeard	03e5e3ccbc	Proxies periodically request health metrics from Ratekeeper in the getRate function. Occassionally (determined by DETAILED_METRIC_UPDATE_RATE), requests are for detailed per-process metrics.	2019-01-31 13:25:57 -08:00
Trevor Clinkenbeard	5822bd65bf	Track health metrics in Ratekeeper and send these metrics to proxies in GetRateInfoReply messages	2019-01-31 12:56:58 -08:00
Trevor Clinkenbeard	d7930af2cb	Storage server periodically calculates cpuUsage and diskUsage metrics. These metrics (as well as all other metrics necessary for health metrics calculation) are sent in the StorageQueuingMetricsReply message.	2019-01-31 12:23:04 -08:00
Evan Tschannen	a678f778fa	Increase severity to SevWarnAlways for TooManyStatusRequests trace Co-Authored-By: tclinken <trevorclinkenbeard@gmail.com>	2019-01-28 17:50:50 -08:00
Trevor Clinkenbeard	5b89db811a	Throttle status requests with MAX_STATUS_REQUESTS_PER_SECOND knob, whenever status batching is used.	2019-01-28 15:37:30 -08:00
Evan Tschannen	1d7fec3074	Merge commit '048bfc5c368063d9e009513078dab88be0cbd5b0' into task/tls-upgrade-2 # Conflicts: # .gitignore	2019-01-24 17:43:06 -08:00
Alex Miller	ec32d3308b	Merge pull request #1086 from mpilman/features/c++-compiler-errors Fixed several minor code issues	2019-01-24 15:24:33 -08:00
mpilman	79637f07ac	Fixed several minor code issues These will become a problem as soon as we switch to C++17	2019-01-24 14:43:12 -08:00
A.J. Beamon	2198d24ce1	Merge commit '3b2700d25334c53d13496ca16682642aac951beb' into track-server-request-latencies # Conflicts: # fdbclient/MasterProxyInterface.h # fdbserver/ClusterController.actor.cpp # fdbserver/MasterProxyServer.actor.cpp # fdbserver/ServerDBInfo.h # fdbserver/Status.actor.cpp # fdbserver/fdbserver.vcxproj # fdbserver/storageserver.actor.cpp	2019-01-24 11:43:26 -08:00
mpilman	58964af7e1	ctest improvements - #1058 - A set of CMake variables controls whether to keep the simfdb directory and the traces and whether we want to aggregate the traces into a single file - Test labels now contain the directory they are in so that one can now run `ctest -R fast/` - A different binary can be used for restart tests. CMake will automatically look for an installed fdb and use that by default. If none is found, it will use the built one but it will also print a warning - CMake will throw an error if there are any text files in the tests directory that are not associated with a test. - Moved testing from fdbserver/CMakeLists.txt to tests/CMakeLists.txt - Moved fdb testing functions to its own cmake module	2019-01-22 14:34:51 -08:00
A.J. Beamon	8e05e95045	Added the ability to configure the latency band settings by setting a special key in \xff keyspace.	2019-01-18 16:18:34 -08:00
Evan Tschannen	699f8dd617	fix: coordinators auto could put two coordinators in the same zone simulation now tests two machines in the same zone	2019-01-18 15:42:48 -08:00
A.J. Beamon	7498c2308c	Merge branch 'release-6.0' into track-server-request-latencies	2019-01-16 13:39:01 -08:00
Alex Miller	b4a446756a	Remove more top-level tests that were out of order.	2019-01-14 20:28:40 -08:00
Alex Miller	0d579b4730	Top level tests aren't run by default, and some fail.	2019-01-14 19:14:25 -08:00
Alex Miller	b3e977d7c1	Apply test directory as a label. This isn't ideal, as it makes `restarting/from_5.2.0/potato.txt` have the label "from_5.2.0" instead of "restarting", but it does make the fast label work right.	2019-01-14 19:14:25 -08:00
mpilman	414fb0b6c8	made TestRunner work with XML traces	2019-01-14 19:14:25 -08:00
Markus Pilman	14f0a6958b	added all tests to ctest	2019-01-14 19:14:25 -08:00
Markus Pilman	b096b8e3f8	First tests working with ctest	2019-01-14 19:14:25 -08:00
Evan Tschannen	7dbf06162e	Update fdbserver/ClusterController.actor.cpp Co-Authored-By: bnamasivayam <36455962+bnamasivayam@users.noreply.github.com>	2019-01-14 16:57:00 -08:00
Balachandar Namasivayam	ff661bca22	Fix a minor bug in the RoleFitness Class.	2019-01-14 14:54:54 -08:00
Evan Tschannen	9912d17c35	Merge pull request #1030 from bnamasivayam/master Add some sanity checks to tester.actor.cpp	2019-01-11 17:15:40 -08:00
Evan Tschannen	4eb11d74af	Merge pull request #1029 from bnamasivayam/reenable-check_desired_classes Re-enable CheckDesiredClasses after making necessary changes for mult…	2019-01-11 17:15:05 -08:00
A.J. Beamon	d4d5740282	* Add Optional.map and ErrorOr.map. * Rename Optional/ErrorOr cast_to to castTo. * Make printable(Optional<T>) templated rather than restricted to StringRef types. * Fixes bug in (unused) ErrorOr.castTo where an ErrorOr that was not set would lose its error.	2019-01-11 09:03:38 -08:00
Balachandar Namasivayam	baeaa490e4	Add some sanity checks to tester.actor.cpp	2019-01-10 11:05:50 -08:00
Balachandar Namasivayam	a8e2e75cd5	Re-enable CheckDesiredClasses after making necessary changes for multi-region setup. Fixed a couple of bugs 1) A rare race condition where a worker is being roles even after it died. 2) Fix how RoleFitness is calculated for TLog and LogRouter. Only worst fitness is compared to see if a better fit is available.	2019-01-10 10:28:32 -08:00
Evan Tschannen	684a22a52b	Merge branch 'release-6.0' # Conflicts: # documentation/sphinx/source/release-notes.rst # fdbbackup/backup.actor.cpp # fdbclient/BackupContainer.actor.cpp # fdbclient/HTTP.actor.cpp # fdbserver/storageserver.actor.cpp # fdbserver/workloads/BackupCorrectness.actor.cpp # versions.target	2019-01-09 16:14:46 -08:00
A.J. Beamon	7c5b2ab330	Merge pull request #976 from alexmiller-apple/jsonlogs Allow trace logs to be output as JSON instead of XML	2019-01-09 17:04:50 -05:00
Evan Tschannen	5d2b11cba9	Merge pull request #1019 from satherton/http-request-id Backup usability and sanity check improvements	2019-01-09 13:29:37 -08:00
Vishesh Yadav	51b89ae083	WIP	2019-01-09 07:41:02 -08:00
Stephen Atherton	604ad062d5	Updated backup correctness test to new behavior. WaitBackup() can now return the UID and BackupContainer atomically with the status code for a backup tag.	2019-01-08 18:12:15 -08:00
A.J. Beamon	d265517156	Fix: fast allocator would not cleanup memory for a thread if that thread never called getMagazine. This could happen if the first thing the thread did was to release memory. Added a new metric for the number of threads that hold memory for each size and improve some existing metrics. Fix: a failed ASSERT would crash if done early in the program lifetime.	2019-01-08 14:36:01 -08:00
Evan Tschannen	57293a2db0	byte sample recovery did not use limits for its range reads, leading to slow tasks	2019-01-04 10:32:31 -08:00
Markus Pilman	dbe9baff1f	Several small compilation fixes for new versions of gcc There are several missing includes for cmath in the code, I added those. Next, Coro returns a reference to a stack variable and this causes a warning. As this is probably ok for Coro, I disabled the warning in that file for GCC. I want to have this warning in the build system as it is generally a very useful warning to have. Another change is that major and minor are deprecated for a while now. I replaced those with gnu_dev_major and gnu_dev_minor. ErrorOr currently implements operators ==, !=, and <. These do not compile because Error does not implement ==. This compiles on older versions of gcc and clang because ErrorOr<T>::operator== is not used anywhere. It is still wrong though and newer gcc versions complain. I simply removed these methods. The most interesting fix is that TraceEvent::~TraceEvent is currently throwing exceptions. This is illegal behavior in C++11 and a idea in older versions of C++. For now I simply removed the throw, but this might need some more thought.	2019-01-03 12:44:19 -08:00
Evan Tschannen	4901e37b8f	Merge pull request #983 from alexmiller-apple/compilationfixes Various minor fixes	2019-01-03 10:01:05 -08:00
Andrew Noyes	7eb6765698	Mention that xml is the default	2019-01-03 08:48:31 -08:00
Bhaskar Muppana	aa2a76ef4c	Merge pull request #981 from alexmiller-apple/cmake Add a CMake build system	2019-01-02 18:50:15 -08:00
Andrew Noyes	bce5b03340	Fix whitespace	2019-01-02 15:24:11 -08:00
Alex Miller	73f53f9861	Merge pull request #991 from atn34/replace-ampersand-operator Replace & operator with variadic function	2018-12-30 22:51:59 -06:00
Simon Zhou	7edf221986	Avoid null check	2018-12-28 13:09:04 -08:00
anoyes	6a4d87802b	Replace & operator with variadic function	2018-12-28 11:33:42 -08:00
anoyes	1bca665b29	Document --trace_format flag	2018-12-20 16:22:41 -08:00
anoyes	b8df5acc15	Add --trace_format flag to fdbserver	2018-12-20 15:02:01 -08:00
Alex Miller	bfab7c150a	Require PageHeader to be 36 bytes, and don't use magic numbers.	2018-12-17 13:37:44 -08:00
Alex Miller	b4b7f382a7	Fix issues that a newer compiler warned about.	2018-12-14 14:43:50 -08:00
Meng Xu	486a7b04fa	TeamCollection: Fix build in osX In osX, we cannot adding unsigned long to a string to append to the string.	2018-12-14 13:44:11 -08:00
Alex Miller	550daa05f8	Default configuration now builds.	2018-12-13 15:52:27 -08:00
Markus Pilman	df0f491c29	Some more improvements to the build and preparations for packaging	2018-12-13 15:04:13 -08:00
Alex Miller	e70e59a895	Change some file locations.	2018-12-13 14:53:19 -08:00
Markus Pilman	dce290909d	fdbserver now compiling	2018-12-13 14:13:47 -08:00
Vishesh Yadav	e04abf25f7	simulator: Support multiple listeners on single process Sim2Listener can now take the network address to listen on. This is used to listen to multiple ports in simulator and test the patch which added multiple network addresses to single endpoint.	2018-12-13 13:36:52 -08:00
Vishesh Yadav	3eb9b23024	Listen to multiple addresses and start using vector<NetworkAdddress> in Endpoint - This patch will make FDB listen to multiple addresses given via command line. Although, we'll still use first address in most places, this patch starts using vector<NetworkAddress> in Endpoint at some basic places. - When sending packets to an endpoint, pick a random network address in endpoints - Renames Endpoint::address to Endpoint::addresses since it now holds a vector of addresses.	2018-12-13 13:36:52 -08:00
Vishesh Yadav	43e5a46f9b	Change Endpoint::address(NetworkAddress) to vector<NetworkAddress> Extend `Endpoint` class to take multiple NetworkAddresses instead of just one. Hence, to talk to an endpoint instead of one IP:PORT, we'll have multiple IP:PORT pairs. This patch simply adds the field and makes changes to compile the codebase. The first element of of `address` field is used everywhere. Hence the way we talk to remains same with this patch. NOTE: Directly accessing the first memeber of Endpoint::address is unsafe as Endpoint() doesn't enforces non-empty address list. However, since the correctness test pass for now and are anyway replacing all those unsafe accesses with ones considering the whole vector, this patch ignores to access them in safe way.	2018-12-13 13:36:52 -08:00
Vishesh Yadav	42dffd4dff	Take a vector of network addresses from CLI to start FDB server Extends the CLI interface to take multiple public and listen addresses. We however do not do anything with those extra addresses and just consider the first one for now.	2018-12-13 13:36:52 -08:00
Vishesh Yadav	e8e01b2406	Remove unused localAddress parameter from newNet2 and Net2 classes	2018-12-13 13:36:52 -08:00
Evan Tschannen	d9626895b1	Merge pull request #964 from xumengpanda/mengxu/teamcollection-release TeamCollection: Use machine teams to create server teams to increase availability at scale when a machine has multiple servers	2018-12-13 13:18:54 -08:00
Meng Xu	79d94f78f1	TeamCollection: Improve code efficiency Further improve code efficiency by 1) Avoid rebuild machine locality map when machine locality is changed. This may leave the global machine locality map stale. This is ok as long as we do not use the global map to validate the machine team follows the locality policy. 2) Use ASSERT_WE_THINK instead of ASSERT to avoid runtime overhead. ASSERT_WE_THINK will only validate the condition in simulation mode. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-12-12 22:38:38 -08:00
Meng Xu	e197926c80	TeamCollection: Remove a duplicate function Remove a duplicate function that has different signature. No functionality change. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-12-12 15:21:37 -08:00
Meng Xu	ad7040efcd	TeamCollection: Bug fix in handle server locality change Make sure the link between server and machine is updated in both server and machine. Rename function name to better reflect its functionality. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-12-12 14:03:29 -08:00
Meng Xu	e069b5c31c	TeamCollection: Use clang format No functional change. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-12-06 11:39:35 -08:00
Meng Xu	5d47b9c884	TeamCollection: Handle server locality change A server locality may change from one machine to another. This affects the old machine and machine team the server is on, and the new machine the server moves to. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-12-05 22:23:14 -08:00
Meng Xu	c5047bc8c3	TeamCollection: All machine teams are correct size We only create correct size machine teams. When configuration (e.g., team size) is changed, the DDTeamCollection will be destroyed and rebuilt so that the invariant will not be violated. Based on the invariant, we can count the number of machine teams more quickly. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-12-05 15:09:38 -08:00
Meng Xu	57eab1f283	DataDistribution: Remove addAllTeams function The addAllTeams function can be replaced with the new addTeamsBestOf function by passing a large enough number of teams to build. Remove addAllTeams function and update the related unit tests. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-12-05 15:03:16 -08:00
Meng Xu	38c5c2562b	DataDistribution: Update NotEnoughServers unit test The buggify option may set 1 to the knob parameters (DESIRED_TEAMS_PER_SERVER and MAX_TEAMS_PER_SERVER). When this happens, the number of machine teams to build will be less than what we want, which prevents us from building enough server teams. To avoid this problem, we build machine teams before we call addTeamsBestOf to build server teams. We also add the ASSERT to ensure we build enough machine teams and server teams in the test case. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-12-05 14:36:48 -08:00
Meng Xu	f32c04c834	DataDistribution: Update NotEnoughServers unit test Change the test condition for the NotEnoughServers unit test. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-12-03 23:14:01 -08:00
Meng Xu	54a4d6b308	TeamCollection: Improve code efficiency Improve code efficiency with the following changes: 1) Change always-true if-statement to ASSERT; 2) Return when we are confident we will not find more machine teams. No functionality change. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-12-01 17:10:50 -08:00
Meng Xu	8d6c6e000b	DataDistribution: Mute the NotEnoughServers test Due to the randomness in choosing a server, we cannot gurantee to find all teams. The NotEnoughServers test case may create false positive bug report in the correctness test. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-12-01 13:29:45 -08:00
Meng Xu	68dcec2240	DataDistribution: Change a unit test Try multiple times of addTeamsBestOf() when we cannot find an available team due to the pure randomness in choosing the server teams. The changes for the unit test reduces the false positive in the simulation test results. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-12-01 13:12:55 -08:00
Meng Xu	a43f579f66	TeamCollection: Change 1 unit test Relax the assert condition on the random unit test. Due to the randomness in choosing the machine team and the server team from the machine team, it is possible that we may not find the remaining several (e.g., 1 or 2) available teams. For example, there are at most 10 teams available, and we have found 9 teams, the chance of finding the last one is low when we do pure random selection. It is ok to not find every available team because 1) In reality, we only create a small fraction of available teams, and 2) In practical system, this situation only happens when most of servers are temporarily unhealthy. When this situation happens, we will abandon all existing teams and restart the build team from scratch. In simulation test, the situation happens 100 times out of 128613 test cases when we run RandomUnitTests.txt only. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-12-01 13:11:19 -08:00
Meng Xu	f311455c45	TeamCollection: Cleanup code and add checks Remove unnecessary sanity checks and remove the dead code. Add some necessary sanity checks. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-11-30 17:40:21 -08:00
A.J. Beamon	eb2f27b8e5	Work in progress implementation of server-side latency tracking. The intent of this is to be able to measure the number of requests that achieve certain latency targets across the system relative to the total number of requests.	2018-11-30 10:46:04 -08:00
Meng Xu	ea3bd1502d	TeamCollection: Calculate machine team number Calculate the number of machine teams in the same way as we calculate the number of server teams. Only count the machine teams that has the correct size and is healthy. Simplify code by removing unnecessary check. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-11-29 15:38:23 -08:00
Meng Xu	2b41ad5e57	TeamCollection: Pick server team randomly Pick server team purely randomly instead of picking the least used one. This is to avoid creating correlation in the server teams we pick when new machines are added. The logic is: First pick the one random least used server as chosen server; Then pick a machine team that has the server; Then pick a server on each machine in the machine team. We make sure the chosen server is picked. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-11-28 15:57:53 -08:00
Meng Xu	e4c9d4cbae	TeamCollection: Build all machine teams first Before we build server teams, we build the desired number of machine teams. Then we pick the least used server, from which we pick the least used machine team. Then we pick the least used server on each machine in the least used machine team to get the server team. Note: The logic of building machine teams should be independent from server teams. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-11-27 18:06:36 -08:00
A.J. Beamon	975711c389	Merge branch 'release-6.0' of github.com:apple/foundationdb # Conflicts: # documentation/sphinx/source/release-notes.rst	2018-11-27 09:50:39 -08:00
Meng Xu	4c2c65c1b3	TeamCollection: Replace TraceEvent with ASSERT Replace one TraceEvent that never happens in correctness test with an ASSERT. Change format in one comment. Signed-off-by: Meng xu <meng_xu@apple.com>	2018-11-27 09:48:24 -08:00
Evan Tschannen	530b5e3763	fix: do not track txsPopVersions unless there are remote logs to pop from	2018-11-26 15:17:17 -08:00
Evan Tschannen	512c00d304	added dump token trace events for storage server interfaces after rollbacks	2018-11-26 11:01:10 -08:00
Meng Xu	5cbff740ca	TeamCollection: Add ASSERT Remove sanity check code for performance benefit. Replace TraceEvent(SevError) with ASSERT. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-11-21 13:16:52 -08:00
Meng Xu	8de031f9a6	TeamCollection: clang-format Format the changes with git clang-format. No functional changes. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-11-21 11:18:26 -08:00
Meng Xu	12c3bec968	TeamCollection: Misc changes to resolve review comments No functional change. Report error in TraceEvent when invariant is violated. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-11-19 20:44:52 -08:00
Meng Xu	52c6a66601	TeamCollection: Fix a bug introduced in code review When we GetTeam, the data distribution actor may have zero teams in rare situation in the ConfigureTest.txt test. We should return an empty team in this situation instead of triggering error. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-11-16 16:34:38 -08:00
Meng Xu	f7a7e069f0	TeamCollection: Remove unnecessary comments Pass 41806 tests with no failure Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-11-16 15:56:35 -08:00
Meng Xu	73c58852f0	TeamCollection: Resolve code review comments Resolve code review comments: 1) Improve the code efficiency by avoiding unnecessary map search and avoiding unnecessary checking 2) Remove or comment out trace events when they can be spammy 3) Improve coding style Tested for 1 hour and no error was found. KillRegionCycle.txt test was excluded from the test because existing code cannot pass that test either Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-11-16 15:55:33 -08:00
Meng Xu	5051b35c61	TeamCollection: Use machine team to create server team Current server team collection logic does not consider the fact that multipe storage servers can run on the same machine. When multiple machines fail, all servers on the machines will fail, and the possibility of having one process team fail and lose data is very high. To reduce the possibility of losing data when multiple machine fails, we first create machine teams which span across different fault zones; we then create server teams based on machine teams by first picking 1 machine team, and then picking 1 server from each machine in the machine team. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-11-16 15:53:22 -08:00
Evan Tschannen	e45952bc53	Merge branch 'release-6.0' # Conflicts: # documentation/sphinx/source/release-notes.rst # fdbclient/BackupContainer.actor.cpp # fdbclient/BlobStore.actor.cpp # fdbclient/HTTP.actor.cpp # tests/BlobStore.txt # versions.target	2018-11-13 16:06:39 -08:00
Evan Tschannen	1bd615f954	fix: remoteDcIds will not actually have transaction logs unless usable regions is > 1	2018-11-13 12:36:04 -08:00
Evan Tschannen	4e54690005	Merge branch 'release-6.0' # Conflicts: # fdbserver/DataDistribution.actor.cpp # fdbserver/MoveKeys.actor.cpp	2018-11-12 20:26:58 -08:00
Evan Tschannen	3f3a562f75	updated resolution balancing knobs to be a little more aggressive	2018-11-12 19:11:28 -08:00
Evan Tschannen	239bf882d8	Merge branch 'release-6.0' into feature-resolution-balancing-fix	2018-11-12 18:43:20 -08:00
Evan Tschannen	3f461f3706	updated comments	2018-11-12 18:42:29 -08:00
Evan Tschannen	6353a6724b	strengthened the protections related to changing regions	2018-11-12 17:40:40 -08:00
Evan Tschannen	26c49f21be	fix: we do not know a region is fully replicated until all the initial storage servers have either been heard from or have been removed	2018-11-12 17:39:40 -08:00
Evan Tschannen	3f39024640	buggify resolution balancing so that it still happens in simulation	2018-11-12 00:03:07 -08:00
Evan Tschannen	536ee826da	tuned resolver balancing to keep the resolvers within 5MB per second of each other	2018-11-11 23:42:45 -08:00
Evan Tschannen	50f481b149	fix: peek local should not call peek all, because it is possible to still peek from remote log sets after a special tag	2018-11-11 19:16:25 -08:00
Evan Tschannen	7892da032f	fix: Do not remove the locality entry for the current transaction logs when removing storage servers fix: dcId_locality map could be incorrect after restarting recruitEverything	2018-11-11 12:37:53 -08:00
Evan Tschannen	cd188a351e	fix: if a destination team became unhealthy and then healthy again, it would lower the priority of a move even though the source servers we are moving from are still unhealthy fix: badTeams were not accounted for when checking priorities	2018-11-11 12:33:31 -08:00
Evan Tschannen	4b5d0b4e2c	Merge branch 'release-6.0' # Conflicts: # documentation/sphinx/source/release-notes.rst # fdbclient/AsyncFileBlobStore.actor.cpp # fdbclient/AsyncFileBlobStore.actor.h # fdbclient/BlobStore.actor.cpp # fdbclient/BlobStore.h # fdbclient/HTTP.actor.cpp # fdbclient/ManagementAPI.actor.cpp # fdbclient/NativeAPI.actor.cpp # fdbrpc/LoadBalance.actor.h # fdbrpc/batcher.actor.h # fdbrpc/fdbrpc.vcxproj # fdbrpc/sim2.actor.cpp # fdbserver/DataDistribution.actor.cpp # fdbserver/DataDistributionTracker.actor.cpp # fdbserver/SimulatedCluster.actor.cpp # fdbserver/TLogServer.actor.cpp # fdbserver/masterserver.actor.cpp	2018-11-10 13:04:24 -08:00
Evan Tschannen	a654183f63	Merge pull request #791 from ajbeamon/remove-cluster-from-iclientapi Remove cluster from IClientApi (phase 2 of removing DB names)	2018-11-10 10:16:18 -08:00
Evan Tschannen	6a406bae72	Merge pull request #896 from ajbeamon/downgrade-incorrect-cluster-file-event Downgrade the severity of IncorrectClusterFileContents the first time…	2018-11-10 10:06:36 -08:00
Evan Tschannen	6f4ad84777	Merge pull request #903 from ajbeamon/move-batcher-into-proxy Move the sort of generic batcher from fdbrpc and make it specific to …	2018-11-10 09:56:03 -08:00
Evan Tschannen	7c23b68501	fix: we need to build teams if a server becomes healthy and it is not already on any teams	2018-11-09 18:06:00 -08:00
A.J. Beamon	c3a06aa6f1	Fix indentation	2018-11-09 14:25:40 -08:00
A.J. Beamon	67a152ae9f	Move the sort of generic batcher from fdbrpc and make it specific to batching commits in master proxy. Also a couple minor formatting changes.	2018-11-09 14:19:18 -08:00
Evan Tschannen	3e2484baf7	fix: a team tracker could downgrade the priority of a relocation issued by the team tracker for the other region	2018-11-09 10:07:55 -08:00
Evan Tschannen	6874e379fc	fix: set the simulator’s view of usable regions to one during configure tests which can disable usable regions	2018-11-09 10:06:03 -08:00
Evan Tschannen	19ae063b66	fix: storage servers need to be rebooted when increasing replication so that clients become aware that new options are available	2018-11-08 15:44:03 -08:00
Evan Tschannen	1cf5689d62	fix: workers could only create a shared transaction log for one store type. This resulted in the old store type being used for new transaction logs after configuration changes which changed the store type	2018-11-07 21:09:51 -08:00
Evan Tschannen	599cc6260e	fix: data distribution who not always add all subsets of emergency teams fix: data distribution would not stop tracking bad teams after all their data was moved to other teams fix: data distribution did not probably handle a server changing locality such that the teams it used to be on no longer satisfy the policy	2018-11-07 21:05:31 -08:00
Stephen Atherton	ade75ac692	Merge branch 'master' of github.com:apple/foundationdb into feature-redwood # Conflicts: # fdbserver/worker.actor.cpp	2018-11-07 11:43:54 -08:00
Stephen Atherton	9d73166b3b	Many bug fixes related to concurrent page operations and pager shutdown.	2018-11-06 19:31:16 -08:00
Evan Tschannen	6bb283aebc	fix: dcId to Locality changes could be lost if an emergency transaction happened that did not change the configuration fix: master proxy was starting dcId’s at 1 number too large	2018-11-05 11:12:43 -08:00
Evan Tschannen	04fa2a7202	fix: we could recover in a region with priority < 0	2018-11-05 10:14:26 -08:00
A.J. Beamon	187e507e53	Downgrade the severity of IncorrectClusterFileContents the first time it is logged to avoid transient issues that appear like the cluster file hasn't been updated (e.g. the cluster file is shared between multiple processes).	2018-11-05 09:28:08 -08:00
Evan Tschannen	87295cc263	suppressed spammy trace events, and avoid reporting a long master recovery duration when the cluster is first created	2018-11-04 23:07:56 -08:00
Evan Tschannen	87d0b4c294	fix: the remote region does not have a full replica is usable_regions==1	2018-11-04 22:05:37 -08:00
Evan Tschannen	c1bd279a4e	addressed review comments	2018-11-04 20:26:23 -08:00
Evan Tschannen	bd60027544	test region priority changes	2018-11-04 20:11:23 -08:00
Evan Tschannen	c02690471d	added protection against configuration changes which cannot be immediately reverted the configure database workload tests region configurations	2018-11-04 19:53:55 -08:00
Evan Tschannen	3304c83229	added additional checks in peek which determine when a tag will never get additional versions	2018-11-04 19:28:15 -08:00
Evan Tschannen	accba4fa1d	keep track of the last time a process became available to set a better starting value for remoteStartTime	2018-11-04 14:33:03 -08:00
Evan Tschannen	45c8f2dfcb	restarting tests will sometimes configure to a fearless configuration on startup if possible	2018-11-02 14:16:47 -07:00
Evan Tschannen	2a8c628d82	fix: even if a peek cursor cannot find a local set for the most recent data, it still may be able to find data from older log sets	2018-11-02 14:13:57 -07:00
Evan Tschannen	f045c041eb	fix: if a storage server already exists in a remote region after converting to fearless, it did not receive mutations between the known committed version and the recovery version	2018-11-02 14:11:39 -07:00
Evan Tschannen	bf6545a9cf	clients cache storage server interfaces individually, instead of as a team. This is needed because in fearless every shard has storage servers from two separate teams, leading to a lot of possible combinations allAlternatives failed logic was simplified, because we are already doing a global rate limiting, so a per shard limit is unnecessary reduced unnecessary state variables in waitMetrics requests	2018-11-02 13:15:09 -07:00
Evan Tschannen	3b97f5a899	fix: the storage server still has to pop old tags, even if it does not need any data from them	2018-11-02 13:10:14 -07:00
Evan Tschannen	979597a2ca	fix: upgraded tags must be popped from all log sets	2018-11-02 13:09:18 -07:00
Evan Tschannen	1b5d28386a	fix: the Tlog would not update the durable version properly when version_sizes was empty	2018-11-02 13:05:54 -07:00
Evan Tschannen	2d9a670774	fix: nested multCursors would improperly hang on getMore, because an inner pop of cursors would not be detected by the outer instance	2018-11-02 13:04:09 -07:00
Evan Tschannen	e68c07ae35	fix: trackShardBytes was called with the incorrect range, resulting in incorrect shard sizes reduced the size of shard tracker actors by removing unnecessary state variable. Because we have a large number of these actors these extra state variables add up to a lot of memory	2018-11-02 13:03:01 -07:00
Evan Tschannen	ad98acf795	fix: if the team started unhealthy and initialFailureReactionDelay was ready, we would not send relocations to the queue print wrong shard size team messages in simulation	2018-11-02 13:00:15 -07:00
Evan Tschannen	1d591acd0a	removed the countHealthyTeams check, because it was incorrect if it triggered during the wait(yield()) at the top of team tracker	2018-11-02 12:58:16 -07:00
Evan Tschannen	30fbc29af1	Renamed TimeKeeperStarted to TimeKeeperCommit	2018-11-02 12:57:03 -07:00
Evan Tschannen	278dbd5096	call debug transaction on timekeeper	2018-11-02 12:56:29 -07:00
Stephen Atherton	df3bdde50b	Many bug fixes. AsyncFileCached write() on a page with a zero-copy read in progress would orphan the old page before the read was finished. Pager file operations were not converting page id to int64 for byte offset calculation. Pager was not calling releaseZeroCopy() after readZeroCopy() if there was an error or cancellation. Pager reads were using some variables that could go out of scope. BusyPage's mechanism for notifying when a physical page is no longer in use is itself no longer in use and therefore removed. Pager shutdown now cancels all outstanding reads. Improved some debug output.	2018-10-31 02:14:55 -07:00
Stephen Atherton	b08497b7ea	Bug fix, at least some users of IKeyValueStore expect the read actors to make their own copies of key arguments.	2018-10-25 19:48:31 -07:00
Stephen Atherton	0277dab747	Removed accidental config change.	2018-10-25 04:01:42 -07:00
Stephen Atherton	342466817a	Added pagefile name to debug output. Shutdown will no longer throw if actor collection or delete futures have errors.	2018-10-25 04:00:02 -07:00
Stephen Atherton	0e84c1f438	Pager and btree debug output macro now prints local network address and time.	2018-10-25 03:57:09 -07:00
Stephen Atherton	32b43cc02b	Added 'simple' flag in simulation config generation. Defaults to false, must be changed in code.	2018-10-25 01:25:41 -07:00
Stephen Atherton	93501947f8	Process_behind should also be sent to clients.	2018-10-24 20:38:15 -07:00
Stephen Atherton	f17cc1e20f	StorageServer will no longer send io_error or other inappropriate errors to a client (this would never happen on SQLite). Many bug fixes around error handling, initialization, and shutdown in Redwood. StorageServer now calls init() on its underlying storage engine.	2018-10-24 15:57:06 -07:00
Alex Miller	a074dc2a60	Revert one line from #856 that accidentally changed the include path for Windows.h	2018-10-23 18:58:00 -07:00
Evan Tschannen	6bb2ba5d92	Merge commit '2a34115e65639b7aad368a148de3c4189bc34bfc' # Conflicts: # fdbserver/storageserver.actor.cpp # fdbserver/worker.actor.cpp	2018-10-23 17:05:42 -07:00
Evan Tschannen	a123c9c533	forgot to commit as part of the merge	2018-10-23 16:53:41 -07:00
Evan Tschannen	7e215b7e0c	Merge branch 'release-6.0' of github.com:apple/foundationdb into release-6.0 # Conflicts: # fdbserver/worker.actor.cpp	2018-10-23 16:53:07 -07:00
Alex Miller	6bb1f4093d	Merge pull request #856 from dropbox/pr/include-fix Adjust all includes to be relative to the root.	2018-10-22 09:51:55 -07:00
Clement Pang	3a30621071	Merge branch 'release-6.0' into memory-fast-rollback	2018-10-21 12:48:46 +09:00
Alex Miller	e2fc1c9b95	Remove specifying non-root directory as a path to search for includes.	2018-10-19 18:56:45 -07:00
Clement Pang	3ceec01392	Address review comments.	2018-10-19 18:55:35 -07:00
Evan Tschannen	22b68d6cf3	fix: we need to get the result of the future before getting the error from the ErrorOr	2018-10-19 17:34:28 -07:00
Evan Tschannen	1ef29cbf0d	more windows build fixes	2018-10-19 17:00:24 -07:00
Robert Escriva	268093a96d	Adjust all includes to be relative to the root. Remove the use of relative paths. A header at foo/bar.h could be included by files under foo/ with "bar.h", but would be included everywhere else as "foo/bar.h". Adjust so that every include references such a header with the latter form. Signed-off-by: Robert Escriva <rescriva@dropbox.com>	2018-10-19 17:35:33 +00:00
Evan Tschannen	8dd900a337	fixed the windows build	2018-10-18 20:26:45 -07:00
Evan Tschannen	5c52711f01	Merge pull request #854 from satherton/feature-redwood Fixed line endings.	2018-10-18 19:56:05 -07:00
Stephen Atherton	3b641643cb	Fixed line endings.	2018-10-18 19:46:58 -07:00
Evan Tschannen	db71b60d72	Merge pull request #819 from satherton/feature-redwood Redwood storage engine, initial/experimental version	2018-10-18 18:38:11 -07:00
Evan Tschannen	ed7036139a	Merge branch 'release-6.0' # Conflicts: # documentation/sphinx/source/release-notes.rst # fdbserver/DataDistribution.actor.cpp # fdbserver/storageserver.actor.cpp	2018-10-18 17:00:52 -07:00
Evan Tschannen	952b26d746	removed upgrade code from 2.0.3	2018-10-18 16:53:40 -07:00
Evan Tschannen	9b6c7f253c	changed a knob	2018-10-18 15:26:19 -07:00
Evan Tschannen	0b304495ad	added a yield to the proxy when committing a large batch of mutations	2018-10-18 15:26:00 -07:00
Evan Tschannen	0613a34845	The storage server would block the main thread when processing a single version with a large amount of data	2018-10-18 13:37:31 -07:00
Evan Tschannen	e36b7cd417	Only log teamTracker trace events if sizes are not wrong, to avoid spammy messages when dropping a fearless configuration wrongSize previous was unneeded	2018-10-17 11:45:47 -07:00
Evan Tschannen	2db17af815	separate code coverage into 3 different files	2018-10-16 16:12:25 -07:00
Evan Tschannen	cae1efee4e	divide workloads into their own item group	2018-10-16 15:29:44 -07:00
Evan Tschannen	c89b56355a	move unreadable in vcxproj to test how it changes the windows build	2018-10-16 14:57:15 -07:00
Evan Tschannen	ce4826217a	RestoreInterface.h was tabled with ClCompile instead of ClInclude	2018-10-16 11:47:36 -07:00
Evan Tschannen	0217aed74c	Merge branch 'release-6.0' # Conflicts: # bindings/go/README.md # documentation/sphinx/source/release-notes.rst # fdbserver/MasterProxyServer.actor.cpp # versions.target	2018-10-15 18:38:51 -07:00
Evan Tschannen	0acfae1e76	fixed the windows linker error	2018-10-15 18:19:51 -07:00
Evan Tschannen	d8dc8e83b9	Do not rollback while uncommitted sets exist	2018-10-15 15:09:12 -07:00
Stephen Atherton	5bc45958d8	Finished VersionedBTree's IClosable implementation. Added deletion of existing unit test pager state.	2018-10-15 03:43:43 -07:00
Evan Tschannen	a8feecbfad	added a comment to explain code ordering	2018-10-12 16:27:13 -07:00
Evan Tschannen	8ed4ce183c	Merge branch 'release-6.0' of github.com:apple/foundationdb into release-6.0	2018-10-12 14:56:19 -07:00
Evan Tschannen	17a1e3ce35	fix: the master proxy would log an OpCommit for empty commits to the txnStateStore	2018-10-12 12:58:17 -07:00
Clement Pang	88e8422511	Per etschannen, wait on durable for reboots	2018-10-10 17:42:40 -07:00
A.J. Beamon	419231d798	Fix: status was trying to read a metric under the wrong name, leading to an error that caused the cluster to report itself unhealthy and some metrics to be missing.	2018-10-10 13:33:28 -07:00
Evan Tschannen	4c95a5ee0f	added the basic structure for parallel restore	2018-10-09 18:47:28 -07:00
Clement Pang	4f1cb97222	add missing semiCommit() on reset.	2018-10-08 17:30:39 -07:00
Clement Pang	403b4c5d94	fix tabs in worker.actor.cpp	2018-10-08 17:28:58 -07:00
Clement Pang	eb72427923	Revert "Fix formatting with clang-format" This reverts commit `448751c`	2018-10-08 17:26:10 -07:00
Clement Pang	40ad06b0ac	Revert "clang-format still looks weird, trying something else." This reverts commit `24c64bd`	2018-10-08 17:26:04 -07:00
Clement Pang	24c64bd4bc	clang-format still looks weird, trying something else.	2018-10-08 17:24:47 -07:00
Clement Pang	448751ce83	Fix formatting with clang-format	2018-10-08 17:21:57 -07:00
Evan Tschannen	ecddeab2ae	fixed review comments; demote killRegionCycle test for now	2018-10-08 10:39:39 -07:00
Clement Pang	2fc60299d4	Fix comment.	2018-10-07 03:09:15 -07:00
Clement Pang	5e258677bd	Fix build.	2018-10-07 03:07:58 -07:00
Clement Pang	cf8f0686f8	Fix build.	2018-10-07 03:06:18 -07:00
Clement Pang	ebc42ba609	Instead of ignoring onClosed on all IOErrors, only ignore reboots (and only for memory and if the flag is on).	2018-10-07 02:53:38 -07:00

... 3 4 5 6 7 ...

1504 Commits