Evan Tschannen
72203ba47a
Merge commit '56f3f0b1bc60604f965152d856ae29a591227703'
2019-04-01 18:45:38 -07:00
Evan Tschannen
f5de52de91
fix: cancel the previous log system recruitment before calling newEpoch, to avoid multiple actors attempting to modify oldLogSystem at the same time
2019-04-01 16:38:25 -07:00
Evan Tschannen
a46620fbee
Merge branch 'release-6.1'
2019-03-30 17:59:28 -07:00
Evan Tschannen
8ebf771392
cleanup cluster controller trace events
2019-03-30 14:17:18 -07:00
Alex Miller
e7ad39246c
Fix typo
2019-03-29 20:16:26 -07:00
Evan Tschannen
a44ffd851e
fix: the shared tlog could fail to update a stopped tlog’s queueCommitVersion to version if a second tlog registered before it could issue the first commit for the tlog
2019-03-29 20:11:30 -07:00
Evan Tschannen
d882c060bf
Merge commit '5dd6396eed0de0dfea6cf9eecc307995eff5cedc'
2019-03-28 18:00:55 -07:00
Balachandar Namasivayam
0bbdc15f71
Multi-test processes waits until a timeout if any of the tester processes restarts. Use getReplyUnlessFailedFor instead of getReply to detect the restarts and fail quickly instead of waiting for a timeout which is usually large.
2019-03-28 17:05:30 -07:00
Evan Tschannen
b6008558d3
renamed BinaryWriter.toStringRef() to .toValue(), because the function now returns a Standalone<StringRef>()
...
eliminated an unnecessary copy from the proxy commit path
eliminated an unnecessary copy from buffered peek cursor
2019-03-28 11:52:50 -07:00
Evan Tschannen
836bb95a7a
Merge pull request #1372 from etschannen/master
...
Merge 6.1 into master
2019-03-27 21:00:49 -07:00
Evan Tschannen
34b9d5e722
Merge pull request #1364 from etschannen/feature-fast-serialize
...
A few performance optimizations
2019-03-27 20:57:25 -07:00
Evan Tschannen
e5a80f2c94
optimized IPaddress
2019-03-27 18:21:13 -07:00
A.J. Beamon
91014d4529
Add file changes that I accidentally failed to commit; fix naming issue in worker.
2019-03-27 08:41:19 -07:00
A.J. Beamon
71e2fdafb8
Changes to ratekeeper camel case
2019-03-27 08:24:25 -07:00
A.J. Beamon
d508658569
Make ratekeeper one word to match our existing convention
2019-03-27 08:15:19 -07:00
Jingyu Zhou
38c6681349
Fix some signed and unsigned mismatch warnings.
2019-03-26 14:54:11 -07:00
Jingyu Zhou
c0b58080ee
Fix type name warning for DDTeamCollection
...
Seen using 'class' now seen using 'struct' in DataDistribution.actor.cpp
2019-03-26 14:18:25 -07:00
Jingyu Zhou
7c02ee6fdd
Fix compiler warning about unreferenced exception variable
2019-03-26 13:43:47 -07:00
Jingyu Zhou
466a59a99d
Merge remote-tracking branch 'apple/release-6.1' into ratekeeper
2019-03-25 15:27:38 -07:00
Jingyu Zhou
f57a22e2ed
Add data distributor and ratekeeper to status output
2019-03-25 15:11:29 -07:00
Evan Tschannen
5e03e178de
Merge pull request #1345 from ajbeamon/support-multiple-client-or-worker-issues
...
Add support for a client or worker having multiple issues.
2019-03-24 17:27:50 -07:00
Evan Tschannen
d45159ebf7
Merge pull request #1307 from jzhou77/ratekeeper
...
Monitor placement of Ratekeeper and DataDistributor
2019-03-24 17:26:07 -07:00
Evan Tschannen
d6ad027d37
ratekeeper needs to be recruited for proxies to make progress, so if one has not registered with the cluster controller by the time we are accepting commits, recruit a new one
2019-03-24 16:48:24 -07:00
Evan Tschannen
f426d732ea
fix: forgot to remove one location where id_used was incremented for distributor and ratekeeper
2019-03-24 16:04:59 -07:00
Evan Tschannen
e8948726e8
once we recruit a ratekeeper, do not allow any other ratekeepers to register
2019-03-24 11:04:39 -07:00
Evan Tschannen
24c92a1870
Merge pull request #1352 from etschannen/feature-network-address-list
...
Changed NetworkAddressList to at most two addresses for performance
2019-03-24 10:22:38 -07:00
Evan Tschannen
50a4403661
fix: missing parathesis
2019-03-23 21:52:15 -07:00
Jingyu Zhou
40eec20252
Restore master PID in worker registration
...
This fix is lost during merge.
2019-03-23 21:02:11 -07:00
Jingyu Zhou
3ef26e6be3
Fix fitness assignment statements
...
Found by MacOS build.
2019-03-23 19:16:04 -07:00
Evan Tschannen
1fc6937802
changed NetworkAddressList to at most two addresses for performance
2019-03-23 17:54:46 -07:00
Evan Tschannen
b51a24453e
the data distributor and ratekeeper are not included in id_used, but when comparing equally good options we prefer to avoid sharing with those roles
...
excluded data distributor and ratekeeper were improperly killed when the best option was also excluded
2019-03-23 13:25:36 -07:00
Jingyu Zhou
10988f89d9
Code refactoring for ConsistencyCheck.actor.cpp
2019-03-23 11:06:43 -07:00
Jingyu Zhou
fdc5b5ddbf
Fix: spurious ratekeeper registration
...
A rare race condition:
-r simulation -f ./foundationdb/tests/slow/WriteDuringReadAtomicRestore.txt -s 114256311 -b on
- A is the ratekeeper.
- CC recruit B and B starts
- CC halts ratekeeper A and A is halted
- A registers back with CC, which then halts B. CC sets A to be the ratekeeper.
CC starts recruiting and finds A is the best machine. But skips recruiting
because CC thinks A is already used. Now the cluster is left with no ratekeeper.
Fix by disallowing ratekeeper registration with previous ID.
2019-03-23 11:03:51 -07:00
Jingyu Zhou
6523cd4931
Fix: recruit ratekeeper is not triggerred
2019-03-23 09:20:54 -07:00
Steve Atherton
09f37cf3d2
Merge pull request #533 from ajbeamon/fix-parent-directory
...
Fixes to parentDirectory() and abspath()
2019-03-22 23:53:46 -07:00
Evan Tschannen
2da46e3172
fix: halt if datacenters are different
2019-03-22 23:53:21 -07:00
Evan Tschannen
b68bc46042
Merge pull request #1348 from ajbeamon/fix-missing-metrics-when-ss-down
...
Fix missing read workload metrics
2019-03-22 19:08:04 -07:00
Evan Tschannen
d34c56c9a5
ensure that the processId exists in id_worker before accessing it
2019-03-22 18:54:39 -07:00
Balachandar Namasivayam
ac8ad07b45
Address review comments.
2019-03-22 18:48:49 -07:00
Balachandar Namasivayam
4ed323ac52
Fixed bug and addressed review comments.
2019-03-22 18:48:49 -07:00
Balachandar Namasivayam
d75020b44a
Fix bug where accessing shared memory created by boost 1.52 leads to error when accessed by boost 1.67.
2019-03-22 18:48:49 -07:00
Evan Tschannen
36ab852bb1
Merge branch 'master' into ratekeeper
...
# Conflicts:
# fdbserver/ClusterController.actor.cpp
2019-03-22 18:41:00 -07:00
Evan Tschannen
6254a1a8e4
fix: restarting the provisional proxy causes all tlog peeks to restart, so if tlog peeks take longer than 1 second this could end in an infinite loop
2019-03-22 18:37:39 -07:00
Evan Tschannen
7dd1c1b60c
fix: processClassFitness could be wrong if the client changed their class while rebooting
2019-03-22 18:37:39 -07:00
Evan Tschannen
ddb6058770
simplified ratekeeper monitoring loop
2019-03-22 18:22:45 -07:00
Jingyu Zhou
12917d8c7d
Add actors to store halt request futures
...
Address best fitness in checking better DD or RK.
2019-03-22 18:06:38 -07:00
Jingyu Zhou
e8977aeb98
Remove clusterControllerDcId check
...
This is no longer needed since it'll be set in the ctor.
2019-03-22 18:01:54 -07:00
Evan Tschannen
82bc447e29
startRatekeeper is responsible for updating serverDBInfo
2019-03-22 17:56:16 -07:00
Evan Tschannen
82c80c225d
make sure id_worker is updated before setting ratekeeper or data distribution
2019-03-22 17:08:54 -07:00
Evan Tschannen
6a9c9d79cc
Update fdbserver/ClusterController.actor.cpp
2019-03-22 17:00:58 -07:00