Evan Tschannen
c05c95cbe8
forgot to rename the knob
2020-02-25 15:47:39 -08:00
Evan Tschannen
12b5064041
a high free_space_ratio_cutoff is not needed anymore because avoid teams with low disk space is no longer the responsibility of getLoadBytes()
2020-02-25 15:47:10 -08:00
Evan Tschannen
6e7d2ff7dd
prevent the proxy from delaying too long based on an incorrect estimate of the compute time
2020-02-25 15:46:13 -08:00
Evan Tschannen
65fbe0d0bc
revert AcceptSocket priority change because of bad performance results
2020-02-21 19:22:14 -08:00
A.J. Beamon
4c696d5bf2
Merge branch 'release-6.2' into dd-better-rebalance-logging
...
# Conflicts:
# fdbserver/DataDistributionQueue.actor.cpp
2020-02-21 17:41:00 -08:00
A.J. Beamon
dfa5f76c01
Remove unused parameter. Don't put check for g_network presence in ASSERT_WE_THINK.
2020-02-21 16:28:03 -08:00
Evan Tschannen
aa4d1357b3
handle the case that there is only one healthy team
2020-02-21 15:41:01 -08:00
Evan Tschannen
457dbc5215
Update fdbserver/DataDistribution.actor.cpp
...
Co-Authored-By: A.J. Beamon <ajbeamon@users.noreply.github.com>
2020-02-21 15:39:17 -08:00
Evan Tschannen
6a634652c4
Update fdbserver/DataDistribution.actor.cpp
...
Co-Authored-By: A.J. Beamon <ajbeamon@users.noreply.github.com>
2020-02-21 15:39:06 -08:00
Evan Tschannen
08914a2acd
Once available space ratio falls below 0.3 avoid moving data to teams with less free space than the median team
2020-02-21 15:14:32 -08:00
Evan Tschannen
e422874758
fix: reboot does not work un unreliable processes
2020-02-21 14:29:42 -08:00
A.J. Beamon
2e699fef55
Don't suppress actor cancellation because we've already initialized the trace event by adding details.
2020-02-21 11:28:59 -08:00
A.J. Beamon
6810a03283
Add more logging to valley filler and mountain chopper
2020-02-21 10:55:14 -08:00
Alvin Moore
9042cab7bc
Changed ordering of link libraries
2020-02-21 08:56:52 -08:00
Alvin Moore
87751df40a
Fixed problem with linking pthread
2020-02-21 08:45:39 -08:00
Evan Tschannen
a27ea63500
Merge branch 'release-6.2' into feature-boost-ssl
2020-02-20 23:06:22 -08:00
Evan Tschannen
59ff782927
fix: only delete the processId file on binaryReader errors
2020-02-20 23:04:39 -08:00
Evan Tschannen
6f1d3ccd35
Merge branch 'release-6.2' into feature-boost-ssl
2020-02-20 20:03:40 -08:00
Evan Tschannen
f04e311a1e
Merge commit 'b46d6e25e24993ab5a5f04091fd3235050b7cd09' into feature-boost-ssl
...
# Conflicts:
# fdbserver/SimulatedCluster.actor.cpp
# flow/Net2.actor.cpp
2020-02-20 17:36:38 -08:00
Alex Miller
927cff3317
Report errors on TLS misconfigurations ... or at least try to.
2020-02-20 16:57:29 -08:00
Evan Tschannen
819c55556c
More aggressively attempt to find teams that do not have low disk space
2020-02-20 16:47:50 -08:00
A.J. Beamon
e1fb568fd1
Merge branch 'release-6.2' into dd-use-available-space
...
# Conflicts:
# fdbserver/DataDistribution.actor.cpp
# fdbserver/DataDistribution.actor.h
# fdbserver/DataDistributionQueue.actor.cpp
2020-02-20 16:12:42 -08:00
A.J. Beamon
9e84fa965d
Merge pull request #2703 from ajbeamon/fix-stuck-dd-rebalancing
...
Fix issue with rebalancing data movement doing no work
2020-02-20 15:56:04 -08:00
Evan Tschannen
d7c841a28a
Merge pull request #2589 from etschannen/feature-proxy-delay
...
Improve version pipelining on the proxy
2020-02-20 15:23:30 -08:00
A.J. Beamon
e4b483796d
Combine some logic that was doing similar computations for free space ratio.
2020-02-20 14:52:08 -08:00
Evan Tschannen
8b768e66df
Merge pull request #2694 from dongxinEric/feature/2663/specialize-policy-for-zoneid-in-cc
...
Added a specialized algorithm for PolicyOne and PolicyAcross(,'zoneId…
2020-02-20 14:46:23 -08:00
Evan Tschannen
8129f74a10
Merge pull request #2698 from etschannen/feature-recruit-delay
...
The CC waits until no new workers register before starting a bad recruitment
2020-02-20 14:42:37 -08:00
Evan Tschannen
7d54acf4ca
removed an unnecessary yield
2020-02-20 14:41:49 -08:00
Evan Tschannen
574e88ba8e
updateGoodRemoteRecruitmentTime was unnecessary because the only way findRemoteWorkers would return would be after a new server has joined which already resets goodRemoteRecruitmentTime
2020-02-20 13:46:22 -08:00
A.J. Beamon
5586e6f6d8
Merge pull request #2697 from etschannen/feature-correctness-fixes
...
A variety of correctness fixes
2020-02-20 13:32:18 -08:00
A.J. Beamon
4f1301b2dd
Merge pull request #2583 from etschannen/feature-keep-status-connected
...
Clients should not disconnect from the CC after fetching status
2020-02-20 13:12:30 -08:00
A.J. Beamon
fcbdcda490
Merge pull request #2650 from ajbeamon/fix-reverse-range-read-byte-limit-bug
...
Fix reverse range read performance bug
2020-02-20 12:47:17 -08:00
A.J. Beamon
6d9decdf59
Merge pull request #2672 from ajbeamon/improve-tlog-role-event
...
Improve TLog "Role" event
2020-02-20 12:45:25 -08:00
A.J. Beamon
4c9c736253
Data distribution uses available space instead of free space when evaluating whether processes are low on space and penalizing them.
2020-02-20 11:21:03 -08:00
A.J. Beamon
3a1ba5a077
Rename variable for clarity
2020-02-20 10:59:52 -08:00
Evan Tschannen
08c318d28a
re-added the connect lock in the fdbcli so that the timeout is not spent before a connection has been initiated (because of the handshake lock)
2020-02-20 10:43:34 -08:00
Xin Dong
99095c9224
Again make Clang happy.
2020-02-20 09:50:22 -08:00
Xin Dong
298d6cb3d7
Address review comments.
2020-02-20 09:34:01 -08:00
A.J. Beamon
c164acb88d
Add new criteria to DD's GetTeamRequest that allow you to require shards be present on the team and that the team have a minimum free ratio. This avoids scenarios where the team chosen when processing the request is later rejected by the requestor, causing rebalancing movements to get stuck.
2020-02-20 09:32:00 -08:00
Evan Tschannen
761da5a059
code cleanup
2020-02-19 17:59:45 -08:00
Evan Tschannen
fbd45963d8
The cluster controller waits until no new workers register for 1.0 before starting a bad recruitment
2020-02-19 16:48:30 -08:00
Evan Tschannen
cf4efca852
fix: buffered cursor should always make sure all of the sub-cursors are completely exhausted before calculating minVersion. It is not legal to advance a cursor version past an epochEnd (+100 million versions) without also returning the epochEnd mutation, or the storage servers might not be able to rollback far enough because the end of the previous epoch will be made durable
2020-02-19 15:24:32 -08:00
Evan Tschannen
9b3254d5f4
A corrupted processId file should be deleted in simulation, as that is the manual operation that would fix the problem in the real world
2020-02-19 15:21:42 -08:00
Evan Tschannen
4326984b1d
fix: wait metrics can take a really long time to detect that two shards have been merged into one if both shards are assigned to the same team. Additional information should be added to the request to improve this.
2020-02-19 15:20:38 -08:00
Xin Dong
89fcbb2055
Make clang happy
2020-02-19 09:44:15 -08:00
Xin Dong
efc0d7f9d5
Added a specialized algorithm for PolicyOne and PoilcyAcross(,'zoneId',PolicyOne()) to find a set of TLog servers which will be able to fulfill the policy later.
2020-02-19 09:25:57 -08:00
Alex Miller
88d36af9c7
Fix --tls_password and add better error logging
...
This refactors all tls settings into a TLSParams object so that we can
set the password before loading any certificates.
It turns out that the FDBLibTLS code did really nice things with error
logging, but I just didn't understand openssl enough before to realize
what pieces I should be copying.
2020-02-19 00:57:05 -08:00
A.J. Beamon
1d9140d874
Removed TLogVersion logging.
...
Added logging of SharedTLog ID for each TLog.
Switched ID logged for TLogRejoining event to the TLog instead of the SharedTLog.
Made some parameters to startRole passed by reference.
2020-02-14 12:33:43 -08:00
A.J. Beamon
a41aa41816
Merge pull request #2670 from Daniel-B-Smith/skip-memcmp
...
Revert to memcmp comparison in SkipList
2020-02-14 09:03:08 -08:00
Alex Miller
c859f859bc
Remove certBytes.
2020-02-13 21:34:23 -08:00