Commit Graph

2762 Commits

Author SHA1 Message Date
Evan Tschannen c05c95cbe8 forgot to rename the knob 2020-02-25 15:47:39 -08:00
Evan Tschannen 12b5064041 a high free_space_ratio_cutoff is not needed anymore because avoid teams with low disk space is no longer the responsibility of getLoadBytes() 2020-02-25 15:47:10 -08:00
Evan Tschannen 6e7d2ff7dd prevent the proxy from delaying too long based on an incorrect estimate of the compute time 2020-02-25 15:46:13 -08:00
Evan Tschannen 65fbe0d0bc revert AcceptSocket priority change because of bad performance results 2020-02-21 19:22:14 -08:00
A.J. Beamon 4c696d5bf2 Merge branch 'release-6.2' into dd-better-rebalance-logging
# Conflicts:
#	fdbserver/DataDistributionQueue.actor.cpp
2020-02-21 17:41:00 -08:00
A.J. Beamon dfa5f76c01 Remove unused parameter. Don't put check for g_network presence in ASSERT_WE_THINK. 2020-02-21 16:28:03 -08:00
Evan Tschannen aa4d1357b3 handle the case that there is only one healthy team 2020-02-21 15:41:01 -08:00
Evan Tschannen 457dbc5215
Update fdbserver/DataDistribution.actor.cpp
Co-Authored-By: A.J. Beamon <ajbeamon@users.noreply.github.com>
2020-02-21 15:39:17 -08:00
Evan Tschannen 6a634652c4
Update fdbserver/DataDistribution.actor.cpp
Co-Authored-By: A.J. Beamon <ajbeamon@users.noreply.github.com>
2020-02-21 15:39:06 -08:00
Evan Tschannen 08914a2acd Once available space ratio falls below 0.3 avoid moving data to teams with less free space than the median team 2020-02-21 15:14:32 -08:00
Evan Tschannen e422874758 fix: reboot does not work un unreliable processes 2020-02-21 14:29:42 -08:00
A.J. Beamon 2e699fef55 Don't suppress actor cancellation because we've already initialized the trace event by adding details. 2020-02-21 11:28:59 -08:00
A.J. Beamon 6810a03283 Add more logging to valley filler and mountain chopper 2020-02-21 10:55:14 -08:00
Alvin Moore 9042cab7bc Changed ordering of link libraries 2020-02-21 08:56:52 -08:00
Alvin Moore 87751df40a Fixed problem with linking pthread 2020-02-21 08:45:39 -08:00
Evan Tschannen a27ea63500 Merge branch 'release-6.2' into feature-boost-ssl 2020-02-20 23:06:22 -08:00
Evan Tschannen 59ff782927 fix: only delete the processId file on binaryReader errors 2020-02-20 23:04:39 -08:00
Evan Tschannen 6f1d3ccd35 Merge branch 'release-6.2' into feature-boost-ssl 2020-02-20 20:03:40 -08:00
Evan Tschannen f04e311a1e Merge commit 'b46d6e25e24993ab5a5f04091fd3235050b7cd09' into feature-boost-ssl
# Conflicts:
#	fdbserver/SimulatedCluster.actor.cpp
#	flow/Net2.actor.cpp
2020-02-20 17:36:38 -08:00
Alex Miller 927cff3317 Report errors on TLS misconfigurations ... or at least try to. 2020-02-20 16:57:29 -08:00
Evan Tschannen 819c55556c More aggressively attempt to find teams that do not have low disk space 2020-02-20 16:47:50 -08:00
A.J. Beamon e1fb568fd1 Merge branch 'release-6.2' into dd-use-available-space
# Conflicts:
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/DataDistribution.actor.h
#	fdbserver/DataDistributionQueue.actor.cpp
2020-02-20 16:12:42 -08:00
A.J. Beamon 9e84fa965d
Merge pull request #2703 from ajbeamon/fix-stuck-dd-rebalancing
Fix issue with rebalancing data movement doing no work
2020-02-20 15:56:04 -08:00
Evan Tschannen d7c841a28a
Merge pull request #2589 from etschannen/feature-proxy-delay
Improve version pipelining on the proxy
2020-02-20 15:23:30 -08:00
A.J. Beamon e4b483796d Combine some logic that was doing similar computations for free space ratio. 2020-02-20 14:52:08 -08:00
Evan Tschannen 8b768e66df
Merge pull request #2694 from dongxinEric/feature/2663/specialize-policy-for-zoneid-in-cc
Added a specialized algorithm for PolicyOne and PolicyAcross(,'zoneId…
2020-02-20 14:46:23 -08:00
Evan Tschannen 8129f74a10
Merge pull request #2698 from etschannen/feature-recruit-delay
The CC waits until no new workers register before starting a bad recruitment
2020-02-20 14:42:37 -08:00
Evan Tschannen 7d54acf4ca removed an unnecessary yield 2020-02-20 14:41:49 -08:00
Evan Tschannen 574e88ba8e updateGoodRemoteRecruitmentTime was unnecessary because the only way findRemoteWorkers would return would be after a new server has joined which already resets goodRemoteRecruitmentTime 2020-02-20 13:46:22 -08:00
A.J. Beamon 5586e6f6d8
Merge pull request #2697 from etschannen/feature-correctness-fixes
A variety of correctness fixes
2020-02-20 13:32:18 -08:00
A.J. Beamon 4f1301b2dd
Merge pull request #2583 from etschannen/feature-keep-status-connected
Clients should not disconnect from the CC after fetching status
2020-02-20 13:12:30 -08:00
A.J. Beamon fcbdcda490
Merge pull request #2650 from ajbeamon/fix-reverse-range-read-byte-limit-bug
Fix reverse range read performance bug
2020-02-20 12:47:17 -08:00
A.J. Beamon 6d9decdf59
Merge pull request #2672 from ajbeamon/improve-tlog-role-event
Improve TLog "Role" event
2020-02-20 12:45:25 -08:00
A.J. Beamon 4c9c736253 Data distribution uses available space instead of free space when evaluating whether processes are low on space and penalizing them. 2020-02-20 11:21:03 -08:00
A.J. Beamon 3a1ba5a077 Rename variable for clarity 2020-02-20 10:59:52 -08:00
Evan Tschannen 08c318d28a re-added the connect lock in the fdbcli so that the timeout is not spent before a connection has been initiated (because of the handshake lock) 2020-02-20 10:43:34 -08:00
Xin Dong 99095c9224 Again make Clang happy. 2020-02-20 09:50:22 -08:00
Xin Dong 298d6cb3d7 Address review comments. 2020-02-20 09:34:01 -08:00
A.J. Beamon c164acb88d Add new criteria to DD's GetTeamRequest that allow you to require shards be present on the team and that the team have a minimum free ratio. This avoids scenarios where the team chosen when processing the request is later rejected by the requestor, causing rebalancing movements to get stuck. 2020-02-20 09:32:00 -08:00
Evan Tschannen 761da5a059 code cleanup 2020-02-19 17:59:45 -08:00
Evan Tschannen fbd45963d8 The cluster controller waits until no new workers register for 1.0 before starting a bad recruitment 2020-02-19 16:48:30 -08:00
Evan Tschannen cf4efca852 fix: buffered cursor should always make sure all of the sub-cursors are completely exhausted before calculating minVersion. It is not legal to advance a cursor version past an epochEnd (+100 million versions) without also returning the epochEnd mutation, or the storage servers might not be able to rollback far enough because the end of the previous epoch will be made durable 2020-02-19 15:24:32 -08:00
Evan Tschannen 9b3254d5f4 A corrupted processId file should be deleted in simulation, as that is the manual operation that would fix the problem in the real world 2020-02-19 15:21:42 -08:00
Evan Tschannen 4326984b1d fix: wait metrics can take a really long time to detect that two shards have been merged into one if both shards are assigned to the same team. Additional information should be added to the request to improve this. 2020-02-19 15:20:38 -08:00
Xin Dong 89fcbb2055 Make clang happy 2020-02-19 09:44:15 -08:00
Xin Dong efc0d7f9d5 Added a specialized algorithm for PolicyOne and PoilcyAcross(,'zoneId',PolicyOne()) to find a set of TLog servers which will be able to fulfill the policy later. 2020-02-19 09:25:57 -08:00
Alex Miller 88d36af9c7 Fix --tls_password and add better error logging
This refactors all tls settings into a TLSParams object so that we can
set the password before loading any certificates.

It turns out that the FDBLibTLS code did really nice things with error
logging, but I just didn't understand openssl enough before to realize
what pieces I should be copying.
2020-02-19 00:57:05 -08:00
A.J. Beamon 1d9140d874 Removed TLogVersion logging.
Added logging of SharedTLog ID for each TLog.
Switched ID logged for TLogRejoining event to the TLog instead of the SharedTLog.
Made some parameters to startRole passed by reference.
2020-02-14 12:33:43 -08:00
A.J. Beamon a41aa41816
Merge pull request #2670 from Daniel-B-Smith/skip-memcmp
Revert to memcmp comparison in SkipList
2020-02-14 09:03:08 -08:00
Alex Miller c859f859bc Remove certBytes. 2020-02-13 21:34:23 -08:00