Commit Graph

7288 Commits

Author SHA1 Message Date
A.J. Beamon 2431d4d788 Always compute the time for a trace event when it is being logged rather than when it is being created. Usually these are the same, but if they aren't, doing the opposite can lead to out of order trace events. 2020-02-21 13:57:04 -08:00
A.J. Beamon 2e699fef55 Don't suppress actor cancellation because we've already initialized the trace event by adding details. 2020-02-21 11:28:59 -08:00
A.J. Beamon 6810a03283 Add more logging to valley filler and mountain chopper 2020-02-21 10:55:14 -08:00
Evan Tschannen 0b732ac9c8
Merge pull request #2714 from etschannen/release-6.2
fix: only delete the processId file on binaryReader errors
2020-02-20 23:05:39 -08:00
Evan Tschannen 59ff782927 fix: only delete the processId file on binaryReader errors 2020-02-20 23:04:39 -08:00
Evan Tschannen 11561b8cc2
Merge pull request #2712 from etschannen/release-6.2
fixed a number of problems with findBestPolicySetSimple
2020-02-20 20:02:43 -08:00
Evan Tschannen 7056c73f20 fixed a number of problems with findBestPolicySetSimple 2020-02-20 20:00:54 -08:00
Evan Tschannen 57dd0a3f28
Merge pull request #2709 from etschannen/feature-avoid-low-diskspace
More aggressively attempt to find teams that do not have low disk space
2020-02-20 18:09:55 -08:00
Evan Tschannen b46d6e25e2
Merge pull request #2710 from etschannen/release-6.2
fix: zoneid is all lower case
2020-02-20 17:27:29 -08:00
Evan Tschannen a50939417b fix: zoneid is all lower case 2020-02-20 17:26:44 -08:00
Evan Tschannen 819c55556c More aggressively attempt to find teams that do not have low disk space 2020-02-20 16:47:50 -08:00
Evan Tschannen 44f21ca332
Merge pull request #2708 from ajbeamon/dd-use-available-space
Use available space rather than free space in data distribution decisions
2020-02-20 16:33:51 -08:00
A.J. Beamon 9d526d1670 Add a release note. 2020-02-20 16:16:07 -08:00
A.J. Beamon e1fb568fd1 Merge branch 'release-6.2' into dd-use-available-space
# Conflicts:
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/DataDistribution.actor.h
#	fdbserver/DataDistributionQueue.actor.cpp
2020-02-20 16:12:42 -08:00
A.J. Beamon 9e84fa965d
Merge pull request #2703 from ajbeamon/fix-stuck-dd-rebalancing
Fix issue with rebalancing data movement doing no work
2020-02-20 15:56:04 -08:00
A.J. Beamon d1f050690a
Merge pull request #2454 from ajbeamon/document-range-read-efficiency
Document efficiency of reverse range reads
2020-02-20 15:25:38 -08:00
Evan Tschannen 859977ffbc
Merge pull request #2700 from alexmiller-apple/rpm-usr-bin-python
Don't require /usr/bin/python in RPMs.
2020-02-20 15:24:27 -08:00
Evan Tschannen d7c841a28a
Merge pull request #2589 from etschannen/feature-proxy-delay
Improve version pipelining on the proxy
2020-02-20 15:23:30 -08:00
A.J. Beamon e4b483796d Combine some logic that was doing similar computations for free space ratio. 2020-02-20 14:52:08 -08:00
Evan Tschannen 8b768e66df
Merge pull request #2694 from dongxinEric/feature/2663/specialize-policy-for-zoneid-in-cc
Added a specialized algorithm for PolicyOne and PolicyAcross(,'zoneId…
2020-02-20 14:46:23 -08:00
Evan Tschannen 8129f74a10
Merge pull request #2698 from etschannen/feature-recruit-delay
The CC waits until no new workers register before starting a bad recruitment
2020-02-20 14:42:37 -08:00
Evan Tschannen 7d54acf4ca removed an unnecessary yield 2020-02-20 14:41:49 -08:00
Evan Tschannen 574e88ba8e updateGoodRemoteRecruitmentTime was unnecessary because the only way findRemoteWorkers would return would be after a new server has joined which already resets goodRemoteRecruitmentTime 2020-02-20 13:46:22 -08:00
A.J. Beamon 5586e6f6d8
Merge pull request #2697 from etschannen/feature-correctness-fixes
A variety of correctness fixes
2020-02-20 13:32:18 -08:00
A.J. Beamon 4f1301b2dd
Merge pull request #2583 from etschannen/feature-keep-status-connected
Clients should not disconnect from the CC after fetching status
2020-02-20 13:12:30 -08:00
A.J. Beamon fcbdcda490
Merge pull request #2650 from ajbeamon/fix-reverse-range-read-byte-limit-bug
Fix reverse range read performance bug
2020-02-20 12:47:17 -08:00
A.J. Beamon 6d9decdf59
Merge pull request #2672 from ajbeamon/improve-tlog-role-event
Improve TLog "Role" event
2020-02-20 12:45:25 -08:00
Evan Tschannen def8ca6da3 simulation advances timer() separately from now() to better model the real world 2020-02-20 12:10:20 -08:00
Evan Tschannen 24c6f7616f removed unused code 2020-02-20 11:57:54 -08:00
A.J. Beamon 4c9c736253 Data distribution uses available space instead of free space when evaluating whether processes are low on space and penalizing them. 2020-02-20 11:21:03 -08:00
A.J. Beamon 3a1ba5a077 Rename variable for clarity 2020-02-20 10:59:52 -08:00
Xin Dong 99095c9224 Again make Clang happy. 2020-02-20 09:50:22 -08:00
A.J. Beamon 223b2201e9 Add release note. 2020-02-20 09:36:05 -08:00
Xin Dong 298d6cb3d7 Address review comments. 2020-02-20 09:34:01 -08:00
A.J. Beamon c164acb88d Add new criteria to DD's GetTeamRequest that allow you to require shards be present on the team and that the team have a minimum free ratio. This avoids scenarios where the team chosen when processing the request is later rejected by the requestor, causing rebalancing movements to get stuck. 2020-02-20 09:32:00 -08:00
Alex Miller 6bacd11ab5 Don't require /usr/bin/python in RPMs.
This was being "helpfully" automatically added for us by rpmbuild as it
saw that we listed a python file (make_public.py) in %files.  As that
isn't a vital thing for running FDB, it's better to not make it a hard
requirement.
2020-02-20 02:10:25 -08:00
Evan Tschannen fbd45963d8 The cluster controller waits until no new workers register for 1.0 before starting a bad recruitment 2020-02-19 16:48:30 -08:00
Evan Tschannen cf4efca852 fix: buffered cursor should always make sure all of the sub-cursors are completely exhausted before calculating minVersion. It is not legal to advance a cursor version past an epochEnd (+100 million versions) without also returning the epochEnd mutation, or the storage servers might not be able to rollback far enough because the end of the previous epoch will be made durable 2020-02-19 15:24:32 -08:00
Evan Tschannen 9b3254d5f4 A corrupted processId file should be deleted in simulation, as that is the manual operation that would fix the problem in the real world 2020-02-19 15:21:42 -08:00
Evan Tschannen 4326984b1d fix: wait metrics can take a really long time to detect that two shards have been merged into one if both shards are assigned to the same team. Additional information should be added to the request to improve this. 2020-02-19 15:20:38 -08:00
Evan Tschannen a6486766c2 fix: rebooting an unreliable process will make it reliable again, but while unreliable the files for that process could have already been corrupted so simulation will think a process is healthy that is actually corrupted 2020-02-19 15:18:57 -08:00
Evan Tschannen 46d5f5e325 do not trigger the resetPing if we cannot actually remove the peer, because it will cause us to reset the timeout, so repeated calls to removePeer can keep a dead peer from being removed 2020-02-19 15:17:50 -08:00
Xin Dong 89fcbb2055 Make clang happy 2020-02-19 09:44:15 -08:00
Xin Dong efc0d7f9d5 Added a specialized algorithm for PolicyOne and PoilcyAcross(,'zoneId',PolicyOne()) to find a set of TLog servers which will be able to fulfill the policy later. 2020-02-19 09:25:57 -08:00
Alex Miller 8cfc032929
Merge pull request #2693 from satherton/fix-backup-basename-issue
BackupContainerFilesystem no longer unnecessarily depends on abspath()..
2020-02-18 17:43:51 -08:00
Steve Atherton 32d9ede328 Added release note. 2020-02-18 16:45:31 -08:00
Steve Atherton 3d72c2a661 BackupContainerFilesystem no longer unnecessarily depends on abspath() to find the last part of a path string, since it shouldn't touch the local filesystem in the remote case. 2020-02-18 16:35:00 -08:00
Alex Miller 07c800668d
Merge pull request #2676 from atn34/atn34/fix-symbol-visibility
Fix symbol visibility
2020-02-18 16:06:09 -08:00
Andrew Noyes 40c9e2f3d8 Add -no_weak_exports 2020-02-18 10:49:43 -08:00
A.J. Beamon 1d9140d874 Removed TLogVersion logging.
Added logging of SharedTLog ID for each TLog.
Switched ID logged for TLogRejoining event to the TLog instead of the SharedTLog.
Made some parameters to startRole passed by reference.
2020-02-14 12:33:43 -08:00