Commit Graph

41 Commits

Author SHA1 Message Date
Evan Tschannen 68ac3bdc4c log routers now calculate a precise version to pop for their log router tag 2018-06-21 15:29:46 -07:00
Evan Tschannen eaca0fb2ea fixed incorrect priorities on the log router 2018-06-18 17:36:40 -07:00
Evan Tschannen f694f7c9ca removed hasBestPolicy 2018-06-15 12:36:19 -07:00
Alex Miller fcfa00928b Make RecoveryState an enum class.
This means that all the == 7 or != 0 checks go away, and explicit names must be used.
2018-06-12 16:50:25 -07:00
Evan Tschannen 372ed67497 Merge branch 'master' into feature-remote-logs
# Conflicts:
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/MasterProxyServer.actor.cpp
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/TagPartitionedLogSystem.actor.cpp
2018-06-11 11:34:10 -07:00
A.J. Beamon e5488419cc Attempt to normalize trace events:
* Detail names now all start with an uppercase character and contain no underscores. Ideally these should be head-first camel case, though that was harder to check.
* Type names have the same rules, except they allow one underscore (to support a usage pattern Context_Type). The first character after the underscore is also uppercase.
* Use seconds instead of milliseconds in details.

Added a check when events are logged in simulation that logs a message to stderr if the first two rules above aren't followed.

This probably doesn't address every instance of the above problems, but all of the events I was able to hit in simulation pass the check.
2018-06-08 11:11:08 -07:00
Evan Tschannen f26a2f771d fix: log router popped one too many versions from messageBlocks 2018-06-05 13:42:48 -07:00
Evan Tschannen e95f663ebc fix: the log router could pop too much data from the logs in rare situations 2018-06-03 19:34:24 -07:00
Evan Tschannen c519339adb avoid peeking from logs that do not match the tag’s locality 2018-06-01 18:42:48 -07:00
Evan Tschannen cc6511a39e fix: we do not know that the minimum popped version on the log router is a known committed version until it has advanced. 2018-05-06 09:32:41 -07:00
Evan Tschannen 5143871fed passed debug ids into all versions of peek() to assist debugging 2018-04-30 13:36:35 -07:00
Evan Tschannen 99598d180b fix: the log router must be initialized with all expected tags to prevent mistakenly choosing a minPopped that is too high 2018-04-30 10:58:41 -07:00
Evan Tschannen 9cdabfed0e added useful trace events 2018-04-29 18:54:47 -07:00
Evan Tschannen 2e286b768d fix: locality is needed for a logSet to call getPushLocations
fix: accidentally deleted allowPops assignment on the log router
2018-04-29 13:47:32 -07:00
Evan Tschannen dbdeeaa5cf fix: log routers are given all the information they need to add remote tags in their initialization request 2018-04-28 18:04:57 -07:00
Evan Tschannen 33fa8f2cac fix: make sure log routers only add remote tags from the correct log set 2018-04-28 15:04:13 -07:00
Evan Tschannen a12b994966 fix: log routers need tlogs to be present before accepting data 2018-04-26 18:37:51 -07:00
Evan Tschannen 5da452db8e fix: pop the log routers again after the log system updates 2018-04-19 14:33:31 -07:00
Evan Tschannen 22526ef996 fix: do not tell storage servers about large sections of empty versions, because it can lead them to make mutations durable which have not been committed 2018-04-18 16:06:44 -07:00
Evan Tschannen 447c7bd15b fix: log routers use durable known committed version at the time of the pop to determine what is safe to pop from their logs
fix: storage server does not advance its version across large version increase until it has data associated with the version
2018-04-18 12:07:29 -07:00
Evan Tschannen e43fb6d8bc fix: the log routers were popping too many versions because the known committed version is less than minPopped version 2018-04-17 19:41:36 -07:00
Evan Tschannen 8569a85771 fix: only let a log router pop if they tlog it is serving is fully recovered 2018-04-17 15:03:22 -07:00
Evan Tschannen 760bc8bc99 fix: log router version needs to be fetched before it is available
fix: tlog did not fetch known committed version if start version was exactly equal to it
2018-04-17 11:16:48 -07:00
Evan Tschannen 093908b83f fix: log routers were starting one version too late 2018-04-17 00:29:16 -07:00
Evan Tschannen dcfa1847ff fix: log router’s starting popped version must be less than its starting version 2018-04-16 11:43:03 -07:00
Evan Tschannen f5141acae9 fix: log routers need all logs present in their log system since they call addRemoteTags 2018-04-13 17:33:36 -07:00
Evan Tschannen 3b7e4410cf fix: protect from peeking too early of a version from a log router 2018-04-12 16:15:17 -07:00
Evan Tschannen 7af892f50b first working version of non-copying recovery working with fearless configurations 2018-04-08 21:24:05 -07:00
Evan Tschannen b36e08f08f first version of non-copying recovery. Upgrades are broken, and it has not been tested using fearless configurations yet 2018-03-29 15:12:38 -07:00
Evan Tschannen 4dcef08260 optimized the log router to use a vector instead of a map for tag data 2018-03-17 11:08:37 -07:00
Evan Tschannen ccd70fd005 The tlog uses the tags embedded in the message instead of a separate vector of locations
optimized remote tlog committing to avoid re-serializing the message
2018-03-16 16:47:05 -07:00
Evan Tschannen 820382ea68 optimized the log router commit path to avoid re-serializing the data 2018-03-16 11:40:21 -07:00
Evan Tschannen df74e2a373 re-added support for non-copying tlog recovery 2017-10-24 15:09:31 -07:00
Evan Tschannen 15962cf079 Merge branch 'master' into feature-remote-logs
# Conflicts:
#	fdbrpc/Locality.cpp
#	fdbrpc/Locality.h
#	fdbserver/ClusterController.actor.cpp
#	fdbserver/ClusterRecruitmentInterface.h
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/TagPartitionedLogSystem.actor.cpp
#	fdbserver/WorkerInterface.h
#	fdbserver/fdbserver.vcxproj.filters
#	fdbserver/masterserver.actor.cpp
#	fdbserver/worker.actor.cpp
#	flow/error_definitions.h
2017-10-05 17:09:44 -07:00
Evan Tschannen d343d37274 fixed merge problems 2017-09-11 16:37:10 -07:00
Evan Tschannen ea26bc1c43 passed first tests which kill entire datacenters
added configuration options for the remote data center and satellite data centers
updated cluster controller recruitment logic
refactors how master writes core state
updated log recovery, and log system peeking
2017-09-07 15:32:08 -07:00
Evan Tschannen c22708b6d6 added tag localities
fix: remote logs need to stop the master when they are stopped
2017-08-03 16:16:36 -07:00
Evan Tschannen 5852a6301b fixed even more bugs 2017-07-15 15:15:03 -07:00
Evan Tschannen 57ba9d36af fixed a large number of bugs 2017-07-13 12:29:21 -07:00
Evan Tschannen 979ebcef6c changed to using a vector of logSets instead of a duplicate set of logs for remote servers
finished porting changes to the tlog
everything but peeking is finished in the TagPartitionedLogSystem
2017-07-09 14:46:16 -07:00
Evan Tschannen 0906250e78 merged everything from feature-remote-logs besides the tlog and tagpartitionedlogsystem
re-included tags in messages to the tlog
previously never committed the LogRouter
2017-06-29 15:50:19 -07:00