Evan Tschannen
68ac3bdc4c
log routers now calculate a precise version to pop for their log router tag
2018-06-21 15:29:46 -07:00
Evan Tschannen
eaca0fb2ea
fixed incorrect priorities on the log router
2018-06-18 17:36:40 -07:00
Evan Tschannen
f694f7c9ca
removed hasBestPolicy
2018-06-15 12:36:19 -07:00
Alex Miller
fcfa00928b
Make RecoveryState an enum class.
...
This means that all the == 7 or != 0 checks go away, and explicit names must be used.
2018-06-12 16:50:25 -07:00
Evan Tschannen
372ed67497
Merge branch 'master' into feature-remote-logs
...
# Conflicts:
# fdbserver/DataDistribution.actor.cpp
# fdbserver/MasterProxyServer.actor.cpp
# fdbserver/TLogServer.actor.cpp
# fdbserver/TagPartitionedLogSystem.actor.cpp
2018-06-11 11:34:10 -07:00
A.J. Beamon
e5488419cc
Attempt to normalize trace events:
...
* Detail names now all start with an uppercase character and contain no underscores. Ideally these should be head-first camel case, though that was harder to check.
* Type names have the same rules, except they allow one underscore (to support a usage pattern Context_Type). The first character after the underscore is also uppercase.
* Use seconds instead of milliseconds in details.
Added a check when events are logged in simulation that logs a message to stderr if the first two rules above aren't followed.
This probably doesn't address every instance of the above problems, but all of the events I was able to hit in simulation pass the check.
2018-06-08 11:11:08 -07:00
Evan Tschannen
f26a2f771d
fix: log router popped one too many versions from messageBlocks
2018-06-05 13:42:48 -07:00
Evan Tschannen
e95f663ebc
fix: the log router could pop too much data from the logs in rare situations
2018-06-03 19:34:24 -07:00
Evan Tschannen
c519339adb
avoid peeking from logs that do not match the tag’s locality
2018-06-01 18:42:48 -07:00
Evan Tschannen
cc6511a39e
fix: we do not know that the minimum popped version on the log router is a known committed version until it has advanced.
2018-05-06 09:32:41 -07:00
Evan Tschannen
5143871fed
passed debug ids into all versions of peek() to assist debugging
2018-04-30 13:36:35 -07:00
Evan Tschannen
99598d180b
fix: the log router must be initialized with all expected tags to prevent mistakenly choosing a minPopped that is too high
2018-04-30 10:58:41 -07:00
Evan Tschannen
9cdabfed0e
added useful trace events
2018-04-29 18:54:47 -07:00
Evan Tschannen
2e286b768d
fix: locality is needed for a logSet to call getPushLocations
...
fix: accidentally deleted allowPops assignment on the log router
2018-04-29 13:47:32 -07:00
Evan Tschannen
dbdeeaa5cf
fix: log routers are given all the information they need to add remote tags in their initialization request
2018-04-28 18:04:57 -07:00
Evan Tschannen
33fa8f2cac
fix: make sure log routers only add remote tags from the correct log set
2018-04-28 15:04:13 -07:00
Evan Tschannen
a12b994966
fix: log routers need tlogs to be present before accepting data
2018-04-26 18:37:51 -07:00
Evan Tschannen
5da452db8e
fix: pop the log routers again after the log system updates
2018-04-19 14:33:31 -07:00
Evan Tschannen
22526ef996
fix: do not tell storage servers about large sections of empty versions, because it can lead them to make mutations durable which have not been committed
2018-04-18 16:06:44 -07:00
Evan Tschannen
447c7bd15b
fix: log routers use durable known committed version at the time of the pop to determine what is safe to pop from their logs
...
fix: storage server does not advance its version across large version increase until it has data associated with the version
2018-04-18 12:07:29 -07:00
Evan Tschannen
e43fb6d8bc
fix: the log routers were popping too many versions because the known committed version is less than minPopped version
2018-04-17 19:41:36 -07:00
Evan Tschannen
8569a85771
fix: only let a log router pop if they tlog it is serving is fully recovered
2018-04-17 15:03:22 -07:00
Evan Tschannen
760bc8bc99
fix: log router version needs to be fetched before it is available
...
fix: tlog did not fetch known committed version if start version was exactly equal to it
2018-04-17 11:16:48 -07:00
Evan Tschannen
093908b83f
fix: log routers were starting one version too late
2018-04-17 00:29:16 -07:00
Evan Tschannen
dcfa1847ff
fix: log router’s starting popped version must be less than its starting version
2018-04-16 11:43:03 -07:00
Evan Tschannen
f5141acae9
fix: log routers need all logs present in their log system since they call addRemoteTags
2018-04-13 17:33:36 -07:00
Evan Tschannen
3b7e4410cf
fix: protect from peeking too early of a version from a log router
2018-04-12 16:15:17 -07:00
Evan Tschannen
7af892f50b
first working version of non-copying recovery working with fearless configurations
2018-04-08 21:24:05 -07:00
Evan Tschannen
b36e08f08f
first version of non-copying recovery. Upgrades are broken, and it has not been tested using fearless configurations yet
2018-03-29 15:12:38 -07:00
Evan Tschannen
4dcef08260
optimized the log router to use a vector instead of a map for tag data
2018-03-17 11:08:37 -07:00
Evan Tschannen
ccd70fd005
The tlog uses the tags embedded in the message instead of a separate vector of locations
...
optimized remote tlog committing to avoid re-serializing the message
2018-03-16 16:47:05 -07:00
Evan Tschannen
820382ea68
optimized the log router commit path to avoid re-serializing the data
2018-03-16 11:40:21 -07:00
Evan Tschannen
df74e2a373
re-added support for non-copying tlog recovery
2017-10-24 15:09:31 -07:00
Evan Tschannen
15962cf079
Merge branch 'master' into feature-remote-logs
...
# Conflicts:
# fdbrpc/Locality.cpp
# fdbrpc/Locality.h
# fdbserver/ClusterController.actor.cpp
# fdbserver/ClusterRecruitmentInterface.h
# fdbserver/TLogServer.actor.cpp
# fdbserver/TagPartitionedLogSystem.actor.cpp
# fdbserver/WorkerInterface.h
# fdbserver/fdbserver.vcxproj.filters
# fdbserver/masterserver.actor.cpp
# fdbserver/worker.actor.cpp
# flow/error_definitions.h
2017-10-05 17:09:44 -07:00
Evan Tschannen
d343d37274
fixed merge problems
2017-09-11 16:37:10 -07:00
Evan Tschannen
ea26bc1c43
passed first tests which kill entire datacenters
...
added configuration options for the remote data center and satellite data centers
updated cluster controller recruitment logic
refactors how master writes core state
updated log recovery, and log system peeking
2017-09-07 15:32:08 -07:00
Evan Tschannen
c22708b6d6
added tag localities
...
fix: remote logs need to stop the master when they are stopped
2017-08-03 16:16:36 -07:00
Evan Tschannen
5852a6301b
fixed even more bugs
2017-07-15 15:15:03 -07:00
Evan Tschannen
57ba9d36af
fixed a large number of bugs
2017-07-13 12:29:21 -07:00
Evan Tschannen
979ebcef6c
changed to using a vector of logSets instead of a duplicate set of logs for remote servers
...
finished porting changes to the tlog
everything but peeking is finished in the TagPartitionedLogSystem
2017-07-09 14:46:16 -07:00
Evan Tschannen
0906250e78
merged everything from feature-remote-logs besides the tlog and tagpartitionedlogsystem
...
re-included tags in messages to the tlog
previously never committed the LogRouter
2017-06-29 15:50:19 -07:00