Commit Graph

2008 Commits

Author SHA1 Message Date
Evan Tschannen cad435499e
Merge pull request #485 from alexmiller-apple/feature-remotelogs-improvements
Feature remotelogs improvements
2018-06-13 09:52:08 -07:00
Alex Miller 8518f6a8a8 smartQuorum shouldn't return if more responses are desired than futures provided. 2018-06-12 16:50:25 -07:00
Alex Miller 6c2cb25c53 Rename BestOtherFit -> OkayFit.
The previous order of fitness was

  BestFit > GoodFit > BestOtherFit > ...

which is baffling.  It's now:

  BestFit > GoodFit > OkayFit > ...

which won't break anyone's expectations.
2018-06-12 16:50:25 -07:00
Alex Miller fcfa00928b Make RecoveryState an enum class.
This means that all the == 7 or != 0 checks go away, and explicit names must be used.
2018-06-12 16:50:25 -07:00
Steve Atherton 731c3b38e8
Merge pull request #481 from satherton/release-5.2
Reduce backup parallel tasks to decrease memory usage.
2018-06-12 13:16:24 -07:00
Steve Atherton 0481928bd8
Reduce backup parallel tasks to decrease memory usage. 2018-06-12 13:15:24 -07:00
A.J. Beamon bd9e416e8c
Merge pull request #479 from ajbeamon/release-notes-spelling-fix
Spelling fix in release notes
2018-06-12 12:25:38 -07:00
A.J. Beamon fc1c7c2f82 Spelling fix in release notes 2018-06-12 12:22:44 -07:00
A.J. Beamon a1901701d6
Merge pull request #449 from jkominek/python-getfullargspec
use inspect.getfullargspec when available
2018-06-12 11:58:04 -07:00
Evan Tschannen e12b7a79aa
Merge pull request #464 from etschannen/feature-remote-logs
fixed another trace event
2018-06-11 12:54:02 -07:00
Evan Tschannen 8dfda1e57b fixed another trace event 2018-06-11 12:53:07 -07:00
Evan Tschannen 3c50fb8d47
Merge pull request #462 from etschannen/feature-remote-logs
fixed trace event name
2018-06-11 12:44:08 -07:00
Evan Tschannen e28769b98e fixed trace event name 2018-06-11 12:43:08 -07:00
Evan Tschannen 33dd2b157a
Merge pull request #458 from etschannen/feature-remote-logs
Do not index tags on TLogs for remote localities
2018-06-11 12:23:21 -07:00
Evan Tschannen 372ed67497 Merge branch 'master' into feature-remote-logs
# Conflicts:
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/MasterProxyServer.actor.cpp
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/TagPartitionedLogSystem.actor.cpp
2018-06-11 11:34:10 -07:00
Evan Tschannen 588eaf4b36 fix: previous delay 0 could still cause us to recruit a tlog before processing disk errors 2018-06-11 11:26:30 -07:00
Alec Grieser 7817c4ac92
Merge pull request #457 from brownleej/site-map-fix
Add administration and TLS sections to the site map.
2018-06-11 11:20:25 -07:00
John Brownlee cd4ce7843d Add administration and TLS sections to the site map.
#264
2018-06-11 11:13:44 -07:00
Evan Tschannen 64e0260085 fix: assert did not properly handle default constructed policies 2018-06-10 21:51:59 -07:00
Evan Tschannen b60264024a fix: we need to copy the txsTag on satellite logs 2018-06-10 20:30:44 -07:00
Evan Tschannen a5c2a8ee8a fix: allow disk errors to cancel the actor before recruiting logs 2018-06-10 20:27:19 -07:00
Evan Tschannen 134b5d6f65 fix: only consider data distribution started when remote has recovered so quite database works correctly 2018-06-10 20:25:15 -07:00
Evan Tschannen 2407e3774b fix: we cannot run with less storage replication than log replication because it breaks recruitment logic 2018-06-10 20:22:58 -07:00
Evan Tschannen 4903df5ce9 fix: give time to detect failed servers before building teams 2018-06-10 20:21:39 -07:00
Evan Tschannen 0bc7274d0e fix: hasSatelliteReplication was set incorrectly 2018-06-10 20:20:41 -07:00
Evan Tschannen 6e48d93d39 backed out the healthy team check because it was unnecessary 2018-06-10 12:43:32 -07:00
Evan Tschannen 8a24bf6124 describe did not list all the log sets 2018-06-10 12:38:50 -07:00
Evan Tschannen 82be52205b
Merge pull request #447 from bnamasivayam/save-fitness-info
Save fitness info of a process to become a cluster controller. This i…
2018-06-08 16:18:00 -07:00
Evan Tschannen b9826dc1cb fix: do not automatically reduce redundancy we move keys if the database does not have remote replicas. This is to prevent problems when dropping remote replicas from a configuration. 2018-06-08 16:17:27 -07:00
Balachandar Namasivayam 8360f71cbb Merge branch 'master' of github.com:apple/foundationdb into save-fitness-info
# Conflicts:
#	fdbserver/worker.actor.cpp
2018-06-08 16:09:59 -07:00
Evan Tschannen 35ab8b9b8e
Merge pull request #455 from ajbeamon/master
Some more trace event normalization
2018-06-08 16:03:34 -07:00
A.J. Beamon 06ccd9a500 Allow trace event type names to end with an underscore. 2018-06-08 15:49:31 -07:00
A.J. Beamon 1fdfe20908 Relax the rules on trace event Types a bit by allowing multiple underscores, as well as starting with an underscore and consecutive underscores. 2018-06-08 15:40:29 -07:00
Evan Tschannen 48fbc407fd fix: we cannot kill all of the remote tlogs, because we still need their data to copy to the next generation in the same data center 2018-06-08 15:28:44 -07:00
Balachandar Namasivayam 32285ee958 Don't crash if fitness file is corrupted in real production use case. 2018-06-08 14:03:36 -07:00
A.J. Beamon 99c9958db7 Some more trace event normalization 2018-06-08 13:57:00 -07:00
Balachandar Namasivayam 34995d4d64 Address review comments. 2018-06-08 11:51:51 -07:00
Evan Tschannen 943dfa278b
Merge pull request #454 from ajbeamon/normalize-trace-events
Attempt to normalize trace events
2018-06-08 11:28:44 -07:00
A.J. Beamon c12b235080 Fix case in a few commented out trace events 2018-06-08 11:20:06 -07:00
A.J. Beamon e5488419cc Attempt to normalize trace events:
* Detail names now all start with an uppercase character and contain no underscores. Ideally these should be head-first camel case, though that was harder to check.
* Type names have the same rules, except they allow one underscore (to support a usage pattern Context_Type). The first character after the underscore is also uppercase.
* Use seconds instead of milliseconds in details.

Added a check when events are logged in simulation that logs a message to stderr if the first two rules above aren't followed.

This probably doesn't address every instance of the above problems, but all of the events I was able to hit in simulation pass the check.
2018-06-08 11:11:08 -07:00
A.J. Beamon f1d389448c
Merge pull request #453 from apple/release-5.2
Merge release-5.2 into master
2018-06-08 10:41:44 -07:00
A.J. Beamon 6461478695
Merge pull request #452 from apple/release-5.1
Merge release-5.1 into release-5.2
2018-06-08 10:41:13 -07:00
Evan Tschannen d7d38c3544
Merge pull request #430 from ajbeamon/rename-logGroup-attribute
Rename trace file logGroup attribute to LogGroup
2018-06-08 10:30:45 -07:00
Evan Tschannen 953c27e570
Merge pull request #431 from ajbeamon/tlog-rename-variables
Rename several variables in TLogServer.actor.cpp to follow our normal camel case conventions.
2018-06-08 10:30:22 -07:00
Evan Tschannen 12c45ccf79
Merge pull request #451 from ajbeamon/release-5.1
Fix case of newSeverity detail in StderrSeverity trace event
2018-06-08 10:28:30 -07:00
A.J. Beamon c9543791fd Fix case of newSeverity detail in StderrSeverity trace event 2018-06-08 10:24:12 -07:00
Jay Kominek fb33412b3a use inspect.getfullargspec when available
getargspec was deprecated in python3, this should use
getfullargspec when available, and degrade gracefully
otherwise.
2018-06-08 01:07:18 -06:00
Evan Tschannen 7d392689fe fix: only update metrics for healthy destinations, because unhealthy destinations are already in the source 2018-06-07 18:12:04 -07:00
Evan Tschannen e4d5817679 fix: we must server getTeam requests before readyToStart is set because we cannot complete relocateShard requests without getTeam responses from both team collections 2018-06-07 16:14:40 -07:00
Evan Tschannen 9f0c16f062 do not build teams which contain failed servers 2018-06-07 14:05:53 -07:00