Alec Grieser
2868908c14
make use of Tuple.pack(prefix) in java tests
2017-10-09 15:28:52 -07:00
Alec Grieser
152e10eba1
added hasIncompleteVersionstamp utility method to tuples
2017-10-09 13:52:00 -07:00
Alec Grieser
a9cc7af79e
added versionstamps to java tuples
2017-10-09 11:07:34 -07:00
Evan Tschannen
5e6eba365b
fix: always set confChange, because popVersion is not deterministic across proxies, and confChange needs to be set deterministically
2017-10-06 18:37:08 -07:00
Evan Tschannen
93b3d0e4e7
fix: toMap didn’t report logs proxies and resolvers
2017-10-06 15:55:50 -07:00
Alex Miller
a21c8a820b
Move cpuProfilerRequest from WorkerInterface to ClientWorkerInterface.
...
A way to access this stream is required if we wish to be able to toggle
profiling from fdbcli. There's two ways to do this:
1. Use `monitorLeader()` to get a `ClusterControllerFullInterface`, and use
`getWorkers` from there to get a list of `WorkerInterface`s, from which we can
access cpuProfilerRequest.
2. Move cpuProfilerRequest to ClientWorkerInterface and use the existing code
in the client that can fetch a list of all `ClientWorkerInterface`s.
The split between WorkerInterface and ClientWorkerInterface appears to be
what a client might have a need to call versus what is fdbserver-internal (and
thus no client should even want to call). Thus, it seems to make more sense to
acknowledge that profiling is useful to be able to toggle from a client, and go
with option (2).
2017-10-05 14:08:28 -07:00
Yichi Chiang
3edc2824a9
Add initialClass to RegisterWorkerRequest 2
2017-10-05 11:03:25 -07:00
John Brownlee
6ad9e389dc
Merge pull request #168 from cie/fdbmonitor-fork-retry-support
...
Add support for retrying a process if fork fails. The HUP signal now …
2017-10-05 10:19:43 -07:00
Alvin Moore
0c899c167a
Merge branch 'master' of github.com:apple/foundationdb
2017-10-05 08:54:37 -07:00
A.J. Beamon
c1bc355306
Add support for retrying a process if fork fails. The HUP signal now causes configuration to be reloaded and timeouts to be reset. A little refactoring to make this easier.
2017-10-05 08:23:52 -07:00
Alvin Moore
de8f875038
Fixed call to IsClear
...
Changed killMachine and killDataCenter interface to return final killtype
Updated TESTs for DataCenter to ensure that DataCenter was killed
Added assertion to ensure that failed DC kills were not downgrades
2017-10-05 03:07:20 -07:00
Yichi Chiang
05f7626e39
Add initialClass to RegisterWorkerRequest
2017-10-04 17:11:12 -07:00
Yichi Chiang
3c70df57b5
Fix cluster controller review comments
2017-10-04 15:48:55 -07:00
A.J. Beamon
63570ccb05
Merge pull request #163 from cie/alexmiller/txnprofcli
...
Allow client profiling to be configured from fdbcli.
2017-10-04 14:35:14 -07:00
Alex Miller
2e662b6bb6
Fixing review comments.
...
* parse_with_units found a proper home in flow.h while this was pending
* atof->strtod for error checking
2017-10-04 14:00:38 -07:00
Alex Miller
e55cc447d2
Address code review comments.
...
* Fixed memory corruption with SystemData key constants
* Removed duplication in ClusterController
* Reworked fdbcli actions to better represent explicit vs default assignments
2017-10-04 13:36:18 -07:00
Alex Miller
80fa597422
Allow client profiling to be configured from fdbcli.
...
This adds the following commands:
* profile client status
* profile client on 0.001 100MB
* profile client off
2017-10-04 13:36:18 -07:00
A.J. Beamon
5063793f36
Revert line ending change
2017-10-04 11:19:19 -07:00
A.J. Beamon
d886b95628
Merge pull request #131 from cie/33300740-with-shutdown-hooks
...
<rdar://problem/33300740> Java: support callbacks from external multi-version client threads
2017-10-04 09:17:25 -07:00
Alex Miller
706427ee62
Fix potential division by zero issues via RPC.
...
A carefully crafted SplitMetricRequest could have caused division by zero.
It's not really great to offer Division By Zero As A Service, so let's just
return an error instead.
2017-10-03 22:11:08 -07:00
Alex Miller
0ac868ad5d
"Simplify" IndexedSet's insert and addMetric API.
...
The existing code tried to work around the complexities of optionally using
rvalue references' move capabilities if they exist. As seen in the previous
MapPair, there's a combinatorial explosion of prototypes to declare as the
parameter length increases. Because of this, addMetric ended up with a strange
API, and there was a wrapper to make a copy for insert.
Instead, we can apply the idiom of using universal/forwarding references and
std::forward to allow the compiler to instantiate the combinations that are
needed. There's a TagData struct with no copy constructor that validates that
move constructors can be properly called still.
I measured a 12-byte difference between before and after this change, so no
template bloat was introduced.
2017-10-03 20:15:12 -07:00
Evan Tschannen
3a2ddcc84a
Add destinations that are read-write to the source list, so that cancelled data movement can contribute to copying the data for the next movement.
2017-10-03 17:39:08 -07:00
Stephen Atherton
fd5fe3a000
Add slightly better handling of HTTP 503 in blob client. Previously it would end the blob request loop and the task doing the blob action would see a failure, but now the blob request attempt loop will continue to back off and retry. This is better because previously the task that saw the failure would be re-run quickly.
2017-10-03 15:25:49 -07:00
Stephen Atherton
03c4cea511
Added rate-controlled TraceEvents for blob http connection attempts and failures.
2017-10-03 15:21:40 -07:00
Yichi Chiang
25870189ae
Merge pull request #165 from cie/fix-connection-count
...
Fix connection count
2017-10-03 14:03:54 -07:00
Yichi Chiang
284e35204a
Fix connection count
2017-10-03 10:54:20 -07:00
Alvin Moore
5257b99d3f
Fixed problem with machines RebootedAndCleared not being considered dead in availability consideration
2017-10-03 10:48:16 -07:00
Evan Tschannen
7818a7972b
fix: read_lock_aware had the same code as used_during_commit_protection_disable
2017-10-03 09:37:45 -07:00
Balachandar Namasivayam
0e153cdd35
Throttle Spammy logs. Three knobs are added.
...
Trace Events are sampled and cached with an expiration set. Every TraceEvent above SevDebug is checked against this cache to see if it exceeded a set threshold. If yes, then throttle the TraceEvent.
If a TraceEvent is throttled, a warning msg is logged.
2017-10-02 18:43:11 -07:00
Alvin Moore
d099656557
Merge branch 'release-5.0'
2017-10-02 12:05:24 -07:00
Alvin Moore
25513d8e2c
Added tests for DataCenter kills
2017-10-02 12:04:28 -07:00
Evan Tschannen
6ea9903c82
Merge branch 'release-5.0'
...
# Conflicts:
# fdbbackup/backup.actor.cpp
# fdbserver/ClusterController.actor.cpp
# versions.target
2017-10-01 18:46:44 -07:00
Evan Tschannen
8250528756
updated version.target for 5.0.6
2017-10-01 18:42:16 -07:00
Evan Tschannen
1a73e31959
updated wix guid
2017-10-01 18:37:22 -07:00
Evan Tschannen
5f82bfe533
updated release notes for 5.0.5
2017-10-01 17:45:56 -07:00
Stephen Atherton
fe7530ed53
Merge branch 'release-5.0' of github.com:apple/foundationdb into release-5.0
2017-10-01 16:45:56 -07:00
Stephen Atherton
ad9de674ac
Knob change, blob requests should be allowed more time.
2017-10-01 16:45:47 -07:00
Evan Tschannen
0949c4be65
Revert "Fixed problem with master being recruited on excluded servers"
...
This reverts commit 1f7b624734a8ad6e896dd3f01f9cdf334ca62486.
2017-10-01 16:30:19 -07:00
Evan Tschannen
696d432462
Revert "fix: excluded servers are worst fit for master rather than never assign (so that we can recover if every process has been excluded)"
...
This reverts commit 83b2ce68c8e1a29fc1559598cc38d3ef7eb46101.
2017-10-01 16:29:32 -07:00
Evan Tschannen
0dde15f1d2
fix: excluded servers are worst fit for master rather than never assign (so that we can recover if every process has been excluded)
...
fix: better master exists did not use exclusions because the configuration was reset
2017-10-01 16:26:58 -07:00
Stephen Atherton
058300be16
Each blobstore request will again select a random remote address. This used to happen before recent load balancing improvements related to focusing too much load on consistently up endpoints after others have recovered from being down.
2017-10-01 16:17:38 -07:00
Stephen Atherton
13a79482d8
Added comments for clarity.
2017-10-01 16:03:12 -07:00
Stephen Atherton
a95107417f
Improved behavior of slow writes during backup. KeyRange and Log backup tasks now use TaskBucket::saveAndExtend() to keep the task alive until flushing the file finishes or fails with an error (blob uploads fail after a limited number of retries). This prevents blob uploads from being retried too often if the destination is slow since a task abort and retry would start the backoff counters back at zero. Also removed a debugging behavior that was accidentally checked in.
2017-10-01 16:01:24 -07:00
Stephen Atherton
a098919b20
Bug fix, releaser declared in wrong place, and lots of whitespace cleanup from try blocks that were no longer needed.
2017-10-01 11:25:50 -07:00
Stephen Atherton
af87ac301d
Removed wait never used for debugging which was accidentally included in bug fix.
2017-10-01 11:19:38 -07:00
Stephen Atherton
6000cafde1
Bug fix, locks were being taken inside try/catch so release would be done even if the take threw an error. Changed to using a Releaser.
2017-10-01 10:46:55 -07:00
Evan Tschannen
f84e7252e8
fix: there was a reference counting cycle in asyncFileBlobStore and asyncFileReadAhead
2017-09-29 19:13:08 -07:00
A.J. Beamon
38616424f6
Report a couple error cases in blobstore URL parsing when dealing with numbers.
2017-09-29 17:58:49 -07:00
Alex Miller
440437f190
Merge pull request #156 from cie/alexmiller/drtime
...
Make versionstamped operations always have a version less than the current database version
2017-09-29 17:30:53 -07:00
Yichi Chiang
636ce4a131
Replace leader when find a better one
2017-09-29 16:34:55 -07:00