Commit Graph

4798 Commits

Author SHA1 Message Date
Alec Grieser 0ad351751a
fix build failure on Windows from undefined constant and mising parenthesis 2019-03-24 14:03:24 -04:00
Evan Tschannen 24c92a1870
Merge pull request #1352 from etschannen/feature-network-address-list
Changed NetworkAddressList to at most two addresses for performance
2019-03-24 10:22:38 -07:00
Evan Tschannen 38ed21328a fix: the failure monitoring client did not update secondaryAddress correctly 2019-03-23 23:51:12 -07:00
Evan Tschannen 50a4403661 fix: missing parathesis 2019-03-23 21:52:15 -07:00
Jingyu Zhou 40eec20252 Restore master PID in worker registration
This fix is lost during merge.
2019-03-23 21:02:11 -07:00
Jingyu Zhou 3ef26e6be3 Fix fitness assignment statements
Found by MacOS build.
2019-03-23 19:16:04 -07:00
Evan Tschannen 1fc6937802 changed NetworkAddressList to at most two addresses for performance 2019-03-23 17:54:46 -07:00
Evan Tschannen b51a24453e the data distributor and ratekeeper are not included in id_used, but when comparing equally good options we prefer to avoid sharing with those roles
excluded data distributor and ratekeeper were improperly killed when the best option was also excluded
2019-03-23 13:25:36 -07:00
Jingyu Zhou 10988f89d9 Code refactoring for ConsistencyCheck.actor.cpp 2019-03-23 11:06:43 -07:00
Jingyu Zhou fdc5b5ddbf Fix: spurious ratekeeper registration
A rare race condition:
-r simulation -f ./foundationdb/tests/slow/WriteDuringReadAtomicRestore.txt -s 114256311 -b on

- A is the ratekeeper.
- CC recruit B and B starts
- CC halts ratekeeper A and A is halted
- A registers back with CC, which then halts B. CC sets A to be the ratekeeper.

CC starts recruiting and finds A is the best machine. But skips recruiting
because CC thinks A is already used. Now the cluster is left with no ratekeeper.

Fix by disallowing ratekeeper registration with previous ID.
2019-03-23 11:03:51 -07:00
Jingyu Zhou 6523cd4931 Fix: recruit ratekeeper is not triggerred 2019-03-23 09:20:54 -07:00
Steve Atherton 09f37cf3d2
Merge pull request #533 from ajbeamon/fix-parent-directory
Fixes to parentDirectory() and abspath()
2019-03-22 23:53:46 -07:00
Evan Tschannen 2da46e3172 fix: halt if datacenters are different 2019-03-22 23:53:21 -07:00
Evan Tschannen b68bc46042
Merge pull request #1348 from ajbeamon/fix-missing-metrics-when-ss-down
Fix missing read workload metrics
2019-03-22 19:08:04 -07:00
Evan Tschannen d34c56c9a5 ensure that the processId exists in id_worker before accessing it 2019-03-22 18:54:39 -07:00
Balachandar Namasivayam ac8ad07b45 Address review comments. 2019-03-22 18:48:49 -07:00
Balachandar Namasivayam 4ed323ac52 Fixed bug and addressed review comments. 2019-03-22 18:48:49 -07:00
Balachandar Namasivayam d75020b44a Fix bug where accessing shared memory created by boost 1.52 leads to error when accessed by boost 1.67. 2019-03-22 18:48:49 -07:00
Andrew Noyes eacde47050 Disable setMemoryQuota for ASAN 2019-03-22 18:47:38 -07:00
Evan Tschannen 36ab852bb1 Merge branch 'master' into ratekeeper
# Conflicts:
#	fdbserver/ClusterController.actor.cpp
2019-03-22 18:41:00 -07:00
Evan Tschannen add66350f6 Merge branch 'master' of github.com:apple/foundationdb 2019-03-22 18:38:32 -07:00
Evan Tschannen e3400c13ae fixed a performance regression related to broadcasting a read version to too many transactions simultaneously 2019-03-22 18:37:39 -07:00
Evan Tschannen e37e45723c fix: CompareAndClear does not coalesce with itself 2019-03-22 18:37:39 -07:00
Evan Tschannen 6254a1a8e4 fix: restarting the provisional proxy causes all tlog peeks to restart, so if tlog peeks take longer than 1 second this could end in an infinite loop 2019-03-22 18:37:39 -07:00
Evan Tschannen 7dd1c1b60c fix: processClassFitness could be wrong if the client changed their class while rebooting 2019-03-22 18:37:39 -07:00
Evan Tschannen ddb6058770 simplified ratekeeper monitoring loop 2019-03-22 18:22:45 -07:00
Jingyu Zhou 12917d8c7d Add actors to store halt request futures
Address best fitness in checking better DD or RK.
2019-03-22 18:06:38 -07:00
Jingyu Zhou e8977aeb98 Remove clusterControllerDcId check
This is no longer needed since it'll be set in the ctor.
2019-03-22 18:01:54 -07:00
Evan Tschannen 82bc447e29 startRatekeeper is responsible for updating serverDBInfo 2019-03-22 17:56:16 -07:00
Evan Tschannen 82c80c225d make sure id_worker is updated before setting ratekeeper or data distribution 2019-03-22 17:08:54 -07:00
Evan Tschannen 6a9c9d79cc
Update fdbserver/ClusterController.actor.cpp 2019-03-22 17:00:58 -07:00
Evan Tschannen 70b1c88cdd
Update fdbserver/ClusterController.actor.cpp 2019-03-22 17:00:52 -07:00
Stephen Atherton 382a7bdc5f Changed behavior regarding ~ and ~user paths to treat them as unresolvable symbolic links. 2019-03-22 16:21:12 -07:00
Evan Tschannen efbcd18987 fixed a performance regression related to broadcasting a read version to too many transactions simultaneously 2019-03-22 16:05:20 -07:00
Evan Tschannen 6905db816e fix: CompareAndClear does not coalesce with itself 2019-03-22 15:55:09 -07:00
A.J. Beamon 01590fc14e
Merge pull request #1323 from alecgrieser/00775-database-level-tr-options
Resolves #775: Support setting Transaction options at the Database level
2019-03-22 15:16:10 -07:00
Jingyu Zhou 16f54577ee Restore master PID in cluster controller worker registration
CC may think master failed and clear the master PID, which can block both data
distributor and ratekeeper recruitment. Fix by restoring it during worker
registration.
2019-03-22 14:53:05 -07:00
A.J. Beamon 3a69d1da8e Add PR number to release notes 2019-03-22 14:24:33 -07:00
A.J. Beamon fc48b6050e When tabulating read workload metrics, ignore the absence of any particular storage server. 2019-03-22 14:22:22 -07:00
Evan Tschannen 78f7a2e40b fix: restarting the provisional proxy causes all tlog peeks to restart, so if tlog peeks take longer than 1 second this could end in an infinite loop 2019-03-22 14:13:58 -07:00
Alec Grieser 63f23c0818
add tests for new database behavior to python scripted tests
This also fixes the behavior for the tests of the options which are no longer reset when on_error is called.
2019-03-22 15:10:08 -04:00
Stephen Atherton 524a666290 Added back previous handling of ~ in paths, with the improvement that it only treats ~ as special if it is the first character. 2019-03-22 11:44:46 -07:00
Alec Grieser e6e2ea2af6
Merge remote-tracking branch 'upstream/master' into 00775-database-level-tr-options 2019-03-22 14:41:27 -04:00
A.J. Beamon f3a85f62c1
Merge pull request #1346 from alecgrieser/fix-scripted-bindingtester-test
Fix scripted bindingtester test
2019-03-22 11:38:43 -07:00
A.J. Beamon fd30969139
Merge pull request #1344 from satherton/fix-parsetime-linux
Stopped using %z to parse timezone offset with strptime()...
2019-03-22 11:04:04 -07:00
A.J. Beamon 76e6a3f56d
Update fdbclient/BackupAgentBase.actor.cpp
Co-Authored-By: satherton <stevea@apple.com>
2019-03-22 11:02:38 -07:00
Andrew Noyes 56955deed9 Set TRACE_FORMAT network option in fdbbackup 2019-03-22 10:39:13 -07:00
Andrew Noyes 555405d1ab Set TRACE_FORMAT network option in fdbcli
Previously --trace_format did not work since the network option has the
last say in what trace format to use.
2019-03-22 10:35:05 -07:00
Alec Grieser 5e8e2ef2a6
rename the function where it is defined as well as where it is called 2019-03-22 13:26:41 -04:00
Alec Grieser 9e15872418
remove test that is now extraneous 2019-03-22 13:20:00 -04:00