Jingyu Zhou
0db03f1d3c
Use backup_logging_enabled flag
...
The default is to enable new backup workers. Users can disable this flag to
turn off the backup worker feature.
2020-02-03 20:03:22 -08:00
Jingyu Zhou
297f22726c
Add backup_type database configuration option
...
Update simulation tests to randomly set backup types to be one of: old backup
(default), new backup (tagged), or both (default+tagged).
2020-01-31 19:29:09 -08:00
mengranwo
227edd4248
change memory storage engine name from memory-radixtree to memory-radixtree-beta
2020-01-15 13:49:45 -08:00
mengranwo
f597aa7e18
WIP : deployable/stable version since Nov 3. Start rebase to master branch
2020-01-15 13:49:45 -08:00
negoyal
a4a0bf18f9
Merging with Master.
2019-11-12 13:01:29 -08:00
Evan Tschannen
688940b685
merge 6.2 into master
2019-10-21 11:43:46 -07:00
Evan Tschannen
ef0890c23a
updated status schema
2019-10-16 22:37:57 -07:00
Evan Tschannen
35e816e9ad
added the ability to configure satellite_logs by satellite location, this will overwrite the region configure if both are present
2019-10-14 18:30:15 -07:00
Meng Xu
d0147e5e5d
Merge branch 'release-6.2' into mengxu/merge-release620-to-master-v3
...
Resolved Conflicts:
documentation/sphinx/source/release-notes.rst
fdbserver/DataDistribution.actor.cpp
versions.target
2019-10-02 13:22:56 -07:00
Evan Tschannen
045175bd0e
added tracking for the size of the system keyspace
2019-09-27 22:39:19 -07:00
A.J. Beamon
6100d3274d
Merge pull request #2058 from tclinken/expose-lock-status
...
Added lockUID to status output if database is locked
2019-09-11 08:47:35 -07:00
Trevor Clinkenbeard
8c31a839be
s/lockUID/lock_uid in status
2019-09-06 22:20:55 -07:00
Trevor Clinkenbeard
2d216f7ae5
Added database_lock_state to statusSchema
2019-09-06 22:20:50 -07:00
A.J. Beamon
3f9e392668
Merge pull request #2014 from etschannen/feature-fdbcli-sleep
...
Added a sleep command to fdbcli
2019-08-30 11:22:13 -07:00
Evan Tschannen
f3bc7e0abd
do not duplicate data distribution disabled fields in status
...
fixed a few bugs related to the existing data distribution disabled fields in status
2019-08-29 18:41:34 -07:00
A.J. Beamon
2b80d836f4
Merge branch 'release-6.2' into add-coordinator-to-status-roles-list
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
2019-08-19 15:03:59 -07:00
A.J. Beamon
b8e57f37d7
Add 'coordinator' to the list of roles that a process can have in status.
2019-08-15 14:42:49 -07:00
A.J. Beamon
bb72cdd36a
Report lag with the usual "seconds" and "versions" fields. Rename and deprecate the qos.*version_lag_storage_server fields.
2019-08-15 13:42:39 -07:00
A.J. Beamon
6581161dd3
Add ratekeeper's durability lag statistics to status
2019-08-15 11:07:04 -07:00
A.J. Beamon
438bc636d5
Rename max_machine_failures_without_losing_X to max_zone_failures_without_losing_X in status.
2019-07-30 14:02:31 -07:00
Evan Tschannen
90e3b50213
Merge branch 'master' into feature-coordinator-connection
...
# Conflicts:
# fdbclient/DatabaseContext.h
# fdbclient/NativeAPI.actor.cpp
# fdbclient/NativeAPI.actor.h
# fdbserver/workloads/KillRegion.actor.cpp
2019-07-26 15:05:02 -07:00
Evan Tschannen
be5d144b8b
added status information on connected clients
2019-07-25 17:15:31 -07:00
Evan Tschannen
846038b0e6
Merge pull request #1858 from bnamasivayam/rk-ssfetch-throttle
...
Ratekeeper throttling aggressively when unable to fetch storage server list
2019-07-19 16:41:58 -07:00
Evan Tschannen
94c66f8d58
Merge pull request #1738 from bnamasivayam/consistency-check-disable
...
Disable/Re-enable consistency check through a database key.
2019-07-18 10:56:02 -07:00
Balachandar Namasivayam
406bcebdc4
Ratekeeper to throttle tpsLimit to 1 if it is not able to fetch storage server list for some configurable amount of time.
2019-07-17 18:08:17 -07:00
A.J. Beamon
2cd05e9ac9
Merge pull request #1712 from tclinken/add-local-rk-to-status
...
Track the local ratekeeper rate in status
2019-07-15 15:17:11 -07:00
Balachandar Namasivayam
9169232fa9
Add the new messages to Schema.
2019-07-15 13:47:27 -07:00
Evan Tschannen
1a18c859c7
knobified the durability lag rate controls
2019-07-12 18:50:56 -07:00
A.J. Beamon
f31884c749
Merge branch 'master' into add-priority-starts-to-status
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
2019-07-11 15:26:52 -07:00
A.J. Beamon
97609ad991
Add information about transaction starts at different priorities to status.
2019-07-11 13:54:44 -07:00
A.J. Beamon
b4dbc6d7fa
Change the way cache hits and misses are tracked to avoid counting blind page writes as misses and count the results of partial page writes. Report cache hit rate in status.
2019-07-10 14:43:20 -07:00
A.J. Beamon
69d7c4f79c
Merge branch 'master' into track-run-loop-busyness
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# flow/Net2.actor.cpp
# flow/network.h
2019-07-09 18:39:23 -07:00
Trevor Clinkenbeard
1bac04509e
Track the local ratekeeper rate as a percentage
...
This value is reported in status for each storage server.
2019-07-09 12:46:53 -07:00
A.J. Beamon
4be08d9b2d
Rename datacenter_version_difference to datacenter_lag and include both seconds and versions.
2019-07-05 14:36:18 -07:00
Evan Tschannen
b9a6271375
local ratekeeper no longer globally limits
2019-06-28 16:54:22 -07:00
Evan Tschannen
92b32855ca
ratekeeper’s control algorithm would oscillate when limited by local ratekeeper
2019-06-28 16:54:22 -07:00
A.J. Beamon
7f23814841
Track run loop busyness and report it in status.
2019-06-26 14:03:02 -07:00
Evan Tschannen
dccb9bc26d
fixed a number of correctness problems
2019-06-12 19:40:50 -07:00
Evan Tschannen
8590b710bf
added additional logging on the logs and log routers
2019-05-02 17:24:39 -07:00
Evan Tschannen
3356ac27bf
added three_data_hall_fallback configuration
2019-04-07 22:58:18 -07:00
Evan Tschannen
628fec8c8b
updated status with information about ongoing maintenance
...
clear the maintenance zone if a different storage server is detected failed
2019-04-02 14:15:51 -07:00
Jingyu Zhou
b81de9831f
Fix SchemaMismatch error
...
Add data_distributor and ratekeeper roles to schema.
2019-03-27 09:54:01 -07:00
Evan Tschannen
eb54a700ba
changed the old memory configuration to memory-1
2019-03-18 15:10:04 -07:00
Evan Tschannen
a372c7cf18
configure memory now selects the ssd engine for transaction log spilling. Transaction log spilling is only used when the transaction logs cannot keep all of the unpopped mutations it has in memory. If we are only using this data structure because we do not have enough memory, it is much less safe to use the memory storage engine for this purpose.
2019-03-16 22:48:24 -07:00
Meng Xu
5a10bf5dfc
Merge branch 'master' into mengxu/tls-switch-status-PR
2019-03-14 10:35:12 -07:00
Evan Tschannen
e068c478b5
merge master
2019-03-12 18:31:25 -07:00
Meng Xu
435e515985
Merge branch 'master' into mengxu/tls-switch-status-PR
2019-03-11 11:17:40 -07:00
Evan Tschannen
80c3f2f8e2
added status fields detailing which processes are degraded, and also the total number of degraded processes
2019-03-10 22:58:15 -07:00
Jingyu Zhou
7340998261
Fix status message for ratekeeper
2019-03-07 13:16:20 -08:00
Meng Xu
845f8fdcbc
Status:healthy: Add optimizing_team_collections
...
Change removing_redundant_teams status name to
optimizing_team_collections.
The new name is more general and can be applied in the future
when we switch storage engines.
2019-03-06 15:05:23 -08:00
Meng Xu
04880e3d4d
Merge branch 'master' into mengxu/tls-switch-status-PR
2019-03-06 13:41:16 -08:00
Meng Xu
b7a52e81e2
Status: Count connected coordinators per client
...
A client will always try to connect all coordinators.
This commit let Status track the number of connected coordinators
for each client.
This allows us to do canary in coordinators. For example,
when we switch from non-TLS to TLS, we can switch 1 coordinator
from non-TLS to TLS. This can help check if a client has the ability
to connect through TLS.
We can make the non-TLS to TLS switch for each coordinators
one by one. This avoid the risk of losing connection in the switch.
2019-03-05 21:21:23 -08:00
Meng Xu
94385447bc
Status: Get if client configured TLS
...
To understand if all clients have configured TLS,
we check the tlsoption when a client tries to open database.
This is similar to how we track the versions of multi-version clients.
2019-03-01 15:17:01 -08:00
A.J. Beamon
3e6a6a6569
Update status schema for correctness. Send the count of batch transactions started back to ratekeeper so that it can be logged with other ratekeeper metrics.
2019-02-28 12:00:58 -08:00
Evan Tschannen
8afb7fbb9d
Merge pull request #1160 from alexmiller-apple/tstlog-fork
...
Spill-By-Reference TLog Part 2: New and Old TLogServers co-exist harmoniously
2019-02-26 18:00:04 -08:00
Alex Miller
6d23eb2d1a
Implement log_version.
...
This mega-commit introduces a new configuration setting, `log_version`,
that controls the TLog implementations and features that are available
within FDB, so that users can opt in to new features if they're willing
to sacrifice backwards compatibility.
2019-02-22 12:15:23 -08:00
Meng Xu
64db109f20
Status: Add schema for the new data distributor role
2019-02-22 10:05:12 -08:00
Meng Xu
9445ac0b0c
Status: Use new data distributor worker to publish status
...
After we add a new data distributor role, we publish the data
related to data distributor and rate keeper through the new
role (and new worker).
So the status needs to contact the data distributor, instead of master,
to get the status information.
2019-02-21 18:05:50 -08:00
Meng Xu
db19b08762
TeamRemover: Add new status to fdbcli
...
Add the healthy_removing_redundant_teams status to fdbcli
2019-02-21 15:03:32 -08:00
Alex Miller
bf8bfb8137
Set log_spill in SimulationConfig.
...
Which also revealed that it needed to be added to the schema.
2019-02-19 22:30:15 -08:00
A.J. Beamon
d4349293b9
Reworked the way latency counters are tracked. Report the latency bands in separate events from StorageMetrics and ProxyMetrics. Fix a problem when the latency band configuration was changed. Add correctness testing.
2019-02-07 13:39:22 -08:00
A.J. Beamon
2198d24ce1
Merge commit '3b2700d25334c53d13496ca16682642aac951beb' into track-server-request-latencies
...
# Conflicts:
# fdbclient/MasterProxyInterface.h
# fdbserver/ClusterController.actor.cpp
# fdbserver/MasterProxyServer.actor.cpp
# fdbserver/ServerDBInfo.h
# fdbserver/Status.actor.cpp
# fdbserver/fdbserver.vcxproj
# fdbserver/storageserver.actor.cpp
2019-01-24 11:43:26 -08:00
A.J. Beamon
8e05e95045
Added the ability to configure the latency band settings by setting a special key in \xff keyspace.
2019-01-18 16:18:34 -08:00
Robert Escriva
268093a96d
Adjust all includes to be relative to the root.
...
Remove the use of relative paths. A header at foo/bar.h could be included by
files under foo/ with "bar.h", but would be included everywhere else as
"foo/bar.h". Adjust so that every include references such a header with the
latter form.
Signed-off-by: Robert Escriva <rescriva@dropbox.com>
2018-10-19 17:35:33 +00:00
Stephen Atherton
3ea9193fa7
Renamed redwood to redwood-experimental. UnitTest names can now be hidden using # as the first character so that random correctness tests will not run them. Excluded redwood tests from correctness testing. Reverted default storage engine to ssd.
2018-10-05 14:43:54 -07:00
Stephen Atherton
7c1dc305cb
Merge commit 'a72c8f5cb2e79a673abc0ed3d27ef1c51028fb13' into feature-redwood
2018-10-05 10:15:10 -07:00
A.J. Beamon
a98fcf5972
Rename durable_lag to durability_lag
2018-10-01 09:58:49 -07:00
A.J. Beamon
f196e2d4dc
Lot metrics about read requests as well as completed reads.
2018-09-27 15:32:39 -07:00
A.J. Beamon
118e21c446
Add new metrics for bytes queried, keys queried, mutation bytes, mutations, and durable lag to the storage role in status.
2018-09-27 14:33:21 -07:00
Stephen Atherton
da70ba7e68
Update status schema to know about redwood storage engine. Remove status schema document which is no longer used.
2018-09-20 11:01:21 -07:00
Evan Tschannen
282e9e41c2
fix: windows build was broken because statusSchema string was too long
2018-09-05 11:40:04 -07:00
Evan Tschannen
40f5dbe423
fixed issues from review, added a safeguard to prevent configuring a cluster to an invalid configuration
2018-09-04 22:16:35 -07:00
Evan Tschannen
d8ea3dbf9a
Added the ability to configure a cluster from a JSON file
2018-08-16 17:34:59 -07:00