Commit Graph

197 Commits

Author SHA1 Message Date
sramamoorthy 9afd162e2f remove snap v1 related code 2019-07-25 17:29:31 -07:00
sramamoorthy a2f2ad96ff code review comments and merge to master changes 2019-07-24 15:36:28 -07:00
sramamoorthy 30a90e546d make snap v2 the default version 2019-07-24 15:36:28 -07:00
sramamoorthy 209448807d snap v2: fdbclient related changes 2019-07-24 15:36:28 -07:00
Trevor Clinkenbeard 9ad9bd4c1f Merge branch 'master' of https://github.com/apple/foundationdb into change-connection-file 2019-07-24 15:22:26 -07:00
Evan Tschannen 846038b0e6
Merge pull request #1858 from bnamasivayam/rk-ssfetch-throttle
Ratekeeper throttling aggressively when unable to fetch storage server list
2019-07-19 16:41:58 -07:00
Evan Tschannen 3045826e3c
Merge pull request #1819 from mpilman/flatbuffers-fixes2
Flatbuffers fixes2
2019-07-19 16:33:50 -07:00
A.J. Beamon f6183df8b9
Merge pull request #1852 from vishesh/task/issue-1840-non-blocking-exclusion
fdbcli: Add `no_wait` option in `exclude` command to avoid blocking
2019-07-19 12:49:29 -07:00
Vishesh Yadav d9a8657096 fdbcli: Add `no_wait` option in `exclude` command to avoid blocking
RESOLVES #1840
2019-07-18 13:07:31 -07:00
Evan Tschannen 94c66f8d58
Merge pull request #1738 from bnamasivayam/consistency-check-disable
Disable/Re-enable consistency check through a database key.
2019-07-18 10:56:02 -07:00
mpilman 1ac2d01b03 Merge remote-tracking branch 'upstream/master' into flatbuffers-fixes2 2019-07-18 09:50:08 -07:00
Balachandar Namasivayam 406bcebdc4 Ratekeeper to throttle tpsLimit to 1 if it is not able to fetch storage server list for some configurable amount of time. 2019-07-17 18:08:17 -07:00
mpilman d5caf0c1b4 Merge branch 'flatbuffers-fixes2' of github.com:mpilman/foundationdb into flatbuffers-fixes2 2019-07-16 14:47:40 -07:00
A.J. Beamon d5051b08dd Make trace event field lengths (and total event sizes) default knobified and configurable. Add a transaction option to control the field length of transaction debug logging. Make the program start command line field less likely to be truncated. 2019-07-12 16:12:35 -07:00
Andrew Noyes 969957e619 Merge branch 'master' into change-connection-file 2019-07-12 11:39:19 -07:00
A.J. Beamon f4366e69ca Unknown options should not be used internally (i.e. underneath thread-safe API). This commit removes various checks that options exist and replaces them with an ASSERT. 2019-07-11 11:25:39 -07:00
Trevor Clinkenbeard 1582a2a24d Merge branch 'master' of https://github.com/apple/foundationdb into change-connection-file 2019-07-09 13:41:54 -07:00
A.J. Beamon a5a6f8431c Add a random UID to TransactionMetrics in case a client opens multiple connections and also a field to indicate whether the connection is internal. Convert some of the metrics to our Counter object instead of running totals. 2019-07-08 14:01:04 -07:00
Andrew Noyes 9ed8eb2cdb Explain strange use of literal byte strings 2019-07-05 14:07:02 -07:00
Andrew Noyes 889e153b81 Add object serializer flag to fdbcli 2019-07-05 14:07:02 -07:00
Balachandar Namasivayam 7489f83a7f Disable/Re-enable consistency check through a database key.
fdbcli has a new command 'consistencycheck' to disable/re-enable consistency check.
cluster_healthy metric in status becomes false if consistencycheck is disabled.
2019-06-20 21:38:45 -07:00
Andrew Noyes 02e173b601 Add changeConnectionFile method to Transaction
Also add tests
2019-06-11 13:58:22 -07:00
sramamoorthy 61e93a9304 Address review comments and minor fixes 2019-05-28 22:07:46 -07:00
sramamoorthy 69edefe68b Snapshot based backup and resotre implementation 2019-05-28 22:07:46 -07:00
A.J. Beamon 5f55f3f613 Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used. 2019-05-10 14:01:52 -07:00
Austin Seipp 1ce585b5ce fdbcli: fix some print/scan format warnings
Signed-off-by: Austin Seipp <aseipp@pobox.com>
2019-05-06 13:35:29 -07:00
Evan Tschannen 2d5043c665 Merge branch 'release-6.1'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	versions.target
2019-04-30 18:27:04 -07:00
Evan Tschannen b2d19eebc4 fdbcli would return success even when configure failed for a variety of error types
the existing configure safety check would fail when attempting to change between three_datacenter and region configuration
2019-04-30 16:30:19 -07:00
Evan Tschannen 6220a5ce0f
Merge pull request #1370 from jzhou77/fix-unreferenced
Remove unused functions
2019-04-09 11:49:45 -07:00
A.J. Beamon dfad79d577 Update fdbcli/fdbcli.actor.cpp
Co-Authored-By: mpilman <markus@pilman.ch>
2019-04-05 13:12:19 -07:00
A.J. Beamon 9595b43a2d Update fdbcli/fdbcli.actor.cpp
Co-Authored-By: mpilman <markus@pilman.ch>
2019-04-05 13:12:19 -07:00
A.J. Beamon a05350bfa7 Update fdbcli/fdbcli.actor.cpp
Co-Authored-By: mpilman <markus@pilman.ch>
2019-04-05 13:12:19 -07:00
mpilman 1c16f87a4e Remove trace-calls to printable (in non-workloads) 2019-04-05 13:12:19 -07:00
Jingyu Zhou 3371cf22d4 Add manually triggered heap profiling
At client side:
fdb> profile
ERROR: Usage: profile <client|list|flow|heap>
fdb> profile heap 127.0.0.1:4500

On the server side:
$ HEAPPROFILE=/tmp/fdbserver bin/fdbserver -C ../test.cluster -p 127.0.0.1:4500
Starting tracking the heap
FDBD joined cluster.
Dumping heap profile to /tmp/fdbserver.0001.heap (1024 MB allocated cumulatively, 13 MB currently in use)
Dumping heap profile to /tmp/fdbserver.0002.heap (User triggered heap dump)
2019-04-03 16:00:54 -07:00
Evan Tschannen 39c595223b Merge branch 'release-6.1' 2019-04-02 22:30:02 -07:00
Evan Tschannen 628fec8c8b updated status with information about ongoing maintenance
clear the maintenance zone if a different storage server is detected failed
2019-04-02 14:15:51 -07:00
Evan Tschannen 72203ba47a Merge commit '56f3f0b1bc60604f965152d856ae29a591227703' 2019-04-01 18:45:38 -07:00
Evan Tschannen 781cf9b5a0 added the ability to make a zoneId for maintenance in fdbcli 2019-04-01 17:55:13 -07:00
Jingyu Zhou f7f8ddd894 Fix warnings on unused variables
Found by -Wunused-variable flag.
2019-04-01 14:00:20 -07:00
Vishesh Yadav 1ba0b4e682 fix: Parse IPv6 addresses correctly in status details 2019-04-01 12:58:26 -07:00
Jingyu Zhou 7c02ee6fdd Fix compiler warning about unreferenced exception variable 2019-03-26 13:43:47 -07:00
Andrew Noyes 555405d1ab Set TRACE_FORMAT network option in fdbcli
Previously --trace_format did not work since the network option has the
last say in what trace format to use.
2019-03-22 10:35:05 -07:00
Evan Tschannen eb54a700ba changed the old memory configuration to memory-1 2019-03-18 15:10:04 -07:00
Evan Tschannen a372c7cf18 configure memory now selects the ssd engine for transaction log spilling. Transaction log spilling is only used when the transaction logs cannot keep all of the unpopped mutations it has in memory. If we are only using this data structure because we do not have enough memory, it is much less safe to use the memory storage engine for this purpose. 2019-03-16 22:48:24 -07:00
Vishesh Yadav ed49d603a0 Allows cluster string to contain coordinators with different TLS states
During live TLS upgrades, we can hence switch one coordinator at a time
to TLS than all of them together.
2019-03-06 16:05:10 -08:00
Vishesh Yadav 57832e625d net: Support IPv6 #963
- NetworkAddress now contains IPAddress object which can be either
IPv4 or IPv6 address. 128bits are used even for IPv4 addresses,
however only 32bits are used when using/serializing IPv4 address.

- ConnectPacket is updated to store IPv6 address. Backward compatible
with old format since the first 32bits of IP address field is used
for serialization of IPv4.

- Mainly updates rest of the code to use IPAddress structure instead
of plain uint32_t.

- IPv6 address/pair ports should be represented as `[ip]:port` as per
convention. This applies to both cluster files and command line
arguments.
2019-03-04 14:12:41 -08:00
Evan Tschannen b8910ba7cd Merge branch 'master' into feature-fix-force-recovery
# Conflicts:
#	fdbclient/ManagementAPI.actor.h
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/storageserver.actor.cpp
#	fdbserver/workloads/KillRegion.actor.cpp
2019-02-22 14:38:13 -08:00
mpilman 0bb60e5a3b Use proper fwd decl in NativeAPI
Also NativeAPI.h -> NativeAPI.actor.h
2019-02-19 15:16:59 -08:00
mpilman 3cb2391b58 use proper fwd declarations in ManagementAPI
Also ManagementAPI.h -> ManagementAPI.actor.h
2019-02-19 15:16:59 -08:00
Evan Tschannen 065a45e05f Merge branch 'master' into feature-fix-force-recovery
# Conflicts:
#	fdbclient/ManagementAPI.actor.cpp
#	fdbserver/ClusterController.actor.cpp
#	fdbserver/workloads/KillRegion.actor.cpp
2019-02-18 17:09:06 -08:00
Evan Tschannen 8f2af8bed1 fix: forced recoveries now require a target dcid which will become the new primary location. During the forced recovery, the configuration will be changed to make that location primary, and usable_regions will be set to 1. If the target dcid is already the primary location, the forced recovery will do nothing. This makes forced recoveries idempotent, so it is safe to the client to re-send forced recovery commands to the cluster controller.
fix: the cluster controller attempts to do a commit to determine if the cluster is alive, since its own internal recoveryState might not be up-to-date.

fix: forceMasterFailure on the cluster controller did not always cause the current master to be re-recruited
2019-02-18 14:54:28 -08:00
Vishesh Yadav e05b53d755 Merge remote-tracking branch 'apple/master' into task/tls-upgrade 2019-02-15 20:37:07 -08:00
Vishesh Yadav c03de6c7b6 Update CLI to take addresses with mixed TLS states 2019-02-15 20:23:07 -08:00
A.J. Beamon b435d51061 Merge branch 'master' into track-server-request-latencies 2019-02-14 08:07:32 -08:00
Andrew Noyes 067a445e06 Replace unused _ variables with wait(success(...)) 2019-02-12 17:30:30 -08:00
mpilman 6da5971e79 Guard all versions.h to not break old WIN32 build 2019-02-08 16:06:00 -08:00
mpilman 7e26b4ef0d Address comments from PR 2019-02-07 15:37:04 -08:00
mpilman 8a94d80deb fdbservice and fdbrpc now compiling 2019-02-07 15:37:04 -08:00
Balachandar Namasivayam 750124f083 Accounting only non-excluded processes in the user output if recruiting is going on. 2019-02-06 17:04:38 -08:00
Balachandar Namasivayam 9c19d9e9eb Improve status output to reflect that tlogs are recruited across availability zones according to the configured replication factor. 2019-02-04 18:14:00 -08:00
A.J. Beamon 2198d24ce1 Merge commit '3b2700d25334c53d13496ca16682642aac951beb' into track-server-request-latencies
# Conflicts:
#	fdbclient/MasterProxyInterface.h
#	fdbserver/ClusterController.actor.cpp
#	fdbserver/MasterProxyServer.actor.cpp
#	fdbserver/ServerDBInfo.h
#	fdbserver/Status.actor.cpp
#	fdbserver/fdbserver.vcxproj
#	fdbserver/storageserver.actor.cpp
2019-01-24 11:43:26 -08:00
A.J. Beamon 8e05e95045 Added the ability to configure the latency band settings by setting a special key in \xff keyspace. 2019-01-18 16:18:34 -08:00
A.J. Beamon d4d5740282 * Add Optional.map and ErrorOr.map.
* Rename Optional/ErrorOr cast_to to castTo.
* Make printable(Optional<T>) templated rather than restricted to StringRef types.
* Fixes bug in (unused) ErrorOr.castTo where an ErrorOr that was not set would lose its error.
2019-01-11 09:03:38 -08:00
Andrew Noyes 7eb6765698 Mention that xml is the default 2019-01-03 08:48:31 -08:00
Andrew Noyes bce5b03340 Fix whitespace 2019-01-02 15:24:11 -08:00
anoyes 1bca665b29 Document --trace_format flag 2018-12-20 16:22:41 -08:00
anoyes 03b48fb452 s/--trace-format/--trace_format/ 2018-12-20 15:58:26 -08:00
anoyes b8df5acc15 Add --trace_format flag to fdbserver 2018-12-20 15:02:01 -08:00
Evan Tschannen d2d68aa171 Merge branch 'release-6.0'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbclient/ManagementAPI.actor.cpp
#	versions.target
2018-12-03 18:26:52 -08:00
Evan Tschannen 6e94d7d71a changing the database configuration now checks that enough processes exist to support the new configuration 2018-11-30 18:52:24 -08:00
Evan Tschannen 4e54690005 Merge branch 'release-6.0'
# Conflicts:
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/MoveKeys.actor.cpp
2018-11-12 20:26:58 -08:00
Evan Tschannen ccdc83036d simplified region configuration protections 2018-11-12 17:45:20 -08:00
Evan Tschannen 6353a6724b strengthened the protections related to changing regions 2018-11-12 17:40:40 -08:00
Evan Tschannen 4b5d0b4e2c Merge branch 'release-6.0'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbclient/AsyncFileBlobStore.actor.cpp
#	fdbclient/AsyncFileBlobStore.actor.h
#	fdbclient/BlobStore.actor.cpp
#	fdbclient/BlobStore.h
#	fdbclient/HTTP.actor.cpp
#	fdbclient/ManagementAPI.actor.cpp
#	fdbclient/NativeAPI.actor.cpp
#	fdbrpc/LoadBalance.actor.h
#	fdbrpc/batcher.actor.h
#	fdbrpc/fdbrpc.vcxproj
#	fdbrpc/sim2.actor.cpp
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/DataDistributionTracker.actor.cpp
#	fdbserver/SimulatedCluster.actor.cpp
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/masterserver.actor.cpp
2018-11-10 13:04:24 -08:00
Evan Tschannen a654183f63
Merge pull request #791 from ajbeamon/remove-cluster-from-iclientapi
Remove cluster from IClientApi (phase 2 of removing DB names)
2018-11-10 10:16:18 -08:00
Evan Tschannen c02690471d added protection against configuration changes which cannot be immediately reverted
the configure database workload tests region configurations
2018-11-04 19:53:55 -08:00
Robert Escriva 268093a96d Adjust all includes to be relative to the root.
Remove the use of relative paths.  A header at foo/bar.h could be included by
files under foo/ with "bar.h", but would be included everywhere else as
"foo/bar.h".  Adjust so that every include references such a header with the
latter form.

Signed-off-by: Robert Escriva <rescriva@dropbox.com>
2018-10-19 17:35:33 +00:00
A.J. Beamon 3eb4355a48 Some various cleanup and fixes. Added "Cluster" to TransactionMetrics trace event. 2018-09-25 15:06:19 -07:00
A.J. Beamon c831051474 This removes the idea of clusters from IClientApi. 2018-09-21 15:58:14 -07:00
Evan Tschannen 90301f497f Merge branch 'release-6.0'
# Conflicts:
#	fdbclient/ManagementAPI.actor.cpp
#	fdbrpc/FlowTransport.actor.cpp
#	fdbrpc/TLSConnection.actor.cpp
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/Status.actor.cpp
#	fdbserver/storageserver.actor.cpp
#	fdbserver/workloads/StatusWorkload.actor.cpp
#	versions.target
2018-09-05 16:06:33 -07:00
Evan Tschannen 4eaff42e4f
Merge pull request #712 from ajbeamon/remove-database-name-internal
Eliminate use of database names (phase 1)
2018-09-05 10:35:00 -07:00
Evan Tschannen 40f5dbe423 fixed issues from review, added a safeguard to prevent configuring a cluster to an invalid configuration 2018-09-04 22:16:35 -07:00
A.J. Beamon e1fafbf259 Add some missing actorcompiler.h includes 2018-08-22 09:40:45 -07:00
Evan Tschannen d8ea3dbf9a Added the ability to configure a cluster from a JSON file 2018-08-16 17:34:59 -07:00
A.J. Beamon 2a97139d5d This is the first step in eliminating the usage of database names in our code. The C API remains the same, but underneath that all usage of database names is eliminated. 2018-08-16 10:24:12 -07:00
Alex Miller 535b5701e5 Rewrite all `Void _ = wait(...)` -> `wait(...)`.
This takes advantage of the new actorcompiler functionality to avoid
having duplicate definitions of `Void _` when trying to feed the
un-actorompiled source through clang.
2018-08-14 15:50:26 -07:00
Evan Tschannen 1c29275672 call all methods which could disable a trace event before it is initialized. In practice this means calling .error first, then .suppressFor, then all your details. 2018-08-01 14:30:57 -07:00
Evan Tschannen 507b3bacb0 fix: kill all tlogs in one region prevents the remote logs from recovering in that region, do not allow that to prevent us from configuring usable_regions=1.
added more recovery states.
2018-07-05 00:08:51 -07:00
Balachandar Namasivayam cbdf598fa2 Add force_recovery_with_data_loss to hidden command list. 2018-07-03 15:04:11 -07:00
Evan Tschannen e67f951c06 Merge branch 'master' into feature-remote-logs 2018-07-02 02:18:20 -04:00
Alvin Moore c3f88dbfe1 Merge branch 'master' of github.com:apple/foundationdb into tls-static 2018-07-01 23:13:57 -07:00
Evan Tschannen 7a12d3e130 added the (untested) ability to force a recovery to the remote datacenter, even if that results in data loss. If the DR lag is more than 1 week there could be potential data corruption if any primary storage servers are still alive. 2018-07-01 09:39:04 -04:00
Alvin Moore 45849d1f95 Added support for no-op legacy TLS options 2018-06-27 09:25:05 -07:00
Alvin Moore dd967bd9e2 Removed developmental debug messages 2018-06-26 14:44:21 -07:00
Alvin Moore ef8de426d3 Changed the TLS_DISABLED macro
Disable TLS within Windows until working
2018-06-26 12:08:32 -07:00
Evan Tschannen 8a8914f046 re-added the ability to configure the number of log routers. Many log routers are needed to get a sufficient number of sockets involved in copying data across the WAN 2018-06-22 00:04:00 -07:00
Alvin Moore f8ce1de601 Added support for compiling TLS into binaries 2018-06-20 09:21:23 -07:00
Richard Low 39894ea798 Merge remote-tracking branch 'apple/release-5.2' 2018-06-12 18:31:20 -07:00
Alex Miller cfa7fe8866 Identify processes with host:port regardless of if TLS is enabled or not.
This makes `kill` and `profile` behave like how `exclude` functions, and means
commands don't have to change depending on TLS status.

Verified via starting a TLS cluster and killing a process before and after this change.
2018-06-11 16:49:20 -07:00
A.J. Beamon e5488419cc Attempt to normalize trace events:
* Detail names now all start with an uppercase character and contain no underscores. Ideally these should be head-first camel case, though that was harder to check.
* Type names have the same rules, except they allow one underscore (to support a usage pattern Context_Type). The first character after the underscore is also uppercase.
* Use seconds instead of milliseconds in details.

Added a check when events are logged in simulation that logs a message to stderr if the first two rules above aren't followed.

This probably doesn't address every instance of the above problems, but all of the events I was able to hit in simulation pass the check.
2018-06-08 11:11:08 -07:00