Commit Graph

1091 Commits

Author SHA1 Message Date
Stephen Atherton e495afdc6e Removed unnecessary c_str(). 2019-05-16 17:26:37 -07:00
Stephen Atherton 7b8e140788 Added UID field to fdbbackup status --json output. 2019-05-16 17:23:33 -07:00
Alvin Moore 94aed513c7 Switched Windows tools within projects to 2017 2019-05-16 15:05:11 -07:00
Evan Tschannen f3897238f8 added the ability to add a read conflict range on the metadata version key without the READ_SYSTEM_KEYS option 2019-05-15 10:13:38 -07:00
mpilman 46e7a0ca56 address reviews and make compile with `-Wunused-variable` 2019-05-13 14:15:23 -07:00
mpilman 96aaa31a6c Compiling on clang again 2019-05-13 14:15:23 -07:00
mpilman 06745568f7 Fixed Tag serialization 2019-05-13 14:15:23 -07:00
mpilman 44db3450ec Several flatbuffers bug fixes 2019-05-13 14:15:23 -07:00
mpilman 69fa3d3903 fixed compilation issues after rebase 2019-05-13 14:15:23 -07:00
mpilman 47fdb78782 use EnsureTable for enums of enums 2019-05-13 14:15:22 -07:00
mpilman 7affe73774 Added more low-level tests and implemented enums 2019-05-13 14:15:22 -07:00
mpilman 8df7818b17 Added a test to reproduce current failure
Currently, enabling the ObjectSerializer results
in failures. These are hard to debug. Therefore I added
a unit test that reproduces the problem.
2019-05-13 14:15:22 -07:00
mpilman 6afce01744 Implementation complete (not yet working) 2019-05-13 14:15:22 -07:00
mpilman 92bad76479 Wrap ClusterClientInterface into its own type
When a process joins a cluster it fetches the cluster
interface. However, not the whole interface is exposed
to the client. This mechanism relies on the fact that
the serializer keeps the field ordering and doesn't
verify the message before parsing it.

To make this work, we provide a client type with one
member (the ClusterInterface which is exposed to the
client and the server). This client interface has the
same FileIdentifier as the ClusterControllerFullInterface
which has the same first member. This works because
FlatBuffers allows for members to be missing.
2019-05-13 14:15:22 -07:00
mpilman 9eeb48c43d Allow to turn on object serializer
This commit includes functionality to turn on
the object serializer for network communication.
This is done the following way:

- On incoming connections, a process will detect
  whether the client supports the object serializer
  and will only serialize responses with it, if it does
- On outgoing connections, the command line flag is used
  to determine whether the object serializer should be used
  to send data.

This way, a cluster can run in mixed mode. To upgrade one
can upgrade one process at a time and set the flag one process
at a time.

This is how this is tested on the simulator:
- The command line flag can take three options: on, off,
  and random.
- For off, the object serializer will never we used.
- For on, the object serializer will be always used.
- For random, the simulator will flip a coin for each
  process it starts up.
2019-05-13 14:15:22 -07:00
mpilman ba83c458a6 types implemented 2019-05-13 14:15:22 -07:00
A.J. Beamon 5f55f3f613 Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used. 2019-05-10 14:01:52 -07:00
Evan Tschannen 22499666d0 Merge branch 'release-6.1'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbserver/LogRouter.actor.cpp
#	flow/Trace.cpp
#	versions.target
2019-05-08 18:19:35 -07:00
Austin Seipp b5cbffc1b8 fdbclient: fix some print/scan format warnings
Signed-off-by: Austin Seipp <aseipp@pobox.com>
2019-05-06 13:35:29 -07:00
Evan Tschannen 8590b710bf added additional logging on the logs and log routers 2019-05-02 17:24:39 -07:00
Jingyu Zhou e193cac5ef Merge remote-tracking branch 'apple/master' into tlog
Resolve Conflicts: fdbserver/MasterProxyServer.actor.cpp
2019-05-01 17:18:00 -07:00
Evan Tschannen 2d5043c665 Merge branch 'release-6.1'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	versions.target
2019-04-30 18:27:04 -07:00
Alex Miller 6124a7e145 Update fdbclient/ManagementAPI.actor.cpp 2019-04-30 16:30:19 -07:00
Evan Tschannen b2d19eebc4 fdbcli would return success even when configure failed for a variety of error types
the existing configure safety check would fail when attempting to change between three_datacenter and region configuration
2019-04-30 16:30:19 -07:00
Alex Miller 797d431934 Add an \xff keyrange that is backed by the txnStateStore. 2019-04-25 17:04:20 -07:00
Alvin Moore 28d96fe612 Merge pull request #1470 from mpilman/fixes/windows-parallel-build
Don't pass options file to coverage tool
2019-04-25 12:02:10 -07:00
Jingyu Zhou 5462f560e7 Add pseudo locality for log routers and tlogs
This changes the logic of pop operations from log routers (LG):
- LG pops tagLocalityLogRouterMapped from TLogs;
- TLog converts tagLocalityLogRouterMapped back to tagLocalityLogRouter before
  popping.

Later when we add more psuedo localities, the same pattern can be used.
2019-04-23 21:35:56 -07:00
Evan Tschannen 4fa1c008f9 Highly prioritize storageServerRejoin messages on the proxy, so that storage servers can rejoin the cluster even when a proxy is CPU saturated 2019-04-23 20:56:01 -07:00
A.J. Beamon e0f76edf77
Merge pull request #1471 from AlvinMooreSr/release-6.1-merge
Merge Release 6.1 Into Master
2019-04-23 11:08:21 -07:00
Jingyu Zhou d2b215b926 Refactor tag population of ServerCacheInfo 2019-04-22 11:55:04 -07:00
Jingyu Zhou 0b1984978a Small code refactoring. 2019-04-21 10:41:07 -07:00
A.J. Beamon 65c297bd75
Merge pull request #1465 from atn34/fix-open-for-ide-build
Fix OPEN_FOR_IDE -Wunused-variable warnings
2019-04-19 15:08:00 -07:00
Alvin Moore 2bea99591e Merge branch 'release-6.1' of copy of master
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
2019-04-17 15:51:48 -07:00
mpilman 8ef453eb2e Don't pass options file to coverage tool
This hopefully fixes a weird msbuild issue where files are generated
more than once. This issue is described here:
https://gitlab.kitware.com/cmake/cmake/issues/16767

This workaround makes coverage_tool not to use the generated options
files, which hopefully will result in msbuild not trying to generate
these files more than once.
2019-04-17 12:10:51 -07:00
Andrew Noyes 781b6ece77 Fix OPEN_FOR_IDE -Wunused-variable warnings
CC #1255, #1173
2019-04-16 15:28:01 -07:00
Andrew Noyes 6207d724f8 Fix all -Wunused-variable warnings 2019-04-15 18:13:00 -07:00
Evan Tschannen cd5c9d91fa
Merge pull request #1443 from etschannen/master
Merge 6.1 into master
2019-04-10 17:43:07 -07:00
Balachandar Namasivayam 04e9aa6afd For small clusters that are growing quickly, it could happen that the rateLimit is set to a low value and it would take very long to read the entire database. Fix this by setting the rateLimit to the maximum allowed value if reading the entire database is taking a long time. 2019-04-10 17:13:37 -07:00
Evan Tschannen 6220a5ce0f
Merge pull request #1370 from jzhou77/fix-unreferenced
Remove unused functions
2019-04-09 11:49:45 -07:00
Evan Tschannen 21c0ba555c Merge branch 'release-6.1'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	versions.target
2019-04-08 18:38:42 -07:00
A.J. Beamon 538b431656 Apply suggestions from code review 2019-04-08 14:55:58 -07:00
A.J. Beamon a7288e1325 Throw process_behind instead of future_version when all storage nodes on a team are behind. process_behind gets the same backoff behavior as not_committed. Add proxy_memory_limit_exceeded to the retryable predicate. 2019-04-08 14:21:24 -07:00
mpilman bdba8e22eb Added test and bugfixes 2019-04-08 11:05:29 -07:00
mpilman 207049e852 fixed serialization 2019-04-08 11:04:44 -07:00
mpilman 32393ec4c9 Prototype of local ratekeeper 2019-04-08 11:04:44 -07:00
Evan Tschannen 7fd0af1888 the default failure state for local addresses should be not failed 2019-04-08 10:43:48 -07:00
Evan Tschannen 3356ac27bf added three_data_hall_fallback configuration 2019-04-07 22:58:18 -07:00
mpilman d01cbf3455 Addressed code review comments 2019-04-05 13:12:20 -07:00
A.J. Beamon 45eb6b0973 Update fdbclient/BackupAgentBase.actor.cpp
Co-Authored-By: mpilman <markus@pilman.ch>
2019-04-05 13:12:19 -07:00
A.J. Beamon 389ec1c2ac Update fdbclient/BackupAgentBase.actor.cpp
Co-Authored-By: mpilman <markus@pilman.ch>
2019-04-05 13:12:19 -07:00
mpilman 39ecbedd74 Fixed compilation errors on OS X & gcc8 2019-04-05 13:12:19 -07:00
mpilman 1c16f87a4e Remove trace-calls to printable (in non-workloads) 2019-04-05 13:12:19 -07:00
mpilman ea67b742c7 Implemented Traceable for printable types 2019-04-05 13:12:19 -07:00
mpilman c008e16c81 Defer formatting in traces to make them cheaper
This is the first part of making `TraceEvent` cheaper. The main idea is
to defer calls to any code that formats string. These are the main
changes:

- TraceEvent::detail now takes a c-string instead of std::string for
  literals. This prevents unnecessary allocations if the trace is not
  going to be printed in the first place (for example for SevDebug).
  Before that `detail` expected a `std::string` as key, which mean that
  any string literal would be copied on each call.
- Templates Traceable and SpecialTraceMetricType. These templates can be
  specialized for any type that needs to be printed. The actual
  formatting will be deferred to after the `enabled` check. This
  provides two benefits: (1) if a TraceEvent is disabled, we don't pay
  for the formatting and (2) TraceEvent can trace types that it doesn't
  know about.
- TraceEvent::enabled will be set in the constructor if the Severity is
  passed. This will make sure that `TraceEvent::init` is not called.
- `TraceEvent::detail` will be inlined. So for disabled TraceEvent
  calls, a call to detail will only introduce a if-branch which is much
  cheaper than a function call.
2019-04-05 13:12:19 -07:00
Alex Miller 433060ef95
Update fdbclient/ClientWorkerInterface.h
Co-Authored-By: jzhou77 <jingyuzhou@gmail.com>
2019-04-03 20:05:19 -07:00
Jingyu Zhou 3371cf22d4 Add manually triggered heap profiling
At client side:
fdb> profile
ERROR: Usage: profile <client|list|flow|heap>
fdb> profile heap 127.0.0.1:4500

On the server side:
$ HEAPPROFILE=/tmp/fdbserver bin/fdbserver -C ../test.cluster -p 127.0.0.1:4500
Starting tracking the heap
FDBD joined cluster.
Dumping heap profile to /tmp/fdbserver.0001.heap (1024 MB allocated cumulatively, 13 MB currently in use)
Dumping heap profile to /tmp/fdbserver.0002.heap (User triggered heap dump)
2019-04-03 16:00:54 -07:00
Evan Tschannen 39c595223b Merge branch 'release-6.1' 2019-04-02 22:30:02 -07:00
Evan Tschannen a38c396283 made all maintenance transactions lock aware 2019-04-02 14:27:48 -07:00
Evan Tschannen 628fec8c8b updated status with information about ongoing maintenance
clear the maintenance zone if a different storage server is detected failed
2019-04-02 14:15:51 -07:00
Evan Tschannen d57e76f7e6 Merge commit '56f3f0b1bc60604f965152d856ae29a591227703' into feature-maintenance-zone 2019-04-01 18:58:45 -07:00
Evan Tschannen 72203ba47a Merge commit '56f3f0b1bc60604f965152d856ae29a591227703' 2019-04-01 18:45:38 -07:00
Evan Tschannen 8714394d42 increase the priority of the client’s version batch timeout, so that we prefer issuing the batch over other possible work 2019-04-01 18:37:40 -07:00
Evan Tschannen 980df10f5a Merge branch 'release-6.1' into feature-maintenance-zone 2019-04-01 18:22:37 -07:00
Evan Tschannen 781cf9b5a0 added the ability to make a zoneId for maintenance in fdbcli 2019-04-01 17:55:13 -07:00
Stephen Atherton bae815a777 Bug fix, starting a restore on a tag already in-use would spinloop forever and eventually run out of memory. 2019-04-01 15:00:24 -07:00
Jingyu Zhou 47b4b82628
Merge branch 'master' into fix-unreferenced 2019-04-01 14:07:19 -07:00
Jingyu Zhou 3f76be8f45 Merge remote-tracking branch 'apple/master' into fix-unreferenced 2019-04-01 14:00:43 -07:00
Jingyu Zhou f7f8ddd894 Fix warnings on unused variables
Found by -Wunused-variable flag.
2019-04-01 14:00:20 -07:00
Evan Tschannen a46620fbee Merge branch 'release-6.1' 2019-03-30 17:59:28 -07:00
Evan Tschannen 29a37beb20 fixed a valgrind correctness bug 2019-03-30 12:01:36 -07:00
Evan Tschannen d882c060bf Merge commit '5dd6396eed0de0dfea6cf9eecc307995eff5cedc' 2019-03-28 18:00:55 -07:00
Evan Tschannen b6008558d3 renamed BinaryWriter.toStringRef() to .toValue(), because the function now returns a Standalone<StringRef>()
eliminated an unnecessary copy from the proxy commit path
eliminated an unnecessary copy from buffered peek cursor
2019-03-28 11:52:50 -07:00
Evan Tschannen 99e71d561e
Merge pull request #1373 from etschannen/master
Merge 6.1 into master
2019-03-27 21:04:28 -07:00
Evan Tschannen 795ce9f137
Merge pull request #1369 from jzhou77/ratekeeper
Fix SchemaMismatch error
2019-03-27 21:01:37 -07:00
Evan Tschannen 836bb95a7a
Merge pull request #1372 from etschannen/master
Merge 6.1 into master
2019-03-27 21:00:49 -07:00
Evan Tschannen 34b9d5e722
Merge pull request #1364 from etschannen/feature-fast-serialize
A few performance optimizations
2019-03-27 20:57:25 -07:00
Evan Tschannen f1a4bdd70d changed failureMonitor to use an unordered_map 2019-03-27 19:17:08 -07:00
Jingyu Zhou b81de9831f Fix SchemaMismatch error
Add data_distributor and ratekeeper roles to schema.
2019-03-27 09:54:01 -07:00
Evan Tschannen 83b2ff8b08
Merge pull request #1366 from ajbeamon/docs-update-batch-priority
Update documentation about batch priority transactions
2019-03-26 16:08:27 -07:00
Evan Tschannen efa9b1cd73
Merge pull request #1359 from ajbeamon/fix-database-memory-leak
Avoiding holding references to ThreadSafeDatabase on the main thread.
2019-03-26 16:02:03 -07:00
A.J. Beamon f363bdb007 Update documentation about batch priority transactions 2019-03-26 15:45:38 -07:00
Jingyu Zhou 7c02ee6fdd Fix compiler warning about unreferenced exception variable 2019-03-26 13:43:47 -07:00
A.J. Beamon 1429ffe8ab Initialize the tm.tm_isdst field because it isn't set by strptime. 2019-03-26 09:00:45 -07:00
A.J. Beamon fe68a1f7fb Avoiding holding references to ThreadSafeDatabase on the main thread because this can cause a race with fdb_stop_network when it comes time to destroy it. 2019-03-25 16:11:50 -07:00
Trevor Clinkenbeard 4293a3e230 Generalized VersionedMap<K, T>::overheadPerItem calculation 2019-03-25 13:45:13 -07:00
Trevor Clinkenbeard 007abbc45b Added 96-byte FastAllocator
Since storage queue nodes account for a large portion of memory usage,
we can save space by only allocating 96 bytes instead of 128 bytes for
each node.
2019-03-25 13:44:39 -07:00
Evan Tschannen 5e03e178de
Merge pull request #1345 from ajbeamon/support-multiple-client-or-worker-issues
Add support for a client or worker having multiple issues.
2019-03-24 17:27:50 -07:00
Evan Tschannen 38ed21328a fix: the failure monitoring client did not update secondaryAddress correctly 2019-03-23 23:51:12 -07:00
Evan Tschannen 1fc6937802 changed NetworkAddressList to at most two addresses for performance 2019-03-23 17:54:46 -07:00
Evan Tschannen e3400c13ae fixed a performance regression related to broadcasting a read version to too many transactions simultaneously 2019-03-22 18:37:39 -07:00
Evan Tschannen e37e45723c fix: CompareAndClear does not coalesce with itself 2019-03-22 18:37:39 -07:00
Alec Grieser e6e2ea2af6
Merge remote-tracking branch 'upstream/master' into 00775-database-level-tr-options 2019-03-22 14:41:27 -04:00
A.J. Beamon 76e6a3f56d
Update fdbclient/BackupAgentBase.actor.cpp
Co-Authored-By: satherton <stevea@apple.com>
2019-03-22 11:02:38 -07:00
Alec Grieser 55a9db1994
spaces to tabs 🤮 2019-03-22 12:58:32 -04:00
A.J. Beamon 67cc274f10
add a missing "is" to parameter description
Co-Authored-By: alecgrieser <alloc@apple.com>
2019-03-22 12:57:01 -04:00
A.J. Beamon 4eb5715689 Add support for a client or worker having multiple issues. 2019-03-22 08:29:41 -07:00
Stephen Atherton cabe7ca844 Stopped using %z to parse timezone offset with strptime() because it only seems to work as expected on MacOS. Updated time input/output unit tests so that they don't assume what the local timezone is. 2019-03-21 19:38:07 -07:00
Alec Grieser 64e45e6826
retry limit and max delay transaction options are no longer reset after onError 2019-03-21 18:50:02 -04:00
Alec Grieser 22f592ce6e
reset the timeout only if the API version is less than 610 to allow transactions with longer timeouts than the database default 2019-03-21 16:47:12 -04:00
Alec Grieser 7c8a1c8db7
Revert "start the timeout actor only after the first read to allow transaction timeouts longer than the default db timeout"
This reverts commit df8826115d.
2019-03-21 14:45:43 -04:00