Commit Graph

488 Commits

Author SHA1 Message Date
sramamoorthy 17ecba8313 trace cleanup and other indentation changes 2019-05-28 22:07:46 -07:00
sramamoorthy 898bed66c1 Allow only whitelisted binary path for exec op 2019-05-28 22:07:46 -07:00
sramamoorthy c4d27ac9d2 bug fixes in SnapTest
Earlier the test was checking for the following condition:
durable version of storage > min version of tlog, but the
check has been modified to:
durable version of storage >= min version of tlog - 1.

Ensure that the pre-snap validate keys are exactly 1000 in
the case of commit retires.
2019-05-28 22:07:46 -07:00
sramamoorthy 281c785f94 '--restoring' cmd line arg removed for fdbserver
'--restoring' command line option was introduced to indicate
simulated fdbserver to restore from snapshot and restart the cluster.
As part of this change that option is removed and restore
information is stored in the restartInfo.ini.
2019-05-28 22:07:46 -07:00
sramamoorthy 6431513ad0 Fail exec req until the cluster is fully_recovered 2019-05-28 22:07:46 -07:00
sramamoorthy 4016f16c76 Fix few compilation and bugs in rebase 2019-05-28 22:07:46 -07:00
sramamoorthy 3d5998e9dd tlog: when pops are disabled, store them & replay
In Tlogs, disable pop is done whlie taking snapshots. Earlier, tlogs
were ignoring the pops if it got pop requests when pops were
disabled. In this change, instead of ignoring the pop - it remembers
the list of pops in-memory and plays them once the popping is
enabled.
2019-05-28 22:07:46 -07:00
sramamoorthy 4bc4c615da exec op to all tlog, restore change in test &other
- exec operation to go to all the TLogs
- minor bug fix in tlog
- restore implementation for the simulator
- restore snap UID to be stored in restartInfo.ini
- test cases added
- indentation and trace file fixes
2019-05-28 22:07:46 -07:00
sramamoorthy 72dd067173 Trace message changes and fix few FIXMEs 2019-05-28 22:07:46 -07:00
sramamoorthy 69edefe68b Snapshot based backup and resotre implementation 2019-05-28 22:07:46 -07:00
chaoguang 5350c2777a change g_random to deterministicRandom() 2019-05-28 18:37:55 -07:00
chaoguang a7920ef311 Merge branch 'master' of https://github.com/apple/foundationdb into MakoWorkload 2019-05-28 18:21:02 -07:00
chaoguang 7329466182 update comments, parameter names and descriptions 2019-05-28 15:43:41 -07:00
A.J. Beamon f417e60264 Merge branch 'merge-release-6.1-into-master' into thread-safe-random-number-generation
# Conflicts:
#	fdbserver/QuietDatabase.actor.cpp
2019-05-23 09:52:00 -07:00
A.J. Beamon d29c7e4c9b Merge branch 'release-6.1' into merge-release-6.1-into-master
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbserver/QuietDatabase.actor.cpp
#	versions.target
2019-05-23 09:28:45 -07:00
A.J. Beamon 603721e125 Merge branch 'master' into thread-safe-random-number-generation
# Conflicts:
#	fdbclient/ManagementAPI.actor.cpp
#	fdbrpc/AsyncFileCached.actor.h
#	fdbrpc/genericactors.actor.cpp
#	fdbrpc/sim2.actor.cpp
#	fdbserver/DiskQueue.actor.cpp
#	fdbserver/workloads/BulkSetup.actor.h
#	flow/ActorCollection.actor.cpp
#	flow/Net2.actor.cpp
#	flow/Trace.cpp
#	flow/flow.cpp
2019-05-23 08:35:47 -07:00
chaoguang c527b1a6b1 renaming function, add comments, fix bugs. 2019-05-22 17:39:36 -07:00
chaoguang 57968d9df7 Merge branch 'master' of https://github.com/apple/foundationdb into MakoWorkload 2019-05-21 16:24:11 -07:00
chaoguang 0bbcc75e4b fix bug 2019-05-21 16:22:02 -07:00
Evan Tschannen 9604452e50 mistakenly changed a quiet database parameter 2019-05-21 15:17:46 -07:00
Evan Tschannen 4059d68348 fix: the tlog would not pop data from the disk queue after a storage server was removed, because the tag still exists in memory on the logs
fix: we could incorrectly make data durable if eraseMessagesFromMemory was in progress while running updatePersistentData
the quiet database check now ensure that tlogs have no more than 30 seconds of versions unpopped from the disk queue
2019-05-20 23:58:45 -07:00
chaoguang 12a51b2d39 fix bugs, update naming and comments, refine functions 2019-05-20 18:26:30 -07:00
Jingyu Zhou b8e7fc1b84 Refactor: add std:: qualifier and use emplace_back 2019-05-17 09:38:50 -10:00
chaoguang 6788c8eb7d update cleanup process 2019-05-15 16:17:01 -07:00
chaoguang 106bb7677d update 2019-05-15 12:58:12 -07:00
chaoguang 4c9cc44c73 add paras 2019-05-14 10:13:13 -07:00
mpilman 46e7a0ca56 address reviews and make compile with `-Wunused-variable` 2019-05-13 14:15:23 -07:00
mpilman 6afce01744 Implementation complete (not yet working) 2019-05-13 14:15:22 -07:00
A.J. Beamon 5f55f3f613 Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used. 2019-05-10 14:01:52 -07:00
Chaoguang 5678a7417e Mako Workload 2019-05-09 15:55:05 -07:00
Evan Tschannen 22499666d0 Merge branch 'release-6.1'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbserver/LogRouter.actor.cpp
#	flow/Trace.cpp
#	versions.target
2019-05-08 18:19:35 -07:00
Balachandar Namasivayam d45e7bf0b1 Addressed review comments 2019-05-07 17:19:59 -07:00
Balachandar Namasivayam 5d824f5fbc Address review comments 2019-05-07 17:06:52 -07:00
Balachandar Namasivayam a0cc3d98a1 Add a workload to trigger repeated recoveries. 2019-05-06 18:16:44 -07:00
Austin Seipp bf378952cb fdbserver: fix some print/scan format warnings
Signed-off-by: Austin Seipp <aseipp@pobox.com>
2019-05-06 13:35:29 -07:00
A.J. Beamon e0f76edf77
Merge pull request #1471 from AlvinMooreSr/release-6.1-merge
Merge Release 6.1 Into Master
2019-04-23 11:08:21 -07:00
Andrew Noyes d1e86779a6 Address review comments 2019-04-18 08:48:27 -07:00
Andrew Noyes 5af8208c62 Fix JavaWorkload unused variable 2019-04-17 16:29:22 -07:00
Andrew Noyes ef04471a66 Fix more unused-variable warnings 2019-04-17 16:04:10 -07:00
Alvin Moore 2bea99591e Merge branch 'release-6.1' of copy of master
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
2019-04-17 15:51:48 -07:00
A.J. Beamon 43533b3d72 Don't validate the shard size estimate unless enough keys are sampled with a less than 100% probability. 2019-04-17 11:01:23 -07:00
Trevor Clinkenbeard 8a7d9afbe9 Fixed name of LocalRatekeeperWorkloadFactory 2019-04-16 10:36:09 -07:00
Andrew Noyes baa3e806ef Address review comments from #1446 2019-04-16 09:48:15 -07:00
Andrew Noyes 6207d724f8 Fix all -Wunused-variable warnings 2019-04-15 18:13:00 -07:00
Evan Tschannen cd5c9d91fa
Merge pull request #1443 from etschannen/master
Merge 6.1 into master
2019-04-10 17:43:07 -07:00
Balachandar Namasivayam 04e9aa6afd For small clusters that are growing quickly, it could happen that the rateLimit is set to a low value and it would take very long to read the entire database. Fix this by setting the rateLimit to the maximum allowed value if reading the entire database is taking a long time. 2019-04-10 17:13:37 -07:00
A.J. Beamon 058d028099
Merge pull request #1301 from mpilman/features/cheaper-traces
Defer formatting in traces to make them cheaper
2019-04-09 10:11:04 -07:00
Evan Tschannen 21c0ba555c Merge branch 'release-6.1'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	versions.target
2019-04-08 18:38:42 -07:00
Evan Tschannen d126730b4d fixed a spurious test error where process_behind was treated as an error 2019-04-08 17:09:54 -07:00
A.J. Beamon a7288e1325 Throw process_behind instead of future_version when all storage nodes on a team are behind. process_behind gets the same backoff behavior as not_committed. Add proxy_memory_limit_exceeded to the retryable predicate. 2019-04-08 14:21:24 -07:00
mpilman c45fe8c697 Fixed typo 2019-04-08 11:33:45 -07:00
Trevor Clinkenbeard b286102d34 Update fdbserver/workloads/LocalRatekeeper.actor.cpp
Co-Authored-By: mpilman <markus@pilman.ch>
2019-04-08 11:06:17 -07:00
mpilman d2e74cb2c0 Fix stupid rounding error 2019-04-08 11:05:29 -07:00
mpilman bdba8e22eb Added test and bugfixes 2019-04-08 11:05:29 -07:00
Andrew Noyes d7612a4426 Fix OPEN_FOR_IDE build errors 2019-04-05 16:30:42 -07:00
mpilman 4287b1d2a1 resolved minor merge issues 2019-04-05 13:12:19 -07:00
mpilman c008e16c81 Defer formatting in traces to make them cheaper
This is the first part of making `TraceEvent` cheaper. The main idea is
to defer calls to any code that formats string. These are the main
changes:

- TraceEvent::detail now takes a c-string instead of std::string for
  literals. This prevents unnecessary allocations if the trace is not
  going to be printed in the first place (for example for SevDebug).
  Before that `detail` expected a `std::string` as key, which mean that
  any string literal would be copied on each call.
- Templates Traceable and SpecialTraceMetricType. These templates can be
  specialized for any type that needs to be printed. The actual
  formatting will be deferred to after the `enabled` check. This
  provides two benefits: (1) if a TraceEvent is disabled, we don't pay
  for the formatting and (2) TraceEvent can trace types that it doesn't
  know about.
- TraceEvent::enabled will be set in the constructor if the Severity is
  passed. This will make sure that `TraceEvent::init` is not called.
- `TraceEvent::detail` will be inlined. So for disabled TraceEvent
  calls, a call to detail will only introduce a if-branch which is much
  cheaper than a function call.
2019-04-05 13:12:19 -07:00
Markus Pilman 101a05ae77
Merge branch 'master' into features/client-simulator 2019-04-03 10:03:56 -08:00
Evan Tschannen 39c595223b Merge branch 'release-6.1' 2019-04-02 22:30:02 -07:00
Evan Tschannen 628fec8c8b updated status with information about ongoing maintenance
clear the maintenance zone if a different storage server is detected failed
2019-04-02 14:15:51 -07:00
mpilman 371a41dbba Allow classPath to be modified at runtime 2019-04-02 11:56:40 -07:00
mpilman e19901186f Fixed buggy register preparation for natives 2019-04-02 11:56:03 -07:00
mpilman b148981bba Fixed compilation issues with char* 2019-04-01 14:29:45 -07:00
mpilman e23e63c6ac Implemented JavaWorkload
This change allows a user to write a workload in Java.

The way this is implemented is by creating a JVM within the
simulator and calling the corresponding workload class. A
workload can then run in the simulator or on a testing cluster.

If the workload is executed within the simulator, the resulting
test will not be deterministic anymore as it will execute in a
different thread (and even without that it is not clear, whether
we could get determinism as the JVM does a lot of stuff that are
not deterministic).

This is intendet to get better testing of the Java client and
layer authors can use the simulator to test their layers on a single
machine but they can still simulate failing machines etc.
2019-03-31 17:57:43 -07:00
Evan Tschannen d882c060bf Merge commit '5dd6396eed0de0dfea6cf9eecc307995eff5cedc' 2019-03-28 18:00:55 -07:00
Evan Tschannen b6008558d3 renamed BinaryWriter.toStringRef() to .toValue(), because the function now returns a Standalone<StringRef>()
eliminated an unnecessary copy from the proxy commit path
eliminated an unnecessary copy from buffered peek cursor
2019-03-28 11:52:50 -07:00
Evan Tschannen 836bb95a7a
Merge pull request #1372 from etschannen/master
Merge 6.1 into master
2019-03-27 21:00:49 -07:00
A.J. Beamon 71e2fdafb8 Changes to ratekeeper camel case 2019-03-27 08:24:25 -07:00
Jingyu Zhou 7c02ee6fdd Fix compiler warning about unreferenced exception variable 2019-03-26 13:43:47 -07:00
Jingyu Zhou 10988f89d9 Code refactoring for ConsistencyCheck.actor.cpp 2019-03-23 11:06:43 -07:00
Evan Tschannen 36ab852bb1 Merge branch 'master' into ratekeeper
# Conflicts:
#	fdbserver/ClusterController.actor.cpp
2019-03-22 18:41:00 -07:00
Evan Tschannen f9aad46573 made use_provisional_proxies a transaction option 2019-03-19 18:44:37 -07:00
Evan Tschannen eb54a700ba changed the old memory configuration to memory-1 2019-03-18 15:10:04 -07:00
Jingyu Zhou 254c78053c Fix a segfault error
After wait, ServerDBInfo may have changed. Using the old copy is wrong.
2019-03-15 22:11:13 -07:00
Jingyu Zhou 12ddd56698 Fix Ratekeeper and DataDistributor placement
Make sure both RateKeeper and DataDistributor are placed in the same data
center as the Master. Make sure only one RateKeeper is live in the cluster as
well.
2019-03-15 17:09:28 -07:00
Jingyu Zhou bb5686eb75 Fix monitoring of DD and RK 2019-03-15 16:02:17 -07:00
Jingyu Zhou 9f6fe5f649 Merge remote-tracking branch 'apple/master' into ratekeeper 2019-03-15 11:30:04 -07:00
Jingyu Zhou 40860e0093 Attempt to fix. 2019-03-15 11:29:04 -07:00
Jingyu Zhou 9e59c9c253 Check DataDistributor and RateKeeper fitness
Fail the test if they are not put in the best fitness.
2019-03-14 16:14:57 -07:00
Steve Atherton dbacfcbc82
Merge branch 'master' into feature-backup-json 2019-03-13 13:30:45 -07:00
Evan Tschannen e068c478b5 merge master 2019-03-12 18:31:25 -07:00
Steve Atherton 8aab719c22
Merge branch 'master' into feature-backup-json 2019-03-12 18:23:16 -07:00
Stephen Atherton f0eae0295f Merge branch 'master' of https://github.com/apple/foundationdb into feature-backup-json 2019-03-12 03:35:03 -07:00
Stephen Atherton e9b8bf601e Added backup status JSON output to backup workload to get sim coverage. 2019-03-12 03:34:38 -07:00
Evan Tschannen 2627bcd35e Merge branch 'master' into feature-metadata-version 2019-03-10 21:13:28 -07:00
Evan Tschannen 044b6b4f8a Merge branch 'master' into feature-degraded-tlog
# Conflicts:
#	fdbserver/ClusterController.actor.cpp
2019-03-08 22:50:41 -05:00
Evan Tschannen 710a64dc4e replaced std::pair<WorkerInterface,ProcessClass> with a struct named WorkerDetails 2019-03-08 11:25:07 -05:00
Balachandar Namasivayam f3391ea413
Merge pull request #1240 from satherton/feature-restore-by-timestamp
Restore by timestamp
2019-03-06 16:21:06 -08:00
Stephen Atherton 7778112f6a Bug fix, restore was using the destination cluster to look up timestamps when printing the backup description instead of (optionally) the original cluster which generated the backup. Made missing cluster file errors more clear. 2019-03-06 02:45:55 -08:00
anoyes 981426bac9 More ide fixes 2019-03-05 18:03:57 -08:00
Evan Tschannen 82d957e0bb
Merge pull request #1178 from vishesh/task/issue-963-IPv6
IPv6 Support
2019-03-05 17:14:16 -08:00
Steve Atherton 21f55e1878
Merge pull request #1190 from bnamasivayam/restore-multiple-ranges
Add support for restoring multiple ranges.
2019-03-05 10:15:55 -08:00
Evan Tschannen f1897f3eb6 Merge branch 'master' into feature-metadata-version
# Conflicts:
#	fdbclient/NativeAPI.actor.cpp
2019-03-04 21:06:16 -08:00
Evan Tschannen 69d7633d5b
Merge pull request #1217 from alexmiller-apple/tstlog-goodref
Spill-By-Reference TLog Part 4: Actually Usable Reference Spilling
2019-03-04 20:58:24 -08:00
Trevor Clinkenbeard 89cbb77b4e Merge branch 'master' of https://github.com/apple/foundationdb into lazily-fetch-health-metrics 2019-03-04 14:17:58 -08:00
Trevor Clinkenbeard 56ae46f89e Client lazily fetches health metrics from proxies 2019-03-04 14:16:39 -08:00
Vishesh Yadav cc9ad0e202 net: Use IPv6 in simulation testing #963
25% times we will use IPv6 addresses
2019-03-04 14:12:45 -08:00
Vishesh Yadav 57832e625d net: Support IPv6 #963
- NetworkAddress now contains IPAddress object which can be either
IPv4 or IPv6 address. 128bits are used even for IPv4 addresses,
however only 32bits are used when using/serializing IPv4 address.

- ConnectPacket is updated to store IPv6 address. Backward compatible
with old format since the first 32bits of IP address field is used
for serialization of IPv4.

- Mainly updates rest of the code to use IPAddress structure instead
of plain uint32_t.

- IPv6 address/pair ports should be represented as `[ip]:port` as per
convention. This applies to both cluster files and command line
arguments.
2019-03-04 14:12:41 -08:00
Alex Miller fb4cb8c3a8 Print out configuration changes in ConfigureTest. 2019-03-04 01:42:38 -08:00
Alex Miller 4d4e0a1d54 Fix the build on -O0.
C++ < 17 requires definitions of declared static constexpr variables.
2019-03-04 01:42:38 -08:00