Commit Graph

3528 Commits

Author SHA1 Message Date
Markus Pilman 6f71a811b6 fix memory leak 2021-04-28 09:27:11 -06:00
Markus Pilman 0ee0b8a76f
Fixed typo
Co-authored-by: Lukas Joswiak <lukas.joswiak@snowflake.com>
2021-04-28 09:25:28 -06:00
Markus Pilman 54919d4f3b Merge remote-tracking branch 'sfc/features/actor-lineage' into features/actor-lineage 2021-04-28 09:22:14 -06:00
Markus Pilman 4fab2ecd30 Merge remote-tracking branch 'origin/master' into features/actor-lineage 2021-04-28 09:20:54 -06:00
Markus Pilman 1e665044fe bugfix 2021-04-28 09:08:17 -06:00
RenxuanW 0145eea684 Make `MonitorLeaderForwarding` and `LeaderForwarding` trackLatest events. 2021-04-27 15:17:20 -07:00
A.J. Beamon 16dfb2b2f2 Keep connections older than 6.2 open indefinitely to avoid weird bugs around quickly closing the database. 2021-04-27 15:00:56 -07:00
Lukas Joswiak 5d0eaac3ea Remove old code 2021-04-27 11:40:02 -07:00
RenxuanW 2f3d70c084 Fix the logic of getting firstConsistentVersion.
First consistent version should be:

- In a logs-only restore, it is the begin version the user said to start applying logs for;
- In an inconsistent-snapshot-only restore, if all range files have the same version, then it is that version, otherwise unknown (use -1);
- If using both range files and logs, then it is the highest version of any range file in the RestoreSet’s ranges vector.
2021-04-27 11:27:57 -07:00
Lukas Joswiak d964b5ded0 clang-format 2021-04-27 10:41:48 -07:00
Lukas Joswiak 0ba5a8e9d1 Fix return key when sorting by time 2021-04-27 10:39:26 -07:00
Lukas Joswiak e163432303 Add filtering by wait state 2021-04-27 10:20:25 -07:00
Lukas Joswiak 10d5007e1a Cleanup 2021-04-27 09:59:10 -07:00
Markus Pilman 2d6fafde64 Implemented configuration 2021-04-27 10:26:42 -06:00
Markus Pilman 340f012e1a
Merge pull request #4695 from sfc-gh-etschannen/fix-rewrite-bme
rewrote tlog recruitment logic so that it is deterministic
2021-04-27 10:19:25 -06:00
Lukas Joswiak 7f9ee224a4 Refactor samples to include wait state 2021-04-26 22:50:44 -07:00
Lukas Joswiak 76acb0fcb9 Update date format to ISO 8601 2021-04-26 17:42:15 -07:00
A.J. Beamon 823873a9aa Address review comments:
Use nullptr instead of NULL
Use const& for a parameter
Add some comments
2021-04-26 14:39:27 -07:00
RenxuanW 719f810676 Rename incrementalBackupOnly to onlyAppyMutationLogs in all restore configs and functions. 2021-04-26 12:30:46 -07:00
Lukas Joswiak 6b81b7a04b Remove current lineage validity check 2021-04-26 11:04:36 -07:00
Andrew Noyes 656c9a6c47 Add benchmark and document entities touched 2021-04-26 17:46:35 +00:00
Evan Tschannen f1559a2203 use the stateless process class instead of master or resolution in simulation because it is the recommended process class, and the others are not deterministic when recruited in a constrained process situation 2021-04-26 09:49:26 -07:00
Lukas Joswiak e45faa3534 Fix a bug where deleting a key invalidated its memory which was later
read
2021-04-23 16:38:01 -07:00
A.J. Beamon a794fca932 Support 5.0 (and earlier) client versions by adding GRV probing for old versions. Update the C bindings implementation of get_server_protocol to convert the ProtocolVersion object into a uint64_t. Rename a misleading protocol version alias. 2021-04-23 15:00:21 -07:00
Andrew Noyes 6fc59379d8 Add /fdbclient/multiversionclient/ to ctest, and fix thread safety 2021-04-23 21:17:41 +00:00
Lukas Joswiak 9adce8456a Add invalid reference check 2021-04-23 14:06:23 -07:00
Lukas Joswiak 3cf2dd0fbe Remove TODO 2021-04-23 14:06:23 -07:00
Lukas Joswiak 25fb85a64c Add API to read samples from worker 2021-04-23 14:06:21 -07:00
Lukas Joswiak 52bba82e8e Add window size configuration key 2021-04-23 14:05:05 -07:00
Chaoguang Lin 039a7dc482 Merge branch 'master' of github.com:apple/foundationdb into refactor-fdbcli 2021-04-23 12:04:21 -07:00
Andrew Noyes 5489de985c
Merge pull request #4582 from sfc-gh-clin/add-dd-and-maintenance
Add dd and maintenance
2021-04-23 11:43:35 -07:00
Chaoguang Lin 185d08b5b8 Add comments for added actors 2021-04-23 11:13:08 -07:00
Markus Pilman 3e18b857a8 add command line args to configure profile ingestor 2021-04-23 11:02:53 -06:00
Chaoguang Lin de4753a5db Add a workaround to temporily use the ryw to create a ThreadTransaction; Make sure we are using the same underlying ryw object 2021-04-23 01:32:30 -07:00
Markus Pilman adb0ce9776 address review comments 2021-04-22 17:52:27 -06:00
Markus Pilman a6a8a97e1f Merge remote-tracking branch 'sfc/features/actor-lineage' into features/actor-lineage-instrument 2021-04-22 17:50:36 -06:00
Markus Pilman 99c1edf87e Implemented fluentd functionality 2021-04-22 17:48:09 -06:00
RenxuanW 41ca11c3e5 Implement restoring an inconsistent snapshot as a real feature. 2021-04-22 13:53:37 -07:00
Steve Atherton 4ff714efbd
Merge pull request #4691 from RenxuanW/first-consistent-version
Add first consistent version in restore status.
2021-04-21 17:46:13 -07:00
RenxuanW bc43fa99ac Move commit to its own try loop. 2021-04-21 17:37:58 -07:00
Andrew Noyes 8a00c6cdf8 Add -Wshift-sign-overflow
This catches the bug fixed in #4656 at compile time
2021-04-21 23:49:26 +00:00
RenxuanW b90f61d740 Move commit to its own try loop. 2021-04-21 15:50:25 -07:00
Chaoguang Lin f0a236c544 Merge branch 'master' of github.com:apple/foundationdb into refactor-fdbcli 2021-04-21 15:36:11 -07:00
RenxuanW 7c4b5b0337 Add first consistent version in restore status.
First consistent version is the max of versions in RestoreFileSet.
2021-04-21 14:32:13 -07:00
Markus Pilman 3fcbed1fbd Merge remote-tracking branch 'sfc/features/actor-lineage' into features/actor-lineage-fluentd 2021-04-21 14:58:11 -06:00
Markus Pilman 80e15e8768 started implementation 2021-04-21 14:56:02 -06:00
A.J. Beamon f485d7fa5e Fix comment typo 2021-04-21 12:25:03 -07:00
A.J. Beamon b9d8302ac7
Merge pull request #4540 from dlambrig/batch
Log latency metrics for batch GRV requests
2021-04-21 10:23:37 -07:00
RenxuanW ba60f18ebf Merge remote-tracking branch 'upstream/master' into backup-agent 2021-04-20 22:54:47 -07:00
Lukas Joswiak 36b1ab7ba5 Detach profiler thread instead of joining it 2021-04-20 22:05:16 -07:00
Lukas Joswiak 8b280f5be6 Remove old includes 2021-04-20 17:55:27 -07:00
Lukas Joswiak 15336ca274 Add callback for specific global configuration key changes 2021-04-20 17:51:38 -07:00
A.J. Beamon 9e89159efb Don't use DLDatabase objects before they are ready (applicable for API versions < 610). Fix reference counting of DLDatabase objects to avoid leaking the underlying database handle. Update release notes to note that clients older than 6.2 still create extra connections. 2021-04-20 16:21:01 -07:00
Lukas Joswiak 115efaabc3 Move profiler start function 2021-04-20 15:31:13 -07:00
A.J. Beamon eaaae2e16d Merge branch master into 'feature-mvc-monitor-protocol-version' 2021-04-20 15:07:02 -07:00
Lukas Joswiak 2357177722 Add bool support to global configuration 2021-04-20 15:05:51 -07:00
Lukas Joswiak 66d88e5f12
Merge branch 'master' into fixes/backup-delete 2021-04-20 14:49:26 -07:00
Markus Pilman d76b32da18 Annotate read paths on the server side 2021-04-20 15:10:01 -06:00
Chaoguang Lin af387e1519 Add check to make sure maintenance time is non-negative and update the documentation 2021-04-20 14:09:52 -07:00
Chaoguang Lin 2e825908dc Add check to make sure maintenance time is positive and update the documentation 2021-04-20 14:04:00 -07:00
Lukas Joswiak 8b2a72fea2 Add option to clear destination range before backup 2021-04-20 12:24:17 -07:00
Trevor Clinkenbeard 81fbe9ceaa
Merge pull request #4684 from sfc-gh-satherton/restore-range-fixes
Restore target range handling bug fixes
2021-04-20 11:57:13 -07:00
Evan Tschannen 83097c92cf
Merge pull request #4602 from sfc-gh-ljoswiak/fixes/applied-version
Fix early cutoff when DR copies logs
2021-04-20 11:10:30 -07:00
Lukas Joswiak 38d780e847 Add buggify 2021-04-20 09:35:40 -07:00
Lukas Joswiak 5d0d837268 Fix version cutoff 2021-04-20 09:35:40 -07:00
Steve Atherton 75425b5a24
Merge pull request #4620 from RenxuanW/renxuan/first-pr
Control backup's initial snapshot interval via backup cmd argument.
2021-04-19 23:59:02 -07:00
Lukas Joswiak c81e1e9519 Add sampling profiler frequency to global config 2021-04-19 22:46:57 -07:00
Steve Atherton 3f54a4a6dc Throw an error if an empty range set is passed to restore(). 2021-04-19 21:52:38 -07:00
Steve Atherton c2c9ca4362 Assert was incorrect. Restore ranges must begin with the restore prefix to remove. 2021-04-19 17:01:20 -07:00
Chaoguang Lin b34825a0e6 Merge branch 'master' of github.com:apple/foundationdb into add-dd-and-maintenance 2021-04-19 14:52:10 -07:00
RenxuanW 491fc6d69e Merge remote-tracking branch 'upstream/master' into backup-agent 2021-04-19 14:34:27 -07:00
RenxuanW ab4c5ff90e For better readability 2021-04-19 14:06:50 -07:00
RenxuanW 03c031a09d Update getCurrentVersion_impl
- If the restore is in the running state, then the current version is the getApplyBeginVersion()
- If the restore is in the completed state, the current version is the restore target version which comes from the restoreVersion() property.
- If the restore is in any other state, the current version can be reported as -1 as you have done.
2021-04-19 13:43:51 -07:00
Markus Pilman 863e262302
Merge pull request #4581 from sfc-gh-anoyes/anoyes/improve-ryw-disable-doc
Document that ryw disable can only be set at beginning of transaction
2021-04-19 14:38:06 -06:00
Markus Pilman d0c9ded6aa Merge remote-tracking branch 'sfc/features/actor-lineage-collector' into features/actor-lineage-collector 2021-04-19 13:11:51 -06:00
Markus Pilman f8d2bca6a4 address review comments 2021-04-19 13:10:27 -06:00
Markus Pilman 51ce278d2d
Merge branch 'features/actor-lineage' into features/actor-lineage-collector 2021-04-19 12:55:29 -06:00
Markus Pilman 09ddcb3bae remove old sample thread 2021-04-19 11:55:35 -06:00
Markus Pilman 7307750e5e Merge remote-tracking branch 'origin/master' into features/actor-lineage 2021-04-19 11:29:52 -06:00
Markus Pilman c8e9e2fab4 Merge remote-tracking branch 'sfc/features/actor-lineage' into features/actor-lineage-collector 2021-04-16 17:33:40 -06:00
Markus Pilman 336a429be1 first version of profiler 2021-04-16 17:32:53 -06:00
Lukas Joswiak bb5539bb70 Initialize version field 2021-04-16 14:08:36 -07:00
Lukas Joswiak 5c33c7c4f5 Remove TODO 2021-04-15 13:54:49 -07:00
Lukas Joswiak 551268b0f2 Add well known endpoint for worker communication 2021-04-15 13:50:50 -07:00
A.J. Beamon 486260e944 Fix infinite loop with stable interface protocol monitoring. Fix case where getting an error with a network option didn't properly terminate the database connection. Reduce option lock critical section. 2021-04-15 13:36:31 -07:00
A.J. Beamon b2d6930103 The multi-version client monitors the cluster's protocol version and only activates the client library that can connect. 2021-04-15 11:45:14 -07:00
RenxuanW 0378dc0a50 Report the current version in the restore status. 2021-04-14 22:19:39 -07:00
Steve Atherton 3aa237b298 Merge commit 'd51f94f16ee903aa249ddaac0047c88cafac1e89' into redwood-improvements 2021-04-14 18:23:48 -07:00
Markus Pilman d51f94f16e
Merge pull request #4330 from sfc-gh-ljoswiak/features/global-configuration
Add global configuration framework implementation
2021-04-14 16:10:08 -06:00
RenxuanW 97b995fb4f
Update fdbclient/FileBackupAgent.actor.cpp
Co-authored-by: Trevor Clinkenbeard <trevor.clinkenbeard@snowflake.com>
2021-04-14 13:49:59 -07:00
RenxuanW a285d6019e We cannot put 2 Future functions in the same wait if the second one uses the first's result.
Before this change:
20210414-180825-renxuan-7451fad7aed4f0c7           compressed=True data_size=22960315 duration=732 ended=146 fail=10 fail_fast=10 max_runs=100000 pass=46 priority=100 remaining=0 runtime=0:01:12 sanity=False started=147 stopped=20210414-180937 submitted=20210414-180825 timeout=5400 username=renxuan

After this change:
20210414-192849-renxuan-cbe0f71ad5c48286           compressed=True data_size=22959419 duration=4261266 ended=106778 fail=1 fail_fast=10 max_runs=100000 pass=99999 priority=100 remaining=0 runtime=0:24:49 sanity=False started=106963 stopped=20210414-195338 submitted=20210414-192849 timeout=5400 username=renxuan
2021-04-14 13:10:56 -07:00
A.J. Beamon 3ed0d614d2 Move fdb_get_server_protocol to be a function on the database object. Add an argument for expected_version that can be used to signal that the function shouldn't return unless the protocol version is different. 2021-04-14 12:50:30 -07:00
RenxuanW 9737212e51 The default value of the first snapshot interval should be 0 rather than -1. 2021-04-14 10:56:42 -07:00
Lukas Joswiak 51e4c19675 Add migration for client profiling keys 2021-04-14 10:56:33 -07:00
Lukas Joswiak 2594d91f11 Update casing 2021-04-14 10:56:33 -07:00
Lukas Joswiak 7de23918c0 Add comments, fix erase bug, make optimizations 2021-04-14 10:56:33 -07:00
Lukas Joswiak c38ddf5eb7 Add comments 2021-04-14 10:56:33 -07:00
Lukas Joswiak 7ba7257cd2 Store global config data on heap 2021-04-14 10:56:33 -07:00
Lukas Joswiak 1c60653c2a Add fix to conditionally set global config history 2021-04-14 10:56:33 -07:00
Lukas Joswiak aa0014ab6e Fix version serialization 2021-04-14 10:56:33 -07:00
Lukas Joswiak 6de28dd916 clang-format 2021-04-14 10:56:33 -07:00
Lukas Joswiak 1260385965 Use object to wrap global configuration history 2021-04-14 10:56:32 -07:00
Lukas Joswiak 1c84c04ffc Add global configuration prefix function 2021-04-14 10:56:32 -07:00
Lukas Joswiak 388344c31e Better estimation for arena size 2021-04-14 10:56:32 -07:00
Lukas Joswiak b7cd8175be Add arena per object in global config 2021-04-14 10:56:32 -07:00
Lukas Joswiak e5e48da5ce Revert removal of history size check 2021-04-14 10:56:32 -07:00
Lukas Joswiak e9e2ca54d6 Assert history contains data 2021-04-14 10:56:32 -07:00
Lukas Joswiak 70c4bbe119 Fix clear range persistence issue 2021-04-14 10:56:32 -07:00
Lukas Joswiak 4a799baa1d Add clear range for global configuration 2021-04-14 10:56:32 -07:00
Lukas Joswiak c3f68831af Move existing ClientDBInfo variables to global configuration 2021-04-14 10:56:32 -07:00
Lukas Joswiak 80c6048a01 Naming fixes 2021-04-14 10:56:32 -07:00
Lukas Joswiak 9587318696 Fix crash when history size is 0
This shouldn't happen in normal operation (if ClientDBInfo has been
updated, that means at least one item should have been added to the
history). But there is old functionality that uses other ClientDBInfo
fields to send updates to all nodes, and until this functionality is
removed this check needs to be here.
2021-04-14 10:56:32 -07:00
Lukas Joswiak 2acefa2c82 Add double and float support to tuples
Note that this functionality is copied from bindings/flow/Tuple.cpp.
These classes should eventually be combined (see #4351).
2021-04-14 10:56:32 -07:00
Lukas Joswiak 96732810ff Move actor implementation 2021-04-14 10:56:32 -07:00
Lukas Joswiak c9b0d3dd4e Fix memory leak
The map containing global configuration data had keys of type StringRef,
referencing data allocated in history arenas. When the old history
was deleted, this memory was no longer valid and some keys would point
to garbage memory.
2021-04-14 10:56:32 -07:00
Lukas Joswiak 7bb0b3d899 Use commit version for global configuration updates
FIXME: There is a memory issue where the underlying data for values set
in the `data` field of GlobalConfig will be freed shortly after being
set.
2021-04-14 10:56:32 -07:00
Lukas Joswiak 9e20b08976 Add float and double parsing 2021-04-14 10:56:32 -07:00
Lukas Joswiak f1415412f1 Add global configuration framework implementation 2021-04-14 10:56:32 -07:00
RenxuanW a0430536f1 Remove knob BACKUP_INIT_SNAPSHOT_INTERVAL_SEC. 2021-04-14 10:41:41 -07:00
A.J. Beamon eab468fecc Remove extra line caused by commit issue 2021-04-14 09:32:48 -07:00
Markus Pilman 1d362d8869 Merge remote-tracking branch 'origin/master' into features/actor-lineage 2021-04-14 09:55:04 -06:00
RenxuanW ebf37594f7 Change initialSnapshotIntervalSeconds from knob to a backup argument. 2021-04-13 19:22:13 -07:00
Chaoguang Lin 201ca568f9 Merge branch 'master' of github.com:apple/foundationdb into refactor-fdbcli 2021-04-13 13:51:59 -07:00
A.J. Beamon feede1d2f6 Fix line length of test macro + comments to be within the 120 character limit 2021-04-13 10:48:52 -07:00
RenxuanW c8b27e71c5 Revert TraceEvent
We've found the problem (issue #4640), so we no longer need the TraceEvent.
2021-04-12 15:03:47 -07:00
Markus Pilman ec95b649b0 Any can't be used as an index type 2021-04-12 09:51:59 -06:00
Markus Pilman eb2fe0dbcf added serializable containers 2021-04-12 09:48:53 -06:00
Markus Pilman 2efcf8efec Merge remote-tracking branch 'sfc/features/actor-lineage' into features/actor-serialization 2021-04-12 09:34:03 -06:00
RenxuanW dc00d99626 Log FileBackupLogRangeStart before calling getLogRanges() in .
It will tell us if or why this function is legitimately trying to use too much ram. getLogRange() should normally return about 20 items in the result. If the inputs are trash, it could return far more.

If it isn’t the case, then there’s something else wrong that has corrupted something such that when we try to allocate memory.
2021-04-09 15:02:45 -07:00
Markus Pilman 6656557b6a made internal collect method private 2021-04-09 15:25:11 -06:00
Markus Pilman 8a6473c08a
Apply suggestions from code review
Co-authored-by: Lukas Joswiak <lukas.joswiak@snowflake.com>
2021-04-09 15:23:42 -06:00
Markus Pilman 20d98421af fix compiler errors 2021-04-09 15:16:07 -06:00
Markus Pilman 2064903705 collect and serialize 2021-04-09 14:25:11 -06:00
Steve Atherton 5e6655f111 Added temp space to StorageBytes. 2021-04-07 23:56:20 -07:00
Steve Atherton f8786da688 Added StorageByte::toString() and printed it in Redwood direct perf test. 2021-04-07 20:14:16 -07:00
Lukas Joswiak 83cf965875 Add global variable to fetch each type of sample 2021-04-07 15:38:01 -07:00
Lukas Joswiak 130e520ad7 Use object lifetimes instead of function calls 2021-04-07 13:27:31 -07:00
Lukas Joswiak d60011aa74 Update annotation class name 2021-04-07 13:27:31 -07:00
Lukas Joswiak d6c4aa67d7 Sample actors waiting on network 2021-04-07 13:27:31 -07:00
RenxuanW 551dfa6ad8 Move description of RESTORE_IGNORE_LOG_FILES to header file. 2021-04-07 10:21:24 -07:00
RenxuanW fadc9cccee Use knob RESTORE_IGNORE_LOG_FILES in restore.
Rename IGNORE_LOG_FILES to RESTORE_IGNORE_LOG_FILES. Also, this knob should be used in regular restore, not parallel restore.
2021-04-07 10:05:56 -07:00
RenxuanW 1b6ad42db8 Use a knob to completely ignore log files 2021-04-06 19:07:01 -07:00
Dan Lambright cabf192f57 Respond to review comments 3/23 2021-04-06 13:05:09 -04:00
Dan Lambright 48a475366c Log latency metrics for batch GRV requests 2021-04-06 13:05:09 -04:00
RenxuanW 7e1d60c924 Format BACKUP_INIT_SNAPSHOT_INTERVAL_SEC like other knobs. 2021-04-05 22:53:36 -07:00
RenxuanW edb3dd4414 Control backup's initial snapshot interval via knob. 2021-04-05 18:25:18 -07:00
Vishesh Yadav 91afa3cc9c
Merge pull request #4515 from halfprice/zhewu/update-fault-tolerance-calculation
Populate min_replicas_remaining fields in the cluster status for all regions. This gives better picture of the data replicas currently exist in the database.
2021-03-31 13:06:17 -07:00
Chaoguang Lin 107f66e4e1 Merge branch 'master' of github.com:apple/foundationdb into refactor-fdbcli 2021-03-30 16:35:41 -07:00
Steve Atherton 7d44af3fec
Merge pull request #4568 from jzhou77/master
Use the restored range in the actual restore
2021-03-30 09:58:31 -07:00
Jingyu Zhou 4c73e838ab Fix a comment on backup log data location 2021-03-30 09:26:48 -07:00