Commit Graph

1463 Commits

Author SHA1 Message Date
sfc-gh-tclinkenbeard 2673a727ac Merge remote-tracking branch 'origin/main' into main-gtt-forget-old-ss 2023-07-09 12:57:23 -07:00
Ata E Husain Bohra 7779c908b3
EaR: Remove usage of ENABLE_CONFIGURABLE_ENCRYPTION knob (#10570)
Description

Given Configurable encryption has been checked in and being tested via
simulation for more than a month and also to avoid penalty of accessing
KNOBS in inline commit path, patch retires the KNOB and make
ConfigurationEncryption default EaR mode for FDB.

BlobCipher still supports the old format header and encryption semantics,
will remove the dead code as a followup PR.

Testing

devRunCorrectness - 100K
2023-06-30 17:48:09 -07:00
sfc-gh-tclinkenbeard ba74cd25f3 Add ExpectStableThroughput test 2023-06-29 22:19:39 -07:00
A.J. Beamon 155c03f6fe Decrease transaction rate of backup correctness clean to speed it up 2023-06-23 16:14:03 -07:00
A.J. Beamon 1af1346cde Change a buggify and decrease the number of transactions per second so that the dr upgrade test runs faster 2023-06-23 09:03:34 -07:00
A.J. Beamon cc68320333 Add testing that a metacluster can be used after a restore. Fix some bugs found by this related to the restore ID and tenant tombstones. 2023-06-21 08:59:22 -07:00
Zhe Wu 5c8a163c72
Update main branch to 7.4 (#10459)
* Update main branch to 7.4

* Update API version to 740

* Makes fdb_c_client_config_tests.py passing after API version update

* Remove from_7.3.0_until_7.4.0 and add from_7.3.0

* Update tests in fdb_c_client_config_tests.py
2023-06-15 10:19:39 +02:00
Ata E Husain Bohra bfbf8cd053
EaR: Update KMS URL refresh policy and fix bugs (#10382)
* EaR: Update KMS URL refresh policy and fix bugs

Description

RESTKmsConnector implements discovery and refresh semantics i.e.
on bootstrap it discovers KMS Urls and periodically refresh the
URLs (handle server upgrade scenario). The current implementation
caches the URLs in a min-heap, as part of serving a request, actor
pops out elements from min-heap and attempts connecting to the server,
on failure, the URL is temporarily stored in a stack, at the end of
the request processing, the stack is merged back into the heap.
The code doesn't work as expected if there are multiple requests
consumes the heap causing following issues:
1. Min-heap would retain old URLs replaced by latest refresh (stack merge)
2. URL discovery file is read more than expected as multiple requests can
empty heap, causing the code to read URLs from the file.

Patch proposes following policy to cache and maintain URLs priority:
1. Unresponsiveness penalty: KMS flaky connection or overload can cause
requests to timeout or fail; each such instance updates unresponsiveness
penalty of associated URL context. Further, the penalty is time bound and
deteriorate with time.
2. Cached URLs are sorted once a failure is encountered, priority followed
is:
2.1. Unresponsiveness penalty server(s) least preferred
2.2. Server(s) with high total-failures less preferred
2.3. Server(s) with high total-malformed response less preferred.
3. Updates RESTClient to throw 'retryable' error up to the client such as:
'connection_failed' and/or 'timeout'
4. Extend RESTUrl to support IPv6 format.

Testing

RESTUnit - 100K (new test added for coverage)
devRunCorrectness
2023-06-14 08:06:39 -07:00
dependabot[bot] e8266ad473
Bump cryptography from 39.0.1 to 41.0.0 in /tests/authorization (#10398)
Bumps [cryptography](https://github.com/pyca/cryptography) from 39.0.1 to 41.0.0.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/39.0.1...41.0.0)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-09 10:39:16 -07:00
dependabot[bot] 4184621151
Bump cryptography from 39.0.1 to 41.0.0 in /tests/TestRunner (#10397)
Bumps [cryptography](https://github.com/pyca/cryptography) from 39.0.1 to 41.0.0.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/39.0.1...41.0.0)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-09 10:38:58 -07:00
A.J. Beamon 4e49f5b26d
Merge pull request #10435 from sfc-gh-jslocum/disable_low_value_tests
removing/disabling explicit test files for quick running unit tests, …
2023-06-08 08:19:22 -07:00
Xiaoxi Wang 85a9f01554 fix format issue; combine mock DD tests into 1 toml file; temporarily
disable add storage servers in MockReadWrite test
2023-06-07 22:01:33 -07:00
Xiaoxi Wang e0bf36a14e Only enable mock DD related tests w/o RocksDB 2023-06-07 22:01:33 -07:00
Xiaoxi Wang c307795301 Fix merge shard bug in mock finishMoveKeys; And testRawFinishMoveKeys bug in workload IDDTxnProcessorApiCorrectness.actor.cpp 2023-06-07 22:01:33 -07:00
Xiaoxi Wang b09483a644 addStoragePerProcess method. Add testClass for mock dd test. 2023-06-07 22:01:33 -07:00
Xiaoxi Wang e139f5bf90 Correctly handle buggify errors in MGSWaitStorageMetrics 2023-06-07 22:01:33 -07:00
Xiaoxi Wang b444eb1f22 Make DataDistributor use the configuration object in DDSharedContext; Change Mock test config 2023-06-07 22:01:33 -07:00
Xiaoxi Wang ac16dbd0d8 Fix mock DD incompatible places 2023-06-07 22:01:33 -07:00
Xiaoxi Wang bef639ab81 change how MockDataDistributor start 2023-06-07 22:01:33 -07:00
Josh Slocum 07dd44d659 removing/disabling explicit test files for quick running unit tests, and converting actor fuzz to a unit test 2023-06-07 16:00:46 -05:00
Jingyu Zhou 36f2f9015e
Merge pull request #10357 from w41ter/main
Fix restore range loss
2023-05-31 15:43:25 -07:00
w41ter f7e88aeb23 Fix test by disable tenant 2023-05-31 11:10:11 +08:00
Aaron Molitor a718a31dd7 update links to foundationdb.org to reference GitHub 2023-05-30 10:15:20 -05:00
w41ter abd23958c2 Fix restore range loss 2023-05-29 11:39:07 +08:00
He Liu 6a18c1fa4c Revert unintended changes to APICorrectnessTest. 2023-05-24 14:53:05 -07:00
Yanqin Jin 16df5a8517
Make redwood tests terminate after certain amount of time (#10032)
This PR avoids "external timeout" for redwood correctness tests.

Update the logic in fdbserver.actor.cpp so that -1 instead of 0 is considered a noUnseed. If "noUnseed == true", then -1 will be logged as "RandomUnseed" in the end of the trace.

Tweak the finish condition of redwood unit tests so that if wall clock time reaches a certain threshold, finish the test and set nounseed to true.
2023-05-23 21:29:45 -07:00
He Liu 8ad7ec6fdf
Psm ss (#9817)
* Update NativeAPI getCheckpointForRange().

* Implemented checkpoint in SS.

* clean up.

* Disabled StorageServerCheckpointTest.

* Serialized checkpoint creation and deletion.

Simplified checkpoint GC, via deleting CheckpointMetaData::dir.

* Fixed PhysicalShardMove test. Where fetchCheckpoint target range is misset.

* Minor improvements on CheckpointMetaData and DataMoveMetaData.

* fmt.

* Optimized PhysicalShardMove test

cleanup.

* Refactored ShardedRocks checkpoint/restore for psm.

* Complete ShardedRocks::restore.

* dismiss operation_obsolete, and throw actor_cancelled.

* Validate checkpoint when !asKeyValues.

* fmt.

* Don't read from uninitialized physical shard.

* Resolved commments.

* cleanup.

* Added verify_checksum_before_restore for ShardedRocks.

* Added ShardedRocksDB checkpoint/restore unit test.

* Populate CheckpointMetaData::dir in RocksDB.

* Rename MovingIn as Adding.

* Added StorageServerUtils.

* Added physical shard move in SS.

* Fix on ApplyMetaData, doFetchFile error handling etc.

* Debugging incorrect shard size.

* Create/delete checkpoints only when Physical shard move is enabled.

* Added back SHARD_ENCODE_LOCATION_METADATA.

* Fixed bytesSample incorrect issue.

Essentially dedicated CheckpointRocksDBCF as key-value based checkpoint, will need to add a new format for the file-based checkpoint.

* Cleanup.

* Cleanup & compile rocksdb with 8.1 branch.

* clean up.

* clean up.

* Allowed request_maybe_delivered error type in FetchShard.

* Added FDBRocksDBVersion.h.

* Fixed stuck fetchShard.

* Don't create checkpoint on TSS.

* Upgrade to RocksDB 8.1.1

* Cleanup.

* Fixed accidently deleted db_path and name fields.

* Improved trace event.

* Removed redundants from previuos ShardedrocksDB.

* Cleanup.

* cleanup.

* cleanup.

* reanme `state`.

* Cleanup.

* Removed excessive TraceEvent.

* * Fixed shardMap race condition on different threads
* Added *Stats, logging data move rates.
* Added `DD_PHYSICAL_SHARD_MOVE_PROBABILITY` to support hybrid data move.

* Resolved comments.

* fmt.

* Use physical shard move in PhysicalShardMoveTest.

* Enforce physical-shard-move for PhysicalShardMoveTest.

* fmt
2023-05-23 11:18:35 -07:00
Josh Slocum 629b068145
Bg tenant metadata restarting (#10235)
* making blob metadata optionally deterministic across runs

* Non restarting test passes after refactor

* adding downgrade version test

* formatting
2023-05-23 11:24:13 -05:00
Jingyu Zhou 8878de8c8f
Merge pull request #10288 from sfc-gh-yajin/update-test-pattern-1
Fix a test pattern so that simulator tests do not run non-sim tests
2023-05-19 16:34:01 -07:00
Hui Liu 7ca13d8f9c
support blob restore in fdbrestore (#10248) 2023-05-19 14:45:14 -07:00
Zhe Wu 93ad70db38
Merge pull request #10263 from halfprice/zhewu/gc-generation-using-recoverat
GC earlier TLog generation using each generation's `recover at` version instead of `start version`
2023-05-19 12:07:02 -07:00
Yanqin Jin 92873f2e1f Fix a test pattern so that simulator tests do not run non-sim tests (#256) 2023-05-18 21:48:47 -07:00
Jingyu Zhou 4641f17808 Ignore noSim/PerfShardedRocksDBTest.toml for Valgrind build 2023-05-18 16:09:41 -07:00
He Liu a5f639f859
Fix psm test (#10273) 2023-05-18 14:54:26 -07:00
Ata E Husain Bohra e25b9ff686
EaR: REST based Simulated KMS Vault request handler interface (#10240)
* EaR: REST based Simulated KMS Vault request hanlder interface

Description

  diff-1: Address review comments
             Improve unit test case coverage
  diff-2: Extend RESTKmsConnectorUtil to generate HTTP::Header

EaR simulation testing is currently driven using SimKmsConnector
interface, it exposes endpoints directly invoked by EKP to fetch
encryption keys. Approach avoids testing RESTKms communication
path. Recently FDB codebase got extended by adding HTTPServer
interface, which was a gap prohibiting end-to-end testing of
EaR code.

Patch proposes following changes:
1. Refactor RESTKmsConnector to move common code and definitions
to RESTKmsConnectorUtil namespace
2. Introduce RESTSimKmsVault accepting HTTP format requests and
providing appropriate HTTP response.

Testing

RESTUnit          100K + 5k valgrind
devRunCorrectness 100K

Testing
2023-05-17 12:38:09 -07:00
Zhe Wu 3b651697c5 Update documents 2023-05-16 21:11:48 -07:00
Zhe Wu 1eae833ae2 test record_recover_at_in_cstate and track_tlog_recovery in restart test 2023-05-16 13:37:42 -07:00
Zhe Wu 0bdfe1889b Add recovered at in CSTATE, and use a knob to guard the use of it 2023-05-16 12:47:00 -07:00
Sam Gwydir 6c16875c34
Add networkoption to disable non-TLS connections (#9984)
* Add networkoption to disable non-TLS connections

* add disable plaintext connection to fdbserver

* python doc

* Formatting

* Add tls disable plaintext connection to client api test

* review

* fix negative test

* formatting

* add TLS support to c client config tests

Adds support for TLS in the client and server separately

* add tests for disable_plaintext_connections

Test TLS and Plaintext Clusters and Clients

* Fix documentation

* Rename option to indicate it is client-only

* clearer formatting

* default to allowing plaintext connections

* add SetTLSDisablePlaintextConnection to go bindings
2023-05-13 00:14:11 +02:00
Josh Slocum 9c081f8a08
Sim http server improvements (#10217)
* Passes existing tests

* adding http unit test for wrong md5 sum

* Added new HTTPKeyValueStore workload to test long-running http clients

* fixing warnings
2023-05-12 16:33:32 -05:00
Yao Xiao bdb54c5375 initialize counter 2023-05-10 10:23:01 -07:00
Yao Xiao 1c018f066d Fix build error. 2023-05-10 10:23:01 -07:00
Yao Xiao 2d1b5d02e2 Range deletion memory usage improvements (#10048) 2023-05-10 10:23:01 -07:00
Zhe Wu a6d6c70aad
Merge pull request #10103 from halfprice/zhewu/update-main-to-7.4
Bring main branch to 7.3
2023-05-10 09:53:27 -07:00
Zhe Wu 41d3259ead revert 7.3 to 7.4 restart tests 2023-05-09 21:14:53 -07:00
Zhe Wu 761cdbc019 Bring main to 7.3 2023-05-09 21:14:16 -07:00
Josh Slocum eeae0cbe93
fixing logical merge conflict that dropped tenant modes (#10190) 2023-05-09 16:36:04 -05:00
Josh Slocum 9a2365daa8
fixing bugs with tenant_mode required on external clients and changin… (#10183)
* fixing bugs with tenant_mode required on external clients and changing test to find them

* Update fdbcli/BlobKeyCommand.actor.cpp

Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>

---------

Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>
2023-05-09 13:41:58 -05:00
Chaoguang Lin 50df914c80
Add snapshot restarting tests back to correctness on main (#10164)
* Enable fault injection;
Ignore MAX_COORDINATOR_SNAPSHOT_FAULT_TOLERANCE in simulation
Fix the issue when there's not fdb.cluster file and disable randomMoveKeys in SnapTest workload

* Move snap tests to from_7.3.0

* Add comments for the change

* Change SevError to SevWarn when Tlog process is not found in the worker map
2023-05-09 11:02:27 -07:00
Josh Slocum 6be0c74d5b
Adding explicit blob range mutation log to handle large number of ranges (#10174)
* Adding explicit blob range mutation log to handle large number of ranges

* fixing ide build
2023-05-09 11:30:04 -05:00