* Make CodeProbeImpl::_hitCount atomic
* Structure access to TraceLog::logTraceEventMetrics so that it is written before a trace log is opened and only read from one thread after it is opened.
* Fix condition in assert
* Rename TraceLog::log to logMetrics and move initialization of trace log metrics into TraceLog::open
---------
Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>
The blob worker needs more time to catchup, about 388s in the failed simulation
test.
Reproduction:
seed: -f ./tests/slow/BlobGranuleVerifyLargeClean.toml -s 4068151139 -b on
commit: 3bdd71cb0 at release-7.3 branch
build: gcc
* Test watch cleanup on cancel
* Fix clearing the database in Java integration tests
* Always cancel the futures wrapped by MVC abortable futures
* More tests for watch cleanup
* Fix clear database database in some Java integration tests
* improve audit throughput
* if ssshard fails do audit due to ssi failure, then global retry is required
* fix a trace event name
* fix budget release in doAudit
* avoid throttling in general simultion tests
* fix doAuditOnStorageServer throw error
* avoid starting a task that has been complete
* when ddaudit ssshard failed, check if ssi is removed, if yes, silently exit
* fix trace detail name of AuditUtilStorageServerRemovedEnd evenrt
* redo schedule in doAuditOnStorageServer
* schedule does not wait doAudit
* remove TESTING_AUDIT_STORAGE_THROTTLING
* ssaudit stops proceeding if ddauditstate is not in running phase
* make tester audit storage only happen when simulation, and randomly set CONCURRENT_AUDIT_TASK_COUNT_MAX
* Remove duplicate getRange() for DB handles and update existing GetRange to accept DB handles.
* Initial progress checkpoint on new ConsistencyScan role.
* Updated TODOs, finished most if not all state updates.
* placeholder
* Add more TODOs, documentation and comment improvements.
* Checkpoint round state to avoid advancing progress if commit fails.
* Bug fix, check is supposed to be for overlap, not lack of overlap.
* Added more TODO's and added faked read results / exceptions and faked DB size retrieval to prove the consistencyScanCore logic works.
* Update JSON schemas and command help.
* Add comment about lifetime stats reset.
* More TODO comments and some renames for clarity, some bug fixes.
* properly stopping consistency scan in simulation so that it doesn't run forever and cause quiet database to fail
* removing trailing comma from consistency_scan json schema
* Making CC inconsistency not an error if it's intentional tss corruption
* consistency scan actually reads storage locations
* added check that consistency scan actually completes a round in simulation, fixed bug and added debugging around consistency scan getting stuck
* made consistency scan properly fetch database size
* refactoring data check to be used in both consistency scan and consistency check
* checking that consistency scan always completes at least one round and doesn't get stuck
* cleanup
* fixing ide build
* consistencyscan fdbcli command wasn't actually changing db state
* consistencyscan fdbcli command always said enabled even when it wasn't
---------
Co-authored-by: Steve Atherton <steve.atherton@snowflake.com>
* EaR: REST based Simulated KMS Vault request hanlder interface
Description
diff-1: Address review comments
Improve unit test case coverage
diff-2: Extend RESTKmsConnectorUtil to generate HTTP::Header
EaR simulation testing is currently driven using SimKmsConnector
interface, it exposes endpoints directly invoked by EKP to fetch
encryption keys. Approach avoids testing RESTKms communication
path. Recently FDB codebase got extended by adding HTTPServer
interface, which was a gap prohibiting end-to-end testing of
EaR code.
Patch proposes following changes:
1. Refactor RESTKmsConnector to move common code and definitions
to RESTKmsConnectorUtil namespace
2. Introduce RESTSimKmsVault accepting HTTP format requests and
providing appropriate HTTP response.
Testing
RESTUnit 100K + 5k valgrind
devRunCorrectness 100K
Testing
* when trigger doAuditOnStorageServer, check remainingBudgetForAuditTasks
* add trace event of audit progress
* address comments
* code clean up
* make dispatch and schedule audit be more clear
* make dispatch and schedule audit be more clear 2
* make dispatch and schedule audit be more clear 3
* address comments
* Add networkoption to disable non-TLS connections
* add disable plaintext connection to fdbserver
* python doc
* Formatting
* Add tls disable plaintext connection to client api test
* review
* fix negative test
* formatting
* add TLS support to c client config tests
Adds support for TLS in the client and server separately
* add tests for disable_plaintext_connections
Test TLS and Plaintext Clusters and Clients
* Fix documentation
* Rename option to indicate it is client-only
* clearer formatting
* default to allowing plaintext connections
* add SetTLSDisablePlaintextConnection to go bindings
* clean up old audit metadata
* change comments
* fix audit cleanup rule as PR description claim and reduce timeout of auditStorageCorrectness in tester
* address comment
* clear audit metadata should not throw error
* cleanup progress metadata by type
* control number of AuditStatistic events
* carefully persist new audit state
* add unit tests and fix issues
* cleanup
* allow audit concurrent run for different types and fix some bug in auditutl
* fix ci issue and nits
* Added `get_audit_status checkmigration` to print out the number of data shards and `physical shards`, so that we know the progress of migration to `shard_encode_location_metadata`
* Fixed print format.
* Addressed comments.
* fixing bugs with tenant_mode required on external clients and changing test to find them
* Update fdbcli/BlobKeyCommand.actor.cpp
Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>
---------
Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>