Commit Graph

2626 Commits

Author SHA1 Message Date
Dennis Zhou 080acd811f
tuple: user defined type support (#8300)
* tuple: add support for usertype as last part of tuple

The expectation is user defined types come at the end as there is no
delimiter standard. In that case, if we encounter a user defined type,
we append it to the end of the tuple and allow users to transform that
type by parsing raw string returned.

* blob: use Tuple::unpackUserType() for key alignment

We want blob granule key alignment to be able to understand user data
types. This upgrades support and allows user data types to be the last
key in a blob granule and for us to be able to parse that properly.
2022-09-26 09:18:37 -07:00
sfc-gh-tclinkenbeard 7fc5c196c4 Make read and write quotas fungible 2022-09-25 21:00:11 -07:00
sfc-gh-tclinkenbeard 985958c260 Add rare code probe decoration 2022-09-25 15:28:32 -07:00
Dan Adkins df2c1374cb
Correct a couple of comments in simulator and simulated workloads. (#8310) 2022-09-24 22:52:13 -07:00
Markus Pilman 5774249e5b
Revert "[DRAFT] Redwood PriorityMultiLock enable different launch limits to be specified based on different priority level." 2022-09-23 12:22:47 -06:00
Markus Pilman 82d8c17d00
Merge pull request #8272 from sfc-gh-dadkins/sfc-gh-dadkins/bugs/reboot-not-kill
Downgrade kill to reboot in Rollback workload.
2022-09-23 11:32:04 -06:00
Markus Pilman fe4c33fabb
Merge pull request #8302 from sfc-gh-huliu/testfix2
disable MoveKeysWorkload for a few tests that need to manipulate dd
2022-09-23 11:02:04 -06:00
Hui Liu 9fd16bd38e disable MoveKeysWorkload for a few tests that need to manipulate dd 2022-09-23 08:56:27 -07:00
Hui Liu c9b4fc5761 disallow MoveKeysWorkload running in parallel 2022-09-22 17:00:01 -07:00
Josh Slocum 430f6e9670
Fix purge at latest racing with other checking threads at the end of BlobGranuleVerify (#8281) 2022-09-22 16:32:33 -07:00
Jon Fu 2eea93f170 move cluster assignment into loop to test error case 2022-09-22 14:43:08 -07:00
Jon Fu e342a9db43 Merge branch 'main' of github.com:apple/foundationdb into metacluster-assigned-cluster 2022-09-22 14:39:27 -07:00
Jon Fu 8a6e68cf63 adjust check if existing entry cluster does not match 2022-09-22 14:39:03 -07:00
A.J. Beamon bd006526d6
Merge pull request #8251 from sfc-gh-ajbeamon/fdbcli-tenant-group-metadata
Fdbcli command to get tenant group metadata
2022-09-22 14:17:08 -07:00
A.J. Beamon 97a325adab Add an fdbcli command to get tenant group metadata 2022-09-22 13:24:21 -07:00
Xiaoxi Wang be1cc6c111
Update fdbserver/workloads/IDDTxnProcessorApiCorrectness.actor.cpp
Co-authored-by: Bharadwaj V.R <bharadwaj.vr@snowflake.com>
2022-09-22 13:06:21 -07:00
Steve Atherton 7831f6b2f1
Merge pull request #7578 from sfc-gh-fzhao/RedwoodIOLaunchLimit
[DRAFT] Redwood PriorityMultiLock enable different launch limits to be specified based on different priority level.
2022-09-22 12:29:17 -07:00
Nim Wijetunga 37b93a6232
fix tests (#8286) 2022-09-22 11:34:06 -07:00
Jon Fu b559be1184 assign cluster outside of retry loop 2022-09-22 11:22:41 -07:00
Jon Fu e91afa15b6 Merge branch 'main' of github.com:apple/foundationdb into metacluster-assigned-cluster 2022-09-22 11:14:27 -07:00
A.J. Beamon fda0d7223d Update backup to include system key ranges needed for tenants. Run simulated backup tests with tenants. 2022-09-22 10:00:13 -07:00
Steve Atherton 04b4960786 Merge branch 'main' into fzhao/RedwoodIOLaunchLimit
# Conflicts:
#	fdbserver/VersionedBTree.actor.cpp
#	fdbserver/storageserver.actor.cpp
#	fdbserver/workloads/ReadWrite.actor.cpp
2022-09-22 00:39:51 -07:00
Jon Fu 7a09b701cc
Merge pull request #8141 from sfc-gh-jfu/network-disable-bypass
Introduce network option for disabling mvc bypass
2022-09-21 17:33:48 -07:00
Hui Liu bf943bfd57
Merge pull request #8273 from sfc-gh-huliu/testfix
disable MoveKeysWorkload when running SpecialKeySpaceCorrectness test
2022-09-21 17:32:44 -07:00
Jon Fu 937fb38dcc address review comments 2022-09-21 16:23:12 -07:00
Hui Liu f115599a67 disable MoveKeysWorkload when running SpecialKeySpaceCorrectness test 2022-09-21 16:12:20 -07:00
Josh Slocum 4794ebabcc
Merge pull request #8262 from sfc-gh-dzhou/purge-fixes
blob: java api fixes (purge + verify)
2022-09-21 17:25:54 -05:00
Dan Adkins 5aa9a36559 Downgrade kill to reboot in Rollback workload.
Indiscriminately killing machines can lead to unrecoverable clusters,
even if we avoid killing the protected coordinators.
2022-09-21 15:22:31 -07:00
Dennis Zhou 4ea4546cb6 blob/java: verifyBlobRange() with latestVersion 2022-09-21 14:07:16 -07:00
Dennis Zhou e353169a50 blob: teach purge about latestVersion
This teaches purgeBlobGranules about latestVersion and rejects
versions <= 0.
2022-09-21 14:04:58 -07:00
Josh Slocum 6270016bed
Seq insert perf fixes main (#8264)
* Force flushing granules post-split to guarantee parent feeds get cleaned up

* fixing bug and cleaning up split finalize code
2022-09-21 12:36:02 -07:00
Jon Fu 0343ca9c53 Merge branch 'main' of github.com:apple/foundationdb into network-disable-bypass 2022-09-21 11:33:27 -07:00
Hui Liu c237186ec3
Merge pull request #8252 from sfc-gh-huliu/fixtest
Fix correctness test: skip lock check in MoveKeysWorkload
2022-09-20 16:34:22 -07:00
Trevor Clinkenbeard 3b5117ca92
Merge pull request #8233 from sfc-gh-mpilman/features/disable-failure-injection-for-workload
All workloads to disable some failures
2022-09-20 16:17:35 -07:00
Nim Wijetunga eadb769cfa
Encrypt Backup Mutation Log (#8159)
* encrypt backup mutation log

* format

* address pr comments

* format

* fix bug

* revert knobs

* address pr comments
2022-09-20 15:43:39 -07:00
Chaoguang Lin 9628561235
Add DataDistributionMetrics workload into correctness packages, (#8237)
which makes the code probe hit in nightly tests.
2022-09-20 15:33:15 -07:00
Hui Liu 52f1bef8ec Fix correctness test: skip lock check in MoveKeysWorkload 2022-09-20 15:16:56 -07:00
Jon Fu 4bbc2ad597 Merge branch 'main' of github.com:apple/foundationdb into metacluster-assigned-cluster 2022-09-20 09:34:06 -07:00
Jon Fu 1f778f9d76 Merge branch 'main' of github.com:apple/foundationdb into network-disable-bypass 2022-09-20 09:30:53 -07:00
Andrew Noyes 4ac5498f2e Fix a UBSAN diagnostic caused by use of uninitialized memory
The diagnostic is

fdbserver/workloads/DiskFailureInjection.actor.cpp:57:64: runtime error: load of value 208, which is not a valid value for type 'bool'

The fix is simply to initialize the default value of verificationMode
2022-09-20 08:54:45 -07:00
Markus Pilman 62ac01b481 All workloads to disable some failures 2022-09-19 16:42:55 -06:00
Chaoguang Lin 125137b987
Change the special key space correctness workload to hit code probe (#8214) 2022-09-19 15:01:21 -07:00
sfc-gh-tclinkenbeard 0ec06f5675 Merge remote-tracking branch 'origin/main' into guard-gsimulator 2022-09-19 13:16:48 -07:00
Josh Slocum 0f3f493c28
Merge pull request #8218 from sfc-gh-jslocum/aligned_purge
fixes for non-aligned blob range calls
2022-09-19 15:00:02 -05:00
A.J. Beamon 4fd64630e8 Convert literal string ref instances to use _sr suffix 2022-09-19 11:35:58 -07:00
Markus Pilman e1627e0a78 Merge remote-tracking branch 'origin/main' into features/always-inject-faults 2022-09-19 09:38:55 -06:00
Josh Slocum 88f88707f5 fixes for non-aligned blob range calls 2022-09-16 19:06:15 -05:00
Markus Pilman ef2d301305 Merge remote-tracking branch 'origin/main' into features/always-inject-faults 2022-09-16 16:41:16 -06:00
Jon Fu 1abac8ea9f check on shared state ptr in native api and add to test spec in api tester 2022-09-16 15:11:33 -07:00
Markus Pilman 9441795c8e address review comments 2022-09-16 14:40:10 -06:00
sfc-gh-ngoyal 1bd97fe628
Recruit new singleton for consistency checker. (#5804)
* Recruit new singleton for consistency checker.

* Recruit the consistency checker only if enabled.

* Add a yield in monitorConsistencyChecker().

* Minor fixes.

* Consistency check workload enhancements.

* Minor fixes and clarifications.

* clang format

* Clang format.

* Minor fixes, cleanup, debug tracing.

* Misc.

* Move the consistency scan information from dbconfig to a key backed object.

* Move consistency scan config out of db cofig to a state object and feature rename.

* ConsistencyCheck workload refactor.

* devFormat

* Update fdbcli/ConsistencyScanCommand.actor.cpp

* Review Comments.

Co-authored-by: negoyal <neelam.goyal@gmail.com>
Co-authored-by: Ata E Husain Bohra <ata.husain@snowflake.com>
2022-09-16 09:03:06 -07:00
Xiaoxi Wang 7169a8b132 merge upstream/main 2022-09-15 21:27:26 -07:00
Fuheng Zhao ac65c3f569 merge upstream main 2022-09-15 14:19:19 -07:00
He Liu 6f7968b618 Merge branch 'main' of https://github.com/apple/foundationdb into validate-data-consistency 2022-09-15 10:15:33 -07:00
sfc-gh-tclinkenbeard b9fbd8c130 Fix -Wlogical-op-parentheses warning 2022-09-15 09:00:33 -07:00
sfc-gh-tclinkenbeard 82adc1e856 Make g_simulator a pointer 2022-09-15 09:00:33 -07:00
Lukas Joswiak cfbd04ae78 Add explicit check for a simulated network 2022-09-14 14:14:49 -07:00
Jon Fu 51b9861354 check data cluster entry for capacity and account for different thrown errors 2022-09-14 12:57:46 -07:00
Xiaoxi Wang 1f1a66be39 merge upstream/main 2022-09-14 12:32:28 -07:00
Xiaoxi Wang 674da807ad add readMoveKeysLock method and catch movekeys_conflict() in the test workload 2022-09-14 12:17:47 -07:00
Ata E Husain Bohra d2b82d2c46
Introduce "default encryption domain" (#8139)
* Introduce "default encryption domain"

Description

In current FDB native encryption data at-rest implementation,
an entity getting encrypted (mutation, KV and/or file) is categorized
into one of following encryption domains:
1. Tenant domain, where, Encryption domain == Tenant boundaries
2. FDB system keyspace - FDB metadata encryption domain
3. FDB Encryption Header domain - used to generate digest for
plaintext EncryptionHeader.

The scheme doesn't support encryption if an entity can't be categorized
into any of above mentioned encryption domains, for instance, non-tenant
mutations are NOT supported.

Patch extend the encryption support for mutations for which corresponding
Tenant information can't be obtained (Key length shorter than TenantPrefix)
and/or mutations do not belong to any valid Tenant
(FDB management cluster data) by mapping such mutations to a
"default encryption domain".

TODO

CommitProxy driven TLog encryption implementation requires every transaction
mutation to contain 1 KV, not crossing Tenant-boundaries. Only exception to
this rule is ClearRange mutations. For now ClearRange mutations are mapped
to 'default encryption domain', in subsequent patch appropriate handling
for ClearRange mutations shall be proposed.

Testing

devRunCorrectness - 100k
2022-09-14 10:58:32 -07:00
Xiaoxi Wang b1e48e06f8 remove meaningless comparison in InitDataDistribution 2022-09-14 10:56:00 -07:00
He Liu 28e5a70dbe Clean up SS validate storage. 2022-09-14 10:53:54 -07:00
Jon Fu e3f54dba2f throw invalid_tenant_configuration and add to metacluster management test 2022-09-14 10:22:51 -07:00
Xiaoxi Wang 85f119c13b fix destination team issue 2022-09-14 00:17:00 -07:00
Xiaoxi Wang 46eadd53cf finish getInitialDataDistribution test verification 2022-09-13 22:43:15 -07:00
Lukas Joswiak 424bb87f3e Apply feedback 2022-09-13 16:53:54 -07:00
Lukas Joswiak 2a26777d64 Disable coordinator change when using simple configuration database 2022-09-13 16:53:54 -07:00
Lukas Joswiak 74ac617a34 Add support for changing coordinators to the configuration database
Configuration database data lives on the coordinators. When a change
coordinators command is issued, the data must be sent to the new
coordinators to keep the database consistent.
2022-09-13 16:53:54 -07:00
Trevor Clinkenbeard b641bd6c04
Merge pull request #8168 from sfc-gh-tclinkenbeard/add-doappendiffits-unittest
Add `/Atomic/DoAppendIfFits` unit test
2022-09-13 15:18:59 -07:00
Xiaoxi Wang bafcacc1eb fix MockDDTxnProcessor::getInitialDataDistribution segment fault in assertion part; IDDTxnProcessorApiWorkload get mock initData finish 2022-09-13 14:39:12 -07:00
Xiaoxi Wang 2ae01bdf2d
Merge pull request #8162 from sfc-gh-xwang/feature/main/moveKey
Implement txnProcessor->moveKeys(const MoveKeysParams& params)
2022-09-13 14:11:49 -07:00
He Liu 958b28497e Merge branch 'main' of https://github.com/apple/foundationdb into validate-data-consistency 2022-09-13 13:55:01 -07:00
Markus Pilman cebf49798e Merge remote-tracking branch 'origin/main' into features/always-inject-faults 2022-09-13 12:36:12 -06:00
sfc-gh-tclinkenbeard 2bea5b88bf Add /Atomic/DoAppendIfFits unit test 2022-09-13 11:35:39 -07:00
Xiaoxi Wang eaff46cd27 IDDTxnProcessorApiWorkload setup - Get Initial DD in real cluster 2022-09-13 11:04:23 -07:00
Trevor Clinkenbeard 9f1e04cf9f
Merge pull request #8147 from sfc-gh-tclinkenbeard/improve-backup-code-coverage
Check backup agent activity in BackupAndRestoreCorrectnessWorkload
2022-09-13 09:32:40 -07:00
Markus Pilman 3e31820078 fix compilation issues 2022-09-13 10:29:29 -06:00
Markus Pilman 6e97bfb15a Added RandomMoveKeys 2022-09-12 17:20:57 -06:00
Markus Pilman 389ea4c952 added additional workloads 2022-09-12 17:16:32 -06:00
Markus Pilman acd24d6c81 Merge remote-tracking branch 'origin/main' into features/always-inject-faults 2022-09-12 16:44:16 -06:00
Xiaoxi Wang 949b1c1af9 Add MoveKeysParams struct; use txnProcessor->moveKeys() 2022-09-12 15:40:18 -07:00
He Liu 56cf51e2a0 Audit allKeys in auditStorage test. 2022-09-12 14:32:34 -07:00
Markus Pilman 59ce49913a
Merge pull request #8146 from sfc-gh-tclinkenbeard/improve-code-coverage
Increase the number of unit tests run in `RandomUnitTests.toml`
2022-09-12 15:10:47 -06:00
Fuheng Zhao ee99de7cf3
Merge branch 'apple:main' into RedwoodIOLaunchLimit 2022-09-12 10:58:12 -07:00
sfc-gh-tclinkenbeard 924c198a5b Run 10 unit tests within RandomUnitTests.toml 2022-09-11 00:36:18 -07:00
sfc-gh-tclinkenbeard cc34658a32 Check backup agent activity in BackupAndRestoreCorrectnessWorkload 2022-09-10 23:43:11 -07:00
Yi Wu d831c87d14
Add encryption metrics (#8070)
Adding the following metrics:
* BlobCipherKeyCache hit/miss
* EKP: KMS requests latencies
* For each component that using encryption, they now need to pass a UsageType enum to the encryption helper methods (GetEncryptCipherKeys/GetLatestEncryptCipherKey/encrypt/decrypt) and those methods will help to log get cipher key latency samples and encryption/decryption cpu times accordingly.
2022-09-09 18:43:09 -07:00
Jon Fu bfd2d138f7 set assignedcluster in tenantmanagement workload 2022-09-09 16:10:42 -07:00
He Liu e6846e1ed5 Handle auditStorage API errors. 2022-09-09 15:04:34 -07:00
Jon Fu 7381cf6ef4 update test assertions 2022-09-09 12:11:02 -07:00
Jon Fu 328e4b3bf9 use state variable for error 2022-09-09 10:15:25 -07:00
Jon Fu 75a096a5e5 Merge branch 'main' of github.com:apple/foundationdb into jfu-tenant-special-key-space 2022-09-09 10:12:19 -07:00
Jon Fu 8a24afb152 fix miscellaneous errors in special keys tests 2022-09-09 10:07:37 -07:00
He Liu 9e911956d9 Handle GetKeyValuesReply errors. 2022-09-08 15:41:22 -07:00
Junhyun Shim bc47f90aff
Merge pull request #8125 from sfc-gh-jshim/knob-tokenless-tenant-access
Add a knob to allow token-less tenant data access for untrusted clients
2022-09-08 17:52:32 +02:00
A.J. Beamon 726d5215a0
Remove API 720 guards for tenants (experimental feature) and the cluster ID special keys (#8108)
* Remove API 720 guards for tenants (experimental feature) and the cluster ID special keys (no need to guard)

* Enable the relaxed special key access in transactions that need to use special key-space APIs introduced in 7.2
2022-09-08 17:22:36 +02:00
Junhyun Shim 3023096962 Add a knob to allow token-less tenant data access for untrusted clients 2022-09-08 14:53:01 +02:00
He Liu 071ed41caa ManagementAPI auditStorage passed test. 2022-09-07 22:34:14 -07:00
Dennis Zhou 7fc4a92d09
Merge pull request #8089 from sfc-gh-jslocum/unblobbify_idempotent
Blob range idempotency test and fixes
2022-09-07 17:31:05 -07:00
Jon Fu da7ce5231c Merge branch 'main' of github.com:apple/foundationdb into jfu-tenant-special-key-space 2022-09-07 13:30:08 -07:00
Jon Fu 255795eac0 update test workload and reserve space for vector copies 2022-09-07 13:25:38 -07:00
He Liu 16e31c6bd6 Make AuditStorage more reliable on SS. 2022-09-07 09:40:19 -07:00
He Liu ad11c6e82d Cleanup. 2022-09-06 19:00:09 -07:00
Jon Fu dbb6357371 add conflict range tests and change tenant prefix code to work with RYW 2022-09-06 16:55:57 -07:00
Andrew Noyes fbf5830bb2 Rollback #7374
Valgrind is complaining about use of uninitialized memory in
absl::GetStackTrace, and the cases where it complains the backtraces are
incomplete. Note: this means that jemalloc heap profiling no longer
works out of the box. Advanced users who want to enable jemalloc heap
profiling will now have to revert this change and build from source.
2022-09-06 16:55:18 -07:00
Fuheng Zhao 91219a4a28 sync 2022-09-06 09:37:00 -07:00
He Liu 6f4d09f947 Replace Reference<> with std::shared_ptr<>. 2022-09-05 12:45:14 -07:00
Josh Slocum 94e0c83214 Blob range idempotency test and fixes 2022-09-02 15:53:06 -05:00
He Liu 033741daab Audit should always complete, any failures are retried. 2022-09-02 09:11:19 -07:00
He Liu 0bbce98da2
Disable shard aware (#8072)
* Removed STORAGE_SERVER_SHARD_AWARE knob.

* Fixed PhysicalShardMove test.

Co-authored-by: He Liu <heliu@apple.com>
2022-09-02 09:07:34 -07:00
Dennis Zhou 80a0816157
flow: switch from hard coded to ApiVersion like ProtocolVersion (#8071)
* flow: add ApiVersion to replace hard coding api version

Instead of hard coding api value, let's rely on feature versions akin to
ProtocolVersion.

* ApiVersion: remove use of -1 for latest and use LATEST_VERSION
2022-09-02 09:28:13 +02:00
Fuheng Zhao c7f7544231 Merge remote-tracking branch 'upstream/main' into RedwoodIOLaunchLimit 2022-09-01 14:53:11 -07:00
He Liu cade0baf7e Added Interfaces in ClusterInterface and DataDistributorInterface for
audit.
2022-09-01 11:03:34 -07:00
Nim Wijetunga 72ccc681be move tenant entry cache to client 2022-09-01 10:10:47 -07:00
Fuheng Zhao 0aa096dc17 sync with upstream main 2022-08-31 15:46:39 -07:00
Jon Fu 4733e5a096 add extra test cases for set/clear/commit 2022-08-31 12:49:42 -07:00
Yi Wu 49503987cc
Support Redwood encryption (#7376)
A new knob `ENABLE_STORAGE_SERVER_ENCRYPTION` is added, which despite its name, currently only Redwood supports it. The knob is mean to be only used in tests to test encryption in individual components, and otherwise enabling encryption should be done through the general `ENABLE_ENCRYPTION` knob.

Under the hood, a new `Encryption` encoding type is added to `IPager`, which use AES-256 to encrypt a page. With this encoding, `BlobCipherEncryptHeader` is inserted into page header for encryption metadata. Moreover, since we compute and store an SHA-256 auth token with the encryption header, we rely on it to checksum the data (and the encryption header), and skip the standard xxhash checksum.

`EncryptionKeyProvider` implements the `IEncryptionKeyProvider` interface to provide encryption keys, which utilizes the existing `getLatestEncryptCipherKey` and `getEncryptCipherKey` actors to fetch encryption keys from either local cache or EKP server. If multi-tenancy is used, for writing a new page, `EncryptionKeyProvider` checks if a page contain only data for a single tenant, if so, fetches tenant specific encryption key; otherwise system encryption key is used. The tenant check is done by extracting tenant id from page bound key prefixes. `EncryptionKeyProvider` also holds a reference of the `tenantPrefixIndex` map maintained by storage server, which is used to check if a tenant do exists, and getting the tenant name in order to get the encryption key.
2022-08-31 12:19:55 -07:00
Dennis Zhou 046141b5be blob: only allow unblobbification on aligned to ranges
If unblobbify request comes in, reject it if the start and end do not
align to a blob granule boundary.
2022-08-31 09:23:33 -07:00
Jingyu Zhou 7fcf839068
Merge pull request #8047 from jzhou77/main
Fix a race that check happened before last configure command
2022-08-30 15:27:31 -07:00
Jingyu Zhou 1b3e986dc0 Fix a race that check happened before last configure command
By adding a delay before start checking.
2022-08-30 13:42:52 -07:00
Josh Slocum 058c720ef3
Fixing granule mismatch bug caused by race in change feed fetch (#8019)
* fixing another feed fetch race causing incorrect data

* limiting size of final data check in granule workload to not get too large of a mapping
2022-08-29 17:25:34 -05:00
Jon Fu 479d774e79 set raw access for certain management API functions and update special key test 2022-08-29 11:45:56 -07:00
A.J. Beamon 2907d2d4dd
Merge pull request #8004 from sfc-gh-ajbeamon/fix-ub
Fix some undefined bevavior in RK and a unit test
2022-08-29 09:16:11 -07:00
A.J. Beamon 0e782412a8 Fix some undefined bevavior: 1) a unit test was not initializing members of the WorkloadContext it was using, and 2) very large ratekeeper limits for batch priority were overflowing the types used to log them 2022-08-26 14:17:01 -07:00
sfc-gh-tclinkenbeard e4bfb3e28c Merge remote-tracking branch 'origin/main' into ignore-test-encryption-file 2022-08-26 12:53:31 -07:00
Josh Slocum 46b02cab49
Blob granule summary implementation (in native client) (#7981)
* implemented blob granule summary call in native client

* clean up prints
2022-08-26 14:04:59 -05:00
sfc-gh-tclinkenbeard bad14b67fb Ignore test encryption file when re-running simulation test 2022-08-25 16:45:13 -07:00
Jon Fu 337a2f8130 try to account for errors in fuzzapicorrectness and add test case in special key test 2022-08-25 15:28:06 -07:00
Ata E Husain Bohra 00fe4863b6
Implement TenantCacheEntry in-memory cache (#7801)
* Implement TenantCacheEntry in-memory cache

Description

  diff-4: TraceEvent usage improvements 
  diff-3: Address review comments
  diff-2: Add APIs to read counter values, test improvements
  diff-1: Address review comments

Major changes includes:
1. Implements an actor that enables an in-memory caching of
TenantCacheEntry object, allowing the caller to embed custom
information along with TenantCacheEntry.
2. The cache follows read-through cache semantics where the entry
gets loaded from underlying database on a miss.
3. The cache implements a "periodic poller" to refresh known Tenants
by consulting the database. Once a database keyrange-watch feature is
available, cache shall be updated.

Bonus:
Implement a 'recurringAsync' addition to genericActors allowing caller
to schedule a periodic task registering an "actor functor"; the routine
'waits' for the actor unlike existing 'recurring' implementation.

Testing

TenantEntryCache workload
devCorrectnessRun - 100K
2022-08-25 11:42:26 -07:00
Josh Slocum e04e3885b9
Adding BlobRange test (#7868)
* Adding BlobRange test

* refactored blobrange test to use new db functions, and fixed several bugs

* bug fixes for blob manager and verifyBlobRange

* More range unaligned fixes

* cleaning up test and disabling tests that don't work yet for now

* removing overzealous assert in blob manager

* more fixes for overzealous assert

* cleanup and renaming test

* adding chaos to blob ranges test
2022-08-24 11:30:37 -05:00
Jon Fu 9eb313cb4b
Merge pull request #7791 from sfc-gh-jfu/jfu-metacluster-rename
Support tenant renaming in metacluster
2022-08-24 08:51:54 -07:00
Nim Wijetunga a857609478 refactor ekp interface 2022-08-23 23:04:12 -07:00
Jon Fu 8f011c9f73 Merge branch 'main' of github.com:apple/foundationdb into jfu-metacluster-rename 2022-08-22 12:04:18 -07:00
Jon Fu 8177e3d883 remove error checks that should not occur in test workload 2022-08-22 11:12:43 -07:00
Josh Slocum 5059a417aa
More improved purge tests (#7923)
* Added purgeAtLatest to BlobGranuleVerifier

* Also checking for merge resnapshot/fully delete granule purge races

* error check and count for purgeAtLatest

* changing test defaults back

* adding final data check after final availability check in blob granule verifier
2022-08-19 16:22:21 -05:00
He Liu aa5c3b7e9d Added ValidateStorage test. 2022-08-18 13:53:42 -07:00
Josh Slocum a42e81a795
Bg bug fixes3 (#7904)
* Fixing merge boundary recovery

* fixing an edge case in blob manager repeat recruitment

* fixing a race between tenant loading and key alignment

* formatting
2022-08-18 12:42:53 -05:00
He Liu 13cd4b973c Added ValidateStorage.actor.cpp 2022-08-17 17:09:21 -07:00
Josh Slocum 21298d418f
Improved purge tests (#7911)
* doing force purge at the end of the test if it didn't happen during

* better checking for granule metadata left over after purge

* retrying for in-flight granules that weren't known before purge

* letting test pass for a hard to fix purge case

* changed post-test force purge to only be after final availability check
2022-08-17 13:40:52 -07:00
Jon Fu 2836852a7f Merge branch 'main' of github.com:apple/foundationdb into jfu-metacluster-rename 2022-08-16 16:14:28 -07:00
Dennis Zhou 750c5a9ca0
Merge pull request #7835 from sfc-gh-dzhou/blobapi
blob: java bindings for various blob management functions
2022-08-16 15:11:22 -07:00
Josh Slocum d6e788affc
More force purge fixes (#7902)
* Made force purge more robust to split+merge races

* fix flush/force purge change feed leak

* Better fix for assign while force purge races

* more merge/force purge races

* cleanup
2022-08-16 15:49:59 -05:00
Dennis Zhou 1c2109dcbd blob: add rangeLimit to getBlobGranuleRanges() 2022-08-16 13:29:23 -07:00
Dennis Zhou 5085c21d0d blob: check if blob range exists and fail accordingly 2022-08-15 16:25:36 -07:00
Dennis Zhou 736ef4f2c9 blob: move blobrange command implementation to native api 2022-08-15 16:25:36 -07:00
Markus Pilman 072c10ed44
Merge pull request #7823 from sfc-gh-anoyes/anoyes/improve-determinism
Improve determinism (based on JOSHUA_SEED)
2022-08-15 14:09:49 -06:00
Markus Pilman eb16947860
Merge pull request #7844 from jzhou77/grv-error
Buggify GRV proxy to return errors and fix bugs found
2022-08-15 13:59:21 -06:00
Ata E Husain Bohra 03435b5133
Update BlobCipher cache to respect EKP/KMS cipherKey TTL (#7885)
Description

FDB native encryption data at-rest supports two type of cipher-keys
in-memory caching:
1. Revocable keys - with a definite expiry (future timestamp)
2. Non-revocable keys - with or without expiry timestamp and/or
refreshAt timestamp.

Patch update BlobCipherKey in-memory cache to respect EKP/KMS
supplied 'refreshAt' and 'expireAt' timestamp. GetLatestCipher
validates `cipher key freshness' as well as GetCipherKey checks
for 'cipher key liveness' before replying details to the caller.

Patch also optimizes the BlobCipher module logging by taking
following measures:
1. BLOB_CIPHER_DEBUG macro to guard spammy log messages needed
mostly for debugging failures.
2. Minimize log volume by logging cipherKey details for any new
key added to the cache, key-refreshes are not logged.
3. Categorize logs into: debug, info and warn on per-usecase basis

Testing

devRunCorrectness - 100K
EncryptOps.toml - 100K
2022-08-15 11:17:26 -07:00
Jon Fu 0c85efee43 Merge branch 'main' of github.com:apple/foundationdb into jfu-metacluster-rename 2022-08-14 11:34:16 -07:00
Jingyu Zhou 8fb6d59e94 Use error_code_grv_proxy_memory_limit_exceeded
instead of error_code_proxy_memory_limit_exceeded
2022-08-12 11:36:17 -07:00
Jingyu Zhou f912a74ec1 Fix TenantManagementWorkload.actor.cpp 2022-08-12 11:13:32 -07:00
Jingyu Zhou 6122cb6acd Fix LocalRatekeeper workload 2022-08-12 11:13:32 -07:00
Jingyu Zhou 0b735b6b11 Fix SpecialKeySpaceCorrectness workload 2022-08-12 11:13:32 -07:00
Jingyu Zhou 4497fe4ccf Fix FuzzApiCorrectness workload 2022-08-12 11:13:32 -07:00
Jingyu Zhou e5da35d7bf Fix TenantManagementWorkload with GRV errors 2022-08-12 11:13:32 -07:00
Josh Slocum 7c155f4521
Granule force purging (#7846)
* Granule purge cannot delete history entry for fully deleting granule until all children are completely done splitting

* Several purging fixes related to granule history

* Fixed typo in refactor

* fixing memory model for purgeRange

* formatting

* weakening granule purge test for now

* cleanup

* First version of force purging granules

* fixing issue in BW range assignment reporting

* Fixing incorrect assert with force purging

* Error handling when checking force purged state

* fixed force purging and recover/reassign range races and check

* Handling force purge + boundary change race

* more places to check for force purged status

* fixed manager restart in the middle of force purge bug

* fixing same-BM purge and assignment races in all cases

* weakening orphaned granule history check a bit because of difficult to solve races

* fixing txn options on retry

* loading force purged ranges at start to avoid resuming a merge that is being force purged

* cleanup

* Enabling purging in granule tests, and adding check for leaked change feeds in force purge

* formatting

* missed parameter in merge conflicts

* Fixing leaked change feed race with merge and force purge

* adding change feed cleanup when new blob manager recovers in-progress merge that raced with force purge

* added forcepurge fdbcli command
2022-08-11 15:22:32 -07:00
Jon Fu fe350f4514 remove debug traces 2022-08-11 14:28:05 -07:00
Jon Fu c1ce4b361b address code review comments 2022-08-11 14:25:14 -07:00
Jon Fu 0f90ecf6e1 basic breakpoints for debugging and fixes for bugs in concurrent delete + rename operations 2022-08-10 14:05:37 -07:00
Jon Fu 2d3614f830 add check for no cluster capacity in concurrency tenant workload 2022-08-09 13:41:25 -07:00
Jon Fu ea69fac3d4 account for cluster capacity error in tenant management workload 2022-08-09 13:28:26 -07:00
Jon Fu 6386f550d4 Merge branch 'main' of github.com:apple/foundationdb into jfu-metacluster-rename 2022-08-09 13:13:06 -07:00
Jon Fu 74df84f686 Merge branch 'main' of github.com:apple/foundationdb into jfu-metacluster-rename 2022-08-08 17:49:15 -07:00
Markus Pilman bef5b87472 add clogging workload 2022-08-08 14:30:41 -06:00
A.J. Beamon 1569f033c8 Reduce the number of extra databases to prevent using too many files 2022-08-08 12:47:35 -07:00
A.J. Beamon 91ccb82605 Fix: Decommissioning a metacluster could fail if some global state fields were set in the metacluster. Update the metacluster management workload to decommission the metacluster at the end. 2022-08-08 12:47:35 -07:00
Andrew Noyes 3ba2cd2fc9 Sort unit tests before selecting random unit test
This makes it so that the selected unit test does not depend on static
initialization order
2022-08-08 10:06:34 -07:00
Markus Pilman 956326dd5f
Merge pull request #7804 from sfc-gh-ajbeamon/move-max-tenants-buggify
Don't buggify max tenants per cluster globally; instead buggify it in specific tests
2022-08-08 10:32:02 -06:00
Jingyu Zhou 98c4c22d92
Merge pull request #7809 from sfc-gh-ajbeamon/test-harness-report-unit-tests
Report the unit tests being run in test harness's summarized output
2022-08-08 09:26:05 -07:00
Vaidas Gasiunas 79571dd2b4
Testing upgrades to a future version of FDB (#7780)
* Enable configuring the next future protocol version as the current protocol version in FDB client, fdbserver, and fdbcli

* Auto format python files used in upgrade tests

* Add a test for upgrading to a future FDB version

* Emphasize that the options for using future protocol version are intended for test purposes only

* Make the global variable for current protocol version visible only locally

* Refactirng to avoid using currentProtocolVersion() in static intialization

* Update go bindings
2022-08-08 17:29:49 +02:00
A.J. Beamon b42cb48dd4 Report the unit tests being run in test harness 2022-08-07 07:37:29 -07:00
A.J. Beamon b33a9da7ce Don't buggify max tenants per cluster globally; instead buggify it in specific tests 2022-08-06 06:10:45 -07:00
Josh Slocum f866ffc36b
Better granule conversion (#7787)
* better check for granule-ification

* Handling blob granule initial split too large

* Re-evaluating split size if too large, even if read doesn't get transaction_too_old

* reworked to have blob worker propose split key

* New GranuleStatusReply to avoid seqno issue stream side effects

* Handling retries on reevaluateInitialSplit properly

* Waiting for stream to be initialized

* Checking reevaluate split for additional split points beyond proposed

* Fixing more races in reevaluate initial split

* properly handling cleaning up old change feed after split re-evaluate

* fixing granule conversion bug with hard boundaries

* fixing clear and merge check race with cycle test

* refactor missed knob check for clearAndMerge

* Fixing formatting

* review comments and improving large range conversion

* fixing typo

* more formatting
2022-08-05 18:12:17 -05:00
Jon Fu cc85973957 Merge branch 'main' of github.com:apple/foundationdb into jfu-metacluster-rename 2022-08-05 12:38:43 -07:00
A.J. Beamon c8451d5f97
Merge pull request #7794 from sfc-gh-ajbeamon/fix-permanently-failed-tenant-creation
Fix: tenant creation in metacluster failure
2022-08-05 12:26:35 -07:00
Markus Pilman f716dfd772 Merge remote-tracking branch 'origin/main' into features/always-inject-faults 2022-08-05 10:56:12 -06:00
Josh Slocum b2835921ba
Using knownBlobRanges for blob granule ranges whether tenants are enabled or not (#7788)
* Using knownBlobRanges for blob granule ranges whether tenants are enabled or not

* Effectively disabled blob granule tests when tenants enabled to fix ctest
2022-08-05 11:46:09 -05:00
A.J. Beamon a7653b2859 Fix: when a tenant creation permanently failed on the data cluster and started over, it could incorrectly fail with a tenant already exists error if the subsequent retry successfully committed during a commit_unknown_result error. Also expand the tenant management concurrency workload to include configure operations. 2022-08-05 06:46:46 -07:00
Jon Fu 7a45aba74d Merge branch 'main' of github.com:apple/foundationdb into jfu-metacluster-rename 2022-08-04 16:36:09 -07:00
Jon Fu 8da261d28c add some rename ops to other test workloads 2022-08-04 16:35:44 -07:00
A.J. Beamon ff23d5994e
Merge pull request #7729 from sfc-gh-ajbeamon/feature-metacluster
Metacluster
2022-08-04 15:29:44 -07:00
Jon Fu 947d584b7e Merge branch 'feature-metacluster' of github.com:sfc-gh-ajbeamon/foundationdb into jfu-metacluster-rename 2022-08-04 14:52:49 -07:00
Jon Fu b3e129ac63 fix problem with commit_unknown_result on data cluster for metacluster rename 2022-08-04 14:51:09 -07:00
Fuheng Zhao d5c3679046 merge upstream main and resolve conflicts 2022-08-04 12:15:00 -07:00
A.J. Beamon 48f149a62e Fix a small test bug and remove some delays that don't seem to be needed anymore 2022-08-04 05:57:07 -07:00
A.J. Beamon fbe1a4a69a Use multiple databases in the metacluster managemen test. Fix a test bug as well as some issues with setting up multiple extra databases. 2022-08-03 19:10:34 -07:00
Jon Fu 0324600b3f Merge branch 'feature-metacluster' of github.com:sfc-gh-ajbeamon/foundationdb into jfu-metacluster-rename 2022-08-03 17:31:43 -07:00
Markus Pilman 2f7e6ee278 don't do failure injection for change config and quiescence tests 2022-08-03 17:04:06 -06:00
Josh Slocum 7f45cccb56
More granule purging fixes (#7756)
* Granule purge cannot delete history entry for fully deleting granule until all children are completely done splitting

* Several purging fixes related to granule history

* Fixed typo in refactor

* fixing memory model for purgeRange

* formatting

* weakening granule purge test for now

* cleanup

* review comments
2022-08-03 16:43:27 -05:00
A.J. Beamon 41af66bd4e Add a tenant consistency check and use it in the various tenant workloads 2022-08-03 13:33:45 -07:00
Jon Fu 1b59ff2b03 Merge branch 'feature-metacluster' of github.com:sfc-gh-ajbeamon/foundationdb into jfu-metacluster-rename 2022-08-03 12:21:50 -07:00
Jon Fu bceb44f5a1 address code review comments 2022-08-03 12:03:44 -07:00
A.J. Beamon 8e777a6330 Detect and handle inverted ranges in the get cluster list test. Remove some unused code. 2022-08-03 09:09:36 -07:00
Trevor Clinkenbeard edf4e60fa9
Merge pull request #7631 from sfc-gh-tclinkenbeard/global-tag-throttling5
Improvements to `GlobalTagThrottler`
2022-08-02 16:04:20 -07:00
Markus Pilman 31f05f2fdf Separated normal workloads and failure injection
Also first implementation with machine attrition
2022-08-02 17:04:02 -06:00
Fuheng Zhao dccf0c5fd7 add readType to other requests 2022-08-02 15:47:15 -07:00
Jon Fu 1fe1d04a6e Merge branch 'feature-metacluster' of github.com:sfc-gh-ajbeamon/foundationdb into jfu-metacluster-rename 2022-08-02 14:46:20 -07:00
Jon Fu 9805cef57c add debug messages for correctness checking 2022-08-02 14:35:38 -07:00
Dennis Zhou b34a54fa7f
blob: allow for alignment of granules to tuple boundaries (#7746)
* blob: read TenantMap during recovery

Future functionality in the blob subsystem will rely on the tenant data
being loaded. This fixes this issue by loading the tenant data before
completing recovery such that continued actions on existing blob
granules will have access to the tenant data.

Example scenario with failover, splits are restarted before loading the
tenant data:
BM - BlobManager
epoch 3:                        epoch 4:
  BM record intent to split.
  Epoch fails.
                                BM recovery begins.
  BM fails to persist split.
                                BM recovery finishes.
                                BM.checkBlobWorkerList()
                                  maybeSplitRange().
                                BM.monitorClientRanges().
                                  loads tenant data.

bin/fdbserver -r simulation -f tests/slow/BlobGranuleCorrectness.toml \
    -s 223570924 -b on  --crash --trace_format json

* blob: add tuple key truncation for blob granule alignment

FDB has a backup system available using the blob manager and blob
granule subsystem. If we want to audit the data in the blobs, it's a lot
easier if we can align them to something meaningful.

When a blob granule is being split, we ask the storage metrics system
for split points as it holds approximate data distribution metrics.
These keys are then processed to determine if they are a tuple and
should be truncated according to the new knob,
BG_KEY_TUPLE_TRUNCATE_OFFSET.

Here we keep all aligned keys together in the same granule even if it is
larger than the allowed granule size. The following commit will address
this by adding merge boundaries.

* blob: minor clean ups in merging code

1. Rename mergeNow -> seen. This is more inline with clocksweep naming
   and removes the confusion between mergeNow and canMergeNow.
2. Make clearMergeCandidate() reset to MergeCandidateCannotMerge to make
   a clear distinction what we're accomplishing.
3. Rename canMergeNow() -> mergeEligble().

* blob: add explicit (hard) boundaries

Blob ranges can be specified either through explicit ranges or at the
tenant level. Right now this is managed implicitly. This commit aims to
make it a little more explicit.

Blobification begins in monitorClientRanges() which parses either the
explicit blob ranges or the tenant map. As we do this and add new
ranges, let's explicitly track what is a hard boundary and what isn't.

When blob merging occurs, we respect this boundary. When a hard boundary
is encountered, we submit the found eligible ranges and start looking
for a new range beginning with this hard boundary.

* blob: create BlobGranuleSplitPoints struct

This is a setup for the following commit. Our goal here is to provide a
structure for split points to be passed around. The need is for us to be
able to carry uncommitted state until it is committed and we can apply
these mutations to the in-memory data structures.

* blob: implement soft boundaries

An earlier commit establishes the need to create data boundaries within
a tenant. The reality is we may encounter a set of keys that degnerate
to the same key prefix. We'll need to be able to split those across
granules, but we want to ensure we merge the split granules together
before merging with other granules.

This adds to the BlobGranuleSplitPoints state of new
BlobGranuleMergeBoundary items. BlobGranuleMergeBoundary contains state
saying if it is a left or right boundary. This information is used to,
like hard boundaries, force merging of like granules first.

We read the BlobGranuleMergeBoundary map into memory at recovery.
2022-08-02 16:06:25 -05:00