Commit Graph

6610 Commits

Author SHA1 Message Date
A.J. Beamon 7284e691fb Fix a few minor restore bugs and add a dry-run mode. Some improvements to the fdbcli output. 2023-02-14 12:28:55 -08:00
A.J. Beamon f3b58a063f Fix some merge issues and review comments 2023-02-13 15:32:44 -08:00
A.J. Beamon 958ff862e0 Fix some merge issues 2023-02-13 12:59:48 -08:00
A.J. Beamon 98407809d9 Merge branch 'main' into metacluster-mgmt-restore
# Conflicts:
#	fdbcli/MetaclusterCommands.actor.cpp
#	fdbclient/Metacluster.cpp
#	fdbclient/include/fdbclient/MetaclusterManagement.actor.h
#	fdbserver/workloads/MetaclusterManagementWorkload.actor.cpp
#	tests/CMakeLists.txt
2023-02-13 12:30:33 -08:00
A.J. Beamon 0127dd4b5a
Merge pull request #9356 from sfc-gh-ajbeamon/metacluster-concurrency-testing
Add metacluster concurrency test and fix various bugs that it found
2023-02-13 11:57:47 -08:00
Steve Atherton 41fa3eada9
Merge branch 'main' into add-redwood-slack-knob 2023-02-12 19:31:20 -08:00
A.J. Beamon a261c1d94c Run tenant management concurrency alongside metacluster management concurrency. Fix a few issues where performing tenant operations returned undesirable errors when the associated cluster was removed. 2023-02-11 19:46:47 -08:00
Xiaoxi Wang 93f892c085
Merge pull request #9340 from sfc-gh-xwang/fix/main/tenantList
fix the way verifyListFilter detect tenant state change
2023-02-11 17:20:46 -08:00
A.J. Beamon e6021f8326 Add Jon's metacluster concurrency test and fix various bugs that it found 2023-02-11 15:15:32 -08:00
Xiaoxi Wang 21a2378de5
Merge pull request #9298 from sfc-gh-xwang/feature/main/clearRange
Split raw clear ranges across tenants in required mode
2023-02-11 14:29:46 -08:00
Xiaoxi Wang a9c7632c83 Merge branch 'main' of https://github.com/apple/foundationdb into fix/main/tenantList 2023-02-11 13:54:27 -08:00
A.J. Beamon ee1b48323d
Merge pull request #9346 from sfc-gh-nwijetunga/nim/global-tenant-ids
Support for Two Byte Prefix for Tenant IDs
2023-02-11 11:31:24 -08:00
A.J. Beamon 4579a4319d Merge branch 'main' into storage-quota-in-tenant-metadata-space 2023-02-11 09:04:15 -08:00
Xiaoxi Wang a0f7943fc3 simplify implementation of lowerBoundTenantId and withinSingleTenant 2023-02-10 22:14:59 -08:00
Nim Wijetunga 9e5c61e127 address pr comments 2023-02-10 15:56:41 -08:00
Jingyu Zhou 814350c4e6
Merge pull request #9338 from jzhou77/fix
Fix DD stuck when remote DC is dead
2023-02-10 15:19:17 -08:00
Nim Wijetunga de9eef72ff address pr comments 2023-02-10 13:49:15 -08:00
Xiaoxi Wang bb8d96c026 Merge branch 'main' of https://github.com/apple/foundationdb into feature/main/clearRange 2023-02-10 12:30:16 -08:00
Xiaoxi Wang ffadea08cb change isSingleTenant check; add unit tests 2023-02-10 12:29:38 -08:00
Jingyu Zhou 5232a21005
Merge pull request #9344 from sfc-gh-akejriwal/valgrind 2023-02-10 12:04:31 -08:00
Jingyu Zhou 6c4a9b5f23 Fix DD stuck when remote DC is dead
When remote DC is down, the remote team collection of DD can initializing
waiting for the remote to recover (all_tlog_recruited state). However, the
getTeam request can already be served by the remote team collection. So, for
a RelocateShard (data movement such as split, move), it will get a team for
the remote DC. But the data movement can't make progress on the remote team
because the remote DC hasn't recovered yet. Because of the stuck of data
movement, the primary cannot reach the "storage_recovered" state and stay in
accepting_commit state.

The specifc test failure: slow/ApiCorrectness.toml -s 339026305 -b on
at commit:  0edd899d65

In this test, primary DC has 1 SS killed, remote DC has 2 TLog and 2 SS killed.
So the remote is dead, the remaining 2 SSes can't make progress because of the
loss of 2 TLogs. The repairDeadDatacenter() can't reach the "storage_recovered"
state due to DD's failure of moving shards away from the killed SS in the
primary.

The fix is to exclude all remote in repairDeadDatacenter() so that tells DD to
mark all SSes in the remote as unhealthy. Another fix is to return empty
results for getTeam request if the remote team collection is not ready. This
will allow the data movement to continue, essentially remote team is not changed
for the data movement.
2023-02-10 11:11:07 -08:00
A.J. Beamon 13eee09ce8 Merge branch 'main' into metacluster-mgmt-restore 2023-02-10 10:58:01 -08:00
A.J. Beamon 0e078435ab Remove unnecessary try/catch 2023-02-10 10:57:37 -08:00
A.J. Beamon 4b13c9c211 Make a few minor fixes, refactor some code for clarity, and improve throughput of repopulating a management cluster 2023-02-10 10:41:55 -08:00
Ata E Husain Bohra ce49bfb8ac
EaR: Fix RandomUnitTest (#9339)
Description

Set `enable_configurable_encryption` knob in the unit test to make
RandomUnitTest runs happy

Testing

BlobCipherUnitTest
EncryptionOps
RandomUnitTest
2023-02-10 10:35:08 -08:00
Nim Wijetunga 8a3f3ea674 clean up code 2023-02-10 01:01:16 -08:00
Xiaoxi Wang 09da7efdc0 handle clear range when tenantMap.size() == 0 2023-02-09 22:28:18 -08:00
Ata E Husain Bohra f30c5a13ac
EaR: Configurable Encryption feature support for BlobGranules (#9343)
Description

Patch update BlobGranule encryption code to support Configurable
encryption semantics

Testing

BlobGranuleCorrectness* - 100K
2023-02-09 21:13:56 -08:00
Nim Wijetunga fed650894d working version 2023-02-09 21:10:40 -08:00
Ankita Kejriwal f5a01ebac1 Add a default value for version in `WaitMetricsRequest` 2023-02-09 19:37:45 -08:00
Xiaoxi Wang 53923c77cb Merge branch 'main' of https://github.com/apple/foundationdb into fix/main/tenantList 2023-02-09 17:27:54 -08:00
A.J. Beamon 788095536b Merge branch 'main' into storage-quota-in-tenant-metadata-space 2023-02-09 16:52:44 -08:00
A.J. Beamon f9a68056ac Add support for modifying a data cluster that is being restored so that we can manage conflicts 2023-02-09 15:33:40 -08:00
Nim Wijetunga b7ef50d1f8 inital commit 2023-02-09 14:32:29 -08:00
Hui Liu cb9d4d8bb5
Merge pull request #9276 from sfc-gh-huliu/manifest
Split blob manifest as segments when writting
2023-02-09 13:54:02 -08:00
Xiaoxi Wang 47ac2a8fb2 fix the way verifyListFilter detect tenant state change 2023-02-09 11:34:14 -08:00
sfc-gh-tclinkenbeard 31c3365215 Increase default value for MAX_TRANSACTION_TAG_LENGTH 2023-02-09 11:31:10 -08:00
Hui Liu 6b6959d35f Split blob manifest as segments when writting 2023-02-09 11:26:19 -08:00
A.J. Beamon 2d59c5681d Bug fixes and test improvements for management cluster restoration 2023-02-09 08:42:23 -08:00
A.J. Beamon 1815254b76
Merge pull request #9333 from sfc-gh-xwang/fix/main/tenantList
Fix list tenant name bug
2023-02-09 08:33:23 -08:00
Ata E Husain Bohra 9c649d7880
EaR: Configurable encryption framework (#9271)
* EaR: Configurable encryption framework

Description

EaR implementation only supports fixed size on-disk encryption header format.
One drawback of the scheme is, introducing a newer encryption scheme as well
as updating header format in future may incur data migration restrictions.
Major changes proposed in the patch includes:
1. Flexible Encryption header format allowing the following:
 1.1. Header flags (metadata) can evolve separately from the encryption algorithm
 1.2. Specific encryption algorithm header to allow future extensions.
2. Update the BlobCipher encryption/decryption util classes to work with newer
encryption header format.
3. Continue supporting multiple encryption authentication schemes such as:
HMAC-SHA and AES-CMAC; also, supports no encryption-authentication schemes.
4. Refactor BlobCipher unit test to enable testing of new format.
5. Configuration knobs to control encryption header flags and algorithm
versions.

Note: 
The on-disk header storage footprint savings due to the newer scheme is as follows:
1. No encryption authentication: 54% smaller compared to existing implementation.
3. AES-CMAC: 16% smaller compared to existing implementation.
3. HMAC-SHA encryption authentication: almost same size.


Testing

BlobCipherTest
EncryptionOpsTest
2023-02-08 22:51:05 -08:00
Xiaoxi Wang fcebcbcf72 reset TenantManagement listTenantMetadataTransaction 2023-02-08 21:11:43 -08:00
Xiaoxi Wang e5d3fd2e96 add debug events; fix list tenant name bug 2023-02-08 20:37:27 -08:00
Xiaoxi Wang a77bf236c7 fix unit test bug 2023-02-08 16:09:32 -08:00
Xiaoxi Wang e75f38a6fc extract pushToBackupMutations method and add single tenant range validation 2023-02-08 14:35:35 -08:00
Jingyu Zhou 1a9aed795f
Merge pull request #9327 from sfc-gh-tclinkenbeard/enable-gtt-by-default
Enable `GLOBAL_TAG_THROTTLING` by default
2023-02-08 14:03:39 -08:00
Josh Slocum 1e5bac6238
fixing fault injection stalling change feed fetch (#9316) 2023-02-08 15:49:56 -06:00
Xiaoxi Wang 1ff7865ee2 Merge branch 'main' of https://github.com/apple/foundationdb into feature/main/clearRange 2023-02-08 13:38:55 -08:00
sfc-gh-tclinkenbeard 484dc4f74c Enable GLOBAL_TAG_THROTTLING by default 2023-02-08 11:34:31 -08:00
Ankita Kejriwal b9630e57d9
Merge pull request #9291 from sfc-gh-akejriwal/debug2
Update storage metrics functions to use the version at which the tenant was read
2023-02-08 11:33:33 -08:00