Commit Graph

356 Commits

Author SHA1 Message Date
Evan Tschannen 8872e5a462
Merge pull request #9347 from sfc-gh-etschannen/feature-change-feed-cache
added a disk to blob workers
2023-02-24 13:59:03 -08:00
Nim Wijetunga 29819b0645
Change Feed Bug Fix + Encryption Asserts (#9457)
* add encryption asserts

* modify function name

* address pr comments

* address pr comments

* Trigger Build
2023-02-23 19:33:25 -08:00
Evan Tschannen 8129381689 merge in main 2023-02-21 12:06:35 -08:00
Evan Tschannen 4f9e86b0a4 fixed two bugs that prevented the blob manager from properly loading worker affinity 2023-02-20 16:47:26 -08:00
Hui Liu aa1d983132 Truncate logs after force-flushing cold blob granules 2023-02-17 10:17:04 -08:00
Evan Tschannen 20bc868ee0 merge in main 2023-02-13 12:41:31 -08:00
Evan Tschannen bad8b2fad4 blob workers reboot with a different ID and register in the database their previous ID 2023-02-12 10:44:53 -08:00
A.J. Beamon 72c5abc0f5 Refactor storage quotas to store them in a key backed map in the tenant metadata space 2023-01-25 20:48:17 -08:00
Hui Liu e1b06a62f9 Add tenant metadata ranges to manifest backup 2023-01-24 17:09:04 -08:00
Hui Liu 36e8e5a3bb
Merge pull request #9176 from sfc-gh-huliu/restoreversion
Restore to a previous version
2023-01-19 17:37:23 -08:00
Hui Liu c85b984c3a blobrestore to previous point of time 2023-01-19 14:45:01 -08:00
Nim Wijetunga 330ac71630
Tenant Deletion Support for Backup Mutation Log (#9103)
tenant deletion support for backup mutation log
2023-01-18 15:11:58 -08:00
He Liu 00203c8732
Validate Storage part II (#8471)
* Implemented AuditUtils.actor.cpp

Moved AuditUtils to fdbserver/

* Persist AuditStorageState.

* Passed persisted AuditStorageState test.

* Added audit_storage_error to indicate a corruption is caught.

Throw/Send audit_storage_error when there is a data corruption.

Added doAuditStorage() for resuming Audit.

* Load and resume AuditStorage when DD restarts.

* Generate audit id monotonically.

* Fixed minor issue AuditId/Type was not set.

* Adding getLatestAuditStates.

* Improved persisted errors and added AuditStorageCommand.actor.cpp for
fdbcli.

* Added `audit_storage` fdbcli command.

* fmt.

* Fixed null shared_ptr issue.

* Improve audit data.

* Change DDAuditFailed to SevWarn.

* Sev.

* set SERVE_AUDIT_STORAGE_PARALLELISM to 1.

* Moved AuditUtils* to fdbclient/.

* Added getAuditStatus fdbcli command.

* Refactor audit storage fdb cli commands.

* Added auditStorage in sim.

* Cleanup.

* Resolved comments.

* Resolved comments.

* Test disabling audit for sims.

* Cleanup.

Co-authored-by: He Liu <heliu@apple.com>
2023-01-15 21:46:14 -08:00
Xiaoxi Wang 5de0c87654 add comments; remove unnecessary actor suffix; code format 2023-01-02 23:59:46 -08:00
Xiaoxi Wang bbcb3cc018 extract KeyBackedConfig, StorageWiggleData class; solve template resolution problem; solve MV txn and native api conflict by splitting RunTransaction file 2023-01-02 23:34:39 -08:00
Xiaoxi Wang ccc494319c perpetual wiggle key functions 2022-12-08 16:46:05 -05:00
Nim Wijetunga 97713cadff
Update Encryption Mode DB Config Values (#8839)
* add encryption db config

* address pr comments

* address pr comments

* add comments

* add comment

* modify check

* change condition

* address pr comments

* simplify check

* address pr comments
2022-11-22 16:33:59 -08:00
Trevor Clinkenbeard cfd8396c8c
Merge pull request #8786 from sfc-gh-akejriwal/tenantgroup
Update the storage quota mechanisms to set quota on tenant groups instead of tenants
2022-11-11 13:12:28 -08:00
Hui Liu 5834517570 Add fdbcli blobrestore to start the full restore 2022-11-11 08:32:23 -08:00
Ankita Kejriwal abc4b45af1 Set the storage quota on tenant groups instead of tenants
Update all the relevant data structures and monitors accordingly.
2022-11-10 18:56:43 -08:00
Lukas Joswiak a8f8757f77 Rename cluster ID key
In FDB 7.1, this key was stored in the txnStateStore. In 7.2, it has
been moved to the database. This was causing protocol compatibility
issues during upgrades, so we need to rename the key.
2022-10-27 13:56:13 -07:00
Lukas Joswiak bba05b7c9b Move cluster ID from txnStateStore to the database
The cluster ID is now stored in the database instead of in the
txnStateStore. The cluster controller will read it on boot and send it
to all processes to persist.
2022-10-27 13:56:13 -07:00
Jingyu Zhou a8391caf23 Revert "Data loss protection v2" 2022-10-20 18:09:58 -05:00
Lukas Joswiak 9c847a20e8 Rename cluster ID key
In FDB 7.1, this key was stored in the txnStateStore. In 7.2, it has
been moved to the database. This was causing protocol compatibility
issues during upgrades, so we need to rename the key.
2022-10-18 21:37:42 -07:00
Lukas Joswiak 7342672c11 Move cluster ID from txnStateStore to the database
The cluster ID is now stored in the database instead of in the
txnStateStore. The cluster controller will read it on boot and send it
to all processes to persist.
2022-10-18 21:37:42 -07:00
Jingyu Zhou df5825ff65
Merge pull request #8398 from sfc-gh-anoyes/anoyes/idempotency-id2
Initial work for automatic idempotency
2022-10-13 13:07:14 -07:00
Andrew Noyes e77e6e51ec Use StringRef 2022-10-11 13:46:39 -07:00
Andrew Noyes d8fce2525d IdempotencyId -> IdempotencyIdRef 2022-10-11 13:46:39 -07:00
Andrew Noyes 6d1707a8b8 Write and inspect key value format 2022-10-11 13:46:39 -07:00
He Liu b52edd8658 Merge branch 'main' of https://github.com/apple/foundationdb into validate-data-consistency 2022-10-10 11:00:05 -07:00
He Liu 0e69b19a6f Added AuditUtils.actor.h 2022-09-19 12:15:49 -07:00
A.J. Beamon 4fd64630e8 Convert literal string ref instances to use _sr suffix 2022-09-19 11:35:58 -07:00
sfc-gh-ngoyal 1bd97fe628
Recruit new singleton for consistency checker. (#5804)
* Recruit new singleton for consistency checker.

* Recruit the consistency checker only if enabled.

* Add a yield in monitorConsistencyChecker().

* Minor fixes.

* Consistency check workload enhancements.

* Minor fixes and clarifications.

* clang format

* Clang format.

* Minor fixes, cleanup, debug tracing.

* Misc.

* Move the consistency scan information from dbconfig to a key backed object.

* Move consistency scan config out of db cofig to a state object and feature rename.

* ConsistencyCheck workload refactor.

* devFormat

* Update fdbcli/ConsistencyScanCommand.actor.cpp

* Review Comments.

Co-authored-by: negoyal <neelam.goyal@gmail.com>
Co-authored-by: Ata E Husain Bohra <ata.husain@snowflake.com>
2022-09-16 09:03:06 -07:00
Lukas Joswiak 74ac617a34 Add support for changing coordinators to the configuration database
Configuration database data lives on the coordinators. When a change
coordinators command is issued, the data must be sent to the new
coordinators to keep the database consistent.
2022-09-13 16:53:54 -07:00
Ata E Husain Bohra 28e608e717
Encryption data at-rest db-config (#7929)
* Encryption data at-rest db-config

Description

 diff-1: Handle 'force' updates to encryption_at_rest db-config

Major changes proposed:
1. Introduce 'encryption_data_at_rest_mode" 'configure new'
option to enable Encryption data at-rest. The feature is disabled
by default.
2. The configuration is meant to be set at the time of database
creation, addition checks will be done to avoid updating the config
in subsequent PR.
3. DatabaseConfiguration validity check to account for "tenant_mode"
set to `required` if Encryption data at-rest is selected given
EncryptionDomain matches Tenant boundaries.

Testing

devCorrectness - 100K
2022-09-02 14:11:38 -07:00
Dennis Zhou 15c7331259 blob: fix blobRangeInactive = StringRef() 2022-08-31 09:23:33 -07:00
Dennis Zhou 8c33aa7b1d blob: create named values for blobRangeActive/blobRangeInactive
blobRangeActive = LiteralStringRef("1")
blobRangeInactive = LiteralStringRef("")
2022-08-15 16:25:36 -07:00
Josh Slocum 7c155f4521
Granule force purging (#7846)
* Granule purge cannot delete history entry for fully deleting granule until all children are completely done splitting

* Several purging fixes related to granule history

* Fixed typo in refactor

* fixing memory model for purgeRange

* formatting

* weakening granule purge test for now

* cleanup

* First version of force purging granules

* fixing issue in BW range assignment reporting

* Fixing incorrect assert with force purging

* Error handling when checking force purged state

* fixed force purging and recover/reassign range races and check

* Handling force purge + boundary change race

* more places to check for force purged status

* fixed manager restart in the middle of force purge bug

* fixing same-BM purge and assignment races in all cases

* weakening orphaned granule history check a bit because of difficult to solve races

* fixing txn options on retry

* loading force purged ranges at start to avoid resuming a merge that is being force purged

* cleanup

* Enabling purging in granule tests, and adding check for leaked change feeds in force purge

* formatting

* missed parameter in merge conflicts

* Fixing leaked change feed race with merge and force purge

* adding change feed cleanup when new blob manager recovers in-progress merge that raced with force purge

* added forcepurge fdbcli command
2022-08-11 15:22:32 -07:00
Vaidas Gasiunas 79571dd2b4
Testing upgrades to a future version of FDB (#7780)
* Enable configuring the next future protocol version as the current protocol version in FDB client, fdbserver, and fdbcli

* Auto format python files used in upgrade tests

* Add a test for upgrading to a future FDB version

* Emphasize that the options for using future protocol version are intended for test purposes only

* Make the global variable for current protocol version visible only locally

* Refactirng to avoid using currentProtocolVersion() in static intialization

* Update go bindings
2022-08-08 17:29:49 +02:00
Dennis Zhou b34a54fa7f
blob: allow for alignment of granules to tuple boundaries (#7746)
* blob: read TenantMap during recovery

Future functionality in the blob subsystem will rely on the tenant data
being loaded. This fixes this issue by loading the tenant data before
completing recovery such that continued actions on existing blob
granules will have access to the tenant data.

Example scenario with failover, splits are restarted before loading the
tenant data:
BM - BlobManager
epoch 3:                        epoch 4:
  BM record intent to split.
  Epoch fails.
                                BM recovery begins.
  BM fails to persist split.
                                BM recovery finishes.
                                BM.checkBlobWorkerList()
                                  maybeSplitRange().
                                BM.monitorClientRanges().
                                  loads tenant data.

bin/fdbserver -r simulation -f tests/slow/BlobGranuleCorrectness.toml \
    -s 223570924 -b on  --crash --trace_format json

* blob: add tuple key truncation for blob granule alignment

FDB has a backup system available using the blob manager and blob
granule subsystem. If we want to audit the data in the blobs, it's a lot
easier if we can align them to something meaningful.

When a blob granule is being split, we ask the storage metrics system
for split points as it holds approximate data distribution metrics.
These keys are then processed to determine if they are a tuple and
should be truncated according to the new knob,
BG_KEY_TUPLE_TRUNCATE_OFFSET.

Here we keep all aligned keys together in the same granule even if it is
larger than the allowed granule size. The following commit will address
this by adding merge boundaries.

* blob: minor clean ups in merging code

1. Rename mergeNow -> seen. This is more inline with clocksweep naming
   and removes the confusion between mergeNow and canMergeNow.
2. Make clearMergeCandidate() reset to MergeCandidateCannotMerge to make
   a clear distinction what we're accomplishing.
3. Rename canMergeNow() -> mergeEligble().

* blob: add explicit (hard) boundaries

Blob ranges can be specified either through explicit ranges or at the
tenant level. Right now this is managed implicitly. This commit aims to
make it a little more explicit.

Blobification begins in monitorClientRanges() which parses either the
explicit blob ranges or the tenant map. As we do this and add new
ranges, let's explicitly track what is a hard boundary and what isn't.

When blob merging occurs, we respect this boundary. When a hard boundary
is encountered, we submit the found eligible ranges and start looking
for a new range beginning with this hard boundary.

* blob: create BlobGranuleSplitPoints struct

This is a setup for the following commit. Our goal here is to provide a
structure for split points to be passed around. The need is for us to be
able to carry uncommitted state until it is committed and we can apply
these mutations to the in-memory data structures.

* blob: implement soft boundaries

An earlier commit establishes the need to create data boundaries within
a tenant. The reality is we may encounter a set of keys that degnerate
to the same key prefix. We'll need to be able to split those across
granules, but we want to ensure we merge the split granules together
before merging with other granules.

This adds to the BlobGranuleSplitPoints state of new
BlobGranuleMergeBoundary items. BlobGranuleMergeBoundary contains state
saying if it is a left or right boundary. This information is used to,
like hard boundaries, force merging of like granules first.

We read the BlobGranuleMergeBoundary map into memory at recovery.
2022-08-02 16:06:25 -05:00
A.J. Beamon a64693518a Add support for tenant groups 2022-07-26 09:04:29 -07:00
He Liu 7a8be255cd
Ss shard management (#7340)
* Storage server shard management with physical shards.

* Cleanup.

* Resolved comments.

* Added `UnlimintedCommitBytes`.

Co-authored-by: He Liu <heliu@apple.com>
2022-07-22 09:30:44 -07:00
A.J. Beamon 4d036ae339
Merge pull request #7626 from sfc-gh-ajbeamon/tenant-metadata-change
Some changes to the tenant metadata
2022-07-20 14:50:00 -07:00
Josh Slocum 78b6a96006 Merge branch 'main' into granule_merging_batch 2022-07-20 07:42:26 -05:00
A.J. Beamon 537ceff8ac Remove the ability to configure a tenant subspace. Rename the prefixes used for tenant metadata. 2022-07-19 14:32:05 -07:00
Trevor Clinkenbeard 1eec6f993d
Merge pull request #7572 from sfc-gh-akejriwal/tenantquota
Introduce storage quotas per tenant
2022-07-18 20:00:22 -07:00
Ankita Kejriwal 2757262ec9
Simplify storageQuotaKey() function implementation
Co-authored-by: Trevor Clinkenbeard <trevor.clinkenbeard@snowflake.com>
2022-07-18 15:56:11 -07:00
Josh Slocum 08186f9245 More efficient merge intent and granule history serialization 2022-07-15 20:38:06 -05:00
Ata E Husain Bohra 24b2de8de8 BlobFile Encryption and compression support
Description

Testing
2022-07-14 17:04:14 -07:00
Ankita Kejriwal bb05321d24 Introduce storage quotas per tenant.
This change adds:
* ability to store the mapping from tenants to quota in the system keyspace,
* a setter and getter function
* a new workload to test this functionality

FDBCORE-2437
2022-07-14 16:35:12 -07:00