Commit Graph

331 Commits

Author SHA1 Message Date
Jingyu Zhou df5825ff65
Merge pull request #8398 from sfc-gh-anoyes/anoyes/idempotency-id2
Initial work for automatic idempotency
2022-10-13 13:07:14 -07:00
Andrew Noyes e77e6e51ec Use StringRef 2022-10-11 13:46:39 -07:00
Andrew Noyes d8fce2525d IdempotencyId -> IdempotencyIdRef 2022-10-11 13:46:39 -07:00
Andrew Noyes 6d1707a8b8 Write and inspect key value format 2022-10-11 13:46:39 -07:00
He Liu b52edd8658 Merge branch 'main' of https://github.com/apple/foundationdb into validate-data-consistency 2022-10-10 11:00:05 -07:00
He Liu 0e69b19a6f Added AuditUtils.actor.h 2022-09-19 12:15:49 -07:00
A.J. Beamon 4fd64630e8 Convert literal string ref instances to use _sr suffix 2022-09-19 11:35:58 -07:00
sfc-gh-ngoyal 1bd97fe628
Recruit new singleton for consistency checker. (#5804)
* Recruit new singleton for consistency checker.

* Recruit the consistency checker only if enabled.

* Add a yield in monitorConsistencyChecker().

* Minor fixes.

* Consistency check workload enhancements.

* Minor fixes and clarifications.

* clang format

* Clang format.

* Minor fixes, cleanup, debug tracing.

* Misc.

* Move the consistency scan information from dbconfig to a key backed object.

* Move consistency scan config out of db cofig to a state object and feature rename.

* ConsistencyCheck workload refactor.

* devFormat

* Update fdbcli/ConsistencyScanCommand.actor.cpp

* Review Comments.

Co-authored-by: negoyal <neelam.goyal@gmail.com>
Co-authored-by: Ata E Husain Bohra <ata.husain@snowflake.com>
2022-09-16 09:03:06 -07:00
Lukas Joswiak 74ac617a34 Add support for changing coordinators to the configuration database
Configuration database data lives on the coordinators. When a change
coordinators command is issued, the data must be sent to the new
coordinators to keep the database consistent.
2022-09-13 16:53:54 -07:00
Ata E Husain Bohra 28e608e717
Encryption data at-rest db-config (#7929)
* Encryption data at-rest db-config

Description

 diff-1: Handle 'force' updates to encryption_at_rest db-config

Major changes proposed:
1. Introduce 'encryption_data_at_rest_mode" 'configure new'
option to enable Encryption data at-rest. The feature is disabled
by default.
2. The configuration is meant to be set at the time of database
creation, addition checks will be done to avoid updating the config
in subsequent PR.
3. DatabaseConfiguration validity check to account for "tenant_mode"
set to `required` if Encryption data at-rest is selected given
EncryptionDomain matches Tenant boundaries.

Testing

devCorrectness - 100K
2022-09-02 14:11:38 -07:00
Dennis Zhou 15c7331259 blob: fix blobRangeInactive = StringRef() 2022-08-31 09:23:33 -07:00
Dennis Zhou 8c33aa7b1d blob: create named values for blobRangeActive/blobRangeInactive
blobRangeActive = LiteralStringRef("1")
blobRangeInactive = LiteralStringRef("")
2022-08-15 16:25:36 -07:00
Josh Slocum 7c155f4521
Granule force purging (#7846)
* Granule purge cannot delete history entry for fully deleting granule until all children are completely done splitting

* Several purging fixes related to granule history

* Fixed typo in refactor

* fixing memory model for purgeRange

* formatting

* weakening granule purge test for now

* cleanup

* First version of force purging granules

* fixing issue in BW range assignment reporting

* Fixing incorrect assert with force purging

* Error handling when checking force purged state

* fixed force purging and recover/reassign range races and check

* Handling force purge + boundary change race

* more places to check for force purged status

* fixed manager restart in the middle of force purge bug

* fixing same-BM purge and assignment races in all cases

* weakening orphaned granule history check a bit because of difficult to solve races

* fixing txn options on retry

* loading force purged ranges at start to avoid resuming a merge that is being force purged

* cleanup

* Enabling purging in granule tests, and adding check for leaked change feeds in force purge

* formatting

* missed parameter in merge conflicts

* Fixing leaked change feed race with merge and force purge

* adding change feed cleanup when new blob manager recovers in-progress merge that raced with force purge

* added forcepurge fdbcli command
2022-08-11 15:22:32 -07:00
Vaidas Gasiunas 79571dd2b4
Testing upgrades to a future version of FDB (#7780)
* Enable configuring the next future protocol version as the current protocol version in FDB client, fdbserver, and fdbcli

* Auto format python files used in upgrade tests

* Add a test for upgrading to a future FDB version

* Emphasize that the options for using future protocol version are intended for test purposes only

* Make the global variable for current protocol version visible only locally

* Refactirng to avoid using currentProtocolVersion() in static intialization

* Update go bindings
2022-08-08 17:29:49 +02:00
Dennis Zhou b34a54fa7f
blob: allow for alignment of granules to tuple boundaries (#7746)
* blob: read TenantMap during recovery

Future functionality in the blob subsystem will rely on the tenant data
being loaded. This fixes this issue by loading the tenant data before
completing recovery such that continued actions on existing blob
granules will have access to the tenant data.

Example scenario with failover, splits are restarted before loading the
tenant data:
BM - BlobManager
epoch 3:                        epoch 4:
  BM record intent to split.
  Epoch fails.
                                BM recovery begins.
  BM fails to persist split.
                                BM recovery finishes.
                                BM.checkBlobWorkerList()
                                  maybeSplitRange().
                                BM.monitorClientRanges().
                                  loads tenant data.

bin/fdbserver -r simulation -f tests/slow/BlobGranuleCorrectness.toml \
    -s 223570924 -b on  --crash --trace_format json

* blob: add tuple key truncation for blob granule alignment

FDB has a backup system available using the blob manager and blob
granule subsystem. If we want to audit the data in the blobs, it's a lot
easier if we can align them to something meaningful.

When a blob granule is being split, we ask the storage metrics system
for split points as it holds approximate data distribution metrics.
These keys are then processed to determine if they are a tuple and
should be truncated according to the new knob,
BG_KEY_TUPLE_TRUNCATE_OFFSET.

Here we keep all aligned keys together in the same granule even if it is
larger than the allowed granule size. The following commit will address
this by adding merge boundaries.

* blob: minor clean ups in merging code

1. Rename mergeNow -> seen. This is more inline with clocksweep naming
   and removes the confusion between mergeNow and canMergeNow.
2. Make clearMergeCandidate() reset to MergeCandidateCannotMerge to make
   a clear distinction what we're accomplishing.
3. Rename canMergeNow() -> mergeEligble().

* blob: add explicit (hard) boundaries

Blob ranges can be specified either through explicit ranges or at the
tenant level. Right now this is managed implicitly. This commit aims to
make it a little more explicit.

Blobification begins in monitorClientRanges() which parses either the
explicit blob ranges or the tenant map. As we do this and add new
ranges, let's explicitly track what is a hard boundary and what isn't.

When blob merging occurs, we respect this boundary. When a hard boundary
is encountered, we submit the found eligible ranges and start looking
for a new range beginning with this hard boundary.

* blob: create BlobGranuleSplitPoints struct

This is a setup for the following commit. Our goal here is to provide a
structure for split points to be passed around. The need is for us to be
able to carry uncommitted state until it is committed and we can apply
these mutations to the in-memory data structures.

* blob: implement soft boundaries

An earlier commit establishes the need to create data boundaries within
a tenant. The reality is we may encounter a set of keys that degnerate
to the same key prefix. We'll need to be able to split those across
granules, but we want to ensure we merge the split granules together
before merging with other granules.

This adds to the BlobGranuleSplitPoints state of new
BlobGranuleMergeBoundary items. BlobGranuleMergeBoundary contains state
saying if it is a left or right boundary. This information is used to,
like hard boundaries, force merging of like granules first.

We read the BlobGranuleMergeBoundary map into memory at recovery.
2022-08-02 16:06:25 -05:00
A.J. Beamon a64693518a Add support for tenant groups 2022-07-26 09:04:29 -07:00
He Liu 7a8be255cd
Ss shard management (#7340)
* Storage server shard management with physical shards.

* Cleanup.

* Resolved comments.

* Added `UnlimintedCommitBytes`.

Co-authored-by: He Liu <heliu@apple.com>
2022-07-22 09:30:44 -07:00
A.J. Beamon 4d036ae339
Merge pull request #7626 from sfc-gh-ajbeamon/tenant-metadata-change
Some changes to the tenant metadata
2022-07-20 14:50:00 -07:00
Josh Slocum 78b6a96006 Merge branch 'main' into granule_merging_batch 2022-07-20 07:42:26 -05:00
A.J. Beamon 537ceff8ac Remove the ability to configure a tenant subspace. Rename the prefixes used for tenant metadata. 2022-07-19 14:32:05 -07:00
Trevor Clinkenbeard 1eec6f993d
Merge pull request #7572 from sfc-gh-akejriwal/tenantquota
Introduce storage quotas per tenant
2022-07-18 20:00:22 -07:00
Ankita Kejriwal 2757262ec9
Simplify storageQuotaKey() function implementation
Co-authored-by: Trevor Clinkenbeard <trevor.clinkenbeard@snowflake.com>
2022-07-18 15:56:11 -07:00
Josh Slocum 08186f9245 More efficient merge intent and granule history serialization 2022-07-15 20:38:06 -05:00
Ata E Husain Bohra 24b2de8de8 BlobFile Encryption and compression support
Description

Testing
2022-07-14 17:04:14 -07:00
Ankita Kejriwal bb05321d24 Introduce storage quotas per tenant.
This change adds:
* ability to store the mapping from tenants to quota in the system keyspace,
* a setter and getter function
* a new workload to test this functionality

FDBCORE-2437
2022-07-14 16:35:12 -07:00
Josh Slocum 0b0ac16a4c Merge branch 'main' into granule_merging 2022-07-12 09:09:30 -05:00
He Liu bc5bfaffda
Shard based move (#6981)
* Shard based move.

* Clean up.

* Clear results on retry in getInitialDataDistribution.

* Remove assertion on SHARD_ENCODE_LOCATION_METADATA for compatibility.

* Resolved comments.

Co-authored-by: He Liu <heliu@apple.com>
2022-07-07 20:49:16 -07:00
A.J. Beamon 26b35c07cd Refactor how tenant map entries are encoded and decoded. Add a specific version to the encoding that matches the version used when this feature was introduced (and the only version in which it was used). 2022-06-29 10:58:58 -07:00
Josh Slocum d6920cde28 Implemented blob granule merging 2022-06-09 10:50:53 -05:00
Xiaoxi Wang b0c26e93b2 remove size() method 2022-05-13 12:55:19 -07:00
Xiaoxi Wang 5bb70dda91 add remainedBytes method 2022-05-06 10:39:45 -07:00
Xiaoxi Wang 7ce53ca164 add SkewReadWriteWorkload 2022-05-05 23:53:51 -07:00
Bharadwaj V.R 9f11a8e6a4
Update fdbclient/SystemData.cpp
Co-authored-by: Trevor Clinkenbeard <trevor.clinkenbeard@snowflake.com>
2022-04-12 10:19:38 -07:00
Bharadwaj V.R 9d00eacfb2
Merge branch 'apple:main' into block-down 2022-04-10 21:50:53 -07:00
Josh Slocum 6276cebad9
Blob integration (#6808)
* Fixing leaked stream with explicit notify failed before destructor

* better logic to prevent races in change feed fetching

* Found new race that makes assert incorrect

* handle server overloaded in initial read from fdb

* Handling more blob error types in granule retry

* Fixing rollback metadata problem, added better debugging

* Fixing version race when fetching change feed metadata

* Better racing split request handling

* fixing assert

* Handle change feed popped check in the blob worker

* fix: do not use a RYW transaction for a versionstamp because of randomize API version (#6768)

* more merge conflict issues

* Change feed destroy fixes

* Fixing change feed destroy and move race

* Check error condition in BG file req

* Using relative endpoints for blob worker interface

* Fixing bug in previous fix

* More destroy and move race fixes

* Don't update empty version on destroy in case it gets rolled back. moved() and removing will take care of ensuring it is not read

* Bug fix (#6796)

* fix: do not use a RYW transaction for a versionstamp because of randomize API version

* fix: if the initialSnapshotVersion was pruned, granule history was incorrect

* added a way to compress null bytes in printable()

* Fixing durability issue with moving and destroying change feeds

* Adding fix for not fully deleting files for a granule that child granules need to re-snapshot

* More destroy and move races

* Fixing change feed destroy and pop races

* Renaming bg prune to purge, and adding a C api and unit test for it

* more cleanup

* review comments

* Observability for granule purging

* better handling for change feed not registered

* Fixed purging bugs (#6815)

* fix: do not use a RYW transaction for a versionstamp because of randomize API version

* fix: if the initialSnapshotVersion was pruned, granule history was incorrect

* added a way to compress null bytes in printable()

* fixed a few purging bugs

Co-authored-by: Evan Tschannen <evan.tschannen@snowflake.com>
2022-04-08 14:15:25 -07:00
Lukas Joswiak 73a7c32982
Add fdbcli command to read/write version epoch (#6480)
* Initialize cluster version at wall-clock time

Previously, new clusters would begin at version 0. After this change,
clusters will initialize at a version matching wall-clock time. Instead
of using the Unix epoch (or Windows epoch), FDB clusters will use a new
epoch, defaulting to January 1, 2010, 01:00:00+00:00. In the future,
this base epoch will be modifiable through fdbcli, allowing
administrators to advance the cluster version.

Basing the version off of time allows different FDB clusters to share
data without running into version issues.

* Send version epoch to master

* Cleanup

* Update fdbserver/storageserver.actor.cpp

Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>

* Jump directly to expected version if possible

* Fix initial version issue on storage servers

* Add random recovery offset to start version in simulation

* Type fixes

* Disable reference time by default

Enable on a cluster using the fdbcli command `versionepoch add 0`.

* Use correct recoveryTransactionVersion when recovering

* Allow version epoch to be adjusted forwards (to decrease the version)

* Set version epoch in simulation

* Add quiet database check to ensure small version offset

* Fix initial version issue on storage servers

* Disable reference time by default

Enable on a cluster using the fdbcli command `versionepoch add 0`.

* Add fdbcli command to read/write version epoch

* Cause recovery when version epoch is set

* Handle optional version epoch key

* Add ability to clear the version epoch

This causes version advancement to revert to the old methodology whereas
versions attempt to advance by about a million versions per second,
instead of trying to match the clock.

* Update transaction access

* Modify version epoch to use microseconds instead of seconds

* Modify fdbcli version target API

Move commands from `versionepoch` to `targetversion` top level command.

* Add fdbcli tests for

* Temporarily disable targetversion cli tests

* Fix version epoch fetch issue

* Fix Arena issue

* Reduce max version jump in simulation to 1,000,000

* Rework fdbcli API

It now requires two commands to fully switch a cluster to using the
version epoch. First, enable the version epoch with `versionepoch
enable` or `versionepoch set <versionepoch>`. At this point, versions
will be given out at a faster or slower rate in an attempt to reach the
expected version. Then, run `versionepoch commit` to perform a one time
jump to the expected version. This is essentially irreversible.

* Temporarily disable old targetversion tests

* Cleanup

* Move version epoch buggify to sequencer

This will cause some issues with the QuietDatabase check for the version
offset - namely, it won't do anything, since the version epoch is not
being written to the txnStateStore in simulation. This will get fixed in
the future.

Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>
2022-04-08 12:33:19 -07:00
Bharadwaj V.R 3fbbf415e7 Properly encapsulate SWVersion and create a couple of UTs for sw version testing 2022-04-04 18:42:52 -07:00
Bharadwaj V.R d175599713 resolve merge conflict from upstream 2022-04-01 15:50:38 -07:00
Bharadwaj V.R be70a57cae Merge branch 'main' of https://github.com/apple/foundationdb into apple-main 2022-04-01 15:50:09 -07:00
Bharadwaj V.R 3410ab3bd2 Revert "Create new system key for tracking latest software version installed on the cluster"
This reverts commit 72d99cc5b1.
2022-03-31 14:10:37 -07:00
Bharadwaj V.R 60c146bd30
Merge branch 'apple:main' into block-down 2022-03-31 00:11:44 -07:00
Bharadwaj V.R c009eba7ed Add last-sw-version tracking to SWVersion structure 2022-03-31 00:09:58 -07:00
Bharadwaj V.R 7926917d5f
Merge branch 'apple:main' into ssupdateb4registration 2022-03-29 13:04:19 -07:00
Josh Slocum 61474d5d54 Future-proof blob granules with full file size 2022-03-29 08:06:07 -05:00
Bharadwaj V.R dd3a453f5b Address suggestions to make new SSI member private, and reduce the number of serialization methods for serverList value 2022-03-28 23:52:26 -07:00
Bharadwaj V.R 726cb3a18f merge commits from main 2022-03-28 22:49:03 -07:00
Bharadwaj V.R 301e64a1b6 Remove unit tests added for SSI upgrade/downgrade 2022-03-25 13:27:55 -07:00
Bharadwaj V.R 961e4ae7fd ratekeeper and ser-des fixes 2022-03-24 17:25:07 -07:00
Josh Slocum f27475e2f4 Merge branch 'main' into blob_integration 2022-03-22 11:41:58 -05:00
sfc-gh-tclinkenbeard a71099471b Update copyright header dates 2022-03-21 13:36:23 -07:00