foundationdb/tests
Dennis Zhou b34a54fa7f
blob: allow for alignment of granules to tuple boundaries (#7746)
* blob: read TenantMap during recovery

Future functionality in the blob subsystem will rely on the tenant data
being loaded. This fixes this issue by loading the tenant data before
completing recovery such that continued actions on existing blob
granules will have access to the tenant data.

Example scenario with failover, splits are restarted before loading the
tenant data:
BM - BlobManager
epoch 3:                        epoch 4:
  BM record intent to split.
  Epoch fails.
                                BM recovery begins.
  BM fails to persist split.
                                BM recovery finishes.
                                BM.checkBlobWorkerList()
                                  maybeSplitRange().
                                BM.monitorClientRanges().
                                  loads tenant data.

bin/fdbserver -r simulation -f tests/slow/BlobGranuleCorrectness.toml \
    -s 223570924 -b on  --crash --trace_format json

* blob: add tuple key truncation for blob granule alignment

FDB has a backup system available using the blob manager and blob
granule subsystem. If we want to audit the data in the blobs, it's a lot
easier if we can align them to something meaningful.

When a blob granule is being split, we ask the storage metrics system
for split points as it holds approximate data distribution metrics.
These keys are then processed to determine if they are a tuple and
should be truncated according to the new knob,
BG_KEY_TUPLE_TRUNCATE_OFFSET.

Here we keep all aligned keys together in the same granule even if it is
larger than the allowed granule size. The following commit will address
this by adding merge boundaries.

* blob: minor clean ups in merging code

1. Rename mergeNow -> seen. This is more inline with clocksweep naming
   and removes the confusion between mergeNow and canMergeNow.
2. Make clearMergeCandidate() reset to MergeCandidateCannotMerge to make
   a clear distinction what we're accomplishing.
3. Rename canMergeNow() -> mergeEligble().

* blob: add explicit (hard) boundaries

Blob ranges can be specified either through explicit ranges or at the
tenant level. Right now this is managed implicitly. This commit aims to
make it a little more explicit.

Blobification begins in monitorClientRanges() which parses either the
explicit blob ranges or the tenant map. As we do this and add new
ranges, let's explicitly track what is a hard boundary and what isn't.

When blob merging occurs, we respect this boundary. When a hard boundary
is encountered, we submit the found eligible ranges and start looking
for a new range beginning with this hard boundary.

* blob: create BlobGranuleSplitPoints struct

This is a setup for the following commit. Our goal here is to provide a
structure for split points to be passed around. The need is for us to be
able to carry uncommitted state until it is committed and we can apply
these mutations to the in-memory data structures.

* blob: implement soft boundaries

An earlier commit establishes the need to create data boundaries within
a tenant. The reality is we may encounter a set of keys that degnerate
to the same key prefix. We'll need to be able to split those across
granules, but we want to ensure we merge the split granules together
before merging with other granules.

This adds to the BlobGranuleSplitPoints state of new
BlobGranuleMergeBoundary items. BlobGranuleMergeBoundary contains state
saying if it is a left or right boundary. This information is used to,
like hard boundaries, force merging of like granules first.

We read the BlobGranuleMergeBoundary map into memory at recovery.
2022-08-02 16:06:25 -05:00
..
TestRunner Python_EXECUTABLE to Python3_EXECUTABLE 2022-07-29 14:57:29 -07:00
argument_parsing Update is_unknown_knob in test_argument_parsing.py 2022-05-23 13:41:48 -07:00
fast Merge pull request #7753 from sfc-gh-jslocum/cf_op_chaos 2022-08-02 09:04:38 -05:00
loopback_cluster simple replication (#7625) 2022-07-19 18:30:34 -04:00
noSim Use absl::GetStackTrace for slow task profiler (#7374) 2022-06-15 14:53:52 -07:00
python_tests tests/ black code reformat 2022-04-27 17:01:20 +02:00
rare disable movekeys granule test and shift other granule tests around 2022-08-01 14:48:38 -05:00
restarting Disabled unsupported tests. (#7693) 2022-07-25 21:57:47 -07:00
slow blob: allow for alignment of granules to tuple boundaries (#7746) 2022-08-02 16:06:25 -05:00
status Rename more places from proxy to commit proxy 2020-09-15 22:29:49 -07:00
AsyncFileCorrectness.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
AsyncFileMix.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
AsyncFileRead.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
AsyncFileReadRandom.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
AsyncFileWrite.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
BGServerCommonUnit.toml Cleanup 2022-03-24 17:15:11 -05:00
BackupContainers.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
BandwidthThrottle.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
BigInsert.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
BlobGranuleFileUnit.toml BlobFile Encryption and compression support 2022-07-14 17:04:14 -07:00
BlobManagerUnit.toml Cleanup 2022-03-24 17:15:11 -05:00
CMakeLists.txt Merge pull request #7753 from sfc-gh-jslocum/cf_op_chaos 2022-08-02 09:04:38 -05:00
CTestCustom.ctest.cmake Python_EXECUTABLE to Python3_EXECUTABLE 2022-07-29 14:57:29 -07:00
ClusterControllerTests.txt Add updateWorkerHealth interface in cluster controller 2021-06-24 19:42:28 -07:00
ConsistencyCheck.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
DDMetricsExclude.txt Make sure only uppercase characters follow underscore in test titles 2020-11-08 14:30:55 -08:00
DataDistributionMetrics.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
DiskDurability.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
FileSystem.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
Happy.txt Initial repository commit 2017-05-25 13:48:44 -07:00
IThreadPool.txt Added unit test. 2021-08-09 16:21:43 -07:00
IncrementalDelete.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
KVStoreMemTest.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
KVStoreReadMostly.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
KVStoreTest.txt Renamed redwood to redwood-1-experimental and file extension to .redwood-v1. 2021-11-16 02:15:22 -08:00
KVStoreTestRead.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
KVStoreTestWrite.txt Initial repository commit 2017-05-25 13:48:44 -07:00
KVStoreValueSize.txt Initial repository commit 2017-05-25 13:48:44 -07:00
LayerStatusMerge.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
Mako.txt Add a simple workload, ReportConflictingKeysWorkload, to test correctness of the API and performance overhead added to the resovler. 2019-12-06 16:21:03 -08:00
ParallelRestoreApiCorrectnessAtomicRestore.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
ParallelRestoreOldBackupApiCorrectnessAtomicRestore.toml test:Mute ParallelRestoreOldBackupApiCorrectnessAtomicRestore 2020-10-15 15:44:29 -07:00
PerfUnitTests.toml added file to run perf unit tests 2021-11-08 15:54:18 -07:00
PureNetwork.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
RRW2500.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
RandomRangeRead.txt Minor Redwood comparison optimizations 2021-04-23 18:49:43 +00:00
RandomRead.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
RandomReadWrite.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
ReadAbsent.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
ReadAfterWrite.txt Add a ReadAfterWrite workload, to measure TLog->SS propagation delay. 2020-07-01 02:17:43 -07:00
ReadHalfAbsent.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
RedwoodCorrectness.txt Some unit tests names had a prefixed "!" in order to be excluded from random selection, this has been changed to a ":" as it is less problematic on the command line. Some Redwood unit tests have been enabled for random selection. 2021-04-05 00:03:15 -07:00
RedwoodCorrectnessBTree.txt Some unit tests names had a prefixed "!" in order to be excluded from random selection, this has been changed to a ":" as it is less problematic on the command line. Some Redwood unit tests have been enabled for random selection. 2021-04-05 00:03:15 -07:00
RedwoodCorrectnessPager.txt Some unit tests names had a prefixed "!" in order to be excluded from random selection, this has been changed to a ":" as it is less problematic on the command line. Some Redwood unit tests have been enabled for random selection. 2021-04-05 00:03:15 -07:00
RedwoodCorrectnessUnits.txt Some unit tests names had a prefixed "!" in order to be excluded from random selection, this has been changed to a ":" as it is less problematic on the command line. Some Redwood unit tests have been enabled for random selection. 2021-04-05 00:03:15 -07:00
RedwoodPerfPrefixCompression.txt Some unit tests names had a prefixed "!" in order to be excluded from random selection, this has been changed to a ":" as it is less problematic on the command line. Some Redwood unit tests have been enabled for random selection. 2021-04-05 00:03:15 -07:00
RedwoodPerfRandomRangeScans.txt Minor Redwood comparison optimizations 2021-04-23 18:49:43 +00:00
RedwoodPerfSequentialInsert.txt Some unit tests names had a prefixed "!" in order to be excluded from random selection, this has been changed to a ":" as it is less problematic on the command line. Some Redwood unit tests have been enabled for random selection. 2021-04-05 00:03:15 -07:00
RedwoodPerfSet.txt Some unit tests names had a prefixed "!" in order to be excluded from random selection, this has been changed to a ":" as it is less problematic on the command line. Some Redwood unit tests have been enabled for random selection. 2021-04-05 00:03:15 -07:00
RedwoodPerfTests.txt Some unit tests names had a prefixed "!" in order to be excluded from random selection, this has been changed to a ":" as it is less problematic on the command line. Some Redwood unit tests have been enabled for random selection. 2021-04-05 00:03:15 -07:00
RocksDBTest.txt rename ssd-rocksdb-experimental as ssd-rocksdb-v1. 2022-03-29 10:53:38 -07:00
S3BlobStore.txt Disambiguate between S3BlobStore and other blob stores 2020-10-29 20:42:23 -07:00
SampleNoSimAttrition.txt fixed indentation issues 2019-10-24 13:21:28 -07:00
SimpleExternalTest.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
SpecificUnitTest.txt fix test changes 2022-03-15 17:28:36 +01:00
StorageMetricsSampleTests.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
StorageServerInterface.txt Testing Storage Server implementation 2021-05-25 20:28:50 +00:00
StreamingWrite.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
SystemData.txt Testing Storage Server implementation 2021-05-25 20:28:50 +00:00
ThreadSafety.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
TraceEventMetrics.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
WorkerTests.txt revert unintended changes 2021-08-09 16:29:57 -07:00
default.txt Initial repository commit 2017-05-25 13:48:44 -07:00
errors.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
fail.txt Initial repository commit 2017-05-25 13:48:44 -07:00
killall.txt Make the testspec more restrictive in terms of what can be set where. 2020-07-06 02:03:30 -07:00
latency.txt Initial repository commit 2017-05-25 13:48:44 -07:00
performance-fs.txt Initial repository commit 2017-05-25 13:48:44 -07:00
performance.txt Initial repository commit 2017-05-25 13:48:44 -07:00
ping.TXT Initial repository commit 2017-05-25 13:48:44 -07:00
pingServers.TXT Initial repository commit 2017-05-25 13:48:44 -07:00
pt.TXT Initial repository commit 2017-05-25 13:48:44 -07:00
randomSelector.txt Make sure only uppercase characters follow underscore in test titles 2020-11-08 14:30:55 -08:00
s3VersionHeaders.txt Support AWS v4 header for s3 backup and restore 2022-02-07 17:53:05 -08:00
selectorCorrectness.txt Make sure only uppercase characters follow underscore in test titles 2020-11-08 14:30:55 -08:00