Commit Graph

21820 Commits

Author SHA1 Message Date
John Brownlee 4f4d32de8e
Merge pull request #7697 from brownleej/kubernetes-monitor-ip-family
Add IP family argument to fdb-kubernetes-monitor
2022-08-03 13:26:34 -07:00
Nim Wijetunga b922470ceb fix pr issues 2022-08-03 13:26:14 -07:00
Xiaoxi Wang 99bfc2406a update onCheck() and unit test; format code 2022-08-03 11:31:59 -07:00
Bala Namasivayam 996484191b Fix protocol version 2022-08-03 11:16:15 -07:00
A.J. Beamon 8e777a6330 Detect and handle inverted ranges in the get cluster list test. Remove some unused code. 2022-08-03 09:09:36 -07:00
Xiaoxi Wang 1243a1d8a3 add /StorageWiggler/MinAge test 2022-08-02 18:55:40 -07:00
Bala Namasivayam bf3009d6c9 Add missing CommitInfo fields to the transaction profiling analyzer 2022-08-02 17:13:03 -07:00
Xiaoxi Wang 2cd15073d5 make storage wiggler support SS_MIN_AGE 2022-08-02 16:21:41 -07:00
Trevor Clinkenbeard edf4e60fa9
Merge pull request #7631 from sfc-gh-tclinkenbeard/global-tag-throttling5
Improvements to `GlobalTagThrottler`
2022-08-02 16:04:20 -07:00
Dennis Zhou b34a54fa7f
blob: allow for alignment of granules to tuple boundaries (#7746)
* blob: read TenantMap during recovery

Future functionality in the blob subsystem will rely on the tenant data
being loaded. This fixes this issue by loading the tenant data before
completing recovery such that continued actions on existing blob
granules will have access to the tenant data.

Example scenario with failover, splits are restarted before loading the
tenant data:
BM - BlobManager
epoch 3:                        epoch 4:
  BM record intent to split.
  Epoch fails.
                                BM recovery begins.
  BM fails to persist split.
                                BM recovery finishes.
                                BM.checkBlobWorkerList()
                                  maybeSplitRange().
                                BM.monitorClientRanges().
                                  loads tenant data.

bin/fdbserver -r simulation -f tests/slow/BlobGranuleCorrectness.toml \
    -s 223570924 -b on  --crash --trace_format json

* blob: add tuple key truncation for blob granule alignment

FDB has a backup system available using the blob manager and blob
granule subsystem. If we want to audit the data in the blobs, it's a lot
easier if we can align them to something meaningful.

When a blob granule is being split, we ask the storage metrics system
for split points as it holds approximate data distribution metrics.
These keys are then processed to determine if they are a tuple and
should be truncated according to the new knob,
BG_KEY_TUPLE_TRUNCATE_OFFSET.

Here we keep all aligned keys together in the same granule even if it is
larger than the allowed granule size. The following commit will address
this by adding merge boundaries.

* blob: minor clean ups in merging code

1. Rename mergeNow -> seen. This is more inline with clocksweep naming
   and removes the confusion between mergeNow and canMergeNow.
2. Make clearMergeCandidate() reset to MergeCandidateCannotMerge to make
   a clear distinction what we're accomplishing.
3. Rename canMergeNow() -> mergeEligble().

* blob: add explicit (hard) boundaries

Blob ranges can be specified either through explicit ranges or at the
tenant level. Right now this is managed implicitly. This commit aims to
make it a little more explicit.

Blobification begins in monitorClientRanges() which parses either the
explicit blob ranges or the tenant map. As we do this and add new
ranges, let's explicitly track what is a hard boundary and what isn't.

When blob merging occurs, we respect this boundary. When a hard boundary
is encountered, we submit the found eligible ranges and start looking
for a new range beginning with this hard boundary.

* blob: create BlobGranuleSplitPoints struct

This is a setup for the following commit. Our goal here is to provide a
structure for split points to be passed around. The need is for us to be
able to carry uncommitted state until it is committed and we can apply
these mutations to the in-memory data structures.

* blob: implement soft boundaries

An earlier commit establishes the need to create data boundaries within
a tenant. The reality is we may encounter a set of keys that degnerate
to the same key prefix. We'll need to be able to split those across
granules, but we want to ensure we merge the split granules together
before merging with other granules.

This adds to the BlobGranuleSplitPoints state of new
BlobGranuleMergeBoundary items. BlobGranuleMergeBoundary contains state
saying if it is a left or right boundary. This information is used to,
like hard boundaries, force merging of like granules first.

We read the BlobGranuleMergeBoundary map into memory at recovery.
2022-08-02 16:06:25 -05:00
He Liu 013b9e3baa
Fixed ChangeServerKeysContext name issue. (#7761)
* Fixed ChangeServerKeysContext name issue.

* Update fdbserver/storageserver.actor.cpp

Co-authored-by: Andrew Noyes <andrew.noyes@snowflake.com>

Co-authored-by: He Liu <heliu@apple.com>
Co-authored-by: Andrew Noyes <andrew.noyes@snowflake.com>
2022-08-02 13:54:27 -07:00
sfc-gh-tclinkenbeard 025085ccb5 Fix GlobalTagThrottler::addRequests 2022-08-02 13:47:04 -07:00
Xiaoxi Wang 07eafcec93
Merge pull request #7763 from sfc-gh-xwang/feature/main/unittest
move waitForMost into generic actors
2022-08-02 13:30:30 -07:00
Andrew Noyes 5dbb6f1dd3
Make Tuple::pack return a Standalone<StringRef> (#7764)
This makes it less error-prone and more like other similar functions
like BinaryWriter::toValue

Closes #7748
2022-08-02 12:45:56 -07:00
Chaoguang Lin 48e46cbc81
Add test coverage for SpecialKeyRangeAsyncImpl::getRange (#7671)
* Add getRange test coverage for SpecialKeyRangeAsyncImpl

* Fix the bug in SpecialKeyRangeAsyncImpl found by the test

* Refactor ConflictingKeysImpl::getRange to use containedRanges to simplify the code

* Fix file format

* Initialize SpecialKeyRangeAsyncImpl cache with correct end key

* Add release notes

* Revert "Refactor ConflictingKeysImpl::getRange to use containedRanges to simplify the code"

This reverts commit fdd298f469.
2022-08-02 12:04:40 -07:00
Nim Wijetunga 8d591fc5e7 address pr comments 2022-08-02 10:51:41 -07:00
Xiaoxi Wang 3c76ad9e72 move waitForMost into generic actors 2022-08-02 10:38:11 -07:00
Nim Wijetunga 9a4fb8ad4e merge 2022-08-02 09:49:16 -07:00
Josh Slocum 4b66645d80
Granule file performance benchmark and improvements (#7742)
* added cpu microbenchmark for blob granule files

* Added edge case read benchmarks, and sorting memory deltas

* Sorted merge for granule files

* key block comparison optimization in granule files

* More performance improvements to granule file read

* fixing zlib not supported build

* fixing formatting

* Added debug macro for new debugging prints

* review comments

* more strict compression size validation assert
2022-08-02 11:36:44 -05:00
Josh Slocum b1ff8b8340
Merge pull request #7753 from sfc-gh-jslocum/cf_op_chaos
Add Chaos to Change Feed Operations test
2022-08-02 09:04:38 -05:00
Josh Slocum 7b9a9c0176
Merge pull request #7752 from sfc-gh-jslocum/rearrange_granule_tests
disable movekeys granule test and shift other granule tests around
2022-08-02 09:03:36 -05:00
A.J. Beamon daa7d3ae72 Remove unused variable 2022-08-02 05:50:58 -07:00
Nim Wijetunga c36320b696 fix issues 2022-08-01 21:59:15 -07:00
Hao Fu 9db19b9a7c
Retain debug id in prefetch server-server call (#7754) 2022-08-01 19:46:45 -07:00
Xiaoge Su fd3c3f0774 fixup! Reformat source 2022-08-01 18:56:50 -07:00
Xiaoge Su 195890dd7b Add ratekeeper ID for storage server busiest write tag report 2022-08-01 18:56:50 -07:00
Xiaoge Su ca482784d4 Empty commit, retrigger tests 2022-08-01 18:56:50 -07:00
Xiaoge Su aa69f5f36e fixup! Update per code review 2022-08-01 18:56:50 -07:00
Xiaoge Su 75b69d7774 fixup! gcc build error 2022-08-01 18:56:50 -07:00
Xiaoge Su ead1aa5bc1 fixup! Remove C++20 attribute 2022-08-01 18:56:50 -07:00
Xiaoge Su 90b887f394 fixup! Update per comments 2022-08-01 18:56:50 -07:00
Xiaoge Su ec40c6bfec fixup! Add a wrapper of ResourceWeakRef for better support of self pointer 2022-08-01 18:56:50 -07:00
Xiaoge Su 1ab8d388af fixup! Reformat source 2022-08-01 18:56:50 -07:00
Xiaoge Su 2ce456bd3f fixup! Fix minor issues
* Reintroduce the anonymous namespace deleted by accident
 * Rename the guardian definition in OwningResource.h
2022-08-01 18:56:50 -07:00
Xiaoge Su bc9d2e8cbe fixup! Fix per upstream change 2022-08-01 18:56:50 -07:00
Xiaoge Su cf04afe925 fixup! Non-owning reference to an object
See documents in flow/OwningResource.h
2022-08-01 18:56:50 -07:00
Xiaoge Su 26877a8924 fixup! Remove the boost/bind.hpp in TLSTest.cpp
Not used, and causes gcc emit warning
2022-08-01 18:56:50 -07:00
Xiaoge Su 542b5e61cf Let the storage server reports busiest write tag
Issue #7258

The ratekeeper is recording the busiest write tag for *all* storage
servers, which throttles the traceevent. Distribute the busiest write
tag to corresponding storage servers should reduces this throttling
issue.
2022-08-01 18:56:50 -07:00
Nim Wijetunga f3e7fd142b fix formatting 2022-08-01 18:12:34 -07:00
Nim Wijetunga 6004eaf1ba fix 2022-08-01 18:08:29 -07:00
Ata E Husain Bohra ef6012c1d1
Encrypt BlobGranule delta files (#7735)
* Encrypt  BlobGranule delta files

Description

 diff-1: Address review comments

Major changes proposed by the patch are:
1. Refactor code to allow caching of 'encryption key ctx' as part of
BlobFilePointerRef. The refactoring allows snapshot and/or delta files
to store their own file encryption context.
2. Enable BlobGranule delta file encryption/decryption semantics.

Testing

BlobGranuleCorrrectness  
BlobGranuleCorrectnessClean
BlobGranuleFileUnitTestToml

Description

Testing
2022-08-01 16:34:44 -07:00
Josh Slocum c5c423f2b7 Disabling update/clear post-stop for now in change feed ops 2022-08-01 17:42:32 -05:00
Nim Wijetunga af6db42b1b temp 2022-08-01 15:19:21 -07:00
A.J. Beamon 1b693a588a Merge branch 'main' into feature-metacluster 2022-08-01 14:43:14 -07:00
A.J. Beamon 953bc4252d Add a metacluster consistency check class and use it in various tenant/metacluster workloads 2022-08-01 14:41:46 -07:00
A.J. Beamon d9df813ce4 Fix a few metacluster accounting bugs 2022-08-01 14:34:21 -07:00
Josh Slocum 83806b4a31 disable movekeys granule test and shift other granule tests around 2022-08-01 14:48:38 -05:00
Renxuan Wang 51b92d59b9
Merge pull request #7733 from RenxuanW/hostname-backup
Prefer IPv6 in hostname resolving.
2022-08-01 12:34:02 -07:00
Renxuan Wang 4a0bea2230 Document for pickOneAddress(). 2022-08-01 10:54:04 -07:00
Markus Pilman 9caece6fc2
Merge pull request #7743 from sfc-gh-ajbeamon/fix-dummy-transaction
Fix: Transactions could change tenant ID part way through
2022-08-01 11:28:32 -06:00