Commit Graph

9148 Commits

Author SHA1 Message Date
Bharadwaj V.R 89af5561f1
Merge branch 'apple:main' into block-down 2022-04-20 06:13:01 -07:00
Evan Tschannen 442d2b34c7
fix: pops which were ignored during a snapshot would not be replayed on the proper tlogs within a shared tlog (#6892) 2022-04-19 16:57:41 -07:00
Bharadwaj V.R 51ef860612
Merge branch 'apple:main' into block-down 2022-04-19 10:16:56 -07:00
Ata E Husain Bohra a38318a6ac
Update 'salt' details for EncryptHeader AuthToken details (#6881)
* Update 'salt' details for EncryptHeader AuthToken details

Description

Major changes:
1. Add 'salt' to BlobCipherEncryptHeader::cipherHeaderDetails.
2. During decryption it is possible that BlobKeyCacheId doesn't
    contain required baseCipherDetails. Add API to KeyCache to
    allowing re-populating of CipherDetails with a given 'salt'
3. Update BaseCipherKeyIdCache indexing using {BaseCipherKeyId, salt}
    tuple. FDB processes leverage BlobCipherKeyCache to implement
    in-memory caching of cipherKeys, given EncryptKeyProxy supplies
    BaseCipher details, each encryption participant service would
    generate its derived key by using different 'salt'. Further,
    it is possible to cache multiple {baseCipherKeyId, salt} tuples;
    for instance: CP encrypted mutations being deciphered by
    StorageServer etc.

Testing

1. Update EncyrptionOps simulation test to simulate KeyCache miss
2. Update BlobCipher unit tests to validate above mentioned changes
2022-04-18 22:01:56 -07:00
Sreenath Bodagala abd3d5a3d7 Merge remote-tracking branch 'apple-upstream/main' 2022-04-18 20:55:16 +00:00
Lukas Joswiak 07d11ec2e1 Fix failing upgrades due to non-persisted initial cluster version 2022-04-18 10:59:17 -07:00
Markus Pilman 3cbba4bea4
Don't test requests that don't initialize properly (#6880)
* Don't test requests that don't initialize properly

Some request objects don't initialize their members
properly when being constructed using the default
constructor. This makes valgrind unhappy. Don't test
these endpoints for now.

* fixed code formatting
2022-04-18 10:44:56 -07:00
Jingyu Zhou 17dc1a61f3 ClientDBInfo may be unintentionally not set
The ClientDBInfo's comparison is through an internal UID and shrinkProxyList()
can change proxies inside ClientDBInfo. Since the UID is not changed by that
function, subsequent set can be unintentionally skipped.

This was not a big issue before. However, VV introduces a change that the
client side compares the returned proxy ID with its known set of GRV proxies
and will retry GRV if the returned proxy ID is not in the set. Due the above
bug, GRV returned by a proxy is not within the client set, and results in
indefinite retrying GRVs.
2022-04-18 09:09:14 -07:00
Bharadwaj V.R b10ba334de
Merge branch 'apple:main' into block-down 2022-04-18 08:56:04 -07:00
Markus Pilman 1f26943099
Merge pull request #6859 from sfc-gh-ajbeamon/check-tenant-clear-range
When clearing the database between tests, check that the normal key-space is empty
2022-04-16 11:24:41 -06:00
A.J. Beamon 6151f9c858
Merge pull request #6873 from sfc-gh-ajbeamon/tenant-test-fix
The tenant deletion test now deletes multiple tenants concurrently rather than serially
2022-04-15 14:29:47 -07:00
Jingyu Zhou 0a03b190da Fix multiple PeekStream requests to log routers
There is a bug in how a log router handles streaming read:
* Log router has a `logRouterPeekStream` actor A running.
* Remote tlog detects some problem and starts another streaming connection (maybe just reuse the connection?)
* Log router now has a new `logRouterPeekStream` actor B running.
* B runs and found that popped version > reqBegin, so `LogRouterPeekPopped` . This is because A is still running and changed the popped version.
* A ends with `TLogPeekStreamEnd operation_obsolete`
* B become stuck at `wait(req.reply.onReady() && store(reply.rep, future)`, because the future was sent `Never()`.

As a result, the remote tlog can no longer retrieve data from this log router.

Fix by killing the `logRouterPeekStream` B.
2022-04-15 14:11:52 -07:00
A.J. Beamon e2222355dc The tenant deletion test now deletes multiple tenants concurrently rather than serially. Fix some variable shadowing in the delete test. 2022-04-15 13:17:19 -07:00
Neethu Haneesha Bingi 6543bce8ae RocksDb using aggr property metrics for pendingCompactionBytes. 2022-04-14 18:08:42 -07:00
A.J. Beamon cf5d3c83a1 Fix formatting issues. 2022-04-14 12:03:39 -07:00
A.J. Beamon 19d78cf2a3 When clearing the database between tests, check that clearing the tenant left the entire normal key-space empty. Update the configuration of some tests. Disable a special key-space test that is invoking broken behavior. 2022-04-14 11:39:02 -07:00
Bharadwaj V.R 85904ec739
Merge branch 'apple:main' into block-down 2022-04-14 07:23:16 -07:00
Markus Pilman 3598c6b56b
Merge pull request #6675 from sfc-gh-jshim/tenant-token-sign
Sign and verify auth tokens for multi-tenant FDB
2022-04-13 16:55:20 -06:00
Zhe Wang 2f75a4bd78
Use actor collection for rocksdb histogram actors. (#6805)
Co-authored-by: Zhe Wang <zhewang@Zhes-MacBook-Pro.local>
2022-04-13 14:41:54 -07:00
Junhyun Shim b6a0c0f942 Merge remote-tracking branch 'upstream/main' into tenant-token-sign 2022-04-13 19:55:37 +02:00
Jingyu Zhou 71acfd5a7e Fix provisional GRV Proxy ID in GetReadVersionReply
This was not set and can cause infinite loop in simulation where the client
calls getConsistentReadVersion(), in which we do "continue" for stale GRV reply
and retry. Then this repeats forever.
2022-04-13 10:35:10 -07:00
Bharadwaj V.R 3e50dda79f
Merge branch 'apple:main' into block-down 2022-04-13 06:58:59 -07:00
Sreenath Bodagala f038f37513 - Do not invoke version vector related code on the sequencer and
GRVs when version vector feature is disabled.
2022-04-12 20:05:32 +00:00
Sreenath Bodagala e902ac543a
Merge pull request #6829 from sbodagala/main
Version vector encoding
2022-04-12 14:19:31 -04:00
Bharadwaj V.R e4f22e5394
Merge branch 'apple:main' into block-down 2022-04-12 10:50:02 -07:00
Bharadwaj V.R 46b22b79ea Revert "Add SW versions to DBCore state"
This reverts commit 50b45f1bf9.
2022-04-12 10:47:49 -07:00
Bharadwaj V.R 775fd4a027 Merge branch 'block-down' of github.com:sfc-gh-bvr/foundationdb into block-down 2022-04-12 10:35:53 -07:00
Bharadwaj V.R 95754fe650 Remove unnecessary wait statements in test cases 2022-04-12 10:34:18 -07:00
Bharadwaj V.R 5d103d5f7b
Update fdbserver/worker.actor.cpp
Remove unnecessary errorOr actor when testing SW version compatibility

Co-authored-by: Trevor Clinkenbeard <trevor.clinkenbeard@snowflake.com>
2022-04-12 10:22:35 -07:00
Sreenath Bodagala cb3add17b8 - Encode version vector before sending it over the wire.
Encoding methods used:

  - Tag localities: Run length encoding
  - Tag ids: Compact representation
  - Commit versions: delta encoding.

  If "n" is the number of entries in the version vector, with the tags
  spread over "m" data centers, these techniques will reduce the number
  of bytes to represent the version vector from "(11 * n)" bytes to
  "(3 * m + 2 * n)" / "(3 * m + 3 * n)" bytes (depending on the max tag
  id value, and ignoring some constants) in the best case.
2022-04-11 21:03:09 +00:00
Bharadwaj V.R 129a7b5daf Use boolean-param for GetTeamRequest params 2022-04-11 13:27:08 -07:00
Bharadwaj V.R 50b45f1bf9 Add SW versions to DBCore state 2022-04-11 12:01:12 -07:00
Xiaoxi Wang 7960f77040
Merge pull request #6811 from sfc-gh-xwang/fix-conf-restart
fix configure workload typo
2022-04-11 10:19:47 -07:00
Vaidas Gasiunas ca563466a6
Merge pull request #6401 from sfc-gh-mpilman/features/private-request-streams
Features/private request streams
2022-04-11 18:29:06 +02:00
Ata E Husain Bohra 933e5bbd2e
EncryptKeyProxy server APIs for simulation runs. (#6727)
* EncryptKeyProxy server APIs for simulation runs.

Description

  diff-2: FlowSingleton util class
              Bug fixes
  diff-1: Expected errors returned to the caller

Major changes proposed are:
1. EncryptKeyProxy server APIs:
 1.1. Lookup Cipher details via BaseCipherId
 1.2. Lookup latest Cipher details via encryption domainId.
2. EncyrptKeyProxy implements caches indexed by: baseCipherId &
   encyrptDomainId
3. Periodic task to refresh domainId indexed cache to support
   'limiting cipher lifetime' abilities if supported by
   external KMS solutions.

Testing

EncyrptKeyProxyTest workload to validate the newly added code.
2022-04-11 09:08:42 -07:00
Markus Pilman 099385928c Address review comments 2022-04-11 09:17:10 -06:00
Bharadwaj V.R 9d00eacfb2
Merge branch 'apple:main' into block-down 2022-04-10 21:50:53 -07:00
Markus Pilman 64ac66c1d0 fix merge conflict 2022-04-10 14:16:21 -06:00
Markus Pilman 16467262f0 Merge remote-tracking branch 'origin/main' into features/private-request-streams 2022-04-10 14:12:37 -06:00
Markus Pilman d8a0b57b6c clients have to listen on a port in simulation 2022-04-10 14:09:15 -06:00
Dan Lambright 9d433c1bef
Merge pull request #6764 from apple/vv
version-vector-prototype to main branch
2022-04-08 18:50:12 -04:00
Bharadwaj V.R fc666469af unit test fixes 2022-04-08 15:03:39 -07:00
Dan Lambright e43fde16ec formatting 2022-04-08 17:28:16 -04:00
Renxuan Wang 938e8ed996 Do not throw lookup_failed when resolving fails.
Instead, return an empty Optional<NetworkAddress>. For resolveWithRetry(), still return NetworkAddress because it retries until succeed.
2022-04-08 14:21:49 -07:00
Renxuan Wang 0f894509d9 Simplify the isCoordinator check in registerWorker. 2022-04-08 14:21:49 -07:00
Renxuan Wang bd6d765b83 Fix ConfigFollowerInterface constructor. 2022-04-08 14:21:49 -07:00
neethuhaneesha b7096c410f
Merge pull request #6795 from neethuhaneesha/rocksdb-blocksize
Adding rocksdb block size option.
2022-04-08 14:20:54 -07:00
Dan Lambright 1b3b4166c6
Merge branch 'main' into vv 2022-04-08 17:18:13 -04:00
Josh Slocum 6276cebad9
Blob integration (#6808)
* Fixing leaked stream with explicit notify failed before destructor

* better logic to prevent races in change feed fetching

* Found new race that makes assert incorrect

* handle server overloaded in initial read from fdb

* Handling more blob error types in granule retry

* Fixing rollback metadata problem, added better debugging

* Fixing version race when fetching change feed metadata

* Better racing split request handling

* fixing assert

* Handle change feed popped check in the blob worker

* fix: do not use a RYW transaction for a versionstamp because of randomize API version (#6768)

* more merge conflict issues

* Change feed destroy fixes

* Fixing change feed destroy and move race

* Check error condition in BG file req

* Using relative endpoints for blob worker interface

* Fixing bug in previous fix

* More destroy and move race fixes

* Don't update empty version on destroy in case it gets rolled back. moved() and removing will take care of ensuring it is not read

* Bug fix (#6796)

* fix: do not use a RYW transaction for a versionstamp because of randomize API version

* fix: if the initialSnapshotVersion was pruned, granule history was incorrect

* added a way to compress null bytes in printable()

* Fixing durability issue with moving and destroying change feeds

* Adding fix for not fully deleting files for a granule that child granules need to re-snapshot

* More destroy and move races

* Fixing change feed destroy and pop races

* Renaming bg prune to purge, and adding a C api and unit test for it

* more cleanup

* review comments

* Observability for granule purging

* better handling for change feed not registered

* Fixed purging bugs (#6815)

* fix: do not use a RYW transaction for a versionstamp because of randomize API version

* fix: if the initialSnapshotVersion was pruned, granule history was incorrect

* added a way to compress null bytes in printable()

* fixed a few purging bugs

Co-authored-by: Evan Tschannen <evan.tschannen@snowflake.com>
2022-04-08 14:15:25 -07:00
Zhe Wang 37054af7e2
Fix RocksDB Metrics (#6803)
* fix-metrics-in-rocksdb

* remain-GetIntProperty-for-checkRocksdbState

Co-authored-by: Zhe Wang <zhewang@Zhes-MacBook-Pro.local>
2022-04-08 17:07:22 -04:00