Commit Graph

3020 Commits

Author SHA1 Message Date
Junhyun Shim e4d204daf0 Rephrase function description 2022-05-10 17:00:57 +02:00
Vishesh Yadav 9173e2e19b Move GlobalConfig to DatabaseContext 2022-05-09 14:54:51 -07:00
Xiaoxi Wang 389e289fe4 merge upstream/main 2022-05-09 12:10:53 -07:00
Junhyun Shim 0339802014 Remove debug messages 2022-05-09 14:38:57 +02:00
Junhyun Shim 637044fd54 Add testcase for when client chain is invalid
Optionally allow the leaf certificates to have already expired
2022-05-09 14:01:39 +02:00
sfc-gh-tclinkenbeard 8553c28d74 Add invalid_throttle_quota_value error 2022-05-07 15:58:37 -07:00
Ata E Husain Bohra 33ae398268
REST KmsConnector implementation (#6994)
* REST KmsConnector implementation

Description
  diff-1: Address review comments.
          Add utility interface to Platform namespace to
          create and operate on tmpfile
 diff-2: Address review comments
         Link Boost::filesystem to CMake build process

Major changes includes:
1. Implement REST based KmsConnector implementation.
2. Salient features of the connector:
 2.1. Two required configuration are:
   a. Discovery KMS URLs - enable KMS discovery on bootstrap
   b. Endpoint path configuration to construct URI to fetch/refresh
      encryption keys
   c. Configuration to provide "validationTokens" to connect with
      external KMS. Patch implements file-based token validation scheme.
 2.2. On startup, RESTKmsConnector discovers KMS Urls and caches
      them in-memory. Extracts "validationTokens" based on input config.
 2.3. Expose endpoints to allow fetch/refresh of encryption keys.
 2.4. Defines JSON format to interact with external KMS - request &
      response payload format.
3. Extend Platform namespace with an interface to create and operate on
   tmp files.
4. Update Platform 'readFileBytes' and 'writeFileBytes' to leverage
   fstream supported implementation.

NOTE: KMS URLs fetched after initial discovery will be persisted using
      DynamicKnobs. It is TODO at the moment and shall be completed
      once DynamicKnobs is feature complete

Testing

Unit test to validation following:
1. Parsing on "validation tokens" logic.
2. Construction and parsing of REST JSON request and response strings.
2022-05-07 13:18:35 -07:00
Xiaoxi Wang 5bb70dda91 add remainedBytes method 2022-05-06 10:39:45 -07:00
Junhyun Shim e5f039acf8 Apply clang format 2022-05-06 19:10:42 +02:00
Junhyun Shim b6a200bd2c Set up a unit test to find the right setup for selective mTLS
To be modified or removed once implementation is complete
2022-05-06 19:10:02 +02:00
Junhyun Shim 767a37f7d2 Helper functions to generate certs and keys for TLS testing 2022-05-06 12:56:35 +02:00
Yi Wu 66f1c5c85a
Small BlobCipher and SimKmsConnector fixes and changes (#6936)
* SimKmsConnector fix domain id being unsigned
* SimKmsConnector fix returning cipher id 0 as latest key, which is invalid
* SimKmsConnector fix keys initialized as c-style strings with incorrect length and uninitialized bytes
* SimKmsConnector fix returning different keys for the same id after restart
* BlobCipher change APIs to return null reference when key not found
* BlobCipher insertCipherKey to return the inserted key
2022-05-04 14:09:31 -07:00
Dan Lambright e8adad38b0
Merge pull request #7057 from sbodagala/main
Address GRV cache and version vector incompatibility
2022-05-04 10:06:14 -04:00
Sreenath Bodagala 2102ed1eaa - Remove "stale_version_vector" error code. 2022-05-03 21:56:11 +00:00
sfc-gh-tclinkenbeard 225146176d Apply clang-format to fdbcli.actor.cpp and Net2.actor.cpp 2022-05-03 12:13:09 -07:00
sfc-gh-tclinkenbeard 258ba462e1 Remove !defined(_WIN32) guards for encryption code 2022-05-03 09:48:24 -07:00
sfc-gh-tclinkenbeard 06825775db Fix formatting of lines with TLS_OPTION_FLAGS 2022-05-02 22:56:06 -07:00
sfc-gh-tclinkenbeard 8ea68154bf Remove WITH_TLS CMake variable 2022-05-02 22:45:00 -07:00
sfc-gh-tclinkenbeard 475d66084d Remove ENCRYPTION_ENABLED macro 2022-05-02 22:26:31 -07:00
sfc-gh-tclinkenbeard 7f05221cfe Removed TLS_DISABLED macro 2022-05-02 22:15:27 -07:00
Andrew Noyes 7ed82c1ac5
Mac m1 has 16k pages (#7038)
Previously the page guard implementation assumed that the page size was
4k. Also check for mmap and mprotect returning errors.
2022-05-02 14:24:43 -07:00
Ray Jenkins dc9e782ccc
OpenTelemetry Tracing Perf Fixes (#6990) 2022-05-02 14:56:51 -05:00
Jingyu Zhou 0ca9761088 Fix IDE build warnings and errors 2022-05-01 16:20:57 -07:00
Sam Gwydir 5403a29ecb
add WolfSSL support (#6682)
remove extraneous include
2022-04-28 16:53:38 -07:00
A.J. Beamon cc72d541e4
Merge pull request #6971 from sfc-gh-svemuri/verify-tenant-prefixes-on-commit-proxy
Validate commit request tenant prefixes on commit proxy
2022-04-28 12:40:56 -07:00
pranavPandit1 a192264e7e missing file added 2022-04-28 08:35:42 -07:00
pranavPandit1 5794fd4e91 clang format corrected for file 2022-04-28 08:35:42 -07:00
pranavPandit1 195a196392 crc32 support added for ppc64le 2022-04-28 08:35:42 -07:00
Ata E Husain Bohra 333aadb903
Interface to enable clients to send/receive REST requests/responses (#6866)
* Interface to enable clients to send/receive REST requests/responses

Description

Major changes:
1. Add RESTClient interface enabling client to send/receive REST HTTP
   requests. Support REST APIs are: get, head, put, post, delete, trace
2. Add RESTUtil file introducing below interfaces:
 2.1. RESTUrl - Extract URI information: host, service, request-parameters.
 2.2. RESTConnectionPool-
      Connection establishment, life-cycle management, connection-pool (TTL)
 2.3. RESTClientKnobs - supports REST Knob parameter management and updates

Testing

Unit test - fdbrpc/RESTClient, fdbrpc/RESTUtils
2022-04-27 12:17:52 -07:00
Sagar Vemuri ed60afc964 Handle versionstamped keys, and include additonal trace information 2022-04-27 11:12:01 -07:00
Sagar Vemuri 35baf4d745 Validate commit request tenant prefixes on commit proxy 2022-04-26 13:28:29 -07:00
Markus Pilman cbe4a873d2 Merge remote-tracking branch 'origin/main' into features/validate-trace-events-in-simulation 2022-04-25 17:39:29 -06:00
Ray Jenkins 1c5bf135d5
Revert "Migrate to OpenTelemetry tracing. (#6855)" (#6941)
This reverts commit 5df3bac110.
2022-04-25 09:29:56 -05:00
Bharadwaj V.R 08323de905 fix formatting 2022-04-22 15:10:24 -07:00
Bharadwaj V.R 588b2fa509
Merge branch 'main' into block-down 2022-04-22 14:53:09 -07:00
Bharadwaj V.R 988a70f064
Merge pull request #6858 from sfc-gh-bvr/dbcorever
Track newest and lowest compatible protocol versions in DBCoreState
2022-04-22 14:46:21 -07:00
Bharadwaj V.R 4a5c2268da
Merge branch 'apple:main' into block-down 2022-04-22 14:45:54 -07:00
Markus Pilman 9e65e15b45 Merge remote-tracking branch 'origin/main' into features/validate-trace-events-in-simulation 2022-04-22 15:39:55 -06:00
Ata E Husain Bohra 6c9030408e Fix Build: use boost::hash to compute hash for std::pair
Description

Fix Build: use boost::hash to compute hash for std::pair

Testing

1. Build - gcc/clang
2. Simulation test: EncryptKeyProxyTest, EncryptionOps
3. Unit test: flow/BlobCipher
4. Running 10k correctness Joshua run
2022-04-22 13:16:30 -07:00
Ata E Husain Bohra 670d40ef79
FDB native KMS Connector Framework (#6846)
* FDB native KMS Connector Framework

Description

Major changes includes:
1. Framework code to enable FDB native KMS connector implementation.
2. SERVER_KNOBS->KMS_CONNECTOR_TYPE controls the connector type selection.
3. KmsConnectorInterface endpoint definitions, every KMSConnector
   implementation needs to support defined endpoints.
4. Update EncryptKeyProxy to leverage KmsConnectorInterface endpoints
   to fetch encryption keys on-demand and/or periodic refreshes.
   Integrate SimKmsConnector implementation.
5. Implement SimKmsConnector by leveraging existing SimKeyProxy
   implementation.

Testing

Unit test: fdbserver/SimKmsConnector
Simulation: EncryptKeyProxy
2022-04-22 08:53:39 -07:00
Bharadwaj V.R 822eb9ec26
Merge branch 'apple:main' into dbcorever 2022-04-22 08:08:34 -07:00
Bharadwaj V.R ed08cfbf52
Merge branch 'apple:main' into block-down 2022-04-22 06:19:38 -07:00
Ata E Husain Bohra 04ecd8e08f
Revert "Revert "Update 'salt' details for EncryptHeader AuthToken details (#6881)" (#6902)" (#6922)
Description

Major changes proposed:
1. This reverts commit f38b2e8209.
2. Also add fix for Valgrind failure due to unintialized variables.
3. Improve checks to catch is cipherKey details cached in BlobCipherKeyCache
   isn't as expected

Testing

Overall correctness: 10K (20220421-193911-ahusain-foundationdb-a730e5cb38541e20)
EncyrptionOps correctness: 100K (20220421-194315-ahusain-foundationdb-29c598a8b9420430)
EncryptionOps Valgrind: 100 (20220421-194434-ahusain-foundationdb-7fc5f98eddc0921a)
2022-04-21 18:57:56 -07:00
Bharadwaj V.R 449a315c06
Merge branch 'apple:main' into block-down 2022-04-21 09:37:42 -07:00
Bharadwaj V.R c20fb6ef6d
Merge branch 'apple:main' into dbcorever 2022-04-21 09:37:29 -07:00
Markus Pilman f38b2e8209
Revert "Update 'salt' details for EncryptHeader AuthToken details (#6881)" (#6902)
This reverts commit a38318a6ac.
2022-04-21 09:04:40 -07:00
Markus Pilman bbb1392aad Merge remote-tracking branch 'origin/main' into features/validate-trace-events-in-simulation 2022-04-21 08:24:18 -06:00
Markus Pilman 85757eb47c
Update flow/Trace.cpp 2022-04-20 15:30:42 -06:00
Renxuan Wang e40cc8722c
A few hostname improvements. (#6825)
* Add tryResolveHostnames() in connection string.

* Add missing hostname to related interfaces.

* Do not pass RequestStream into *GetReplyFromHostname() functions.

Because we are using new RequestStream for each request anyways. Also, the passed in pointer could be nullptr, which results in seg faults.

* Add dynamic hostname resolve and reconnect intervals.

* Address comments.
2022-04-20 13:42:46 -07:00
Markus Pilman 3335b2686e
Merge branch 'main' into features/validate-trace-events-in-simulation 2022-04-20 12:03:33 -06:00
Markus Pilman f7a8ebf818
Update flow/Trace.cpp
Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>
2022-04-20 11:55:01 -06:00
Bharadwaj V.R 8a0ce5bfc7 Rename isInvalidMagic and fix formatting 2022-04-20 09:17:28 -07:00
Bharadwaj V.R a2449041ea Fix formatting of ProtocolVersion.h 2022-04-20 08:45:53 -07:00
Bharadwaj V.R 4d6f4ecd9c
Merge branch 'main' into dbcorever 2022-04-20 08:23:34 -07:00
Ray Jenkins 5df3bac110
Migrate to OpenTelemetry tracing. (#6855) 2022-04-20 09:26:37 -05:00
Bharadwaj V.R a711c55061
Merge branch 'apple:main' into dbcorever 2022-04-20 06:16:27 -07:00
Bharadwaj V.R 89af5561f1
Merge branch 'apple:main' into block-down 2022-04-20 06:13:01 -07:00
Markus Pilman d4ee7be1d7 Reduce excessive tracing and fail after 1M traces 2022-04-19 21:11:51 -06:00
Andrew Noyes 297d831192
Put guard pages next to fast alloc memory (#6885)
* Put guard pages next to fast alloc memory

I verified that we can now detect #6753 without creating tons of
threads.

* Use pageSize instead of 4096

* Don't include mmapInternal for windows
2022-04-19 11:22:35 -07:00
Bharadwaj V.R 51ef860612
Merge branch 'apple:main' into block-down 2022-04-19 10:16:56 -07:00
Ata E Husain Bohra a38318a6ac
Update 'salt' details for EncryptHeader AuthToken details (#6881)
* Update 'salt' details for EncryptHeader AuthToken details

Description

Major changes:
1. Add 'salt' to BlobCipherEncryptHeader::cipherHeaderDetails.
2. During decryption it is possible that BlobKeyCacheId doesn't
    contain required baseCipherDetails. Add API to KeyCache to
    allowing re-populating of CipherDetails with a given 'salt'
3. Update BaseCipherKeyIdCache indexing using {BaseCipherKeyId, salt}
    tuple. FDB processes leverage BlobCipherKeyCache to implement
    in-memory caching of cipherKeys, given EncryptKeyProxy supplies
    BaseCipher details, each encryption participant service would
    generate its derived key by using different 'salt'. Further,
    it is possible to cache multiple {baseCipherKeyId, salt} tuples;
    for instance: CP encrypted mutations being deciphered by
    StorageServer etc.

Testing

1. Update EncyrptionOps simulation test to simulate KeyCache miss
2. Update BlobCipher unit tests to validate above mentioned changes
2022-04-18 22:01:56 -07:00
Bharadwaj V.R b10ba334de
Merge branch 'apple:main' into block-down 2022-04-18 08:56:04 -07:00
Bharadwaj V.R 47a1a282e9
Merge branch 'apple:main' into dbcorever 2022-04-15 15:47:14 -07:00
Andrew Noyes 29cf5f1fbf
Fix an ASSERT when an fdbcli command times out (#6857)
* Re-throw operation_cancelled

There's a few places in fdbcli where we don't rethrow operation
cancelled but wait on a future. It's very unusual that you don't want to
rethrow operation_cancelled.

* Update ASSERT

It's possible to get error_code_broken_promise here if the network has
already shutdown.
2022-04-14 12:09:25 -07:00
Bharadwaj V.R 78c4771f9d Fix ProtocolVersion.h formatting diffs 2022-04-14 11:10:39 -07:00
Bharadwaj V.R 3c080c6382
Merge branch 'apple:main' into dbcorever 2022-04-14 10:39:32 -07:00
Bharadwaj V.R 85904ec739
Merge branch 'apple:main' into block-down 2022-04-14 07:23:16 -07:00
Junhyun Shim edc659d339 Use camelCase & move error code to 6xxx 2022-04-13 21:11:52 +02:00
Junhyun Shim b6a0c0f942 Merge remote-tracking branch 'upstream/main' into tenant-token-sign 2022-04-13 19:55:37 +02:00
Bharadwaj V.R 3cbe7f7d63 Update min compatible version to 7.1 2022-04-13 09:47:12 -07:00
Bharadwaj V.R 2f2ece073c Add sw version tracking to DBCoreState 2022-04-13 08:05:22 -07:00
Bharadwaj V.R 3e50dda79f
Merge branch 'apple:main' into block-down 2022-04-13 06:58:59 -07:00
Lukas Joswiak 0783b044fe Add protocol feature 2022-04-12 14:35:09 -07:00
Bharadwaj V.R e4f22e5394
Merge branch 'apple:main' into block-down 2022-04-12 10:50:02 -07:00
Bharadwaj V.R 46b22b79ea Revert "Add SW versions to DBCore state"
This reverts commit 50b45f1bf9.
2022-04-12 10:47:49 -07:00
Aaron Molitor 6b2d0e5d6b update version to 7.2.0 -- protocol version 2022-04-11 23:23:27 -05:00
Aaron Molitor c440365779 update version to 7.2.0 -- pr comment protocol version 2022-04-11 23:23:27 -05:00
Aaron Molitor 8db8545db2 update version to 7.2.0 -- protocol version changes 2022-04-11 23:23:27 -05:00
Bharadwaj V.R 50b45f1bf9 Add SW versions to DBCore state 2022-04-11 12:01:12 -07:00
Vaidas Gasiunas ca563466a6
Merge pull request #6401 from sfc-gh-mpilman/features/private-request-streams
Features/private request streams
2022-04-11 18:29:06 +02:00
Ata E Husain Bohra 933e5bbd2e
EncryptKeyProxy server APIs for simulation runs. (#6727)
* EncryptKeyProxy server APIs for simulation runs.

Description

  diff-2: FlowSingleton util class
              Bug fixes
  diff-1: Expected errors returned to the caller

Major changes proposed are:
1. EncryptKeyProxy server APIs:
 1.1. Lookup Cipher details via BaseCipherId
 1.2. Lookup latest Cipher details via encryption domainId.
2. EncyrptKeyProxy implements caches indexed by: baseCipherId &
   encyrptDomainId
3. Periodic task to refresh domainId indexed cache to support
   'limiting cipher lifetime' abilities if supported by
   external KMS solutions.

Testing

EncyrptKeyProxyTest workload to validate the newly added code.
2022-04-11 09:08:42 -07:00
Bharadwaj V.R 9d00eacfb2
Merge branch 'apple:main' into block-down 2022-04-10 21:50:53 -07:00
Markus Pilman 16467262f0 Merge remote-tracking branch 'origin/main' into features/private-request-streams 2022-04-10 14:12:37 -06:00
Dan Lambright 9d433c1bef
Merge pull request #6764 from apple/vv
version-vector-prototype to main branch
2022-04-08 18:50:12 -04:00
Renxuan Wang 938e8ed996 Do not throw lookup_failed when resolving fails.
Instead, return an empty Optional<NetworkAddress>. For resolveWithRetry(), still return NetworkAddress because it retries until succeed.
2022-04-08 14:21:49 -07:00
Renxuan Wang f3a8ac21be Change resolve functions in hostname to return network address. 2022-04-08 14:21:49 -07:00
Renxuan Wang a752faf122 resolveFinish should be triggered when resolve succeeds. 2022-04-08 14:21:49 -07:00
Dan Lambright 1b3b4166c6
Merge branch 'main' into vv 2022-04-08 17:18:13 -04:00
Josh Slocum 6276cebad9
Blob integration (#6808)
* Fixing leaked stream with explicit notify failed before destructor

* better logic to prevent races in change feed fetching

* Found new race that makes assert incorrect

* handle server overloaded in initial read from fdb

* Handling more blob error types in granule retry

* Fixing rollback metadata problem, added better debugging

* Fixing version race when fetching change feed metadata

* Better racing split request handling

* fixing assert

* Handle change feed popped check in the blob worker

* fix: do not use a RYW transaction for a versionstamp because of randomize API version (#6768)

* more merge conflict issues

* Change feed destroy fixes

* Fixing change feed destroy and move race

* Check error condition in BG file req

* Using relative endpoints for blob worker interface

* Fixing bug in previous fix

* More destroy and move race fixes

* Don't update empty version on destroy in case it gets rolled back. moved() and removing will take care of ensuring it is not read

* Bug fix (#6796)

* fix: do not use a RYW transaction for a versionstamp because of randomize API version

* fix: if the initialSnapshotVersion was pruned, granule history was incorrect

* added a way to compress null bytes in printable()

* Fixing durability issue with moving and destroying change feeds

* Adding fix for not fully deleting files for a granule that child granules need to re-snapshot

* More destroy and move races

* Fixing change feed destroy and pop races

* Renaming bg prune to purge, and adding a C api and unit test for it

* more cleanup

* review comments

* Observability for granule purging

* better handling for change feed not registered

* Fixed purging bugs (#6815)

* fix: do not use a RYW transaction for a versionstamp because of randomize API version

* fix: if the initialSnapshotVersion was pruned, granule history was incorrect

* added a way to compress null bytes in printable()

* fixed a few purging bugs

Co-authored-by: Evan Tschannen <evan.tschannen@snowflake.com>
2022-04-08 14:15:25 -07:00
sfc-gh-ngoyal b742149869
Merge pull request #6810 from sfc-gh-ngoyal/misc-fixes
Convert the ScopeEventFieldTypeMismatch event to a Sev40 in simulation.
2022-04-08 12:14:48 -07:00
Dan Lambright c106847e3e
Merge branch 'main' into vv 2022-04-08 15:05:51 -04:00
Ata E Husain Bohra 81c7834d06
Encryption header authentication tokens (#6750)
* Encryption header authentication tokens

Description

  diff-1: Allow NONE AuthTokenMode operations
          Address review comments

Major changes proposed are:
1.Encryption header support two modes of generation 'authentication tokens':
  a) SingleAuthTokenMode: the scheme generates single crypto-secure auth
     token to protect {cipherText + header} payload. Scheme is geared towards
     optimizing cost due to crypto-secure auth-token generation, however,
     on decryption client needs to be read 'header' + 'encrypted-buffer'
     to validate the 'auth-token'. The scheme is ideal for usecases where
     payload represented by the encryptionHeader is not large and it is
     desirable to minimize CPU/latency penalty due to crypto-secure ops,
     such as: CommitProxies encrypted inline transactions,
     StorageServer encrypting pages etc.
  b) MultiAuthTokenMode: Scheme generates separate authTokens for
     'encrypted buffer' & 'encryption-header'. The scheme is ideal where
     payload represented by encryptionHeader is large enough such that it
     is desirable to optimize cost of upfront reading full 'encrypted buffer',
     compared to reading only encryptionHeader and ensuring its sanity;
     for instance: backup-files
2. Leverage full crypto-secure digest as 'authentication token'

Testing

Update EncryptionOps simulation test
Update BlobCipher unit test
20220408-182229-ahusain-foundationdb-7fd2e4b19328cd44
20220408-175754-ahusain-foundationdb-5352e37e1dcabfc8
2022-04-08 11:32:05 -07:00
Steve Atherton 11a5d14a11
Merge pull request #6108 from sfc-gh-satherton/redwood-header-changes
Redwood page format refactor to support format evolution, forensic analysis and future encryption scheme
2022-04-08 10:59:12 -07:00
negoyal 4b77074e21 devFormat 2022-04-08 10:45:29 -07:00
negoyal 9e5c0a782c Convert the ScopeEventFieldTypeMismatch event to a Sev40 in simulation. 2022-04-08 10:30:53 -07:00
Dan Lambright 5bdc525353
Merge branch 'main' into vv 2022-04-08 13:16:04 -04:00
Markus Pilman 7631d299bf Merge remote-tracking branch 'origin/main' into features/private-request-streams 2022-04-08 09:58:56 -06:00
Ray Jenkins 25c5fabc4e
Merge pull request #6798 from sfc-gh-rjenkins/tracing_test_improvements
Simplify MessagePack string decoding logic.
2022-04-08 09:28:53 -05:00
Bharadwaj V.R 597db0b3fd Merge branch 'block-down' of github.com:sfc-gh-bvr/foundationdb into block-down 2022-04-07 14:46:56 -07:00
Bharadwaj V.R 718833f73c
Merge branch 'apple:main' into block-down 2022-04-07 14:46:44 -07:00
Bharadwaj V.R 1e2c4a0844 Rename some variables from clarity; reorganize compatibility test into a new actor 2022-04-07 14:46:27 -07:00
Ray Jenkins 6567409a39 Simplify MessagePack string decoding logic.
Remove use of strcmp, handle reading size in readMPString.
2022-04-07 14:55:35 -05:00
Markus Pilman bf956f5630 Merge remote-tracking branch 'origin/main' into features/private-request-streams 2022-04-07 13:29:27 -06:00
Lukas Joswiak 32d4e29e72 Double existing FastAlloc limit 2022-04-07 10:25:10 -07:00
Bharadwaj V.R 58a6442d1f Add tracing and fix the version write and update logic 2022-04-06 21:06:01 -07:00
Yi Wu 994b8c92f8
Add option to limit resident memory and remove default memory limit (#6719)
Changing `memory` option to limit resident memory instead of virtual memory, in config file and fdbserver/fdbbackup/fdbcli command-line argument. Since `rlimit` doesn't support limiting virtual memory, the current implementation have both of fdbmonitor and the fdbserver/fdbbackup process checking process RSS periodically and kill and restart the process if the limit is exceeded.

Adding a new `memory_vsize` option to limit virtual memory, if backward-compatible behavior is desired.

closes #6671, closes #6672
2022-04-06 20:06:24 -07:00
Renxuan Wang 267c4deaee
Add tryGetReplyFromHostname() and retryGetReplyFromHostname(). (#6761)
* Add hostname to coordination interfaces.

* Add tryGetReplyFromHostname() and retryGetReplyFromHostname().

* Change tryGetReplyFromHostname() to call hostname.resolve().

* Add throw for actor_cancelled.
2022-04-06 10:47:00 -07:00
Andrew Noyes 966402a7a2
Allocate at least sizeof(ArenaBlock) for an ArenaBlock (#6770)
* Allocate at least sizeof(ArenaBlock) for an ArenaBlock

* Fix message pack unit test

Previously we were using only the 4 least significan bits as the length
of a message pack string, but it should be 5 according to https://github.com/msgpack/msgpack/blob/master/spec.md#str-format-family
2022-04-05 18:14:10 -07:00
Dan Lambright 60c55e0785 Merge remote-tracking branch 'origin/version-vector-prototype' into vv 2022-04-05 11:17:39 -04:00
Bharadwaj V.R 1d23d92e40 Fix typo in the name of new error code, and add a few UTs 2022-04-04 19:31:09 -07:00
Renxuan Wang 465ff712b6
Move Hostname to its own files. (#6759)
* Change DNS cache to use std::map.

Revert commit 90c259d84e, because if we use unordered_map, toString() can be inconsistent.

* Move ClientKnob::COORDINATOR_HOSTNAME_RESOLVE_DELAY to FlowKnob::HOSTNAME_RESOLVE_DELAY.

* Move Hostname to its own files.

Also, add resolve-related variables and functions in Hostname.
2022-04-04 19:04:51 -07:00
Bharadwaj V.R 3fbbf415e7 Properly encapsulate SWVersion and create a couple of UTs for sw version testing 2022-04-04 18:42:52 -07:00
Ray Jenkins bb9b9d2471
OpenTelemetry API Tracing. (#6478)
* OTEL Span Implementation.

* Addi trace logging, refactor constructors, unit tests.

* Unit tests for creating OTELSpans

* refactor flag names

* Additional comments.

* Formatting.

* Add back Arena.h include

* cleanup header includes

* Remove include cstddef.

* Remove memory include.

* Remove trailing commas on enums.

* Enum formatting.

* Changing SpanStatus enum from ERROR to ERR to see if it is clashing with Windows.h.

* Move OTELEvents to SmallVectorRef<KeyValueRef>.

* Clean up unused includes.

* Unit tests

* Const reference arguments for OTEL constructors and additional addAttribute
unit tests. Adding return of OTELSpan reference on addAttribute.

* Formatting.

* Begin messagepack encoding tests.

* Formatting.

* MessagePack encoding unit tests.

* Formatting.

* Remove swapBinary.

* remove ambiguous helper methods

* Formatting fixes

* Fix ambiguous calls in AddEvents unit tests.

* Include AddAttributes unit test.

* descope windows for UDP encoding test

* Move ifndef WIN32 around MPEncoding unit test.

* Fix AddEvents Attributes size assertion.

* Formatting.

* Enable AddLinks unit test.

* Full MP encoding testing.

* Fix for encoding longer strings with MessagePack and unit test.

* Remove unnecessary header includes and serialize_string_ref function.

* Fix typos

* Update flow/Tracing.actor.cpp

Co-authored-by: Lukas Joswiak <lukas.joswiak@snowflake.com>

* Update flow/Tracing.actor.cpp

Co-authored-by: Lukas Joswiak <lukas.joswiak@snowflake.com>

* Use ASSERT_WE_THINK and add logging.

We don't want people creating incredibly large traces, so we are only
supporting a subset of MessagePack collection and string sizes. Assert
and log when we hit these unsupported sizes.

* Remove TODOs no longer applicable.

* Refactor OTELEvent to OTELEventRef.

* Remove unnecessary public declaration in struct.

* fix OTELEventRef attribute size assertion

* Formatting

Co-authored-by: Lukas Joswiak <lukas.joswiak@snowflake.com>
2022-04-04 17:55:38 -07:00
Renxuan Wang 5a336655f1 Use unordered_map in DNS cache. 2022-04-04 15:08:17 -07:00
Renxuan Wang 7da31857b7 Address comments. 2022-04-04 15:08:17 -07:00
Renxuan Wang e548c0d604 Add DNS cache. 2022-04-04 15:08:17 -07:00
Renxuan Wang ff934ca2ad Change MockDNS to DNSCache. 2022-04-04 15:08:17 -07:00
Renxuan Wang ebe928e7e1 Throw lookup_failed() when hostname resolving fails. 2022-04-04 15:08:17 -07:00
Jingyu Zhou 5861ff2dc6
Merge pull request #6717 from sfc-gh-ajbeamon/thread-future-safety-check
Disallow anonymous standalone thread futures in safeThreadFutureToFuture
2022-04-04 13:39:37 -07:00
Steve Atherton 47d1a7b373 Merge commit '38190ad7e787d759f88687e83af0ebabdbc600e8' into redwood-header-changes
# Conflicts:
#	flow/error_definitions.h
2022-04-03 00:39:53 -07:00
Steve Atherton d0152d8442 Fix error description. 2022-04-03 00:37:37 -07:00
Steve Atherton 39fb0a44d7 Merge commit 'f09bdc840c00d712487500b9e752d87cedb1964a' into redwood-header-changes
# Conflicts:
#	fdbserver/VersionedBTree.actor.cpp
2022-04-03 00:37:01 -07:00
Steve Atherton 38190ad7e7
Merge pull request #6737 from sfc-gh-satherton/fix-storage-timestamps
Change storage metadata and perpetual wiggle timestamps to double epoch seconds
2022-04-02 09:47:23 -07:00
Jingyu Zhou 64d4658034 Merge branch 'main' into vv
Fix Conflicts:
	flow/error_definitions.h
2022-04-01 21:49:24 -07:00
Steve Atherton 6eb1c2ae48
Merge pull request #6574 from sfc-gh-satherton/redwood-rare-bugs
Rare correctness bug fixes in Redwood
2022-04-01 16:40:22 -07:00
Bharadwaj V.R a9a84dcafd Merge branch 'apple-main' into block-down 2022-04-01 15:52:18 -07:00
Bharadwaj V.R be70a57cae Merge branch 'main' of https://github.com/apple/foundationdb into apple-main 2022-04-01 15:50:09 -07:00
Bharadwaj V.R 008bd93cce Change the default construction for SWVersion and add error code for sw version incompatibility 2022-04-01 15:45:24 -07:00
Bharadwaj V.R f749aac223
Merge branch 'apple:main' into ssupdateb4registration 2022-03-31 18:59:44 -07:00
Chaoguang Lin 7d365bd1bb
Remote ikvs debugging (#6465)
* initial structure for remote IKVS server

* moved struct to .h file, added new files to CMakeList

* happy path implementation, connection error when testing

* saved minor local change

* changed tracing to debug

* fixed onClosed and getError being called before init is finished

* fix spawn process bug, now use absolute path

* added server knob to set ikvs process port number

* added server knob for remote/local kv store

* implement simulator remote process spawning

* fixed bug for simulator timeout

* commit all changes

* removed print lines in trace

* added FlowProcess implementation by Markus

* initial debug of FlowProcess, stuck at parent sending OpenKVStoreRequest to child

* temporary fix for process factory throwing segfault on create

* specify public address in command

* change remote kv store knob to false for jenkins build

* made port 0 open random unused port

* change remote store knob to true for benchmark

* set listening port to randomly opened port

* added print lines for jenkins run open kv store timeout debug

* removed most tracing and print lines

* removed tutorial changes

* update handleIOErrors error handling to handle remote-ikvs cases

* Push all debugging changes

* A version where worker bug exists

* A version where restarting tests fail

* Use both the name and the port to determine the child process

* Remove unnecessary update on local address

* Disable remote-kvs for DiskFailureCycle test

* A version where restarting stuck

* A version where most restarting tests green

* Reset connection with child process explicitly

* Remove change on unnecessary files

* Unify flags from _ to -

* fix merging unexpected changes

* fix trac.error to .errorUnsuppressed

* Add license header

* Remove unnecessary header in FlowProcess.actor.cpp

* Fix Windows build

* Fix Windows build, add missing ;

* Fix a stupid bug caused by code dropped by code merging

* Disable remote kvs by default

* Pass the conn_file path to the flow process, though not needed, but the buildNetwork is difficult to tune

* serialization change on readrange

* Update traces

* Refactor the RemoteIKVS interface

* Format files

* Update sim2 interface to not clog connections between parent and child processes in simulation

* Update comments; remove debugging symbols; Add error handling for remote_kvs_cancelled

* Add comments, format files

* Change method name from isBuggifyDisabled to isStableConnection; Decrease(0.1x) latency for stable connections

* Commit the IConnection interface change, forgot in previous commit

* Fix the issue that onClosed request is cancelled by ActorCollection

* Enable the remote kv store knob

* Remove FlowProcess.actor.cpp and move functions to RemoteIKeyValueStore.actor.cpp; Add remote kv store delay to avoid race; Bind the child process to die with parent process

* Fix the bug where one process starts storage server more than once

* Add a please_reboot_remote_kv_store error to restart the storage server worker if remote kvs died abnormally

* Remove unreachable code path and add comments

* Clang format the code

* Fix a simple wait error

* Clang format after merging the main branch

* Testing mixed mode in simulation if remote_kvs knob is enabled, setting the default to false

* Disable remote kvs for PhysicalShardMove which is for RocksDB

* Cleanup #include orders, remove debugging traces

* Revert the reorder in fdbserver.actor.cpp, which fails the gcc build

Co-authored-by: “Lincoln <“lincoln.xiao@snowflake.com”>
2022-03-31 17:08:59 -07:00
Bharadwaj V.R 8ff3b7d8a2
Merge branch 'apple:main' into ssupdateb4registration 2022-03-31 16:12:06 -07:00
Tao Lin 001909be08
Fixes for when getMappedRange cannot parse as tuple (#6665) 2022-03-31 14:06:45 -07:00
Jingyu Zhou 4fd414a8ed Merge branch 'main' into vv
Fix Conflicts:
	fdbclient/NativeAPI.actor.cpp
2022-03-31 09:38:36 -07:00
Bharadwaj V.R 60c146bd30
Merge branch 'apple:main' into block-down 2022-03-31 00:11:44 -07:00
Bharadwaj V.R c009eba7ed Add last-sw-version tracking to SWVersion structure 2022-03-31 00:09:58 -07:00
Jingyu Zhou cfcf0f152c Merge branch 'main-4a085fc84' into vv
Fix Conflicts:
	fdbclient/NativeAPI.actor.cpp
	fdbserver/ClusterRecovery.actor.cpp
	fdbserver/MasterInterface.h
	fdbserver/masterserver.actor.cpp
	flow/error_definitions.h
2022-03-30 22:28:06 -07:00
Bharadwaj V.R 88c8b5f2a9
Merge branch 'apple:main' into ssupdateb4registration 2022-03-30 21:16:55 -07:00
Jingyu Zhou b34f4052cd Merge branch 'main-f28dfc12b' into vv
Fix Conflicts:
	fdbclient/MultiVersionTransaction.actor.cpp
	fdbclient/NativeAPI.actor.cpp
	fdbclient/NativeAPI.actor.h
	fdbserver/storageserver.actor.cpp
2022-03-30 21:01:25 -07:00
Jingyu Zhou 00b57d4cce Merge branch 'main-67eba5ec7' into vv
Fix Conflicts:
	fdbclient/CommitProxyInterface.h
	fdbclient/NativeAPI.actor.cpp
	fdbclient/StorageServerInterface.h
	fdbserver/CommitProxyServer.actor.cpp
	fdbserver/storageserver.actor.cpp
2022-03-30 20:05:55 -07:00
Steve Atherton 6744e9e4f9 Change timestamps used in storage server metadata and perpetual wiggle metrics to epoch seconds, stored as doubles, and stringified as either floating point epoch seconds or timestamp strings of the form "2013-04-28 20:57:01.000 +0000". 2022-03-30 18:57:06 -07:00
Steve Atherton 2c9b2dd005 Merge commit '1b919f52e928e8a72d5acba9175eae32ed4b0c90' into redwood-rare-bugs
# Conflicts:
#	flow/ThreadHelper.actor.h
2022-03-30 18:21:03 -07:00
Steve Atherton 84f9e00258 Remove duplicative generic actor repeatEvery() since recurring() exists. 2022-03-30 18:10:29 -07:00
Andrew Noyes 1b919f52e9
Combine vector_like_traits::{insert,reserve} (#6689)
* Combine vector_like_traits::{insert,reserve}

and explain semantics better. This should make it more clear what
implementers need to do when implementing the vector_like_traits
concept.

* Update std::unordered_set vector_like_traits impl
2022-03-30 16:29:35 -07:00
Jingyu Zhou e9659b5dd4 Merge branch 'master-PR-6500' into vv
Fix Conflicts:
	fdbclient/CommitProxyInterface.h
	fdbclient/NativeAPI.actor.cpp
	fdbserver/masterserver.actor.cpp
2022-03-30 14:53:49 -07:00
Steve Atherton 2a52c76b7a Added INetwork::timer_int() for convenience. Clarified what timer_int() actually returns in header comments. 2022-03-30 14:47:24 -07:00
Steve Atherton 75247affa3 Renamed member for better readability. 2022-03-30 13:39:33 -07:00
Steve Atherton 5d74e4d091 Added comments to explain some invariants with ThreadMultiCallback and ThreadCallback and how they are enforced. 2022-03-30 12:38:57 -07:00
Steve Atherton e6457b1656 A few changes for clarity / readability. 2022-03-30 11:31:46 -07:00
Steve Atherton 01facc8dfa Refactored callback tracking in ThreadCallback and ThreadMultiCallback to not use an unordered_map of pointers to prevent it from falsely triggering the DEBUG_DETERMINISM check, plus it is lower overhead, saving about 6% CPU in the AbortableSingleAssignmentVar unit test. 2022-03-29 21:13:13 -07:00
Steve Atherton 971aa2dc4e Refactored callback tracking in ThreadCallback and ThreadMultiCallback to not use an unordered_map of pointers to prevent it from falsely triggering the DEBUG_DETERMINISM check, plus it is lower overhead, saving about 6% CPU in the AbortableSingleAssignmentVar unit test. 2022-03-29 21:09:26 -07:00