Commit Graph

25868 Commits

Author SHA1 Message Date
Junhyun Shim eb0bc70ea8
Fix flaky ArenaBlock wipe test (#9960) 2023-04-14 20:40:56 +02:00
Yanqin Jin 2959d07797
Add test coverage for metacluster operations via fdbcli (#9802)
Add test coverage for metacluster operations via fdbcli

Test plan:

```bash
mkdir build && cd build && cmake -G Ninja ..
ninja fdbcli fdbserver fdbmonitor
ctest -R metacluster_fdbcli_tests
```
2023-04-14 07:42:55 -07:00
Josh Slocum 370feaa3c9
refactoring and adding future compatibility to blob range metadata (#9955)
* refactoring and adding future compatibility to blob range metadata

* formatting
2023-04-13 15:06:50 -05:00
Jon Fu 30132ebac6
Expose list_blobbified_ranges for tenants in python API (#9940)
* add tenant getId api to fdbcli

* expose list_blobbified_ranges for tenants in python API
2023-04-13 13:05:37 -04:00
Ata E Husain Bohra fe0a4df06a
EaR: Implement Key Check Value semantics (#9936)
* EaR: Implement Key Check Value semantics

Description

Key Check Value (KCV) is a checksum of cryptographic encryption key
used to validate encryption keys's integrity. FDB Encryption at-rest
relies on external KMS to supply encryption keys.

Patch proposes following major changes:
1. Implement Sha256 based KCV implementation to protect against
'baseCipher' corruption in two possible scenarios:
 a) potential corruption external to FDB
 b) potential corruption within FDB processes.
2. Scheme persists computed KCV token in block encryption header,
which then gets validated as part of header validation during
decryption.
3. FDB Encryption key derivation uses HMAC_SHA256 digest generation
scheme, which allows max 64 bytes of 'cipher buffer', patch add
required check to ensure 'baseCipher' length are within bounds.
OpenSSL HMAC underlying call ignores extra length if supplied, however,
it weakens the security guarantees, hence, disallowed.

Testing

devRunCorrectness - multiple 500K runs
Valgrind & Asan - BlobCipherUnit, RESTKMSUnit, BlobGranuleCorrectness*,
EncryptionOps, EncryptKeyProxyTest
2023-04-12 14:29:31 -07:00
Josh Slocum e4fbf2fd59
properly handling BM targeted restart injection during BM recovery and if BM has already been restarted (#9945) 2023-04-12 12:22:31 -05:00
Vaidas Gasiunas 963630626c
Avoid loop profiler deadlocking in sanitizer builds by not calling backtrace (#9953) 2023-04-12 19:19:23 +02:00
A.J. Beamon bcb8d01bc4
Merge pull request #9956 from sfc-gh-ajbeamon/fix-can-kill-processes-in-multi-region-config
Fix canKillProcesses to ignore region config when usableRegions is 1
2023-04-12 10:07:17 -07:00
Xiaoxi Wang b50459dd12 change the initial value of CPU pivot 2023-04-12 09:33:05 -07:00
Xiaoxi Wang 737dd09572 set better knob for cpu cutoff unit test 2023-04-12 09:33:05 -07:00
Xiaoxi Wang 7ce7e3c99e add cpu cutoff unit test 2023-04-12 09:33:05 -07:00
Xiaoxi Wang f7061debde remove unused CPU knob; add comments for EligibilityCounter 2023-04-12 09:33:05 -07:00
Xiaoxi Wang 16a8ea0867 move EligibilityCounter to DataDistributionTeam.h; fix ParallelTCInfo compilation error; add MockDD definition of getStorageStats 2023-04-12 09:33:05 -07:00
Xiaoxi Wang bfae1f31e4 fix getCount() implementation 2023-04-12 09:33:05 -07:00
Xiaoxi Wang a0ee303601 remove smoothedCPU 2023-04-12 09:33:05 -07:00
Xiaoxi Wang b0fe14aed5 getTeam based on EligiblityCount 2023-04-12 09:33:05 -07:00
Xiaoxi Wang e427e21350 use smoothed cpu for team selection 2023-04-12 09:33:05 -07:00
Xiaoxi Wang 7ca44124d4 explain what does pivot ratio mean; fix the knob assertion 2023-04-12 09:33:05 -07:00
Xiaoxi Wang 5648f827a0 adjust CPU pivot knobs to hack simulation test 2023-04-12 09:33:05 -07:00
Xiaoxi Wang 990ad26d8b fix segment fault caused by wrong iterator and index type 2023-04-12 09:33:05 -07:00
Xiaoxi Wang 31fd4bb272 consider consistent low CPU status for 5min 2023-04-12 09:33:05 -07:00
Xiaoxi Wang 490a7b534a add getAverageCPU method; delete default value of GetTeamRequest
arguments (solve conflicts)
2023-04-12 09:33:05 -07:00
Xiaoxi Wang 2419edd874 rename getLoadReadBandwidth to getReadLoad; change the implementation of getReadLoad 2023-04-12 09:33:05 -07:00
Xiaoxi Wang 27c45f5f45 solve merge conflicts 2023-04-12 09:33:05 -07:00
Xiaoxi Wang e7b5b3cbe3 pull health metrics for each storage server in storageTracker (solve
conflicts)
2023-04-12 09:33:05 -07:00
Xiaoxi Wang 67b737b44d add getStorageStats method to DatabaseContext 2023-04-12 09:33:05 -07:00
Xiaoxi Wang 1181b1dfab use max(ops*empty_penalty, readbytes) as new read load (solve conflicts) 2023-04-12 09:33:05 -07:00
A.J. Beamon 120933da5f When usableRegions is 1 but we have multiple regions in our configuration, canKillProcesses needs to perform the check as if there is only one region 2023-04-12 08:49:25 -07:00
Xiaoxi Wang 79d154317a fix windows build error: C:/ci/fdb_pr_builder/run/foundationdb/flow/Platform.actor.cpp(3769,18): error : variable has incomplete type 'struct sigaction' 2023-04-11 14:54:52 -07:00
Xiaoxi Wang 9e963909c2 add return value 2023-04-11 13:10:45 -07:00
Xiaoxi Wang 65345dbdfe fix DataDistributionTeam.h: In member function 'std::string TeamSelect::toString() const':
/codebuild/output/src428445704/src/github.com/apple/foundationdb/fdbserver/include/fdbserver/DataDistributionTeam.h:120:2: error: control reaches end of non-void function [-Werror=return-type]
2023-04-11 13:10:45 -07:00
A.J. Beamon 4ba8ecc5eb
Merge pull request #9943 from sfc-gh-ahusain/ahusain-fdbcore-4473-1
Update TestMultiValidationTokeFile token generation with no trailing newline char
2023-04-11 11:59:05 -07:00
A.J. Beamon 377f1f692d
Merge pull request #9941 from sfc-gh-ajbeamon/fix-reboot-process-and-switch
Don't downgrade RebootProcessAndSwitch to Reboot
2023-04-11 08:44:28 -07:00
Ata E Husain Bohra 8e52082475 Update TestMultiValidationTokeFile token generation with no trailing newline char
Description

The test does the following:
1.Randomly appends new-line character to the token value buffer
2. If #1 is done, it generates temp token file with buffer containing
   new-line character.
3. Also, it remember the original token for future validation
4. The code parse and read token-validation files and removes the new-line
   character as desired.

Failure in this case due to random buffer used to populate token value
contained newline character which was used for validation, however,
the file parse/read code as expected removed the newline character,
hence causing the mismatch.

Patch addresses the concern by ensuring test generated random
token-value has no trailing newline chars.

Testing

tests/fast/RandomUnitTests.toml -s 1355028229
2023-04-10 17:33:13 -07:00
A.J. Beamon 4142762981 The RebootProcessAndSwitch kill type is meant to be used on an entire cluster, but it was not rebooting protected processes in this way. 2023-04-10 15:58:19 -07:00
Josh Slocum 982ccffafc
adding simple unit tests to validate actor collection cancel behavior (#9929) 2023-04-10 15:42:47 -05:00
Josh Slocum 25252775c7
Increase timeout for fast triggered watches to handle cases where meaningful progress can't be made until speedUpSim is set (#9910) 2023-04-10 15:42:11 -05:00
Josh Slocum b5580f0ea6
fix for bw read driven compaction rare race (#9784) 2023-04-10 11:27:59 -05:00
Josh Slocum 316658a28a solving strange recruitment deadlock in three_data_hall 2023-04-08 11:04:41 -07:00
Hui Liu 403cf29b3e
Use separated actors to dump manifest and truncate mutation logs (#9931) 2023-04-07 16:38:37 -07:00
Zhe Wu 10a6f3d2d0
Merge pull request #9890 from halfprice/zhewu/log-router-gray-failure
Gray failure detects disconnected remote log router and recover high DC lag
2023-04-07 16:25:11 -07:00
A.J. Beamon 9372e6a8c8
Merge pull request #9930 from sfc-gh-ahusain/ahusain-fdbcore-4473
Fix Asan reporting heap overflow
2023-04-07 11:36:54 -07:00
Ata E Husain Bohra 9d8e8d2f9e
Update fdbclient/BlobCipher.cpp
Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>
2023-04-07 10:34:11 -07:00
Ata E Husain Bohra e10259f461 Fix Asan reportin gheap overflow
Description

Fix Asan reportin gheap overflow

Testing

BlobCipherUnitTest with failing seed
2023-04-07 10:24:22 -07:00
Jay Zhuang 1ec7fa06b5
Merge pull request #9788 from sfc-gh-jazhuang/replaceRange
Add bulkload API replaceRange() in IKeyValueStore
2023-04-06 18:16:37 -07:00
Josh Slocum cfd64014e3
also clear the promise stream in actor collection, so that if actor collection already exited we still clean up those futures (#9926) 2023-04-06 18:32:35 -05:00
Zhe Wang a78d800e8a
Clean up data move tombstone when DD init (#9901)
* init

* move cleanup out of getInitDD

* fmt

* address comments

* fixes
2023-04-06 13:48:18 -07:00
Steve Atherton 9daacb0c2e Merge remote-tracking branch 'origin/main' into replaceRange 2023-04-06 13:18:30 -07:00
Josh Slocum d37b2b0a76
Adding BlobFailureInjection workload (#9833)
* Adding BlobFailureInjection workload

* fixing formatting
2023-04-06 15:10:36 -05:00
Josh Slocum aef5130da2
adding system priority option to getDatabaseConfiguration, and several debugging improvements (#9864) 2023-04-06 15:08:40 -05:00