Commit Graph

770 Commits

Author SHA1 Message Date
Jingyu Zhou fa7f15e46a
Merge pull request #9353 from sfc-gh-yiwu/redwood_restart
Redwood: fix restart test failure with XOR encoding
2023-02-13 09:24:57 -08:00
Steve Atherton 41fa3eada9
Merge branch 'main' into add-redwood-slack-knob 2023-02-12 19:31:20 -08:00
Yi Wu a37d8f757c Redwood: fix restart test failure with xor encoding 2023-02-10 21:01:52 -08:00
Yi Wu f17024e615
Redwood: fix btree unit test reopen memory only pager (#9334)
The Redwood btree test are not suppose to reopen the pager if it is memory only, which will open an empty pager.
2023-02-09 17:12:50 -08:00
Yi Wu d3bc2afc8e
EaR: storage server uses encryption DB config (#9115)
The PR is updating storage server and Redwood to enable encryption based on the encryption mode in DB config, which was previously controlled by a knob. High level changes are
1. Passing encryption mode in DB config to storage server
    1.1 If it is a new storage server, pass the encryption mode through `InitializeStorageRequest`. the encryption mode is pass to Redwood for initialization
    1.2 If it is an existing storage server, on restart the storage server will send `GetStorageServerRejoinInfoRequest` to commit proxy, and commit proxy will return the current encryption mode, which it get from DB config on its own initialization. Storage server will compare the DB config encryption mode to the local storage encryption mode, and fail if they don't match
2. Adding a new `encryptionMode()` method to `IKeyValueStore`, which return a future of local encryption mode of the KV store instance. A KV store supporting encryption would need to persist its own encryption mode, and return the mode via the API.
3. Redwood accepts encryption mode from its constructor. For a new Redwood instance, caller has to specific the encryption mode, which will be stored in Redwood per-instance file header. For existing instance, caller is supposed to not passing the encryption mode, and let Redwood find it out from its own header.
4. Refactoring in Redwood to accommodate the above changes.
2023-02-06 14:02:31 -08:00
Yi Wu 845cc62a39
Redwood: fix tree height overgrowth with per-tenant encryption (#9020)
* Fix Redwood tree height overgrowth when EaR and tenant page split are enabled, by removing the buildNewSubtree() logic.
* Fixing incorrect page upper bound for the last page created by writePages() without the buildNewSubtree() logic.
* Enable tenant page split if encryption mode is domain-aware encryption.
* Related test fixes:
  - In simulation, pass encryption mode to storage/Redwood via knobs. This is a workaround to enable testing with Redwood encryption before we correctly pass the encryption mode via db config. Also temporarily disable tenant page split for restart tests.
  - Disable raw access in FuzzApiCorrectness test if domain-aware encryption is enabled, to avoid test timeout
  - Disable encryption for DrUpgradeRestart test, which is likely to fail due to a rare EKP deadlock issue blocking recovery. Will re-enable after the deadlock issue is fixed.
2023-01-06 15:56:37 -08:00
Yi Wu 5d6ba48da0
Fix /redwood/correctness/EnforceEncodingType unit test (#9095)
The unit test intentionally cause unexpected_encoding_type being thrown, which would hit a simulation only assert failure. Disabling that assert in this case.
2023-01-06 14:13:59 -08:00
Nim Wijetunga 21611761bd
Backup uses DB Config (#8941)
* add encryption db config

* address pr comments

* address pr comments

* add comments

* remove knobs from backup

* remove import

* cp uses db config

* modify simulated cluster

* remove includes

* fix tests

* fix tests

* modify comment

* add encryption enabled method

* change error to warn

* Trigger Build

* Trigger Build

* Trigger Build
2023-01-04 22:43:51 -05:00
Yi Wu 17fdbc46a5
EaR: Add page checksum to Redwood pages in no-auth mode (#8965)
Previously with EaR we always enable authentication (e.g. we encrypt Redwood pages). The authentication is a form of checksum, so dedicated page checksum was not needed. This PR adds back xxhash page checksum when authentication is disabled. Also change the knob to default disable authentication.
2023-01-03 10:30:07 -08:00
imperatorx 81e8afd3a2 Introduce new know for Redwood slack balancing 2022-12-20 15:22:54 +01:00
A.J. Beamon 0a686719d8 Make mapAsync usable without specifying template parameters 2022-12-14 14:33:59 -08:00
sfc-gh-tclinkenbeard 68f14f017c Fix clang 15 compiler warnings 2022-12-08 13:59:37 -08:00
FoundationDB CI 86d6106dc1
format source code after switch to clang 15 2022-12-08 17:26:45 +00:00
Steve Atherton fb45c2ec29 Added explicit shutdown of FIFOQueue operation futures to avoid possible page cache additions after page cache cleanup. 2022-11-28 00:03:21 -08:00
Yi Wu 551fd0b9bb
EAR: Cleanup Redwood tenant map usage (#8902)
We have a recent redesign that no longer required to pass tenant name to get encryption key, and also not allowing optional tenant mode for tenant-aware encryption. This PR clean up Redwood code to remove tenant map usage, and update various checks accordingly.

Changes:
* Cleanup TenantPrefixIndex in TenantAwareEncryptionKeyProvider and related logic in storage server and Redwood for passing the map around.
* Cleanup and update DecodeBoundaryVerifier the reflect the new design.
* A minor fix to writePages() that avoid a page that's default domain encrypted having a lower bound key belonging to a non-default domain.
* Fix TenantAwareEncryptionKeyProvider::getEncryptionDomain() returning wrong prefix long for system domain.
* A minor change to add a context string to IoTimeoutError.
2022-11-23 09:41:40 -08:00
Steve Atherton 8e8c4b4489
Merge pull request #8170 from sfc-gh-sgwydir/ddsketch
Use DDSketch for sample data
2022-11-17 10:38:12 -08:00
Markus Pilman 503769ef05
Merge pull request #8496 from sfc-gh-mpilman/bugfixes/machines-attrition-debugging
Enable machine attrition injection
2022-11-15 16:32:33 -07:00
sfc-gh-tclinkenbeard 82db9415fb Add const qualifier to more methods
This commit mainly targets get* and is* methods
2022-11-15 14:57:32 -08:00
Sam Gwydir 99d4bacf5d Merge remote-tracking branch 'origin/main' into ddsketch 2022-11-15 13:19:42 -08:00
Sam Gwydir 23706c957b Use DDSketch for Sample Data. 2022-11-12 13:45:46 -08:00
Steve Atherton d61b88e6b3 Bug fix, Redwood's ioLock was not a Reference<>. Renamed several knobs, functions, and Redwood metrics for clarity. 2022-11-11 20:07:48 -08:00
Markus Pilman f1fea14255 Merge remote-tracking branch 'origin/main' into bugfixes/machines-attrition-debugging 2022-11-01 13:51:35 -06:00
Steve Atherton 27dc180b68 Merge commit '0ae568a872e474c8c755e648efbbe4524e63e445' into storageserver-pml
# Conflicts:
#	fdbserver/VersionedBTree.actor.cpp
2022-10-24 22:31:36 -07:00
Markus Pilman e7b5b870a3 Merge remote-tracking branch 'origin/main' into bugfixes/machines-attrition-debugging 2022-10-24 15:24:36 -06:00
Markus Pilman 43cafb0bc2 Track disk corruptions and mark resulting failures as injected 2022-10-24 14:54:43 -06:00
Steve Atherton fdc8e2118e Changed Redwood shutdown order so that users get the error signal first and have a chance to cancel pending operations before they are canceled by force which causes broken_promise errors. 2022-10-23 01:52:25 -07:00
Steve Atherton 970eb4a043 Add option in ObjectCache clear for whether or not to wait for safe eviction for each item. 2022-10-23 01:50:18 -07:00
Steve Atherton cd83595a0f Merge remote-tracking branch 'sfc-gh-tclinkenbeard/clear-key-provider' into redwood-page-lifetimes 2022-10-22 17:55:09 -07:00
Steve Atherton fa0ea1368d Now that IO operation cancellation is safe, explictly cancel all pending operations during shutdown in the operations vector or in the caches. 2022-10-22 17:55:02 -07:00
Steve Atherton 94f3eee3a8 Update comment for recent changes. 2022-10-22 16:12:57 -07:00
Steve Atherton 46dd71b008 Change extent reads to use readPhysicalBlock to avoid output buffer lifetime issues while allowing the extent read actor to be cancelled. 2022-10-22 02:55:45 -07:00
Steve Atherton 76316339ac Merge commit '7dacaed98368ec0790c9e18a63dfa0035a31fcff' into redwood-page-lifetimes 2022-10-22 02:30:43 -07:00
Steve Atherton d7f9de53f5 Reworked Redwood's file read/write buffer lifetime safety to be more simple and lower overhead. The lowest level file read/write ops are uncancellable and hold references to their in/out buffers, so now read/write futures in the operations vector or the page cache can be cancelled which allows shutdown to directly cancel active operations. 2022-10-22 02:30:15 -07:00
neethuhaneesha a1eb1d4a48
Rocksdb storage using single_key_deletes instead of deleterange on clearrange operation. (#8452) 2022-10-21 15:47:19 -07:00
sfc-gh-tclinkenbeard e168950a93 Clear DecodeBoundaryVerifier::keyProvider in ~VersionedBTree 2022-10-21 11:26:55 -07:00
Steve Atherton 0332aa0320 Add comments explaining details for page memory and read/write future lifetimes. 2022-10-21 01:16:29 -07:00
Steve Atherton b361c848f2 Bug fix in page memory lifetimes: Page read futures that are not stored in the page cache must be wrapped in uncancellable(). 2022-10-21 00:14:33 -07:00
Steve Atherton 1847f801b7 Merge commit '8733614f632590f67cd52eeb73c800e4a25ab143' into storageserver-pml 2022-10-07 22:58:37 -07:00
Steve Atherton 3228afefd3 Unrevert #7578 - storage server PriorityMultiLock and PML rewrite. 2022-10-06 23:41:28 -07:00
Steve Atherton f2dbbcce40 Allow overlapping commits in Redwood in which the caller drops the commit futures. Call IKVS::init() in TLogServer. 2022-10-06 12:48:06 -07:00
Yi Wu 5c549601d2
Split Redwood page by tenant boundary (#7979)
Redwood encrypt with page granularity. To do per-tenant encryption (i.e. each tenant are encrypted with different set of cipher keys), we need to split Redwood pages by tenant boundary. Moreover, it also needs to handle different tenant modes:
* tenantMode = disabled: do not split page by tenant, all data encrypt using default encryption domain.
* tenantMode = required: look at the prefix of the keys and split by tenant accordingly, and encrypt in tenant specific encryption domain accordingly.
* tenantMode = optional: some key ranges may not map to a tenant. In additional to looking at the key prefix, the key provider also query the tenant prefix index. For prefixes not found in the tenant prefix index, corresponding key should be encrypted using the default encryption domain.

The change also enforce data for each tenant forms a subtree, and key of the link to the subtree is exactly the tenant prefix.

This PR is building on top of #8172 and use the IPageEncryptionKeyProvider interface added there. Changes:
* In `writePages` and `splitPages`, query the key provider to split page accordingly.
* In `commitSubtree`, when doing in-place update (to both of leaf or internal page), check if the entry being insert belong to the same encryption domain as existing data of the page. If not, fallback to full page rebuild, where `writePages` will handle the page split.
* When updating the root, check if it is encrypted using non-default (i.e. tenant specific) domain. If so, add a redundant root node which will be encrypted with default encryption domain.

Tested with 100K run of `Lredwood/correctness/btree` unit test, where it uses `RandomEncryptionKeyProvider`, which is updated to support and generate random encryption domain with 4 byte domain prefixes.
2022-10-04 12:53:55 -07:00
A.J. Beamon e1fe28b78b Switch some usages of LiteralStringRef to use the _sr suffix 2022-09-30 16:04:16 -07:00
Markus Pilman 5774249e5b
Revert "[DRAFT] Redwood PriorityMultiLock enable different launch limits to be specified based on different priority level." 2022-09-23 12:22:47 -06:00
Steve Atherton 04b4960786 Merge branch 'main' into fzhao/RedwoodIOLaunchLimit
# Conflicts:
#	fdbserver/VersionedBTree.actor.cpp
#	fdbserver/storageserver.actor.cpp
#	fdbserver/workloads/ReadWrite.actor.cpp
2022-09-22 00:39:51 -07:00
Steve Atherton ab41da174c Completely rewrote PriorityMultiLock scheduling and added a unit test for it. 2022-09-20 00:45:29 -07:00
A.J. Beamon 4fd64630e8 Convert literal string ref instances to use _sr suffix 2022-09-19 11:35:58 -07:00
Steve Atherton 74b152e550 Removed two obsolete things: explicit maxPriority argument from PriorityMultiLock as it is redundant after the launch limit refactor, and the redwood read concurrency lock which is no longer needed after the StorageServer priority refactor as it will control the concurrency of requests sent to the KVS. 2022-09-18 18:23:15 -07:00
Steve Atherton 6b0eeb8efc Move PriorityMultiLock to its own header file so that changing it does not require recompiling everything. 2022-09-18 17:17:56 -07:00
Steve Atherton c0c9ac3bf5 Simplification, only runnerCounts workers (formerly workerCounts) and available counts in addRunner(). Renamed all uses of "worker" to "runner" for more clarity. 2022-09-18 01:13:27 -07:00
Yi Wu e66942ada4
Update Redwood encryption interface (#8172)
Update Redwood encryption interface to make it better suit for per-tenant encryption, where we will need to do tenant page split.
2022-09-16 15:56:05 -07:00