Commit Graph

127 Commits

Author SHA1 Message Date
Yi Wu 27df6df950 Redwood: config remap cleanup by size instead of versions 2022-03-18 14:56:04 -07:00
Trevor Clinkenbeard 10c536c700
Merge pull request #6435 from sfc-gh-ljoswiak/fixes/dynamic-knobs-release-readiness
Dynamic knobs improvements
2022-03-16 10:26:56 -07:00
He Liu c3a68d661e
Physical Shard Move (#6264)
Physical Shard Move part I: Checkpoint creation, transfer and restore.
2022-03-15 13:03:23 -07:00
Josh Slocum 67eba5ec7c Limiting DD Moves by destination SS. 2022-03-15 13:52:19 -05:00
Ata E Husain Bohra 944ec48415
Introduce a simulate EncryptKeyVaultProxy interface (#6576)
Description

Major changes proposed are:
1. Rename ServerKnob->ENABLE_ENCRYPT_KEY_PROXY to
   ServerKnob->ENABLE_ENCRYPTION. Approach simplifies enabling
   controlling encyrption code change using a single knob (desirable)
2. Implement EncyrptKeyVaultProxy simulated interface to assist
   validating encyrption workflows in simulation runs. The interface
   is leveraged to satisfy "encryption keys" lookup which otherwise
   gets satisfied by integrating organization preferred Encryption
   Key Management solution.

Testing

Unit test to validate the newly added code
2022-03-10 12:06:49 -08:00
Tao Lin e2c7c30faf
GetMappedRange support serializable & check RYW & continuation (#6181) 2022-03-10 10:05:44 -08:00
neethuhaneesha 212deb05e9
Merge pull request #6501 from neethuhaneesha/rocksdb-CompBytesLimiter
Rocksdb knobs for compaction, storageserver canCommit() waiting if rocksdb overloaded
2022-03-07 14:59:34 -08:00
Steve Atherton 8f8f95931b In the SQLite storage engine, destroy and create new cursors after SQLITE_CURSOR_MAX_LIFETIME_BYTES of KV read usage because cursor usage increases page cache memory usage in SQLite (internal page cache, not AsyncFileCached) by pinning pages which are not freed until the cursor is destroyed. 2022-03-07 11:20:59 -08:00
Neethu Haneesha Bingi 8796a763a5 Rocksdb knobs for compaction, storageserver canCommit() waiting if rocksdb overloaded. 2022-03-04 12:41:17 -08:00
Neethu Haneesha Bingi 83e0368eaa RocksDB increasing background threads to speedup compaction. 2022-03-03 15:14:37 -08:00
Trevor Clinkenbeard fe957deef8
Merge pull request #6399 from sfc-gh-bvr/fdb#4271
Introduce a new server knob and use it to test if storage servers are…
2022-02-28 13:02:23 -08:00
Zhe Wang f14e08a991 addRocksDBPerfContextMetrics 2022-02-23 22:29:07 -05:00
Lukas Joswiak a8828db58e Load balance dynamic knob requests
This commit also removes an attempt to read the latest configuration
snapshot when a rollforward timeout occurs. The normal retry loop will
eventually fetch an up to date snapshot and the rollforward will be
retried.
2022-02-22 10:53:48 -08:00
Bharadwaj V.R a54acb3720 Temporarily lower safety buffer knob. AtomicBackupCorrectness needs fixing 2022-02-16 19:26:40 -08:00
Bharadwaj V.R 27855bc5b5
Merge branch 'apple:main' into fdb#4271 2022-02-16 15:38:36 -08:00
Zhe Wu 9da735c38e Batch empty peek reply 2022-02-16 15:28:56 -08:00
Bharadwaj V.R 949f1f1c3e Switch to testing MIN_AVAILABLE_SPACE 2022-02-16 11:33:07 -08:00
Bharadwaj V.R 3fe6a952f1 Merge with upstream tcinfo refactor and move the server knob init to be adjacent to related knobs 2022-02-16 10:28:55 -08:00
Bharadwaj V.R fe03e6f822 Introduce a new server knob and use it to test if storage servers are near the min bar for available space 2022-02-15 22:43:06 -08:00
Trevor Clinkenbeard ef68e6fe0d
Merge pull request #6353 from sfc-gh-ljoswiak/fixes/dynamic-knobs
Fix dynamic knobs correctness issues
2022-02-10 22:13:02 -08:00
Zhe Wang d684508540 Add RatekeeperLimitReasonDetails traceevent for RK 2022-02-10 13:59:47 -08:00
Lukas Joswiak 990e215a8d Fix formatting
Co-authored-by: Trevor Clinkenbeard <trevor.clinkenbeard@snowflake.com>
2022-02-09 13:43:32 -08:00
Lukas Joswiak d5a562e6b8 Fix dynamic knobs correctness issues 2022-02-09 13:43:32 -08:00
Ata E Husain Bohra 87ee4cf958 Add new FDB EncryptKeyProxy role
Major changes includes:

1. Add a new FDB role responsible- EncyrptKeyProxy. The role is
   responsible to expose APIs to fetch encyrption keys interacting
   with external Encryption KeyManager interface.
2. The process is a FDB singleton process following similar recruitment
   rules as other singleton processes in the system.
3. Code to recruit the worker process; given the encryption keys are
   needed during recovery (decode TLog records), for now the process
   is co-located in same datacenter as ClusterController.
4. Skeleton process actor code; more functionality will be added in
   subsequent PRs.

NOTE: The code is protected under a SERVER_KNOB with the default
      value as 'false' for now.
2022-01-25 17:38:27 -08:00
Josh Slocum cf45354833 switched buggified and expected shard size for simulation 2022-01-20 20:37:03 -08:00
Josh Slocum 4bfef29e4c Changed small shards in simulation logic 2022-01-20 20:37:03 -08:00
Josh Slocum 6a8e9d71d2 Raising default minimum shard size, as it causes unecessary merging on growing clusters. 2022-01-20 20:37:03 -08:00
Steve Atherton 2384c5aeb9 Change Redwood default page size knob to 8192. 2022-01-20 20:31:26 -08:00
Neethu Haneesha Bingi 162bce7a58 Rocksdb write rate limiter. 2022-01-18 13:23:00 -08:00
Neethu Haneesha Bingi ef4038fe8d Rocksdb read range iterator pool to reuse iterators. 2022-01-18 02:05:21 -08:00
Ata E Husain Bohra 936bf5336a
Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine" (#6191)
* Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine""

Major changes includes:
1. Re-revert Sequencer refactor commits listed below (in listed order):
1.a. This reverts commit bb17e194d9.
1.b. This reverts commit d174bb2e06.
1.c. This reverts commit 30b05b469c.

2. Update Status.actor to track ClusterController interface to track
   recovery status.
3. Introduce a ServerKnob to define "cluster recovery trace event"
   prefix; for now keeping it as "Master", however, it should allow
   smooth transition to "Cluster" prefix as it seems more appropriate.
2022-01-06 12:15:51 -08:00
Neethu Haneesha Bingi 1f30368e71 KeyValueStoreRocksDB histograms to track latencies 2021-12-21 23:09:46 -08:00
Tao Lin 9b0a9c4503
Return error when getRangeAndFlatMap has more & Improve simulation tests (#6029) 2021-12-03 12:50:07 -08:00
Steve Atherton bed25f9571 Delay prioritized eviction of updated pages until after commit completes. 2021-11-28 21:03:44 -08:00
Evan Tschannen 8fa7085c78 added a comment 2021-11-24 11:40:41 -08:00
Evan Tschannen c9ee83e1b1 fix: do not buggify PEEK_TRACKER_EXPIRATION_TIME to a value of 20 2021-11-24 11:28:57 -08:00
Steve Atherton 508429f30d
Redwood chunked file growth and low priority IO starvation prevention (#5936)
* Redwood files now growth in large page chunks controlled by a knob to reduce truncate() calls for expansion.   PriorityMultiLock has limit on consecutive same-priority lock release.  Increased Redwood max priority level to 3 for more separation at higher BTree levels.

* Simulation fix, don't mark certain IO timeout errors as injected unless the simulated process has been set to have an unreliable disk.

* Pager writes now truncate gradually upward, one chunk at a time, in response to writes, which wait on only the necessary truncate operations.   Increased buggified chunk size because truncate can be very slow in simulation.

* In simulation, ioTimeoutError() and ioDegradedOrTimeoutError() will wait until at least the target timeout interval past the point when simulation is sped up.

* PriorityMultiLock::toString() prints more info and is now public.

* Added queued time to PriorityMultiLock.

* Bug fix to handle when speedUpSimulation changes later than the configured time.

* Refactored mutation application in leaf nodes to do fewer comparisons and do in place value updates if the new value is the same size as the old value.

* Renamed updatingInPlace to updatingDeltaTree for clarity.  Inlined switchToLinearMerge() since it is only used in one place.

* Updated extendToCover to be more clear by passing in the old extension future as a parameter.  Fixed initialization warning.
2021-11-12 13:47:07 -08:00
Daniel Smith 394b9dc619 Code review changes 2021-11-10 11:53:27 -05:00
Daniel Smith f6342b0a8d Update defaults 2021-11-10 11:51:05 -05:00
Daniel Smith 66520eb1c1 Utilize read types to do selective throttling 2021-11-10 11:51:04 -05:00
Tao Lin fdb3b72e35 Introduce GetRangeAndFlatMap to push computations down to FDB
Re-introduce #5609
2021-11-09 13:52:28 -08:00
Tao Lin 586cc3b102
Revert "Introduce GetRangeAndFlatMap to push computations down to FDB" 2021-11-04 08:46:56 -07:00
Tao Lin 0853661d13 Introduce getRangeAndHop to push computations down to FDB 2021-11-03 13:21:16 -07:00
Xiaoxi Wang 1a2a838df3 add knob 2021-10-27 09:08:37 -07:00
Xiaoxi Wang 69190ed04e format 2021-10-27 09:08:37 -07:00
Xiaoxi Wang 0053b4793e change knob and delete redundant doBuildTeam 2021-10-27 09:08:37 -07:00
Evan Tschannen 2208b04174
Merge pull request #5855 from sfc-gh-etschannen/blob_full_clean
Blob Granules V0
2021-10-26 09:57:35 -07:00
Lukas Joswiak c96f560cbe Verify rollback of a single version in simulation, other small fixes 2021-10-25 12:03:22 -07:00
Josh Slocum 0ff8ddc2b6 Merge branch 'master' into blob_full_clean 2021-10-25 13:38:48 -05:00
Steve Atherton d153519188
Merge pull request #5813 from sfc-gh-jslocum/ss_ebrake_streaming_fix
Fixes to ss e-brake, tlog streaming, and their interaction
2021-10-22 10:46:17 -07:00