Commit Graph

8159 Commits

Author SHA1 Message Date
sfc-gh-ngoyal e43ddb09ac
Merge pull request #6034 from sfc-gh-satherton/remap-window-fix
Bug fix:  incorrect remap cleanup window size in KeyValueStoreRedwood
2021-11-22 09:12:59 -08:00
Steve Atherton d2149631d5 Bug fix, the remap cleanup window was being initialized incorrectly in the Redwood KVS wrapper. 2021-11-19 21:40:09 -08:00
Steve Atherton 3901d60548 Test-only bug fixes in Redwood along with debug logging detail improvements. Added clearRemapQueue() to Pager to more cleanly and reliably expire all old data and process the remap queue, fixing a bug where with certain configuration parameters and a lot of data the DestructiveSanityCheck would fail because it would not run cleanup long enough. Added more parameters to performance/set unit test. 2021-11-19 01:00:14 -08:00
sfc-gh-tclinkenbeard c745002607 Merge remote-tracking branch 'origin/master' into add-format-warning 2021-11-18 11:13:03 -08:00
Evan Tschannen 5879d611f5
Merge pull request #5899 from sfc-gh-jfu/jfu-fix-ss-segfault
Fix rare segfault that can occur when SS terminates while an actor that uses the StorageServer is still on the stack
2021-11-18 09:57:11 -08:00
sfc-gh-tclinkenbeard 2613ec7561 Expand use of fmt to get rid of %ld usage 2021-11-17 17:03:32 -08:00
sfc-gh-tclinkenbeard 07349869d9 Use fmt to address -Wformat warnings 2021-11-17 14:45:48 -08:00
sfc-gh-tclinkenbeard 766a05d33c Merge remote-tracking branch 'origin/master' into add-format-warning 2021-11-17 12:14:01 -08:00
Steve Atherton 3caca74ac2 Merge commit 'fd707c6d7ee80de6d9fda5796da2d0add10abd79' into bit-flipping-workload 2021-11-16 21:54:27 -08:00
negoyal 867ad8be46 Merge branch 'bit-flipping-workload' of github.com:sfc-gh-ngoyal/foundationdb into bit-flipping-workload 2021-11-16 18:12:18 -08:00
Steve Atherton 6e43dde613 Fixed bad merge resolution. 2021-11-16 18:04:37 -08:00
Renxuan Wang 725d31e264 Fix parameters passes to simulatedMachine(). 2021-11-16 17:50:12 -08:00
negoyal ce112e1f23 Enhance trace event. 2021-11-16 17:39:59 -08:00
Jon Fu 8f6934c4d0 Merge branch 'master' of github.com:apple/foundationdb into jfu-fix-ss-segfault 2021-11-16 18:03:32 -05:00
Evan Tschannen 557186ed17
Merge pull request #5909 from sfc-gh-jfu/jfu-cc-request-dbinfo
Change dbinfo broadcast to be explicitly requested by the worker registration message
2021-11-16 15:01:42 -08:00
Steve Atherton 035e0d6e52
Merge branch 'master' into bit-flipping-workload 2021-11-16 14:42:22 -08:00
Jingyu Zhou 8d6cfcb630
Merge pull request #6003 from sfc-gh-etschannen/fix-forwarding
fix: coordinators would process forwarding requests before making them durable
2021-11-16 14:24:21 -08:00
Evan Tschannen 35bce4cd36 added a comment 2021-11-16 13:07:35 -08:00
Evan Tschannen fd635432c4 fix: coordinators would process forwarding requests before making them durable 2021-11-16 12:21:26 -08:00
Steve Atherton 21c3c585ca Make file name parameter more user friendly in unit tests. 2021-11-16 04:03:11 -08:00
Steve Atherton 867999a41a Rename wrong_format_version to unsupported_format_version. 2021-11-16 03:25:54 -08:00
Steve Atherton 7b29804a5e Fix typo. 2021-11-16 02:30:57 -08:00
Steve Atherton c53f5aa110 Renamed redwood to redwood-1-experimental and file extension to .redwood-v1. 2021-11-16 02:15:22 -08:00
Evan Tschannen 964d0209ca
Merge pull request #5637 from sfc-gh-ljoswiak/features/data-loss-prevention
Data loss protection when joining new cluster
2021-11-15 15:26:32 -08:00
Jingyu Zhou 3d26d2372b
Merge pull request #5932 from sfc-gh-ahusain/ahusain-improveTesterLogging
Improve tester actor logging to track workload run & check status
2021-11-15 13:28:55 -08:00
Jingyu Zhou 02d0c43bc2
Merge pull request #5982 from sfc-gh-tclinkenbeard/improve-error-descriptions
Make snapshot errors more descriptive
2021-11-15 13:18:19 -08:00
Evan Tschannen 6e81b83924 fix: cleanup change feeds which have been completely removed from a storage server 2021-11-15 11:47:42 -08:00
Evan Tschannen 94a51e57a5 Merge branch 'master' into feature-changefeed-empty-versions
# Conflicts:
#	fdbclient/StorageServerInterface.h
2021-11-14 19:13:05 -08:00
Evan Tschannen 6909754b21 changefeeds now have a whenAtLeast function for efficiently learning when the version has updated but no mutations have been committed 2021-11-14 19:08:46 -08:00
sfc-gh-tclinkenbeard 0ba77ea79b Fix proxySnapCreate trace typo 2021-11-14 16:12:28 -08:00
Tao Lin 9422b8e5f2 Restricted getRangeAndFlatMap to snapshot 2021-11-12 15:12:37 -08:00
Steve Atherton 508429f30d
Redwood chunked file growth and low priority IO starvation prevention (#5936)
* Redwood files now growth in large page chunks controlled by a knob to reduce truncate() calls for expansion.   PriorityMultiLock has limit on consecutive same-priority lock release.  Increased Redwood max priority level to 3 for more separation at higher BTree levels.

* Simulation fix, don't mark certain IO timeout errors as injected unless the simulated process has been set to have an unreliable disk.

* Pager writes now truncate gradually upward, one chunk at a time, in response to writes, which wait on only the necessary truncate operations.   Increased buggified chunk size because truncate can be very slow in simulation.

* In simulation, ioTimeoutError() and ioDegradedOrTimeoutError() will wait until at least the target timeout interval past the point when simulation is sped up.

* PriorityMultiLock::toString() prints more info and is now public.

* Added queued time to PriorityMultiLock.

* Bug fix to handle when speedUpSimulation changes later than the configured time.

* Refactored mutation application in leaf nodes to do fewer comparisons and do in place value updates if the new value is the same size as the old value.

* Renamed updatingInPlace to updatingDeltaTree for clarity.  Inlined switchToLinearMerge() since it is only used in one place.

* Updated extendToCover to be more clear by passing in the old extension future as a parameter.  Fixed initialization warning.
2021-11-12 13:47:07 -08:00
sfc-gh-tclinkenbeard 62efeb6812 Merge remote-tracking branch 'origin/master' into add-format-warning 2021-11-12 11:50:36 -08:00
Tao Lin 08fd69e787
Merge pull request #5967 from nblintao/flatmap-exception 2021-11-12 10:19:44 -08:00
Daniel Smith 019fd50f46
Merge pull request #5971 from Daniel-B-Smith/rocksdb-error-logging
Clean up RocksDB error logging
2021-11-12 13:17:10 -05:00
Ata E Husain Bohra 82c3e8bf79
Trigger buildTeam operation if server transition from unhealthy -> healthy (#5930)
* Trigger buildTeam operation if server transition from unhealthy -> healthy

DataDistribution actor helps in building teams as server count changes
(add/removal), however, it is possible that total_healthy_server count
is insufficient to allow team formation. If happens, even healthy server
count recover, the buildTeam operation will not be triggered.

Patch proposal is to trigger `checkBuildTeam` operation if server
transitions from unhealthy -> healthy state. Incase system already
has created enough teams (desiredTeamCount/maxTeamCount), the operation
incurs a very minimal cost.
2021-11-12 09:41:01 -08:00
Tao Lin 2d5f924278 GetKeyValuesAndFlatMap should return error if not retriable 2021-11-12 09:35:28 -08:00
Daniel Smith 9dccb0131e Clean up RocksDB error logging 2021-11-12 12:14:12 -05:00
He Liu d73d2144fd Adjust distributorSplitRange order. 2021-11-11 20:28:55 -08:00
He Liu 984bc0fbea Added Endpoints. 2021-11-11 16:56:04 -08:00
Josh Slocum 329091e14f Merge branch 'master' into bg_bindings 2021-11-11 10:13:37 -06:00
Josh Slocum b8ac4213a1 Switched BG APIs to transaction instead of database 2021-11-11 08:59:06 -06:00
Markus Pilman 0dfb72176e
Merge pull request #5857 from sfc-gh-vgasiunas/notify-client-lib-changes
A mechanism to notify MVC about relevant client library status changes on the cluster
2021-11-11 07:43:20 -07:00
Lukas Joswiak e4c3f886da Fix recovery issue 2021-11-10 16:15:13 -08:00
Lukas Joswiak 28b72550f3 Remove additional unused tracing 2021-11-10 13:33:49 -08:00
Lukas Joswiak c93052121f Fix issue where transaction spans would not be recorded 2021-11-10 13:33:49 -08:00
Daniel Smith 481cf9bb55
Merge pull request #5788 from Daniel-B-Smith/rocks-throttle
Throttle the number of concurrent reads to RocksDB
2021-11-10 15:20:30 -05:00
Daniel Smith 499dbcdb18 Don't fail fetchKeys when server overloaded is returned 2021-11-10 14:15:42 -05:00
Andrew Noyes db3c08c7cd
Merge pull request #5928 from sfc-gh-anoyes/anoyes/fix-heap-use-after-free
Fix a heap use after free
2021-11-10 10:21:05 -08:00
Vaidas Gasiunas 51b8ccf7d3 Merge remote-tracking branch 'apple/master' into notify-client-lib-changes 2021-11-10 18:40:34 +01:00
Daniel Smith 394b9dc619 Code review changes 2021-11-10 11:53:27 -05:00
Daniel Smith b50b3de5d0 Allow SS to respond with server overloaded 2021-11-10 11:52:02 -05:00
Daniel Smith 66520eb1c1 Utilize read types to do selective throttling 2021-11-10 11:51:04 -05:00
Steve Atherton 470896bdc4
Redwood inline same-size value updates (#5925)
* Refactored mutation application in leaf nodes to do fewer comparisons and do in place value updates if the new value is the same size as the old value.

* Renamed updatingInPlace to updatingDeltaTree for clarity.  Inlined switchToLinearMerge() since it is only used in one place.
2021-11-10 08:22:57 -08:00
Daniel Smith ec99d1d888
Merge pull request #5946 from Daniel-B-Smith/cfoptions
Move table factory options to CFOptions
2021-11-10 11:20:41 -05:00
Markus Pilman a246c1555d
Merge pull request #5931 from sfc-gh-mpilman/features/all-unit-tests-in-ci
Make sure unit tests are run often enough
2021-11-10 08:41:04 -07:00
Daniel Smith 8822c589de Move table factory options to CFOptions 2021-11-09 17:29:06 -05:00
Tao Lin 4b2757bf99 Fix memory bug in IndexPrefetchDemo 2021-11-09 13:52:28 -08:00
Tao Lin fdb3b72e35 Introduce GetRangeAndFlatMap to push computations down to FDB
Re-introduce #5609
2021-11-09 13:52:28 -08:00
Lukas Joswiak 15e0d5b29f Add explicit transaction options when reading cluster ID 2021-11-09 12:29:49 -08:00
Lukas Joswiak 74cf64fe0f Sync cluster ID through ServerDBInfo 2021-11-09 12:29:48 -08:00
Lukas Joswiak 4640045243 Fix rare simulation failures
When partitions appear before a cluster has fully recovered, it was
possible to have different tlogs persist different cluster IDs because
they were involved in different partitions. This would affect recovery
when a quorum was eventually reached. The solution to this is to avoid
persisting the cluster ID before a cluster has fully recovered, to make
sure all nodes agree on the cluster ID.
2021-11-09 12:29:48 -08:00
Lukas Joswiak 1fa726ca73 Fix compilation issue 2021-11-09 12:29:48 -08:00
Lukas Joswiak 3988b11fd6 Cleanup 2021-11-09 12:29:48 -08:00
Lukas Joswiak aa3383f0e3 Exclude when joining new cluster 2021-11-09 12:29:48 -08:00
Lukas Joswiak 3e2c65bb11 Allow tlog to join another cluster but retain its data 2021-11-09 12:29:48 -08:00
Lukas Joswiak 30867750b5 Add protection against storage and tlog data deletion when joining a new cluster 2021-11-09 12:29:47 -08:00
Jon Fu 2887e1c30a set flag to true when doing first registration 2021-11-09 12:44:07 -05:00
Ata E Husain Bohra 5e4f25f96d Improve tester actor logging to track workload run & check status
Patch improves logging in "tester.actor" to assist better tracking
of workload run & check status
2021-11-08 16:18:17 -08:00
Markus Pilman d6fad2e489 readded old tlog tests 2021-11-08 15:52:08 -07:00
Markus Pilman 7df059570a Make sure unit tests are run often enough 2021-11-08 15:43:32 -07:00
Steve Atherton d97d968176
Added KeyBackedObjectMap and KeyBackedObjectProperty classes for storing serializable objects in FDB (#5896)
* Cleaned up some lambda capture workaround since x=y captures weren't available when these classes were originally written.

* Added KeyBackedObjectMap and KeyBackObjectProperty, which work like KeyBackedMap and KeyBackedProperty but use ObjectWriter/Reader for Value serialization so that the type can evolve over time.

* Disabled unit tests which shouldn't run as part of random selection.
2021-11-08 13:04:53 -08:00
Andrew Noyes b7e393587c Fix a heap use after free
If we accept arena arguments by value, then the lifetime of any memory
allocated by that arena ends when the function returns. Given that we
seem to be appending to VectorRef's passed by pointer this is unlikely
to be what we want.
2021-11-08 12:51:32 -08:00
Jon Fu 00f4bd8536 Check ccInterface against serverDbInfo's cc and make broadcast unconditional for first registration 2021-11-08 12:43:02 -05:00
Vaidas Gasiunas 9408c11c3d MVC2.0: Remove client library status "available" 2021-11-08 18:10:36 +01:00
Vaidas Gasiunas 418cc4dc50 MVC2.0: Notifying clients about deleting or disabling client libraries that have upload or active status;
Declare library status access and change transactions as lock aware
2021-11-08 18:01:58 +01:00
Jon Fu 396cd58b21 cancel ss core and ss actor collection after termination and before context switch 2021-11-04 16:05:23 -04:00
sfc-gh-tclinkenbeard 30cef51746 Improve tracing in ddSnapCreateCore 2021-11-04 12:59:50 -07:00
Tao Lin 586cc3b102
Revert "Introduce GetRangeAndFlatMap to push computations down to FDB" 2021-11-04 08:46:56 -07:00
Tao Lin 679023ac51
Merge pull request #5609 from nblintao/index-prefetch-demo 2021-11-03 23:07:34 -07:00
Jon Fu 4e8625ccc0 retain old behaviour along with explicit request 2021-11-03 17:23:07 -04:00
Tao Lin 6c98e35893 Rename Hop to FlatMap 2021-11-03 13:32:01 -07:00
Tao Lin 0853661d13 Introduce getRangeAndHop to push computations down to FDB 2021-11-03 13:21:16 -07:00
Jon Fu 59f0a2c3e5 Change dbinfo broadcast to be explicitly requested by the worker registration message 2021-11-03 15:51:21 -04:00
Josh Slocum 5b2617a524 Added local granule file reading to mako 2021-11-03 09:33:30 -05:00
Steve Atherton b4d69610ee Remove unused variable and more clearly explain out of range annotation in Redwood debug output. 2021-11-02 21:48:37 -07:00
Steve Atherton 84854761cb Change Redwood to use xxhash for checksums. 2021-11-02 21:47:31 -07:00
Jon Fu 67bd4ddea0 Add a wait(delay(0)) to storage server termination to avoid a rare segfault 2021-11-02 16:24:40 -04:00
Josh Slocum 382882f1c1 mako successfully calls read_blob_granules and gets stuff back 2021-11-02 13:43:42 -05:00
Josh Slocum d6a31078fe C API for blob granules 2021-11-02 10:01:23 -05:00
sfc-gh-tclinkenbeard 45cff017c2 Remove Downgrade workload 2021-11-01 14:54:24 -07:00
sfc-gh-tclinkenbeard d0c9cf4fb0 Enable mismatched-tags clang warning 2021-11-01 14:18:31 -07:00
sfc-gh-tclinkenbeard ebcc023b6f Enable missing-field-initializers clang warning 2021-11-01 14:18:31 -07:00
sfc-gh-tclinkenbeard 25257f6f87 Enable unused-function warning for clang 2021-11-01 14:18:31 -07:00
Evan Tschannen ee00135a6b skip good recruitment errors when doing simulation only validation 2021-11-01 13:24:15 -07:00
Evan Tschannen 78e36e7590 fix: simulation only validation could throw errors which would impact the behavior of the cluster controller 2021-11-01 13:24:15 -07:00
sfc-gh-tclinkenbeard 13bb7838aa Enable clang -Wformat warning 2021-10-30 21:07:38 -07:00
Evan Tschannen ddf235713e strengthen assert 2021-10-28 16:40:30 -07:00
Evan Tschannen 4d8ee2ed33 fix: simple recruitment could succeed with less than the required replication factor 2021-10-28 16:38:04 -07:00
negoyal 1e7338b6c3 Merge branch 'master' into bit-flipping-workload 2021-10-28 14:24:49 -07:00
negoyal 88e66533ad devFormat 2021-10-28 11:13:12 -07:00
Vaidas Gasiunas 875824b186 MVC2.0: Notify clients about relevant changes of client libraries 2021-10-27 23:43:40 +02:00
Vaidas Gasiunas 4f0991eb67 MVC2.0: Introducing client library status values for instructing clients to download and activate a library; Operations to read and change client library status 2021-10-27 23:43:40 +02:00
A.J. Beamon 6174229a1b
Merge pull request #5694 from sfc-gh-vgasiunas/multi-version-client-2
Operations to upload and manage client binaries in the system keyspace
2021-10-27 14:28:10 -07:00
Vaidas Gasiunas c8794ae993 MVC2.0: Adding a comment explaining buffer alignment in download & upload operations; checking additional details in testExpectedError 2021-10-27 19:40:22 +02:00
Josh Slocum 962a1aaf74 Fix race in configure database storage migration test 2021-10-27 11:38:15 -05:00
Xiaoxi Wang e4fd0023b7 don't disable machine team remover 2021-10-27 09:08:37 -07:00
Xiaoxi Wang 75ef854563 format 2021-10-27 09:08:37 -07:00
Xiaoxi Wang db7ee9d389 disable team remover 2021-10-27 09:08:37 -07:00
Xiaoxi Wang 14fa32f208 change boolean 2021-10-27 09:08:37 -07:00
Xiaoxi Wang 1a2a838df3 add knob 2021-10-27 09:08:37 -07:00
Xiaoxi Wang c320391c4c restartRecruiting 2021-10-27 09:08:37 -07:00
Xiaoxi Wang dc630d63bd add asyncvar 2021-10-27 09:08:37 -07:00
Xiaoxi Wang 654c0a1f14 format 2021-10-27 09:08:37 -07:00
Xiaoxi Wang 8a10966126 wait extra time 2021-10-27 09:08:37 -07:00
Xiaoxi Wang d1959122af consider wiggling when waitUntilHealthy 2021-10-27 09:08:37 -07:00
Xiaoxi Wang 69190ed04e format 2021-10-27 09:08:37 -07:00
Xiaoxi Wang 0053b4793e change knob and delete redundant doBuildTeam 2021-10-27 09:08:37 -07:00
Xiaoxi Wang db7b48b71c wiggling teams calculation replace 2021-10-27 09:08:37 -07:00
Xiaoxi Wang 3a6359e202 minus wiggling teams when build team 2021-10-27 09:08:37 -07:00
Josh Slocum 6b51e149ed Fixing bug where SS would get removed when someone else got its tag, but its TSS pair wouldn't 2021-10-27 10:11:49 -05:00
Steve Atherton 61b8ee6eeb
Merge pull request #5327 from FuhengZhao/RedwoodSuperpage
Redwood Superpage Refactor from BTree to Pager concept
2021-10-27 01:40:27 -07:00
Steve Atherton 0794ebb7e8 Bump format versions for Pager and BTree as the multi-page BTree nodes are now stored in an incompatible way. 2021-10-26 22:17:55 -07:00
Vaidas Gasiunas 40da5a80f9 Merge remote-tracking branch 'apple/master' into multi-version-client-2 2021-10-26 19:29:10 +02:00
Evan Tschannen 2208b04174
Merge pull request #5855 from sfc-gh-etschannen/blob_full_clean
Blob Granules V0
2021-10-26 09:57:35 -07:00
Vaidas Gasiunas 37bc41abbb Merge remote-tracking branch 'apple/master' into multi-version-client-2 2021-10-26 18:51:43 +02:00
Evan Tschannen c615279807
Merge pull request #5720 from sfc-gh-ljoswiak/fixes/recovery-failure-fix
Fix possible recovery hang
2021-10-25 12:35:31 -07:00
Lukas Joswiak 30b525a607 Add assertions to check rollback 2021-10-25 12:03:22 -07:00
Lukas Joswiak c96f560cbe Verify rollback of a single version in simulation, other small fixes 2021-10-25 12:03:22 -07:00
Lukas Joswiak 6078664792 clang-format 2021-10-25 12:03:22 -07:00
Lukas Joswiak 57c2cf4a24 Retry messages to well known endpoints, add notes for future work 2021-10-25 12:03:22 -07:00
Lukas Joswiak 92998fd20b Merge rollback message into rollforward message 2021-10-25 12:03:22 -07:00
Lukas Joswiak 7357d7714c Retry with well known endpoints, move last committed check to consumer 2021-10-25 12:03:22 -07:00
Lukas Joswiak 1631a1b352 Update fdbserver/PaxosConfigConsumer.actor.cpp
Co-authored-by: Trevor Clinkenbeard <trevor.clinkenbeard@snowflake.com>
2021-10-25 12:03:22 -07:00
Lukas Joswiak e79c6c7456 Fix issue where previous commit messages were reused
Fixes an issue where commit versions from previous requests sent to
ConfigNodes were being reused when a new quorum of commit versions was
requested. This was occurring due to a failure to reset the state of
GetCommittedVersionQuorum after a full snapshot request.
2021-10-25 12:03:22 -07:00
Lukas Joswiak 9d78604c5b Add rollback and rollforward logic to ConfigBroadcaster 2021-10-25 12:03:22 -07:00
Lukas Joswiak 9a39da85b1 Fix issue where previous commit messages were reused
Fixes an issue where commit versions from previous requests sent to
ConfigNodes were being reused when a new quorum of commit versions was
requested. This was occurring due to a failure to reset the state of
GetCommittedVersionQuorum after a full snapshot request.
2021-10-25 12:03:22 -07:00
Lukas Joswiak 48dc91dd7f Add rollback and rollforward logic to ConfigBroadcaster 2021-10-25 12:03:22 -07:00
Josh Slocum 0ff8ddc2b6 Merge branch 'master' into blob_full_clean 2021-10-25 13:38:48 -05:00
Suraj Gupta e57e2bec5f Improve documentation. 2021-10-25 12:19:28 -04:00
Steve Atherton b75edbda31 Merge branch 'master' of github.com:apple/foundationdb into RedwoodSuperpage
# Conflicts:
#	fdbserver/VersionedBTree.actor.cpp
2021-10-25 00:55:52 -07:00
Evan Tschannen 118c307b57 fixed formatting 2021-10-24 22:26:11 -07:00
Evan Tschannen 9a6384fc26 fixed merge conflicts 2021-10-24 21:18:49 -07:00
Evan Tschannen 6f7558b8ea Merge branch 'master' of https://github.com/apple/foundationdb into feature-range-feed
# Conflicts:
#	tests/CMakeLists.txt
2021-10-24 21:06:33 -07:00
A.J. Beamon e882eb33fc Abstract the cluster file into a cluster connection record that can be backed by something other than the filesystem. 2021-10-22 11:05:18 -07:00
Steve Atherton d153519188
Merge pull request #5813 from sfc-gh-jslocum/ss_ebrake_streaming_fix
Fixes to ss e-brake, tlog streaming, and their interaction
2021-10-22 10:46:17 -07:00
Josh Slocum 773886515e Merge branch 'feature-range-feed' into blob_full_clean 2021-10-22 11:07:51 -05:00
Evan Tschannen f03d32f3d4 fix: handle the case where a fetch happens at an earlier read version than the commit version of the change feed registration 2021-10-21 23:04:51 -07:00
He Liu 16ae2b76e5 Merge branch 'master' of https://github.com/apple/foundationdb into clean-sim-test-data-loss 2021-10-21 09:16:53 -07:00
Trevor Clinkenbeard c69364d5aa
Verify that cluster is fully recovered in quietDatabase check (#5807)
* Verify that cluster is fully recovered in quietDatabase check

* Add trace event to waitForQuietDatabase
2021-10-21 09:01:52 -07:00
Vaidas Gasiunas 39b2cb8125 Merge remote-tracking branch 'apple/master' into multi-version-client-2 2021-10-21 17:45:32 +02:00
Evan Tschannen f1158371a7 Merge branch 'master' of https://github.com/apple/foundationdb into feature-range-feed
# Conflicts:
#	flow/error_definitions.h
2021-10-21 00:55:12 -07:00
Evan Tschannen 4e79296a9f fixed a few bugs with fetching change feeds 2021-10-21 00:44:51 -07:00
Evan Tschannen 3ebabb6edc fixed incorrect use of change feed errors 2021-10-20 22:37:31 -07:00
Evan Tschannen e34d242581 fix: do not throw wrong_shard_server for local fetch keys 2021-10-20 22:04:39 -07:00
Zhe Wu 0cf829ef91 Reduce restore error message 2021-10-20 14:02:48 -07:00
Josh Slocum 8dd7f8f447 Fixes to ss e-brake, tlog streaming, and their interaction 2021-10-20 10:48:29 -05:00
Vaidas Gasiunas ec307b3f2c MVC2.0: Addressing code review comments for client lib management operations 2021-10-20 17:19:12 +02:00
Evan Tschannen ad3dcd6a74 fix: memory replies were not being set 2021-10-19 21:07:36 -07:00
Lukas Joswiak 120d99e941 Fix a recovery hang that could occur when a new recovery was started during the existing recovery 2021-10-19 17:37:14 -07:00
negoyal 518065c3ed TargetedKill fixes. 2021-10-19 17:22:27 -07:00
Evan Tschannen 3f7df58a77 fixed a number of issues 2021-10-19 13:56:52 -07:00
Josh Slocum 85f64bf42c more cleanup before merge 2021-10-18 17:11:14 -05:00
Josh Slocum 912ef76f1c cleanup before merge 2021-10-18 17:11:14 -05:00
Suraj Gupta 41e96b98b2 Rethrow on actor cancelled. 2021-10-18 18:06:15 -04:00
Trevor Clinkenbeard 504d0b71b2
Fix invalid memory access when dataDistribution actor is cancelled (#5791)
* Fix valgrind error when dataDistribution actor is cancelled

* Trace Sev30 when dataDistribution actor is cancelled outside of simulation

* Rethrow actor_cancelled error in dataDistribution catch block
2021-10-18 14:21:29 -07:00
Suraj Gupta b15e4f6b4e Fix reference cycle between BWData and GranuleRangeMetadata. 2021-10-18 17:03:27 -04:00
sfc-gh-tclinkenbeard 421dee532c Add const qualifiers in KeyValueStoreRocksDB.actor.cpp 2021-10-18 13:40:47 -07:00
sfc-gh-tclinkenbeard 9e06b6e6e3 Make IClosable interface const-correct 2021-10-18 13:40:47 -07:00
Suraj Gupta 5466bdb569 Gate more entry points to BM recruitment. 2021-10-18 15:04:22 -04:00
Daniel Smith faf16fb29e
Merge pull request #5785 from Daniel-B-Smith/ikvs-read-type
Add an enum to IKeyValueStore to indicate the source/priority of the read
2021-10-18 13:21:20 -04:00
Daniel Smith 9713a14ef1 Reverse order of read type and debug ID args 2021-10-18 12:23:09 -04:00
Suraj Gupta e2e852e515 Mitigate transitive includes. 2021-10-18 10:49:25 -04:00
Suraj Gupta a9f23773ad Default all debug flags to false. 2021-10-18 10:10:05 -04:00
Suraj Gupta 90ce8bbe5b Refactor retry loop to splitRange. 2021-10-18 09:56:47 -04:00
Suraj Gupta 28c0cd3fc6 Remove old comments from BW. 2021-10-18 09:56:01 -04:00
A.J. Beamon 507a09893c
Add ClientCount to ClusterControllerMetrics (#5748) 2021-10-17 20:47:11 -07:00
He Liu a0f62e873e Use ErrorOr to indicate an error. 2021-10-15 14:58:26 -07:00
He Liu 8e7a4c587e Merge branch 'master' of https://github.com/apple/foundationdb into clean-sim-test-data-loss 2021-10-15 14:22:08 -07:00
Daniel Smith df53cc9580 Add an enum to IKeyValueStore to indicate the source/priority of the read 2021-10-15 14:35:59 -04:00
He Liu 66166e09be Clear range before setting the moved-in empty range as available. 2021-10-15 09:30:06 -07:00
Suraj Gupta d1fe9d4c50 Remove old comments from BM. 2021-10-14 19:25:34 -04:00
Suraj Gupta 1c8d96aea3 Ensure non-zero BW processes when gate is enabled. 2021-10-14 19:06:02 -04:00
Suraj Gupta d46951ccb7 Collapse if's into one. 2021-10-14 18:47:05 -04:00
Suraj Gupta 3ccc24136c Gate BM/BW in Status and timeout after waiting. 2021-10-14 18:39:55 -04:00
He Liu dbfeb06c97 Reproduced user data loss incident, and tested the improved exclude tool
can fix the system metadata.
2021-10-14 14:08:39 -07:00
Josh Slocum 8dc5ae9d24 fixed version boundaries 2021-10-14 10:45:09 -05:00
Josh Slocum b5074fd597 Reworked all of the system data to encode granule data more efficiently for persistence 2021-10-13 16:28:04 -05:00
Josh Slocum f3c44c568f fixing merge conflicts 2021-10-13 16:26:44 -05:00
Josh Slocum 5f0ec0612a Merge branch 'feature-range-feed' into blob_full 2021-10-13 15:44:35 -05:00
Suraj Gupta cfb8368da6 Address PR comments. 2021-10-13 14:56:17 -04:00
Suraj Gupta d002df3b21 Clean up blob worker changes. 2021-10-13 14:40:26 -04:00
Suraj Gupta ef67feed67 Clean up blob manager changes. 2021-10-13 14:40:26 -04:00
Suraj Gupta 266a5b06fa Fix infinite loop. 2021-10-13 14:40:26 -04:00
Suraj Gupta 423a67f448 trying to fix infinite loop 2021-10-13 14:40:26 -04:00
Suraj Gupta dfb9655c57 Handle blob work failure 2021-10-13 14:40:26 -04:00
Suraj Gupta 2ec8781224 Merge knobs into one. 2021-10-13 14:00:37 -04:00
Suraj Gupta 9d4b55c7fe Gate the blob verifier as well. 2021-10-13 11:45:51 -04:00
Suraj Gupta 5a6a052c55 Add a knob to gate blob-related work. 2021-10-13 09:48:02 -04:00
He Liu 8174c57714
Merge pull request #5722 from liquid-helium/add-logs
Added logs for worker_removed() errors in SS.
2021-10-12 15:16:11 -07:00
Evan Tschannen 282f84c807 added handling for broken promise 2021-10-12 10:36:10 -07:00
He Liu 9f974ef21f Added logs for worker_removed() errors in SS. 2021-10-12 10:12:20 -07:00
Josh Slocum 0dafb95bbf Fixing tss private mutations ranges 2021-10-11 18:14:29 -07:00
negoyal f913dfed97 Merge branch 'master' into bit-flipping-workload 2021-10-11 16:34:57 -07:00
Evan Tschannen 5b6f8b2abb added the known committed version to change feeds 2021-10-11 13:53:36 -07:00
Vaidas Gasiunas 4bc798cd57 MVC2.0: Extracting reusable error testing pattern and using it for client lib operations tests 2021-10-11 13:57:11 +02:00
Evan Tschannen 26b6f9a3f1 fix: resize changes to source vector even though m is a copy 2021-10-10 11:32:49 -07:00
Steve Atherton f339b603a5 Bug fix: printSnapshotTeamsInfo() could crash when looking up status for a storage server that was very recently added because its entry in server_status was not yet created.
Bug fix:  printSnapshotTeamsInfo()'s local server_status map would not see status updates for server UIDs that already existed in the map.
2021-10-10 01:48:31 -07:00
Steve Atherton 0cce774325
Merge pull request #5732 from sfc-gh-satherton/kvs-write-version
Refactored how Redwood handles commit version
2021-10-09 20:26:52 -07:00
Evan Tschannen d51edf18dc fixed merge conflicts 2021-10-09 19:47:24 -07:00
Evan Tschannen 5c642f706e Merge branch 'master' of https://github.com/apple/foundationdb into feature-range-feed
# Conflicts:
#	fdbcli/fdbcli.actor.cpp
2021-10-09 19:34:16 -07:00
Evan Tschannen efc4cec53f fixed a variety of bugs with change feeds 2021-10-09 19:24:01 -07:00
Steve Atherton 13e6ac7c53
Merge pull request #5740 from sfc-gh-satherton/queue-read-during-shutdown-fix
Fix rare Redwood crash after shutdown
2021-10-09 17:43:01 -07:00
Markus Pilman 424b35de63 verify FLAG_USE_PROVISIONAL_PROXIES on the server 2021-10-09 16:40:24 -06:00
Steve Atherton abc14c4921 Redwood crash after shutdown which occurs if a queue page disk read was in progress during pager shutdown but the callback was fired after shutdown completed. 2021-10-09 02:04:23 -07:00
Steve Atherton 4722412e51 TEST macro logic fix, and moved wait on seek future to the conditional block where it isn't ready. 2021-10-08 08:47:49 -07:00
Steve Atherton fd02cbc08a Bug fix, atomic update case condition for BTree node builds resulting in 1 output node of the same 1-page size as the original was incorrect so atomic updates were never being done. 2021-10-08 00:41:09 -07:00
Steve Atherton 6981f6f5f2 Merge branch 'master' of github.com:apple/foundationdb into RedwoodSuperpage 2021-10-08 00:28:08 -07:00
Steve Atherton c1f05b82ba Added test coverage with comments for range read seek locking behavior. 2021-10-08 00:06:14 -07:00
Steve Atherton 8e149f5aa2 Merge branch 'master' of github.com:apple/foundationdb into RedwoodIO 2021-10-07 23:56:45 -07:00
Steve Atherton e358a4314d Added comment about not locking point reads. 2021-10-07 23:55:59 -07:00
Zhe Wu 645cfc85a0 fix remote health variables declaration order 2021-10-07 21:54:25 -07:00
Steve Atherton 45e19c8072 Merge branch 'master' of https://github.com/apple/foundationdb into kvs-write-version
# Conflicts:
#	fdbserver/VersionedBTree.actor.cpp
2021-10-07 20:48:21 -07:00
Zhe Wu 6540b6eec5 Some improvements for grey failure failover 2021-10-07 20:42:55 -07:00
Zhe Wu 784d899afb fix assignment error in addressInDbAndRemoteDc unittest 2021-10-07 16:17:09 -07:00
Zhe Wu c07a07dbbe Take uptime into account when making failover decision 2021-10-07 11:19:34 -07:00
Zhe Wu 62197faa46 Add more comments to the code 2021-10-07 11:19:34 -07:00
Zhe Wu c0fbe5471f Implement the core logic of grey failure triggered failover 2021-10-07 11:19:34 -07:00
Vaidas Gasiunas 61fb967484 MVC2.0: Add a clientlib metadata attribute for checksum algorithm 2021-10-07 18:06:22 +02:00
Vaidas Gasiunas bd838217ba MVC2.0: Testing client lib operations with random file sizes
- Adding test parameters to control file size range
- Disabling AIO as it does not support non-page-aligned reads and writes
- Fixing bugs for the cases of an empty file and an incomplete last chunk
- Use hexadecimal representation for checksum in JSON and document keys
2021-10-07 17:23:17 +02:00
Vaidas Gasiunas ed72067b24 MVC2.0: Refactoring - declare state variables at the beginning of actors 2021-10-07 13:34:46 +02:00
Vaidas Gasiunas 114d8438fa MVC2.0: Refactoring client lib management
- Move all declarations into ClientLibManagement namespace
- Rename source files for more consistent naming
- Use constant declarations instead of defines for client lib attribute names
2021-10-07 10:30:37 +02:00
Steve Atherton 93f4e01258 Restored Redwood KVS internal auto-incremented version. 2021-10-06 13:15:50 -07:00
Steve Atherton 657f14f7be Remove IKeyValueStore commit version API because usage of IKVS doesn't align well with the concept. 2021-10-06 11:55:49 -07:00
Josh Slocum bbe7c7ca9f found change feed liveness bug 2021-10-06 12:32:16 -05:00
Vaidas Gasiunas 436ed7e497 MVC2.0: Check byte sum on client lib uploads and downloads, rollback upload in case of an error 2021-10-06 18:01:46 +02:00
Vaidas Gasiunas 5360abb238 MVC2.0: Operation to list uploaded libraries with various filters;
Introducing constants for attribute names and platform values
2021-10-06 18:01:46 +02:00
Vaidas Gasiunas abcef299e1 MVC2.0: Operation to delete a client library; enum for client lib status 2021-10-06 18:01:46 +02:00
Vaidas Gasiunas 816e8703d6 MVC2.0: Operation to download a client library from the system keyspace to a file 2021-10-06 18:01:46 +02:00
Vaidas Gasiunas 6ad6874f73 MVC2.0: Test client lib upload error paths 2021-10-06 18:01:46 +02:00
Vaidas Gasiunas cda0a5f931 Operation to upload client library binary in to system keyspace 2021-10-06 18:01:46 +02:00
Josh Slocum 4e4a2534da review comments and moving priority yield to correct spot 2021-10-06 09:24:48 -05:00
Josh Slocum 6a24ef9258 adding priorities to blob worker and fixing monitoring in blob manager 2021-10-05 16:51:19 -05:00
Steve Atherton b8508e40e9 Added random same-version zero-change commits to unit test. 2021-10-05 02:43:10 -07:00
Steve Atherton 5be84f1918 Initial Redwood post-create recovery version changed to -1, which means no commit has ever completed. Commits at the same version as the previous commit must have no changes and are no-ops. KeyValueStoreRedwood no longer advances the commit version automatically. 2021-10-05 02:20:42 -07:00
negoyal 8d1e97b329 Minor changes. 2021-10-04 22:43:48 -07:00
Suraj Gupta 95166796cd Address PR comments. 2021-10-04 20:16:22 -04:00
Suraj Gupta 282f9d35cd Cleanup comments and debugging code. 2021-10-04 11:07:08 -04:00
Suraj Gupta 4d54669ccd Recruit the blob workers via blob manager.
In this PR, the blob manager now recruits blob workers
(via communication with the cluster controller). Blob workers
are onboarded as blob worker processes enter the cluster.
2021-10-04 11:07:08 -04:00
Josh Slocum 04682c9091
Merge pull request #14 from sfc-gh-sgupta/blob-worker-process-class
Add exclusive process class for blob worker.
2021-10-04 10:03:02 -05:00