Commit Graph

7911 Commits

Author SHA1 Message Date
Evan Tschannen 557186ed17
Merge pull request #5909 from sfc-gh-jfu/jfu-cc-request-dbinfo
Change dbinfo broadcast to be explicitly requested by the worker registration message
2021-11-16 15:01:42 -08:00
Jingyu Zhou 8d6cfcb630
Merge pull request #6003 from sfc-gh-etschannen/fix-forwarding
fix: coordinators would process forwarding requests before making them durable
2021-11-16 14:24:21 -08:00
Evan Tschannen 35bce4cd36 added a comment 2021-11-16 13:07:35 -08:00
Evan Tschannen fd635432c4 fix: coordinators would process forwarding requests before making them durable 2021-11-16 12:21:26 -08:00
Steve Atherton 21c3c585ca Make file name parameter more user friendly in unit tests. 2021-11-16 04:03:11 -08:00
Steve Atherton 867999a41a Rename wrong_format_version to unsupported_format_version. 2021-11-16 03:25:54 -08:00
Steve Atherton 7b29804a5e Fix typo. 2021-11-16 02:30:57 -08:00
Steve Atherton c53f5aa110 Renamed redwood to redwood-1-experimental and file extension to .redwood-v1. 2021-11-16 02:15:22 -08:00
Evan Tschannen 964d0209ca
Merge pull request #5637 from sfc-gh-ljoswiak/features/data-loss-prevention
Data loss protection when joining new cluster
2021-11-15 15:26:32 -08:00
Jingyu Zhou 3d26d2372b
Merge pull request #5932 from sfc-gh-ahusain/ahusain-improveTesterLogging
Improve tester actor logging to track workload run & check status
2021-11-15 13:28:55 -08:00
Jingyu Zhou 02d0c43bc2
Merge pull request #5982 from sfc-gh-tclinkenbeard/improve-error-descriptions
Make snapshot errors more descriptive
2021-11-15 13:18:19 -08:00
Evan Tschannen 6e81b83924 fix: cleanup change feeds which have been completely removed from a storage server 2021-11-15 11:47:42 -08:00
Evan Tschannen 94a51e57a5 Merge branch 'master' into feature-changefeed-empty-versions
# Conflicts:
#	fdbclient/StorageServerInterface.h
2021-11-14 19:13:05 -08:00
Evan Tschannen 6909754b21 changefeeds now have a whenAtLeast function for efficiently learning when the version has updated but no mutations have been committed 2021-11-14 19:08:46 -08:00
sfc-gh-tclinkenbeard 0ba77ea79b Fix proxySnapCreate trace typo 2021-11-14 16:12:28 -08:00
Tao Lin 9422b8e5f2 Restricted getRangeAndFlatMap to snapshot 2021-11-12 15:12:37 -08:00
Steve Atherton 508429f30d
Redwood chunked file growth and low priority IO starvation prevention (#5936)
* Redwood files now growth in large page chunks controlled by a knob to reduce truncate() calls for expansion.   PriorityMultiLock has limit on consecutive same-priority lock release.  Increased Redwood max priority level to 3 for more separation at higher BTree levels.

* Simulation fix, don't mark certain IO timeout errors as injected unless the simulated process has been set to have an unreliable disk.

* Pager writes now truncate gradually upward, one chunk at a time, in response to writes, which wait on only the necessary truncate operations.   Increased buggified chunk size because truncate can be very slow in simulation.

* In simulation, ioTimeoutError() and ioDegradedOrTimeoutError() will wait until at least the target timeout interval past the point when simulation is sped up.

* PriorityMultiLock::toString() prints more info and is now public.

* Added queued time to PriorityMultiLock.

* Bug fix to handle when speedUpSimulation changes later than the configured time.

* Refactored mutation application in leaf nodes to do fewer comparisons and do in place value updates if the new value is the same size as the old value.

* Renamed updatingInPlace to updatingDeltaTree for clarity.  Inlined switchToLinearMerge() since it is only used in one place.

* Updated extendToCover to be more clear by passing in the old extension future as a parameter.  Fixed initialization warning.
2021-11-12 13:47:07 -08:00
Tao Lin 08fd69e787
Merge pull request #5967 from nblintao/flatmap-exception 2021-11-12 10:19:44 -08:00
Daniel Smith 019fd50f46
Merge pull request #5971 from Daniel-B-Smith/rocksdb-error-logging
Clean up RocksDB error logging
2021-11-12 13:17:10 -05:00
Ata E Husain Bohra 82c3e8bf79
Trigger buildTeam operation if server transition from unhealthy -> healthy (#5930)
* Trigger buildTeam operation if server transition from unhealthy -> healthy

DataDistribution actor helps in building teams as server count changes
(add/removal), however, it is possible that total_healthy_server count
is insufficient to allow team formation. If happens, even healthy server
count recover, the buildTeam operation will not be triggered.

Patch proposal is to trigger `checkBuildTeam` operation if server
transitions from unhealthy -> healthy state. Incase system already
has created enough teams (desiredTeamCount/maxTeamCount), the operation
incurs a very minimal cost.
2021-11-12 09:41:01 -08:00
Tao Lin 2d5f924278 GetKeyValuesAndFlatMap should return error if not retriable 2021-11-12 09:35:28 -08:00
Daniel Smith 9dccb0131e Clean up RocksDB error logging 2021-11-12 12:14:12 -05:00
He Liu d73d2144fd Adjust distributorSplitRange order. 2021-11-11 20:28:55 -08:00
He Liu 984bc0fbea Added Endpoints. 2021-11-11 16:56:04 -08:00
Markus Pilman 0dfb72176e
Merge pull request #5857 from sfc-gh-vgasiunas/notify-client-lib-changes
A mechanism to notify MVC about relevant client library status changes on the cluster
2021-11-11 07:43:20 -07:00
Lukas Joswiak e4c3f886da Fix recovery issue 2021-11-10 16:15:13 -08:00
Lukas Joswiak 28b72550f3 Remove additional unused tracing 2021-11-10 13:33:49 -08:00
Lukas Joswiak c93052121f Fix issue where transaction spans would not be recorded 2021-11-10 13:33:49 -08:00
Daniel Smith 481cf9bb55
Merge pull request #5788 from Daniel-B-Smith/rocks-throttle
Throttle the number of concurrent reads to RocksDB
2021-11-10 15:20:30 -05:00
Daniel Smith 499dbcdb18 Don't fail fetchKeys when server overloaded is returned 2021-11-10 14:15:42 -05:00
Andrew Noyes db3c08c7cd
Merge pull request #5928 from sfc-gh-anoyes/anoyes/fix-heap-use-after-free
Fix a heap use after free
2021-11-10 10:21:05 -08:00
Vaidas Gasiunas 51b8ccf7d3 Merge remote-tracking branch 'apple/master' into notify-client-lib-changes 2021-11-10 18:40:34 +01:00
Daniel Smith 394b9dc619 Code review changes 2021-11-10 11:53:27 -05:00
Daniel Smith b50b3de5d0 Allow SS to respond with server overloaded 2021-11-10 11:52:02 -05:00
Daniel Smith 66520eb1c1 Utilize read types to do selective throttling 2021-11-10 11:51:04 -05:00
Steve Atherton 470896bdc4
Redwood inline same-size value updates (#5925)
* Refactored mutation application in leaf nodes to do fewer comparisons and do in place value updates if the new value is the same size as the old value.

* Renamed updatingInPlace to updatingDeltaTree for clarity.  Inlined switchToLinearMerge() since it is only used in one place.
2021-11-10 08:22:57 -08:00
Daniel Smith ec99d1d888
Merge pull request #5946 from Daniel-B-Smith/cfoptions
Move table factory options to CFOptions
2021-11-10 11:20:41 -05:00
Markus Pilman a246c1555d
Merge pull request #5931 from sfc-gh-mpilman/features/all-unit-tests-in-ci
Make sure unit tests are run often enough
2021-11-10 08:41:04 -07:00
Daniel Smith 8822c589de Move table factory options to CFOptions 2021-11-09 17:29:06 -05:00
Tao Lin 4b2757bf99 Fix memory bug in IndexPrefetchDemo 2021-11-09 13:52:28 -08:00
Tao Lin fdb3b72e35 Introduce GetRangeAndFlatMap to push computations down to FDB
Re-introduce #5609
2021-11-09 13:52:28 -08:00
Lukas Joswiak 15e0d5b29f Add explicit transaction options when reading cluster ID 2021-11-09 12:29:49 -08:00
Lukas Joswiak 74cf64fe0f Sync cluster ID through ServerDBInfo 2021-11-09 12:29:48 -08:00
Lukas Joswiak 4640045243 Fix rare simulation failures
When partitions appear before a cluster has fully recovered, it was
possible to have different tlogs persist different cluster IDs because
they were involved in different partitions. This would affect recovery
when a quorum was eventually reached. The solution to this is to avoid
persisting the cluster ID before a cluster has fully recovered, to make
sure all nodes agree on the cluster ID.
2021-11-09 12:29:48 -08:00
Lukas Joswiak 1fa726ca73 Fix compilation issue 2021-11-09 12:29:48 -08:00
Lukas Joswiak 3988b11fd6 Cleanup 2021-11-09 12:29:48 -08:00
Lukas Joswiak aa3383f0e3 Exclude when joining new cluster 2021-11-09 12:29:48 -08:00
Lukas Joswiak 3e2c65bb11 Allow tlog to join another cluster but retain its data 2021-11-09 12:29:48 -08:00
Lukas Joswiak 30867750b5 Add protection against storage and tlog data deletion when joining a new cluster 2021-11-09 12:29:47 -08:00
Jon Fu 2887e1c30a set flag to true when doing first registration 2021-11-09 12:44:07 -05:00