Commit Graph

156 Commits

Author SHA1 Message Date
Josh Slocum 951b28498b Fixed a couple issues with manager recovery and granule locks 2022-01-20 19:27:25 -06:00
Josh Slocum 0f6bced510 Fixing assert in blob manager failure 2022-01-20 16:28:08 -06:00
Josh Slocum 62acbcfe19 Added explicit error for old blob manager instead of attaching it to response objects 2022-01-20 14:43:08 -06:00
Josh Slocum 3be1bcd588 Fix move race in change feeds and added extra debugging to track down similar problems 2022-01-20 14:42:24 -06:00
Josh Slocum 05c3aeb93f Fixed range read bug in blob manager recovery 2022-01-20 14:42:24 -06:00
Josh Slocum 07f09f1118 Changed BlobManagerData from pointer to reference to fix ASAN issues 2022-01-20 12:33:15 -06:00
Josh Slocum 9bce1dcb0f Made blob granule split transaction idempotent on retry 2022-01-19 17:03:57 -06:00
Josh Slocum 08f8700636 Fixed a couple bugs in blob manager recovery 2022-01-18 18:47:42 -06:00
Josh Slocum 7ad085ce73 Fixed more manager assignment issues 2022-01-18 14:23:46 -06:00
Josh Slocum 9d9cb961a1 reworked blob manager recovery to be more efficient 2022-01-18 14:22:58 -06:00
Josh Slocum e4e7b638c8 reworked blob manager recovery to correctly handle splits 2022-01-18 14:22:58 -06:00
Josh Slocum 58bc3a78ea Fixed range assignment bug and fixed printf compiler warnings 2022-01-14 17:46:32 -06:00
Evan Tschannen e939cff1d7
Merge pull request #6245 from sfc-gh-etschannen/blob_integration
fix: do not submit assignments if the worker has already been removed
2022-01-14 12:56:15 -08:00
Evan Tschannen 680bc53b8f fix: do not submit assignments if the worker has already been removed 2022-01-14 12:55:35 -08:00
Josh Slocum abfb3a7e82 Fixed bug and removed unecessary check in blob manager 2022-01-14 09:07:42 -06:00
Josh Slocum 661c50d29f Handled server_overloaded in storageFeedVersionUpdater 2022-01-13 16:29:25 -06:00
Josh Slocum 48cad574d4 Enforce max fanout for blob granule splits 2022-01-12 09:52:55 -06:00
Josh Slocum 3b2e58ada8 Fixed rollback of snapshot files bug 2022-01-11 15:35:54 -06:00
Josh Slocum 6b4b22229b Fixed rollback and granule history issue 2022-01-11 15:35:54 -06:00
Evan Tschannen 01fa42522a fix: changed loadHistoryFiles to take a database object 2022-01-10 11:33:18 -08:00
Josh Slocum 4d2650f4dc Fixed a couple issues with failures and the final availability check 2022-01-07 11:21:05 -06:00
Josh Slocum 2438fee519 Made client range watching handle very long/large transactions 2022-01-05 17:38:49 -06:00
Josh Slocum bc69521a91 Several fixes with restarting BW/BM 2022-01-05 12:48:53 -06:00
A.J. Beamon d8e161f89e Refactor NativeAPI transactions to create and pass around a reference counted state object. Watches no longer use the tranasction info object but instead use their own state. 2021-12-17 11:57:39 -08:00
Suraj Gupta 1e36069c8f fix error. 2021-12-10 16:43:58 -06:00
Suraj Gupta e102738ca9 Comment out debugging code. 2021-12-10 16:43:58 -06:00
Suraj Gupta a674edaa62 Address PR comments. 2021-12-10 16:43:58 -06:00
Suraj Gupta 6ff7ad3c6a Clarify error handling. 2021-12-10 16:43:58 -06:00
Suraj Gupta dab13bd614 Fix range iteration for krms in BM. 2021-12-10 16:43:58 -06:00
Suraj Gupta e9d1f36cae More cleanup. 2021-12-10 16:43:58 -06:00
Suraj Gupta 1d32da5ac6 Remove local debugging. 2021-12-10 16:43:58 -06:00
Suraj Gupta 310d990b12 Add debugging. 2021-12-10 16:43:58 -06:00
Suraj Gupta 2ccb3a4740 Fix range boundaries and clearing intents. 2021-12-10 16:43:58 -06:00
Suraj Gupta 63b7666f49 Some more small fixes for compilation 2021-12-10 16:43:58 -06:00
Suraj Gupta 23de6fa39b Get it compiled 2021-12-10 16:43:58 -06:00
Suraj Gupta 3fe7a9f553 More fixes 2021-12-10 16:43:58 -06:00
Suraj Gupta 932f68e1b3 Refactor bstore and add watch key. 2021-12-10 16:43:58 -06:00
Suraj Gupta 22d72ec9dd init 2021-12-10 16:43:58 -06:00
Suraj Gupta d3fbad74a2 cleanup debugging 2021-12-10 14:00:34 -06:00
Suraj Gupta cb568bbd55 Add watch on config key. 2021-12-10 14:00:34 -06:00
Josh Slocum d31cb07647 ASAN fix 2021-12-10 12:25:42 -06:00
Josh Slocum c5b2b384da Fixing ASAN issues 2021-12-08 08:42:35 -06:00
Evan Tschannen 13ef5afb9c reject blob workers joining from the wrong data center
we must run the checkblobworkers actors even on epoch 1
check for an already killed worker even right after it is recruited
2021-12-05 15:02:25 -08:00
Evan Tschannen 935ec25ae3 fix: do not re-add dead blob workers 2021-12-03 14:12:08 -08:00
Evan Tschannen 243927c964 added a knob 2021-12-03 10:31:51 -08:00
Evan Tschannen f2838740f1 fix: do not allow more than one blob worker per address 2021-12-03 10:29:22 -08:00
Josh Slocum 1f71538752 Merge branch 'blob_integration' of github.com:apple/foundationdb into stuff 2021-11-24 13:20:37 -06:00
Josh Slocum efb21ca6a1 Merge branch 'master' into blob_integration 2021-11-24 13:17:54 -06:00
sfc-gh-tclinkenbeard 2613ec7561 Expand use of fmt to get rid of %ld usage 2021-11-17 17:03:32 -08:00
Suraj Gupta dba0a2d729 Add comment to clarify empty UID usage. 2021-11-16 08:49:16 -06:00
Suraj Gupta 1817b135ac Use a local keyRangeMap to avoid assigning ranges not part of client ranges. 2021-11-16 08:49:16 -06:00
Suraj Gupta ac0a5750e7 Clamp end of system keys down to normal keys. 2021-11-16 08:49:16 -06:00
Josh Slocum f140028086 Merge branch 'master' into blob_integration 2021-11-15 15:09:40 -06:00
sfc-gh-tclinkenbeard 62efeb6812 Merge remote-tracking branch 'origin/master' into add-format-warning 2021-11-12 11:50:36 -08:00
Josh Slocum bab8756e17 fixed bugs in small splits 2021-11-11 13:50:19 -06:00
Andrew Noyes db3c08c7cd
Merge pull request #5928 from sfc-gh-anoyes/anoyes/fix-heap-use-after-free
Fix a heap use after free
2021-11-10 10:21:05 -08:00
Steve Atherton d97d968176
Added KeyBackedObjectMap and KeyBackedObjectProperty classes for storing serializable objects in FDB (#5896)
* Cleaned up some lambda capture workaround since x=y captures weren't available when these classes were originally written.

* Added KeyBackedObjectMap and KeyBackObjectProperty, which work like KeyBackedMap and KeyBackedProperty but use ObjectWriter/Reader for Value serialization so that the type can evolve over time.

* Disabled unit tests which shouldn't run as part of random selection.
2021-11-08 13:04:53 -08:00
Andrew Noyes b7e393587c Fix a heap use after free
If we accept arena arguments by value, then the lifetime of any memory
allocated by that arena ends when the function returns. Given that we
seem to be appending to VectorRef's passed by pointer this is unlikely
to be what we want.
2021-11-08 12:51:32 -08:00
sfc-gh-tclinkenbeard 13bb7838aa Enable clang -Wformat warning 2021-10-30 21:07:38 -07:00
Josh Slocum 0d1d1d7f9e fix uninitialized memory and granule bounds on manager recovery 2021-10-28 18:23:43 -05:00
Josh Slocum 19076ad4d2 Changed blob manager checkin to by async and to have blob worker speculatively consume change feed ahead of re-snapshot 2021-10-27 16:23:51 -05:00
Suraj Gupta 134aef6011 Fix compiler warning. 2021-10-27 10:40:03 -04:00
Suraj Gupta fd50a011b1 Address PR comments. 2021-10-26 21:39:41 -04:00
Suraj Gupta 5a9d9921d0 Fixes and final cleanup for BM failure handling 2021-10-26 16:16:00 -04:00
Suraj Gupta 17b30f188a Working impl 2021-10-26 16:16:00 -04:00
Suraj Gupta 99606482ea initial thoughts 2021-10-26 16:16:00 -04:00
Suraj Gupta e57e2bec5f Improve documentation. 2021-10-25 12:19:28 -04:00
Josh Slocum 912ef76f1c cleanup before merge 2021-10-18 17:11:14 -05:00
Suraj Gupta e2e852e515 Mitigate transitive includes. 2021-10-18 10:49:25 -04:00
Suraj Gupta a9f23773ad Default all debug flags to false. 2021-10-18 10:10:05 -04:00
Suraj Gupta 90ce8bbe5b Refactor retry loop to splitRange. 2021-10-18 09:56:47 -04:00
Suraj Gupta d1fe9d4c50 Remove old comments from BM. 2021-10-14 19:25:34 -04:00
Josh Slocum b5074fd597 Reworked all of the system data to encode granule data more efficiently for persistence 2021-10-13 16:28:04 -05:00
Josh Slocum f3c44c568f fixing merge conflicts 2021-10-13 16:26:44 -05:00
Suraj Gupta cfb8368da6 Address PR comments. 2021-10-13 14:56:17 -04:00
Suraj Gupta ef67feed67 Clean up blob manager changes. 2021-10-13 14:40:26 -04:00
Suraj Gupta 266a5b06fa Fix infinite loop. 2021-10-13 14:40:26 -04:00
Suraj Gupta 423a67f448 trying to fix infinite loop 2021-10-13 14:40:26 -04:00
Suraj Gupta dfb9655c57 Handle blob work failure 2021-10-13 14:40:26 -04:00
Josh Slocum 4e4a2534da review comments and moving priority yield to correct spot 2021-10-06 09:24:48 -05:00
Josh Slocum 6a24ef9258 adding priorities to blob worker and fixing monitoring in blob manager 2021-10-05 16:51:19 -05:00
Suraj Gupta 282f9d35cd Cleanup comments and debugging code. 2021-10-04 11:07:08 -04:00
Suraj Gupta 4d54669ccd Recruit the blob workers via blob manager.
In this PR, the blob manager now recruits blob workers
(via communication with the cluster controller). Blob workers
are onboarded as blob worker processes enter the cluster.
2021-10-04 11:07:08 -04:00
Josh Slocum 8fb7c45e65 even more bug fixes 2021-09-25 10:30:27 -05:00
Josh Slocum fa1fe5f08b added blob worker rollbacks that handle (most) cases 2021-09-24 17:52:36 -05:00
Josh Slocum a986679b7f More bug fixes 2021-09-24 10:02:02 -05:00
Suraj Gupta 889ae3f255 Address PR comments.
Remove nukeBlobWorkerData, simplify checkManagerLock call and make
final choose...when consistent.
2021-09-23 14:09:56 -04:00
Suraj Gupta d675e6b143 Kill BM by returning, instead of throwing. 2021-09-23 11:41:57 -04:00
Suraj Gupta 8c49f1f238 Improve actor management and handling of failure in BM. 2021-09-23 10:58:38 -04:00
Suraj Gupta 5fa6c687d6 Add blob manager as a singleton. 2021-09-23 10:45:37 -04:00
Josh Slocum 5ddf08dfe5 Got basic range reassignment working 2021-09-22 16:48:44 -05:00
Josh Slocum c9b8bdffbe Passes a single correctness test! 2021-09-15 17:18:04 -05:00
Josh Slocum e2a51a4fe7 Fixing up after change feed updates 2021-09-10 11:49:41 -05:00
Josh Slocum eb76343dfb Added blob granule reassignment and splitting 2021-09-08 14:09:14 -05:00
Suraj Gupta fccfe3af78 Address PR comments. 2021-09-02 12:09:37 -04:00
Suraj Gupta c27d8f3336 Improve assignment of range granules.
Previously, we randomly picked a worker to assign a range to.
Now, we pick a worker that has the least number of range granules
already assigned, ultimately distributing the workload in a more
efficient manner.

Future iterations should also consider the number of read/writes
that a worker is already handling when picking a worker to assign
a range to. This could prevent us from assigning a range to a worker
that is already hot.
2021-09-01 23:36:52 -04:00
Josh Slocum 46adada5ff Cleaned up debugging and fixed a couple bugs 2021-08-31 12:30:43 -05:00
Josh Slocum bbeec49533 added range mover 2021-08-30 13:59:53 -05:00
Josh Slocum b4bfd58bcb multiple blob workers appears to work 2021-08-30 13:07:25 -05:00
Josh Slocum 3b011408f8 Added sequence numbers and locks to blob worker and manager 2021-08-27 16:33:07 -05:00