Commit Graph

236 Commits

Author SHA1 Message Date
Josh Slocum a42c80faa9 Tightening up memory management in the blob worker 2022-02-01 14:52:28 -06:00
Josh Slocum d0113a6776 Added mechanism for blob manager to poll blob workers for their granule assignments, and used that to improve manager recovery 2022-01-31 19:59:41 -06:00
Josh Slocum f2b9eb1d4b Fixing potential bug of mixed-version FDB result in granule verifier 2022-01-31 10:44:09 -06:00
Josh Slocum ac1fd056dd Added change feed popped read guard for blob workers 2022-01-28 10:45:33 -06:00
Josh Slocum cfbb3f5b2b Adding random prefix to blob worker generated files 2022-01-28 07:43:34 -06:00
Josh Slocum df1a21564b Fixed Blob Worker Rollback issue 2022-01-28 06:21:07 -06:00
Josh Slocum 4262241c92 Removed incorrect assert 2022-01-24 12:51:07 -06:00
Josh Slocum 4d7d1f0e8e Rollback tracking fix for blob worker 2022-01-24 12:50:41 -06:00
Josh Slocum 42a36dc756 Fixed Blob Manager recruitment error and Blob Worker monitoring error 2022-01-24 09:46:37 -06:00
Josh Slocum 1180eb6e44 Fixed uncaught error in blob worker requests 2022-01-21 18:02:30 -06:00
Josh Slocum 558779d782 Fix open granule races 2022-01-21 13:20:15 -06:00
Josh Slocum 951b28498b Fixed a couple issues with manager recovery and granule locks 2022-01-20 19:27:25 -06:00
Josh Slocum 62acbcfe19 Added explicit error for old blob manager instead of attaching it to response objects 2022-01-20 14:43:08 -06:00
Josh Slocum 6a8e73891f Bug fixes for blob worker rollbacks 2022-01-20 11:44:53 -06:00
Josh Slocum 215f5fae93 Reworked change feed initialization to handle more types of races 2022-01-19 15:20:23 -06:00
Josh Slocum f914b0860b Fixed race in change feed initialization 2022-01-19 09:11:55 -06:00
Josh Slocum 9d9cb961a1 reworked blob manager recovery to be more efficient 2022-01-18 14:22:58 -06:00
Josh Slocum 04e8839656 fixed incorrect assert 2022-01-14 18:43:16 -06:00
Josh Slocum 58bc3a78ea Fixed range assignment bug and fixed printf compiler warnings 2022-01-14 17:46:32 -06:00
Josh Slocum 661c50d29f Handled server_overloaded in storageFeedVersionUpdater 2022-01-13 16:29:25 -06:00
Josh Slocum 3b2e58ada8 Fixed rollback of snapshot files bug 2022-01-11 15:35:54 -06:00
Josh Slocum 6b4b22229b Fixed rollback and granule history issue 2022-01-11 15:35:54 -06:00
Josh Slocum 59e6793c6b Fixed waitForVersion sanity check for rollbacks 2022-01-10 13:44:56 -06:00
Evan Tschannen 01fa42522a fix: changed loadHistoryFiles to take a database object 2022-01-10 11:33:18 -08:00
Josh Slocum f0b434d9ce Don't log operation_cancelled 2022-01-10 12:27:52 -06:00
Josh Slocum 17ba3e796d Fixed some races in file requests and wait committed 2022-01-10 12:27:52 -06:00
Josh Slocum 21309fb55b Change feed merge cursor whenAtLeast fix 2022-01-07 16:16:29 -06:00
Josh Slocum 2c62dee5ba Fixed more issues in blob granule requests 2022-01-07 13:49:02 -06:00
Josh Slocum 4d2650f4dc Fixed a couple issues with failures and the final availability check 2022-01-07 11:21:05 -06:00
Josh Slocum 0f66cca8e0 Fixing change feed race with empty mutation and error 2022-01-05 16:40:07 -06:00
Josh Slocum a96163d9d3 Fixed ASAN issues 2022-01-05 13:12:49 -06:00
Josh Slocum bc69521a91 Several fixes with restarting BW/BM 2022-01-05 12:48:53 -06:00
Josh Slocum 738b72918a send merge cursor's buffered data if all sub-streams have something at this version 2021-12-21 16:32:56 -06:00
Josh Slocum b0aea91895 Broadening explicit disconnect handling to explicit error handling of all types 2021-12-21 14:12:09 -06:00
Josh Slocum d337e8fbe8 Handled disconnects explicitly in CF streams 2021-12-21 10:38:04 -06:00
Josh Slocum 9f69715fec Fixing blob worker committed tracking and ReplyPromiseStream::onEmpty 2021-12-20 11:33:44 -06:00
Josh Slocum ea7a5114f5 Merge branch 'bw_rollback_grv' into blob_integration 2021-12-10 16:59:59 -06:00
Suraj Gupta e9d1f36cae More cleanup. 2021-12-10 16:43:58 -06:00
Suraj Gupta 1d32da5ac6 Remove local debugging. 2021-12-10 16:43:58 -06:00
Suraj Gupta 310d990b12 Add debugging. 2021-12-10 16:43:58 -06:00
Suraj Gupta 2ccb3a4740 Fix range boundaries and clearing intents. 2021-12-10 16:43:58 -06:00
Suraj Gupta 23de6fa39b Get it compiled 2021-12-10 16:43:58 -06:00
Suraj Gupta 932f68e1b3 Refactor bstore and add watch key. 2021-12-10 16:43:58 -06:00
Suraj Gupta 22d72ec9dd init 2021-12-10 16:43:58 -06:00
Josh Slocum 307d049c9d Cleaning up some memory lifetime issues 2021-12-10 16:12:06 -06:00
Josh Slocum ff2cd691cd Switching back to GRV for committed version checking, with proper rollback checking 2021-12-10 15:27:25 -06:00
Suraj Gupta d3fbad74a2 cleanup debugging 2021-12-10 14:00:34 -06:00
Suraj Gupta cb568bbd55 Add watch on config key. 2021-12-10 14:00:34 -06:00
Josh Slocum c95c93b527 removed incorrect assert for now 2021-12-09 17:06:20 -06:00
Josh Slocum 8dc5f79dc7 Made blob worker assume no empty mutations, and made merge cursor not send them 2021-12-09 15:00:46 -06:00
Josh Slocum 1ee0b16bfa Fixed bug in merge cursor whenAtLeast 2021-12-09 14:19:00 -06:00
Josh Slocum 46ac726700 Added assert to more easily detect WFV errors 2021-12-09 11:01:13 -06:00
Josh Slocum 86f6f73518 Merge cursor debugging and fix in BW::waitForVersion 2021-12-09 10:51:31 -06:00
Josh Slocum 845f1ade42 Another incorrect assert 2021-12-08 17:19:28 -06:00
Josh Slocum 5f2640f592 Fixed an assert I added during refactor that was incorrect 2021-12-08 16:40:40 -06:00
Josh Slocum e9e5a80086 Refactored file writing to guarantee consuming all mutations from the change feed promise stream 2021-12-08 16:10:29 -06:00
Josh Slocum b438e98220 Fixed multiple availability bugs in blob worker 2021-12-08 15:47:16 -06:00
Josh Slocum 6c79750412 Added a bunch of debugging for tracking whenAtLatest problems, and fixed a bug there 2021-12-08 12:41:45 -06:00
Josh Slocum c5b2b384da Fixing ASAN issues 2021-12-08 08:42:35 -06:00
Josh Slocum bc16e23d2c Merge branch 'master' into blob_integration 2021-12-02 15:51:18 -06:00
Josh Slocum d85eb330e0 retooling some waitForVersion stuff and adding asserts 2021-12-02 14:52:16 -06:00
Evan Tschannen b11ae4dae8
Merge pull request #5910 from sfc-gh-jslocum/bg_bindings
Blob Granule C bindings
2021-12-02 11:40:26 -08:00
Josh Slocum f43169cb7b bug fixes 2021-12-02 12:40:37 -06:00
Josh Slocum 632f701c8a First pass at using ChangeFeedData in blob worker 2021-12-02 11:42:16 -06:00
sfc-gh-tclinkenbeard 90ced244eb Fix -Wunused-but-set-variable warnings 2021-12-01 18:15:53 -08:00
Josh Slocum a82845af43 Merge branch 'master' into bg_bindings 2021-12-01 16:55:28 -06:00
Josh Slocum 2c82a27f09 fixed typo in fmt::printf 2021-11-24 14:35:23 -06:00
Josh Slocum efb21ca6a1 Merge branch 'master' into blob_integration 2021-11-24 13:17:54 -06:00
sfc-gh-tclinkenbeard 2613ec7561 Expand use of fmt to get rid of %ld usage 2021-11-17 17:03:32 -08:00
sfc-gh-tclinkenbeard 07349869d9 Use fmt to address -Wformat warnings 2021-11-17 14:45:48 -08:00
sfc-gh-tclinkenbeard 766a05d33c Merge remote-tracking branch 'origin/master' into add-format-warning 2021-11-17 12:14:01 -08:00
Josh Slocum f140028086 Merge branch 'master' into blob_integration 2021-11-15 15:09:40 -06:00
Josh Slocum fd4f13fba1 Added fetch version to change feeds to avoid races and overwrites between updateStorage and fetchChangeFeed 2021-11-15 09:49:51 -06:00
Evan Tschannen 6909754b21 changefeeds now have a whenAtLeast function for efficiently learning when the version has updated but no mutations have been committed 2021-11-14 19:08:46 -08:00
Josh Slocum bab8756e17 fixed bugs in small splits 2021-11-11 13:50:19 -06:00
Josh Slocum ba53045fcd Fixing BW count for multi-process machines and having BW monitor its own removal since it doesn't get private mutations 2021-11-11 12:33:08 -06:00
Josh Slocum 5b2617a524 Added local granule file reading to mako 2021-11-03 09:33:30 -05:00
Josh Slocum 382882f1c1 mako successfully calls read_blob_granules and gets stuff back 2021-11-02 13:43:42 -05:00
sfc-gh-tclinkenbeard 13bb7838aa Enable clang -Wformat warning 2021-10-30 21:07:38 -07:00
Josh Slocum 3b711af061 Fixed a couple rollback issues and endianness of versionstamp version 2021-10-29 12:07:16 -05:00
Josh Slocum 0d1d1d7f9e fix uninitialized memory and granule bounds on manager recovery 2021-10-28 18:23:43 -05:00
Josh Slocum dbf46c200f disabling long keys and adding more debugging stuff 2021-10-28 13:53:54 -05:00
Josh Slocum f1a4363fe6 fixing a couple bugs in blob worker 2021-10-28 13:18:39 -05:00
Josh Slocum d0c6bbc56a wait on previous snapshots if we know they can be committed now 2021-10-28 08:13:52 -05:00
Josh Slocum 19076ad4d2 Changed blob manager checkin to by async and to have blob worker speculatively consume change feed ahead of re-snapshot 2021-10-27 16:23:51 -05:00
Josh Slocum 60166ecd55 Merge branch 'new_cf_blob' into blob_integration 2021-10-27 10:39:07 -05:00
Suraj Gupta fd50a011b1 Address PR comments. 2021-10-26 21:39:41 -04:00
Suraj Gupta 5a9d9921d0 Fixes and final cleanup for BM failure handling 2021-10-26 16:16:00 -04:00
Suraj Gupta 17b30f188a Working impl 2021-10-26 16:16:00 -04:00
Suraj Gupta 99606482ea initial thoughts 2021-10-26 16:16:00 -04:00
Josh Slocum c39c631d50 Using change feed stops and known committed version from change feeds 2021-10-26 12:48:24 -05:00
Josh Slocum 773886515e Merge branch 'feature-range-feed' into blob_full_clean 2021-10-22 11:07:51 -05:00
Josh Slocum 912ef76f1c cleanup before merge 2021-10-18 17:11:14 -05:00
Suraj Gupta 41e96b98b2 Rethrow on actor cancelled. 2021-10-18 18:06:15 -04:00
Suraj Gupta b15e4f6b4e Fix reference cycle between BWData and GranuleRangeMetadata. 2021-10-18 17:03:27 -04:00
Suraj Gupta e2e852e515 Mitigate transitive includes. 2021-10-18 10:49:25 -04:00
Suraj Gupta 28c0cd3fc6 Remove old comments from BW. 2021-10-18 09:56:01 -04:00
Josh Slocum 8dc5ae9d24 fixed version boundaries 2021-10-14 10:45:09 -05:00
Josh Slocum b5074fd597 Reworked all of the system data to encode granule data more efficiently for persistence 2021-10-13 16:28:04 -05:00
Josh Slocum f3c44c568f fixing merge conflicts 2021-10-13 16:26:44 -05:00
Suraj Gupta cfb8368da6 Address PR comments. 2021-10-13 14:56:17 -04:00
Suraj Gupta d002df3b21 Clean up blob worker changes. 2021-10-13 14:40:26 -04:00
Suraj Gupta 266a5b06fa Fix infinite loop. 2021-10-13 14:40:26 -04:00
Suraj Gupta 423a67f448 trying to fix infinite loop 2021-10-13 14:40:26 -04:00
Suraj Gupta dfb9655c57 Handle blob work failure 2021-10-13 14:40:26 -04:00
Josh Slocum 4e4a2534da review comments and moving priority yield to correct spot 2021-10-06 09:24:48 -05:00
Josh Slocum 6a24ef9258 adding priorities to blob worker and fixing monitoring in blob manager 2021-10-05 16:51:19 -05:00
Suraj Gupta 282f9d35cd Cleanup comments and debugging code. 2021-10-04 11:07:08 -04:00
Suraj Gupta 4d54669ccd Recruit the blob workers via blob manager.
In this PR, the blob manager now recruits blob workers
(via communication with the cluster controller). Blob workers
are onboarded as blob worker processes enter the cluster.
2021-10-04 11:07:08 -04:00
Josh Slocum 8fb7c45e65 even more bug fixes 2021-09-25 10:30:27 -05:00
Josh Slocum 3b23d6aba5 rollback fixes 2021-09-25 09:19:00 -05:00
Josh Slocum fa1fe5f08b added blob worker rollbacks that handle (most) cases 2021-09-24 17:52:36 -05:00
Josh Slocum a986679b7f More bug fixes 2021-09-24 10:02:02 -05:00
Josh Slocum 5ddf08dfe5 Got basic range reassignment working 2021-09-22 16:48:44 -05:00
Josh Slocum 8e4582673c more bug fixes 2021-09-20 10:21:52 -05:00
Josh Slocum 9a92b763ae Fixing issues in known committed version tracking 2021-09-17 12:26:10 -05:00
Josh Slocum c9b8bdffbe Passes a single correctness test! 2021-09-15 17:18:04 -05:00
Josh Slocum cefb66d64c fixing a couple bugs 2021-09-10 12:52:33 -05:00
Josh Slocum e2a51a4fe7 Fixing up after change feed updates 2021-09-10 11:49:41 -05:00
Josh Slocum 44a64005cf also pipelining snapshot from blob in blob worker for maximum throughput 2021-09-09 10:43:28 -05:00
Josh Slocum 99e5ecdf9d Doing GRVs for committed version checking in blob worker 2021-09-08 17:24:49 -05:00
Josh Slocum 2491287f98 Pipelined delta file writing to improve performance 2021-09-08 16:57:57 -05:00
Josh Slocum eb76343dfb Added blob granule reassignment and splitting 2021-09-08 14:09:14 -05:00
Suraj Gupta 5aebd77f0a PR changes 2021-09-03 13:58:39 -04:00
Suraj Gupta a0b3446572 Add metrics for blob worker.
We want to add metrics for the blob worker to evaluate its
performance more concretely. We decided to track the following
information:
- s3 put requests
- s3 get requests
- S3 delete requests
- Delta files written
- Snapshot files written
- Delta bytes written
- Snapshot bytes written
- Number of current ranges assigned
- Bytes read from FDB (initial snapshot)
- Bytes read from S3 (compaction)
- Read requests count
- Read files returned
- Read deltas returned
- Read delta bytes returned
- Ranges assigned
- Ranges revoked
- Number of current ranges assigned
- Total mutation bytes buffered across all ranges // current or accumulated
- Range feed bytes input
- Range feed mutations input
- Wrong Shard Server count
2021-09-03 12:09:01 -04:00
Suraj Gupta bdbb0303f3 Add missing wait around deleteFile invocations. 2021-09-03 11:12:03 -04:00
Josh Slocum ff96843c58 review comments 2021-09-01 10:12:53 -05:00
Josh Slocum 46adada5ff Cleaned up debugging and fixed a couple bugs 2021-08-31 12:30:43 -05:00
Josh Slocum b4bfd58bcb multiple blob workers appears to work 2021-08-30 13:07:25 -05:00
Josh Slocum 3b011408f8 Added sequence numbers and locks to blob worker and manager 2021-08-27 16:33:07 -05:00
Josh Slocum 8d49c98a41 Added simulation workload for blob granules and fixed some bugs 2021-08-26 13:48:05 -05:00
Josh Slocum ad899565e9 Using local files for blob granules in simulation 2021-08-24 14:15:14 -05:00
Josh Slocum 5259af787d Switched blob implementation to use backup container 2021-08-24 13:47:47 -05:00
Josh Slocum fb2eef38fc Changing API and file format to full V1 specification 2021-08-24 10:05:46 -05:00
Josh Slocum 42a781016e Made subsequent snapshots read from blob files instead of FDB 2021-08-23 16:46:59 -05:00
Josh Slocum 2ae447eaaa Refactored blob worker/manager to be in separate files 2021-08-23 14:16:09 -05:00