Commit Graph

12309 Commits

Author SHA1 Message Date
Evan Tschannen 28cb5f242c another fix 2020-08-26 11:01:40 -07:00
Evan Tschannen e81ccd2dc9 another compiler fix 2020-08-26 10:59:06 -07:00
Evan Tschannen e531046b53 fix compiler errors 2020-08-26 10:56:21 -07:00
Evan Tschannen fd1a4304fa fix: made ConnectionResetInfo reference counted 2020-08-26 10:53:17 -07:00
Meng Xu f1bd2a18ed Resovle review comments: No functional change 2020-08-26 10:30:31 -07:00
Meng Xu 3d2b18b663 FastRestore:AtomicRestore:Add resetDBTimeout option 2020-08-26 10:21:17 -07:00
Meng Xu 3ca1359e89 FastRestore:Simulate cpu busy and adjust simulation of overused memory usage
Increase the chance of cpu busy and memory overused
2020-08-26 10:07:49 -07:00
Meng Xu bc766ab7ef FastRestoreKnob:Adjust knobs 2020-08-26 09:54:33 -07:00
Meng Xu a8bd628216 FastRestoreAtomicRestoreTest:Increase timeout from 2100 to 36000 2020-08-26 09:48:52 -07:00
Meng Xu 1b7a0d9b1d FastRestoreTest:Increase cycle test duration from 40 to 60s 2020-08-25 22:47:39 -07:00
Vishesh Yadav 5d5dab1040 java-bindings: Minors fixes and refactor 2020-08-25 17:01:37 -07:00
Vishesh Yadav 738cd82a85 java-bindings: Add function to disable/enable/resize DirectBuffer 2020-08-25 16:19:42 -07:00
Vishesh Yadav 5cefb27fe2 java-bindings: Addressed review comments 2020-08-25 16:03:27 -07:00
Meng Xu d8e73fddb6 FastRestore:Cancel actors when restore request finishes 2020-08-25 14:46:26 -07:00
Vishesh Yadav 9123ffb1bf java-bindings: Use DirectBuffer with standard Async call 2020-08-25 14:29:50 -07:00
Meng Xu 6256bedf8d BackupContainer:Use processId as the process filename
instead of using a randomly generated string which change every time
when a file is open.

Having too many files will trigger TOO_MANY_FILES error
2020-08-25 12:25:09 -07:00
Xiaoxi Wang b1c206b62a change rate calculation 2020-08-25 18:47:13 +00:00
Andrew Noyes 93f1d1a07d Add PR number 2020-08-25 17:48:12 +00:00
Meng Xu 57584f41a0 FastRestoreOldBackup:Change backup latency from 60 to 40s 2020-08-25 10:42:28 -07:00
Meng Xu bd7c07436b FastRestore:Add batchIndex to RestoreAsset for better performance tracking 2020-08-25 09:34:18 -07:00
Meng Xu 23c8d0154b FastRestoreOldBackupTest:Increase test duragion
Add comment on Rollback test as well
2020-08-25 09:14:37 -07:00
Vishesh Yadav be184a9dc2 java-bindings: Use DirectBuffer for `getRange` requests #3682
This patch keeps a batch of Java's DirectBuffers, which can be shared with JNI C
world. This means:

1. No need for JNI wrapper to make several JNI calls, to allocate and convert
   Java objectd to bytes.
2. We already made a PR #3582 to reduce 3 JNI calls for each getRange() i.e. to
   fetch summary and then results. As mentioned in that PR, this patch also
   makes similar decision to make `getDirectRange()` call synchronous and
   instead schedule it asynchronously in Java executor.
3. No need for JNI to dynamically allocate buffers to store KVs.
4. Use one big DirectBuffer to make request and get reponse. `DirectBuffers` give
   direct access to the memory, and are much fast than the regular non-direct
   buffers we use.
5. We keep a pool of reasonably big DirectBuffers, which are borrowed and
   returned by `getRange()` requests.

The downside to this are:

1. We have to manually and "carefully" serialize and deserialize the
   request/response both in C and Java world. It is no longer high-level Java
   objects.
2. Because `DirectBuffers` are expensive, we can only keep a few of them, which
   number of outstanding `getRange()` requests are limited by that.
3. Each request, currently uses 2 buffers, one for current chunk and one for
   outstanding request.
4. The performance bump seems to be excellent for bigger key-values. We didn't
   observe significant difference for smaller KV sizes (I can't say its better
   or worse, as from quick glance it didn't look statistically significant to me).

Performance is currently measured using `PerformanceTester.java`, which measures
throughput for several operations. The results are:

```
 1. Using Key = 16bytes, Value = 100bytes
=========================================

Without this PR=>
                                                           Count     Avg    Min      Q1    Median      Q3     Max
-------------------------------------------------------  -------  ------  -----  ------  --------  ------  ------
get_range throughput (local client) [keys/s]                  30  349363  73590  316218    342523  406445  540731
get_single_key_range throughput (local client) [keys/s]       30    7685   6455    6981      7744    8129    9773

** With this PR ==>
                                                           Count     Avg    Min      Q1    Median      Q3     Max
-------------------------------------------------------  -------  ------  -----  ------  --------  ------  ------
get_range throughput (local client) [keys/s]                  30  383404  70181  338810    396950  437335  502886
get_single_key_range throughput (local client) [keys/s]       30    7029   5515    6635      7090    7353    8219

=======================================
2. Using Key = 256bytes, Value = 512bytes
========================================

** Without this PR ==>
                                                           Count     Avg     Min      Q1    Median      Q3     Max
-------------------------------------------------------  -------  ------  ------  ------  --------  ------  ------
get_range throughput (local client) [keys/s]                  90  132787  102036  122650    130204  138269  202790
get_single_key_range throughput (local client) [keys/s]       90    5833    4894    5396      5690    6061    8986

** With this PR ==>
                                                           Count     Avg     Min      Q1    Median      Q3     Max
-------------------------------------------------------  -------  ------  ------  ------  --------  ------  ------
get_range throughput (local client) [keys/s]                  90  359302  196676  310931    344029  407232  494259
get_single_key_range throughput (local client) [keys/s]       90    7227    5573    6771      7177    7477   10108
====================================================================================================================

=======================================
3. Using Key = 128bytes, Value = 512bytes
========================================

** Without this PR ==>
                                                           Count     Avg     Min      Q1    Median      Q3     Max
-------------------------------------------------------  -------  ------  ------  ------  --------  ------  ------
get_range throughput (local client) [keys/s]                  30  235661  148963  213670    229090  256825  317050
get_single_key_range throughput (local client) [keys/s]       30   10441    6302   10586     10873   10960   11065
====================================================================================================================

** With this PR ==>
                                                           Count     Avg     Min      Q1    Median      Q3     Max
-------------------------------------------------------  -------  ------  ------  ------  --------  ------  ------
get_range throughput (local client) [keys/s]                  30  350612  185698  320868    348998  406750  459101
get_single_key_range throughput (local client) [keys/s]       30   10338    6570   10664     10847   10901   11040
====================================================================================================================
```

NOTE: These tests were run on a shared VM. Benchmark in each group was run
serially, and the groups themselves run at different times. Therefore there
might be some skew based on load, but the difference is compelling enough to
show that there is performance benefit for larger KV.
2020-08-24 23:34:54 -07:00
Vishesh Yadav 9c7b502b68 java-bindings: Combine getSummary() and getResult() into single JNI
RangeQuery makes getSummary() and getResult() JNI calls, which are redundant in
nature. This patch combines them into single call.

This reduce 3 JNI to 2 JNI calls. Next logical step is to remove the 2nd JNI
call, i.e. getResults() after getRange() which is slighly more convoluted
because C API doesn't allow primitives to compose new Futures.
2020-08-24 23:34:47 -07:00
Young Liu 9564171463 Merge branch 'master' into grv-proxy 2020-08-24 22:45:01 -07:00
Meng Xu 778daf20c0 FastRestore:Fix incorrect assert 2020-08-24 19:59:56 -07:00
Xiaoxi Wang 9011ee6f57 Merge branch 'master' of https://github.com/apple/foundationdb 2020-08-25 00:42:04 +00:00
Meng Xu 996ba2374c FastREstore:Fix:Incorrect condition in printing out FastRestoreLoaderSendMutationToApplierDoneTooLate 2020-08-24 17:27:40 -07:00
Trevor Clinkenbeard 3e39f01496
Merge pull request #3689 from sfc-gh-xwang/tag-report
add throttle objects into Schemas.cpp
2020-08-24 17:23:00 -07:00
Young Liu 63b3612ad5 Merge master branch and resolve conflicts 2020-08-24 16:42:31 -07:00
XiaoxiWang 4e627691a9 add throttle objects into Schemas.cpp 2020-08-24 23:37:58 +00:00
Meng Xu e94261efe5 FastRestore:LoaderScheduler:Add validation on too old requests 2020-08-24 16:32:16 -07:00
Xiaoxi Wang 4e4fa0fded merge with master 2020-08-24 21:04:53 +00:00
Trevor Clinkenbeard e7662eecda
Merge pull request #3669 from sfc-gh-xwang/tag-report
Report recommended tx tag to be throttled to status json
2020-08-24 12:06:25 -07:00
Evan Tschannen 947a625968
Update flow/Knobs.cpp
Co-authored-by: A.J. Beamon <ajbeamon@users.noreply.github.com>
2020-08-24 11:33:49 -07:00
Meng Xu 6e3e36c8fc FastRestore:RequestScheduler:Minor code style improvement 2020-08-24 10:45:46 -07:00
XiaoxiWang 0d65e1e0e0 update ProtocolVersion 2020-08-23 21:03:26 +00:00
XiaoxiWang 86b943f17a update documentation 2020-08-23 18:24:45 +00:00
Meng Xu 43853d25e9
Merge pull request #3679 from monsij/verbose_clearup
Verbose printf cleanup on ReplicationPolicy.cpp
2020-08-23 10:44:37 -07:00
Xiaoxi Wang 3afdb44c7a merge master 2020-08-23 17:09:04 +00:00
Monsij Biswal 0239d612d1 Verbose printf cleanup 2020-08-23 13:35:00 +05:30
Meng Xu dee016f96b
Merge pull request #3685 from apple/dyoungworth/fixMerge1
Dyoungworth/fix merge1 6.3 release branch into master
2020-08-22 15:33:01 -07:00
David Youngworth b4cec6577a Fix merge bugs 2020-08-22 15:04:42 -07:00
David Youngworth e1b7dd0c7d Merge remote-tracking branch 'upstream/release-6.3' into dyoungworth/fixMerge1 2020-08-22 12:25:19 -07:00
Meng Xu 7e094f217b
Merge pull request #3675 from ajbeamon/fix-proxy-latency-bands
Fix commit and GRV latency band and statistics publishing
2020-08-22 10:53:33 -07:00
Meng Xu a5776963fc
Merge pull request #3680 from bowlofstew/issue/1647
Issue 1647, Cleanup the verbose printf in ReplicationPolicy, Resolution
2020-08-22 10:48:57 -07:00
Steve Atherton a552181c48
Merge pull request #3652 from satherton/feature-redwood
Optimization to reduce page writes when multiple siblings of the same parent are updated with the page remap window
2020-08-21 17:14:31 -07:00
Xiaoxi Wang 3b63d8b01b remove FIXME; remote tagSet.reset(); trivial changes 2020-08-21 19:17:16 +00:00
A.J. Beamon f864606d8d Don't block the data distributor when getting a GetDataDistributorMetricsRequest. 2020-08-21 18:16:07 +00:00
Lukas Joswiak 5cebba8e74 Add call to set up tracing 2020-08-21 11:06:31 -07:00
XiaoxiWang d8a508ce7e fix command line parse 2020-08-21 17:49:21 +00:00