Commit Graph

142 Commits

Author SHA1 Message Date
Josh Slocum d1d2ca9285 Don't inject TSS faults if speedUpSimulation is set 2021-06-18 12:41:48 -05:00
Markus Pilman 05aea49d16
Merge pull request #4986 from sfc-gh-mpilman/bugfixes/double-ss
Bugfixes/double ss
2021-06-16 14:43:32 -06:00
Markus Pilman 56eaf1bc83 added comments 2021-06-15 16:49:27 -06:00
Markus Pilman b2271f2176 additional tracing for quietDatabase 2021-06-15 16:00:28 -06:00
Trevor Clinkenbeard 866f536983
Merge pull request #4888 from sfc-gh-tclinkenbeard/remove-fdbserver-includes
Remove fdbserver includes from fdbclient
2021-06-07 10:22:13 -07:00
Xiaoxi Wang 838d847d4e
Merge pull request #4860 from sfc-gh-xwang/ppwtest
implement perpetual storage wiggling feature
2021-06-04 16:18:39 -07:00
Xiaoxi Wang e0981d6732 add code coverage mark 2021-06-03 19:58:28 +00:00
Xiaoxi Wang 351325b3af comment modification; wait perpetual wiggling close 2021-06-03 05:13:20 +00:00
Josh Slocum b3e4f182ef TSS Mapping Change 2021-06-02 17:30:09 +00:00
Xiaoxi Wang 8b9c8b33fc manually merge with master 2021-06-01 17:51:42 +00:00
sfc-gh-tclinkenbeard 594e8944ae Move RestoreWorkerInterface into fdbserver 2021-05-30 11:51:47 -07:00
Xiaoxi Wang 24c0c3361a change quietdatabase 2021-05-25 21:20:24 +00:00
Josh Slocum 4257ac2b4d More TSS Changes/Fixes 2021-05-25 20:37:48 +00:00
Josh Slocum ce82c9653e Testing Storage Server implementation 2021-05-25 20:28:50 +00:00
Xiaoxi Wang ca0bacec07 close perpetual wiggling in QuietDatabase 2021-05-24 19:30:25 +00:00
sfc-gh-tclinkenbeard 5c2d7b6080 Create RangeResult type alias 2021-05-03 13:14:16 -07:00
FDB Formatster df90cc89de apply clang-format to *.c, *.cpp, *.h, *.hpp files 2021-03-10 10:18:07 -08:00
Vishesh Yadav 2bb4f2e59f Merge branch 'release-6.3-pre-format' into master-format
This merges release-6.3 branch right before it was fully formatted.
There were quite a few conflicts that are resolved here. CoroFlow had
a check for OOM errors introduced in 6.3, but didn't seem applicable in
the new implmentation which seems to use boost.
2021-03-10 09:37:41 -08:00
Chaoguang Lin 9645f489e6 Fix base trace event name inconsistency 2021-03-08 15:20:50 -08:00
Andrew Noyes 79cec09255 Apply clang-tidy's performance-inefficient-vector-operation fix
I ran this command in my build directory after compiling with
OPEN_FOR_IDE. It took a few small tweaks to get it to compile, which is
outside the scope of this commit.

    $ python run-clang-tidy.py -j $(nproc) -checks='-*,performance-inefficient-vector-operation' -fix
2021-03-04 03:58:25 +00:00
Markus Pilman f967209ea0 Fix compiler warnings with clang 2020-04-22 14:14:56 -07:00
Balachandar Namasivayam 747434a13d Increate QuietDatabase time to 90 seconds for real world cases. 2020-03-17 14:36:07 -07:00
Jingyu Zhou 8b67a89eed More review comments fixed. 2020-01-22 19:42:13 -08:00
Jingyu Zhou 85c4a4e422 Address review comments for PR #1625 2020-01-22 19:38:45 -08:00
Jingyu Zhou 6c6a553dcc Fix hang due to distributor death in QuietDatabase
It's possible that after obtaining data distributor, the distributor then dies
and a new one is recruited. Because the tester is still contacting the old one,
it becomes stuck.
2020-01-22 19:38:45 -08:00
Jon Fu 471e283128 Merge branch 'master' of https://github.com/apple/foundationdb into mark-ss-failed 2019-09-18 11:49:07 -07:00
Evan Tschannen 8fbd90e2f6
Merge pull request #1985 from xumengpanda/mengxu/storage-engine-switch-PR-v2
Graceful storage engine migration
2019-09-09 13:51:53 -07:00
Meng Xu 879dec1a5d ConsistencyCheck:Check teamCollectionValid for data_hall mode 2019-09-05 10:34:57 -07:00
Jon Fu 00c2025d4b fixed removeKeys impl, adjusted test workload, and introduced extra safety checks to NativeAPI and proxy 2019-08-27 14:39:44 -07:00
Meng Xu a588710376 StorageEngineSwitch:Graceful switch
When fdbcli change storeType for storage engines,
we switch the store type of storage servers one by one gracefully.
This avoids recruiting multiple storage servers on the same process,
which can cause OOM error.
2019-08-12 17:37:52 -07:00
Evan Tschannen 9f11f2ec53 Merge branch 'master' of github.com:apple/foundationdb 2019-07-30 16:55:56 -07:00
Evan Tschannen 2d7ec54d3e fix: some exclude workloads would cause both the primary and remote datacenter to be considered dead 2019-07-30 16:35:52 -07:00
sramamoorthy 5a56f6b456 minor snap create client improvement and bug fixes 2019-07-29 20:28:22 -07:00
Balachandar Namasivayam bf87d906f6 Fix a crash. 2019-07-25 16:15:28 -07:00
sramamoorthy 31a1e6858b remove un-necessary state variables in getCoord 2019-07-24 15:36:28 -07:00
sramamoorthy a65c9f92ed get rid of all timeouts and other changes 2019-07-24 15:36:28 -07:00
sramamoorthy a2f2ad96ff code review comments and merge to master changes 2019-07-24 15:36:28 -07:00
sramamoorthy d90b678f6f storage worker to throw in case of failures 2019-07-24 15:36:28 -07:00
sramamoorthy 7ec8fe6e74 snap v2: implement get only local storage workers 2019-07-24 15:36:28 -07:00
sramamoorthy 8f1f0c0435 snap v2: worker and other helper related changes 2019-07-24 15:36:28 -07:00
Meng Xu 64bee63dbc Resolve two review comments
1) No need to check server with only one team when teamRemover finds
a server team or machine team to remove

2) Fix optimalTeamCount counting in teamTracker
2019-07-18 18:46:31 -07:00
Meng Xu 80ed39c189 QuietDB:Disable check for too many teams
Because team remover does not remove a team if it causes 0 team per server.
So we currently disable the check until we have a better strategy to enforce the
desired number of teams.

This will not cause much problem in real situation, while having 0 team on a server
will make the server unable to host data, which is bad.
2019-07-16 12:38:55 -07:00
Meng Xu 20f067e794 Merge with master:Resolve conflict with PR#1797 2019-07-16 10:52:28 -07:00
Meng Xu 243504b125 DD:Clang format changes 2019-07-15 18:40:14 -07:00
Meng Xu 94e9b8a3b4 Do not remove a team whose min team number is less than target
If the minimum number of teams of servers in a team is less than the
target value (desired_team_number_per_server * (teamSize + 1) / 2),
the team remover should not remove it. Otherwise, DD will oscillate in
building more teams and removing redundant teams.

Do not do consistency check for three_data_hall mode because when
machines are not evenly distributed across data halls, we will
need to build more teams than the total desired number to make sure
the number of teams per server is no less than the target value.
2019-07-15 18:30:13 -07:00
Meng Xu cafe9b9412 TC:Target team num per server is desired number
Do not overbuild teams because we may oscillate between building more teams and
removing the redundant teams. The oscillation happens when the machines are not
evenly distributed across availability zones.
For example, in three_data_hall mode, we have 1 machine in 1 data hall for 2 data halls.
We have 3 machines in the 3rd data hall. To build enough (and more teams) for servers
in the 3rd data hall, we will overbuild teams. However,
the teamRemover will remove those newly teams.
2019-07-15 17:32:51 -07:00
Meng Xu cf935ff9e6 Remove debug message and format code 2019-07-11 22:05:20 -07:00
Meng Xu cd28a0b604 Reenable check each server must have at least 1 team 2019-07-11 17:58:14 -07:00
Meng Xu 221e6945db TeamTracker:Fix bug in counting optimalTeamCount
When a teamTracker is cancelled, e.g, by redundant teamRemover or badTeamRemover,
we should decrease the optimalTeamCount if the team is considered as an
optimal team, i.e., all members' machine fitness is no worse than unset, and
the team is healthy.
2019-07-11 17:22:41 -07:00
Meng Xu 4c32593f59 QuietDB:Do not check when machineId is not zoneID 2019-07-11 10:37:16 -07:00