Commit Graph

63 Commits

Author SHA1 Message Date
sfc-gh-tclinkenbeard a71099471b Update copyright header dates 2022-03-21 13:36:23 -07:00
Ata E Husain Bohra 87ee4cf958 Add new FDB EncryptKeyProxy role
Major changes includes:

1. Add a new FDB role responsible- EncyrptKeyProxy. The role is
   responsible to expose APIs to fetch encyrption keys interacting
   with external Encryption KeyManager interface.
2. The process is a FDB singleton process following similar recruitment
   rules as other singleton processes in the system.
3. Code to recruit the worker process; given the encryption keys are
   needed during recovery (decode TLog records), for now the process
   is co-located in same datacenter as ClusterController.
4. Skeleton process actor code; more functionality will be added in
   subsequent PRs.

NOTE: The code is protected under a SERVER_KNOB with the default
      value as 'false' for now.
2022-01-25 17:38:27 -08:00
Suraj Gupta a4bcd3919d Add exclusive process class for Blob Worker.
Also introduces a specific machine in the simulated cluster
to test blob worker (similar to what's done for storage cache).
2021-09-23 16:54:44 -04:00
Suraj Gupta 5fa6c687d6 Add blob manager as a singleton. 2021-09-23 10:45:37 -04:00
Lukas Joswiak 59d535149e Merge branch 'master' into fixes/alp6 2021-07-27 10:07:18 -07:00
Neethu Haneesha Bingi decfb610ff Minor review comments. 2021-06-25 15:04:49 -07:00
Neethu Haneesha Bingi 66f2518405 exclude to work with any locality data match. 2021-06-23 18:03:27 -07:00
Neethu Haneesha Bingi 73752f441b exclude locality:clang-format, ranged loops, documentation, tracking addStoragesever for exclusion. 2021-06-23 18:03:27 -07:00
Neethu Haneesha Bingi 62355571d0 exclude servers based on locality match 2021-06-23 18:03:27 -07:00
Lukas Joswiak 153de33f57 Revert "Merge pull request #4802 from sfc-gh-ljoswiak/revert/actor-lineage"
This reverts commit 6499fa178e, reversing
changes made to 1512631957.
2021-06-04 13:31:55 -07:00
Lukas Joswiak 4ea760b2a9 Revert "Merge pull request #4136 from sfc-gh-mpilman/features/actor-lineage"
This reverts commit da41534618, reversing
changes made to e6300905d6.
2021-05-10 20:26:12 -07:00
Markus Pilman 4fab2ecd30 Merge remote-tracking branch 'origin/master' into features/actor-lineage 2021-04-28 09:20:54 -06:00
Evan Tschannen a02da36e85 fixed the problem with the GrvProxyClass the proper way my keeping the enum the same between versions 2021-04-26 18:45:44 -07:00
Markus Pilman eb036b7b02 Merge remote-tracking branch 'origin/master' into features/actor-lineage 2021-03-17 11:59:54 -06:00
FDB Formatster df90cc89de apply clang-format to *.c, *.cpp, *.h, *.hpp files 2021-03-10 10:18:07 -08:00
Markus Pilman 0d324cee80 Annotation framework and role lineage 2020-12-09 10:19:59 -07:00
Young Liu c34385ef7b Support proxy process class 2020-09-30 01:23:10 -07:00
Young Liu cc5bc16bd8 Rename more places from proxy to commit proxy 2020-09-15 22:29:49 -07:00
Young Liu 35bef73a1c Rename proxy to commit proxy 2020-09-10 17:44:15 -07:00
Young Liu d6a23a4d6b Resolve comments to make GRV proxy a separate process class 2020-08-06 00:01:57 -07:00
Young Liu 229ab0d5f1 Fix some conflicts and remote debugging trace events 2020-07-22 23:35:46 -07:00
Young Liu 525f10e30c Merge master branch 2020-07-22 16:08:49 -07:00
Young Liu 302cf5c45f Remove debug trace events 2020-07-22 12:20:22 -07:00
Young Liu 5b06d69d25 Pass watches test 2020-07-15 00:37:41 -07:00
Andrew Noyes f470ba8316 Remove using namespace std::rel_ops
This causes the following to not compile anymore

\#include <utility>
\#include <vector>

using namespace std::rel_ops;

int main() {
    std::vector<int> xs;
    return xs.rbegin() != xs.rend();
}

See https://godbolt.org/z/s1977n
2020-07-10 22:58:15 +00:00
Evan Tschannen 72ce997d22 explicitly versioned every key in systemData, so we only will update the associated protocolVersion when the serialization actually changes 2020-05-22 16:35:01 -07:00
Meng Xu 31a6ec34b7 Merge branch 'master' into mengxu/fast-restore-agent-PR 2020-02-18 16:17:59 -08:00
mpilman d09e07f1f5 Merge remote-tracking branch 'upstream/master' into features/icc 2020-02-04 10:26:18 -08:00
Meng Xu ff92401ed5 FastRestore:Add FastRestoreClass and blob option
To simplify test in circus framework, we need a fastrestore class;
To get data from blob in real mode, restore worker should set up
blob credentials in order to call BackupContainer interface to
get all backup files.
2020-01-28 20:25:05 -08:00
Jingyu Zhou de8d953865 Add backup role, class, and worker skeleton 2020-01-22 19:35:30 -08:00
negoyal a4a0bf18f9 Merging with Master. 2019-11-12 13:01:29 -08:00
Meng Xu d160810662 FastRestore:Resolve review comments 2019-09-04 16:48:43 -07:00
Meng Xu 3b54363780 FastRestore:Apply Clang-format 2019-08-01 18:09:12 -07:00
Meng Xu 45083edf74 Merge branch 'master' into mengxu/performant-restore-PR
Fix conflicts as well.
2019-07-25 10:46:11 -07:00
mpilman 844dd60202 FDB compiling with intel compiler 2019-06-20 09:29:01 -07:00
mpilman 68ce9a5e75 ProtocolVersion type - second try 2019-06-18 17:55:27 -07:00
mpilman 6afce01744 Implementation complete (not yet working) 2019-05-13 14:15:22 -07:00
mpilman ba83c458a6 types implemented 2019-05-13 14:15:22 -07:00
Meng Xu 70d7c289f4 Merge branch 'master' into mengxu/restore/parallel-v7 2019-03-30 22:13:10 -07:00
A.J. Beamon 71e2fdafb8 Changes to ratekeeper camel case 2019-03-27 08:24:25 -07:00
Jingyu Zhou 3c86643822 Separate Ratekeeper from data distribution.
Add a new role for ratekeeper.

Remove StorageServerChanges from data distribution.
Ratekeeper monitors storage servers, which borrows the idea from
DataDistribution.
2019-03-07 13:16:20 -08:00
Trevor Clinkenbeard 1bb384db4d Merge branch 'master' of https://github.com/apple/foundationdb into add-no-assign-class 2019-02-20 13:13:12 -08:00
Jingyu Zhou 886e7ab2ba Add a new DataDistributor role.
Let cluster controller to start a new data distributor role by sending a
message to a chosen worker.
Change MasterInterface usage in DataDistribution to masterId

Add DataDistributor rejoin handling.

This allows the data distributor to tell the new cluster controller of its
existence so that the controller doesn't spawn a new one. I.e., there should
be only ONE data distributor in the cluster.

If DataDistributor (DD) doesn't join in a while, then ClusterController (CC) tries
to recruit one as DD. CC also monitors DD and restarts one if it failed.

The Proxy is also monitoring the DD. If DD failed, the Proxy will ask CC for
the new DD.

Add GetRecoveryInfo RPC to master server, which is called by data distributor
to obtain the recovery Transaction version from the master server.
2019-02-14 16:30:13 -08:00
Meng Xu 550f2e2682 Merge with master to use the latest backup container
In fdb 6.0.15, backup container is changed on how to organize the backup data.
The backup made by fdb >6.0.15 has to be restored with fdb > 6.0.15.
Merge with master so that the fast restore uses fdb > 6.0.15
2019-01-30 12:05:15 -08:00
Meng Xu 76e1ba2934 add blob_credential_file option 2019-01-29 16:00:52 -08:00
Trevor Clinkenbeard 2e0b3a7f1d Added ProcessClass::CoordinatorClass, which can be used by coordinators, so that coordinators do not have to take on other roles if desired 2019-01-25 11:03:13 -08:00
Balachandar Namasivayam a8e2e75cd5 Re-enable CheckDesiredClasses after making necessary changes for multi-region setup.
Fixed a couple of bugs
1) A rare race condition where a worker is being roles even after it died.
2) Fix how RoleFitness is calculated for TLog and LogRouter. Only worst fitness is compared to see if a better fit is available.
2019-01-10 10:28:32 -08:00
anoyes 6a4d87802b Replace & operator with variadic function 2018-12-28 11:33:42 -08:00
Meng Xu 8de031f9a6 TeamCollection: clang-format
Format the changes with git clang-format.
No functional changes.

Signed-off-by: Meng Xu <meng_xu@apple.com>
2018-11-21 11:18:26 -08:00
Meng Xu 5051b35c61 TeamCollection: Use machine team to create server team
Current server team collection logic does not consider
the fact that multipe storage servers can run on the same machine.
When multiple machines fail, all servers on the machines will fail, and
the possibility of having one process team fail and lose data is very high.

To reduce the possibility of losing data when multiple machine fails,
we first create machine teams which span across different fault zones;
we then create server teams based on machine teams by
first picking 1 machine team, and then
picking 1 server from each machine in the machine team.

Signed-off-by: Meng Xu <meng_xu@apple.com>
2018-11-16 15:53:22 -08:00