If a team is removed from DD, it will be marked as failed and eventually removed from the
global teams data structure.
Team healthyness is likely to be a temporary state which can be changed rather quickly.
There are cases where traceTeamCollectionInfo was called within the same execution block, i.e.,
no wait between the two traceTeamCollectionInfo calls.
Because simulation uses the same time for all execution instructions in the same execution block,
having more than one traceTeamCollectionInfo at the same time will mess up the trackLatest semantics.
When one of them is always chosen by simulator, simulation test will report false positive error.
Changing this function to actor and adding a small delay inside the function can solve this problem.
Whenever use selectReplicas function, be careful that it may have bugs!
This bug is that it always return false (not able to find candidates)
when the storage team size is 1. This is wrong because when storage team size
is 1, the selectReplicas should return an empty result.
When team collection add new server teams, it picks a team with
the least number of teams. We should only consider the healthy teams
because the unhealthy ones will not be useful.
Team collection should prioritize to build machine teams for a machine
that has the least number of healthy machine teams, instead of just
machine teams, because unhealthy machine team will not be able to
produce more server teams.
When team collection (TC) build server teams and machine teams,
it needs to build enough teams such that each server and machine has
the DESIRED_TEAMS_PER_SERVER server teams and machine teams.
This change calculate the number of teams (server team and machine teams)
needed to get each teams for each server and machine.
For example, we have 3 servers for replica factor 3. We can have only 1 team
but the desired team number is 3 times 5 equal to 15.
Instead of sanity checking the absolute team number per server, we check
the difference between the minServerTeamOnServer and maxServerTeamOnServer.
Add test for simulation test which make sure the server team number
per server will be no less than the desired_teams_per_server defined
in knobs and no larger than the max_teams_per_server.
Add similar test for machine teams number per machine as well.