I ran this command in my build directory after compiling with
OPEN_FOR_IDE. It took a few small tweaks to get it to compile, which is
outside the scope of this commit.
$ python run-clang-tidy.py -j $(nproc) -checks='-*,performance-inefficient-vector-operation' -fix
Use `clang-tidy -p . $file -checks='-*,modernize-use-override' -header-filter='.*' -fix`
to fix missing overrides, and then use git clang-format to reformat just
those changes. This went pretty well for most files.
Formatting the following files went off the rails, so I'm going to
follow up with a commit that's just clang-tidy and no clang-format.
- fdbclient/DatabaseBackupAgent.actor.cpp
- fdbclient/FileBackupAgent.actor.cpp
- fdbserver/OldTLogServer_4_6.actor.cpp
- fdbmonitor/SimpleIni.h
- fdbserver/workloads/ClientTransactionProfileCorrectness.actor.cpp
This separates the file format and how to read it from how to apply a key/value
to a TestSpec. This will allow us to reuse the same code when implementing a
TOML parser later.
Testspec is currently very permissive in very misleading ways. In particular,
the tester parser itself will swallow K=V settings and apply them at the test
level, which breaks how a person would expect the scoping to work. Other
settings apply to the entire simulation run globally, but appear to be workload
specific. Even further, others affect simulation cluster creation or test
harness behavior, but can again be set anywhere in a testspec.
This changes testspec parsing to error if a setting that applies globally is
anywhere but the top of the file, or if a setting that applies test-wide is
applied to a workload instead of a test.
We are currently emitting Role transition traces when a role starts and
when it ends. While this is useful for debugging, it doesn't work well
with tools that inject data and might potentially miss some trace lines.
We do decorate each trace lines with the roles assigned to that
particular process, however, this is not sufficient for tools that can
make use of the UID -> Role mapping
- Since the first flush failure, if the accumulated consecutive failure count exceeds the value defined in knobs, it will trigger the current worker process to report this issue via the 'GetServerDBInfo' interface of the cluster controler
- A successful flush will reset the accumulated counter.
Notice that the current solution does not take the time into consideration. The assumption is that flush failures tend to only happen in a clustered manner. The intermittent, but short, periods of flush failures are not considered as a problem since the memory pressure built by them should be negligible.
Due to randomness, when unhealthy teams are majority while there still
exists healthy teams, getTeam function may be unlucky to find
any feasible (ok) team, which leads to BestTeamStuck situation.
This commit increases the tries from 10 to 20.
A long-term solution may first find all feasible teams and choose a random
one from them. Since This can affect the statistics of which team is picked.
So it is not included in this commit.
Non-functional change: This commit removes unneeded printf introduced by
fast restore PR 1404.