Disable machine attrition in DiskFailure workload.

The machine attrition logic doesn't take into account the possibility that a disk corruption could an unrecoverable failure in the cluster. Before disabling attrition during the DiskFailure workload, the failure rate was >10/100,000 in the DiskFailureCycle test. Afterwards, there were no failures in 100,000 runs.
2023-02-13 08:53:58 -08:00 · 2023-02-13 08:53:58 -08:00 · 5ede2d439c
parent 844890bf93
commit 5ede2d439c
1 changed files with 3 additions and 0 deletions
--- a/fdbserver/workloads/DiskFailureInjection.actor.cpp
+++ b/fdbserver/workloads/DiskFailureInjection.actor.cpp
@ -65,6 +65,9 @@ struct DiskFailureInjectionWorkload : FailureInjectionWorkload {
 		periodicBroadcastInterval = getOption(options, "periodicBroadcastInterval"_sr, periodicBroadcastInterval);
 	}

+	// TODO: Currently this workload doesn't play well with MachineAttrition.
+	void disableFailureInjectionWorkloads(std::set<std::string>& out) const override { out.insert("Attrition"); }
+
 	void initFailureInjectionMode(DeterministicRandom& random) override { enabled = clientId == 0; }

 	Future<Void> setup(Database const& cx) override { return Void(); }