* Replace KeyRange with std::vector<KeyRange> in DataMoveMetaData and
CheckpointMetaData.
* Checked if ranges.empty().
* fmt.
* Resolved some comments.
Co-authored-by: He Liu <heliu@apple.com>
* Proactively clean up idempotency ids for successful commits
This change also includes some minor changes from my branch working on
an idempotency ids cleaner, that I'd like to get merged sooner rather
than later.
- Adding a timestamp to idempotency values
- Making IdempotencyId an actor file
- Adding commit_unknown_result_fatal
- Checking idempotencyIdsExpiredVersion in determineCommitStatus
- Some testing QOL changes
* Factor out decodeIdempotencyKey logic
* Fix formatting
* Update flow/include/flow/error_definitions.h
Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>
* Use KeyBackedObjectProperty for idempotencyIdsExpiredVersion
* Add IDEMPOTENCY_ID_IN_MEMORY_LIFETIME knob
* Rename ExpireIdempotencyKeyValuePairRequest
Also add a code probe for the case where an ExpireIdempotencyIdRequest is
received before the count is known, and add an assert
* Fix formatting and add TODO for nwijetunga
Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>
The simulator tracks only active processes. Rebooted or killed processes
are removed from the list of processes, and only get added back when the
process is rebooted and starts up again. This causes a problem for the
`RebootProcessAndSwitch` kill type, which wants to simultaneously reboot
all machines in a cluster and change their cluster file. If a machine is
currently being rebooted, it will miss the reboot process and switch
command.
The fix is to add a check when a process is being started in simulation.
If the process has had its cluster file changed and the cluster is in a
state where all processes should have had their cluster files reverted
to the original value, the simulator will now send a
`RebootProcessAndSwitch` signal right when the process is started. This
will cause an extra reboot, but should correctly switch the process back
to its original, correct cluster file, allowing the cluster to fully
recover all clusters.
Note that the above issue should only affect simulation, due to how the
simulator tracks processes and handles kill signals.
This commit also adds a field to each process struct to determine
whether the process is being run in a DR cluster in the simulation run.
This is needed because simulation does not differentiate between
processes in different clusters (other than by the IP), and some
processes needed to switch clusters and some simply needed to be
rebooted.
The simulator tracks only active processes. Rebooted or killed processes
are removed from the list of processes, and only get added back when the
process is rebooted and starts up again. This causes a problem for the
`RebootProcessAndSwitch` kill type, which wants to simultaneously reboot
all machines in a cluster and change their cluster file. If a machine is
currently being rebooted, it will miss the reboot process and switch
command.
The fix is to add a check when a process is being started in simulation.
If the process has had its cluster file changed and the cluster is in a
state where all processes should have had their cluster files reverted
to the original value, the simulator will now send a
`RebootProcessAndSwitch` signal right when the process is started. This
will cause an extra reboot, but should correctly switch the process back
to its original, correct cluster file, allowing the cluster to fully
recover all clusters.
Note that the above issue should only affect simulation, due to how the
simulator tracks processes and handles kill signals.
This commit also adds a field to each process struct to determine
whether the process is being run in a DR cluster in the simulation run.
This is needed because simulation does not differentiate between
processes in different clusters (other than by the IP), and some
processes needed to switch clusters and some simply needed to be
rebooted.
* Enable EncryptKeyProxy test and add code probes
Description
diff-1: Address review comments
Major changes include:
1. Enable EncryptKeyProxyTest, the test was modified to leverage
GetEncrrptCipherKey actor, as existing code wouldn't work if EKP
process gets restarted while test is running.
2. Add code probes to EKP
3. Minor refactoring.
Testing
EncryptKeyProxyTest - 100K