This is a rewrite of BUGGIFY function/macros. Seems the performance
improved a lot during the simulation, e.g.
fdbserver -r simulation -b on -f ../CycleTest.toml -s 99438
Without this patch:
Unseed: 54646
Elapsed: 494.091327 simsec, 14.586831 real seconds
With this patch:
Unseed: 54646
Elapsed: 494.091327 simsec, 12.580612 real seconds
I expected the improvement but did not expect a ~13% improvement.
CC sets a version to int_max in ClientDBInfo indicating a refresh, however,
proxy server would reject this version for the error of future_version.
This change fixes this issue by not sending int_max, instead maintaining a
lastKnown in memory and send it to grvproxy to get latest globalconfig.
this change also fixes some java tests that were used to test the fix
CC's checkBetterSingletons() calls getUsedIds() that asserts proxy interfaces
are present. However, when a GRV/commit proxy failed, before CC starts a new
recovery, the proxy's processId becomes empty, thus triggering the failure.
The fix is to cancel the caller while waiting for recovery.
To reproduce 7.1 commit 725a08a3ff clang build:
./fdbserver.6.0.15 -r simulation -f ./tests/restarting/from_5.2.0_until_6.3.0/ClientTransactionProfilingCorrectness-1.txt -s 900000399 -b on
-f ./tests/restarting/from_5.2.0_until_6.3.0/ClientTransactionProfilingCorrectness-2.txt --restarting -s 900000400 -b on
* ConsistencyCheckerUrgent repeated run
* address comments
* avoid trace SevError for TesterRecruitmentTimeout unless it keeps failure for over 1 day
* address comments
* address comments
* acs framework
* code refactor and fix bugs
* add ss crash loop protector
* use sharedptr instead of raw pointer
* fixed critical bugs and add provate mutation acs to the framework
* enable ACS for all mutations except for clear serverTag mutation and fix bugs
* fix restarting tests
* refactor code and fix bugs
* fix AccumulativeChecksumState toString
* fix bugs
* allow all mutations in acs and fixed bugs
* fix bugs and code cleanup
* code clean up for adding recovery support
* simplify code and support recovery
* clear acs state at ss
* fix bug
* terminate validator if ss will be removed in the current batch
* simplify code
* add trace
* address comments
* optimize code
* deep copy when adding mutation to acs validator
* warp encode and decode persist acs key
* make acstable private
* remove unless func
* remove unless func
* remove epoch in ACS validator
* add acs mutation counter in SS metrics
* code cleanup and make knob check better
* make mutation buffer global
* simplify code
* add comments
* make knob randomly set
* address comments
* ss reboot after acs mismatch found
* Fix detection of private mutations in version vector
* add assertion that all tlogs receive changes to txn state in version vector
* Re-suppress version_vector upgrade tests
---------
Co-authored-by: Dan Lambright <hlambright@apple.com>