Commit Graph

25680 Commits

Author SHA1 Message Date
Steve Atherton 216d0be2cf
Add processID, networkAddress, and locality to layer status JSON for Backup Agents. (#9736)
* Add processID, networkAddress, and locality to layer status JSON for Backup Agents.

* Backup/dr agent determines network address to report in Layer Status only once, when the status updater loop begins, since it is a blocking call which connects to the cluster.  And lots of code cleanup.
2023-03-17 18:07:03 -07:00
Evan Tschannen 4d757fbcd3 format tester 2023-03-17 15:09:31 -07:00
Evan Tschannen a7e1616de8 code cleanup 2023-03-17 14:57:07 -07:00
A.J. Beamon a685a01c07
Merge pull request #9744 from sfc-gh-ajbeamon/fix-consistency-check-retry-on-get-key-servers
The consistency check should retry if it couldn't find all the commit proxies when getting key server locations
2023-03-17 14:05:33 -07:00
A.J. Beamon fe5d0928f3 Remove doEmptyCommit function 2023-03-17 12:58:41 -07:00
A.J. Beamon dc2bd78aa7 The consistency check should retry if it couldn't find all the commit proxies when getting key server locations 2023-03-17 12:00:47 -07:00
Evan Tschannen 6882f21c10 fix: only consider large team creation successful if the the has the correct size 2023-03-17 10:52:53 -07:00
Evan Tschannen 73767501d4 Merge branch 'main' into feature-custom-dd
# Conflicts:
#	fdbserver/tester.actor.cpp
2023-03-17 10:33:38 -07:00
Evan Tschannen de7f40d2f4 made a variety of improvements that came from code review 2023-03-17 10:30:34 -07:00
Ata E Husain Bohra c492f83bf4
EaR: Avoid appending `tls` to the URL (#9734)
Description

Patch proposes two changes:

1. Avoid appending tls as part of URI for secure connections
2. RefreshEKs recurring task can be skipped if there are no keys to be refreshed

Testing

EncryptionOps.toml
EncryptKeyProxyTest.toml
devRunCorrectness 
devRunCorrectnessFiltered 'Encrypt*'
2023-03-16 22:52:51 -07:00
He Liu 0f5e75b34b
Added newDataMoveId(). (#9647)
* Added newDataMoveId().

* Added `ENABLE_DD_PHYSICAL_SHARD_MOVE`

* fmt.

* Replace `teamId` with `shardId`.
2023-03-16 18:06:06 -07:00
A.J. Beamon aeaedb147f
Merge pull request #9727 from sfc-gh-ajbeamon/fix-shared-remote-region-kills
Avoid killing too many machines if one region is being shared between the remote primary and a satellite
2023-03-16 17:46:12 -07:00
Josh Slocum 3c1ac344f1
buggify blob granule compression per-file (#9670) 2023-03-16 17:46:18 -05:00
A.J. Beamon 6818ce950c Fix check that excludes satellites from consideration to consider satelliteTLogReplicationFactor and satelliteTLogUsableDcs. Update trace event with more info about the updated policy. 2023-03-16 14:27:30 -07:00
Jingyu Zhou 84f5a92936
Merge pull request #9728 from sfc-gh-satherton/corrupt-block-check-clarity
Rewrite corrupt block number calculations to be more clear.
2023-03-16 14:14:58 -07:00
Steve Atherton 5c795c3abe Rewrite corrupt block number calculations to be more clear. 2023-03-16 13:02:15 -07:00
Markus Pilman df5b15e56c
Merge pull request #9634 from sfc-gh-mpilman/features/negative-simulation
Framework to write negative tests
2023-03-16 12:47:02 -07:00
A.J. Beamon 735327f1cf
Merge pull request #9718 from sfc-gh-ajbeamon/decrease-duration-of-automatic-idempotency-workload
Decrease number of transactions in automatic idempotency workload
2023-03-16 12:31:24 -07:00
A.J. Beamon 484a414117 Increase the buggified tag measurement interval to reduce trace spam 2023-03-16 11:53:45 -07:00
A.J. Beamon 75b8148e91 If one region is being shared between the remote primary and a satellite, the simulator could kill too many machines 2023-03-16 11:24:12 -07:00
A.J. Beamon f8255fe7a1
Merge pull request #9724 from sfc-gh-ajbeamon/fix-disk-corruption-check
Fix possible off-by-one in the simulation upper bound check for page corruption
2023-03-16 11:05:25 -07:00
Josh Slocum c7c41bc9db
adding implementation and check for blob worker exclusion (#9700) 2023-03-16 12:09:43 -05:00
Evan Tschannen ac54962533 code cleanup 2023-03-16 09:47:21 -07:00
A.J. Beamon 99f75a9bb1 Fix possible off-by-one in the simulation upper bound check for page corruption 2023-03-16 09:31:11 -07:00
Jingyu Zhou adda32db46
Merge pull request #9691 from sfc-gh-dadkins/sfc-gh-dadkins/commit-proxy-unavailable
Replace 10-second delay with explicit wait for cluster recovery in checkExtraDataStores
2023-03-16 09:21:59 -07:00
A.J. Beamon 91ebf2ffc8
Merge pull request #9714 from sfc-gh-ajbeamon/fix-tenant-id-increment
More carefully validate tenant increments to avoid overflows
2023-03-15 18:53:44 -07:00
A.J. Beamon 8bc75eb313
Merge pull request #9716 from sfc-gh-ajbeamon/fix-storage-quota-enables-tenant-aware-dd
The `DD_TENANT_AWARENESS_ENABLED` knob was accidentally bypassed by the `STORAGE_QUOTA_ENABLED` knob
2023-03-15 18:52:56 -07:00
A.J. Beamon 4b8311d932 The automatic idempotency workload has a long runtime and can occasionally log too many events, etc. This decreases the number of transactions it runs significantly to avoid that issue. 2023-03-15 18:45:26 -07:00
A.J. Beamon 436a187171 Merge branch 'main' into fix-storage-quota-enables-tenant-aware-dd 2023-03-15 17:59:01 -07:00
A.J. Beamon 6d5ffa11f9 Merge branch 'main' into fix-tenant-id-increment 2023-03-15 17:56:42 -07:00
A.J. Beamon a6202253a4
When a storage server fails to register (e.g. due to worker_removed), we need to throw that error to terminate the SS. (#9712) 2023-03-15 17:46:21 -07:00
A.J. Beamon 3f9d51db4e The DD_TENANT_AWARENESS_ENABLED knob was indirectly disabling the feature by not initializing a dd tenant cache, but this could be bypassed by enabling storage quotas. This makes the knob more explicitly control the feature. 2023-03-15 15:56:24 -07:00
Josh Slocum b4eb665f1d
fixing copy constructor error and adding test for it (#9711) 2023-03-15 15:33:16 -07:00
A.J. Beamon 3881f1ccc6 More carefully validate tenant increments to avoid overflows 2023-03-15 14:56:12 -07:00
Markus Pilman b4bc24ae2c Fix includes (these broke the windows build) 2023-03-15 14:06:33 -07:00
Ata E Husain Bohra dbcab0b1bd
Revert "Refactor GetEncryptCipherKeys (#9600)" (#9708)
This reverts commit 2702665e35.
2023-03-15 12:10:08 -07:00
Evan Tschannen aaf7b9b32b Added the ability to manually create a shard and also increase its replication factor 2023-03-15 11:26:15 -07:00
Markus Pilman 3da6582f3f Merge remote-tracking branch 'sfc/features/negative-simulation' into features/negative-simulation 2023-03-15 11:23:14 -07:00
Markus Pilman 303b833d7b Adding data corruption test to verify consistency check 2023-03-15 11:22:25 -07:00
Markus Pilman e5444dd4e1 Adding additional tests 2023-03-15 11:22:25 -07:00
Markus Pilman 3050aa611f Make process events reentrant save 2023-03-15 11:22:25 -07:00
Markus Pilman 79447c6e06 First successful negative run 2023-03-15 11:22:25 -07:00
Markus Pilman 3894d5069e fix compiler error 2023-03-15 11:22:25 -07:00
Markus Pilman 7a108a2768 Add framework for writing negative simulation tests 2023-03-15 11:22:25 -07:00
Markus Pilman aa09baadab
Merge pull request #9635 from sfc-gh-etschannen/fix-consistency-check
Fix: the consistency check did not properly report failed tests
2023-03-15 11:21:44 -07:00
Evan Tschannen 6c1d02a14f
Merge pull request #9703 from sfc-gh-jslocum/bg_file_logical_size
adding blob granule logical size
2023-03-15 09:59:57 -07:00
Evan Tschannen 2f96627d43 merge in main 2023-03-15 09:26:22 -07:00
Jingyu Zhou bc380c9a5d
Merge pull request #9699 from sfc-gh-xwang/fix/main/tcTest
fix unit test failure because of implicit uint16_t conversion to int
2023-03-15 09:18:10 -07:00
Evan Tschannen 0a8435b742
Merge pull request #9702 from sfc-gh-jslocum/dbg_bg_ctest_timeout
fixing 2 bugs related to high delta file waitCommitted latency
2023-03-15 08:52:35 -07:00
Josh Slocum 895b418384 undo json traces 2023-03-15 09:39:26 -05:00