Steve Atherton
216d0be2cf
Add processID, networkAddress, and locality to layer status JSON for Backup Agents. ( #9736 )
...
* Add processID, networkAddress, and locality to layer status JSON for Backup Agents.
* Backup/dr agent determines network address to report in Layer Status only once, when the status updater loop begins, since it is a blocking call which connects to the cluster. And lots of code cleanup.
2023-03-17 18:07:03 -07:00
Evan Tschannen
4d757fbcd3
format tester
2023-03-17 15:09:31 -07:00
Evan Tschannen
a7e1616de8
code cleanup
2023-03-17 14:57:07 -07:00
A.J. Beamon
a685a01c07
Merge pull request #9744 from sfc-gh-ajbeamon/fix-consistency-check-retry-on-get-key-servers
...
The consistency check should retry if it couldn't find all the commit proxies when getting key server locations
2023-03-17 14:05:33 -07:00
A.J. Beamon
fe5d0928f3
Remove doEmptyCommit function
2023-03-17 12:58:41 -07:00
A.J. Beamon
dc2bd78aa7
The consistency check should retry if it couldn't find all the commit proxies when getting key server locations
2023-03-17 12:00:47 -07:00
Evan Tschannen
6882f21c10
fix: only consider large team creation successful if the the has the correct size
2023-03-17 10:52:53 -07:00
Evan Tschannen
73767501d4
Merge branch 'main' into feature-custom-dd
...
# Conflicts:
# fdbserver/tester.actor.cpp
2023-03-17 10:33:38 -07:00
Evan Tschannen
de7f40d2f4
made a variety of improvements that came from code review
2023-03-17 10:30:34 -07:00
Ata E Husain Bohra
c492f83bf4
EaR: Avoid appending `tls` to the URL ( #9734 )
...
Description
Patch proposes two changes:
1. Avoid appending tls as part of URI for secure connections
2. RefreshEKs recurring task can be skipped if there are no keys to be refreshed
Testing
EncryptionOps.toml
EncryptKeyProxyTest.toml
devRunCorrectness
devRunCorrectnessFiltered 'Encrypt*'
2023-03-16 22:52:51 -07:00
He Liu
0f5e75b34b
Added newDataMoveId(). ( #9647 )
...
* Added newDataMoveId().
* Added `ENABLE_DD_PHYSICAL_SHARD_MOVE`
* fmt.
* Replace `teamId` with `shardId`.
2023-03-16 18:06:06 -07:00
A.J. Beamon
aeaedb147f
Merge pull request #9727 from sfc-gh-ajbeamon/fix-shared-remote-region-kills
...
Avoid killing too many machines if one region is being shared between the remote primary and a satellite
2023-03-16 17:46:12 -07:00
Josh Slocum
3c1ac344f1
buggify blob granule compression per-file ( #9670 )
2023-03-16 17:46:18 -05:00
A.J. Beamon
6818ce950c
Fix check that excludes satellites from consideration to consider satelliteTLogReplicationFactor and satelliteTLogUsableDcs. Update trace event with more info about the updated policy.
2023-03-16 14:27:30 -07:00
Jingyu Zhou
84f5a92936
Merge pull request #9728 from sfc-gh-satherton/corrupt-block-check-clarity
...
Rewrite corrupt block number calculations to be more clear.
2023-03-16 14:14:58 -07:00
Steve Atherton
5c795c3abe
Rewrite corrupt block number calculations to be more clear.
2023-03-16 13:02:15 -07:00
Markus Pilman
df5b15e56c
Merge pull request #9634 from sfc-gh-mpilman/features/negative-simulation
...
Framework to write negative tests
2023-03-16 12:47:02 -07:00
A.J. Beamon
735327f1cf
Merge pull request #9718 from sfc-gh-ajbeamon/decrease-duration-of-automatic-idempotency-workload
...
Decrease number of transactions in automatic idempotency workload
2023-03-16 12:31:24 -07:00
A.J. Beamon
484a414117
Increase the buggified tag measurement interval to reduce trace spam
2023-03-16 11:53:45 -07:00
A.J. Beamon
75b8148e91
If one region is being shared between the remote primary and a satellite, the simulator could kill too many machines
2023-03-16 11:24:12 -07:00
A.J. Beamon
f8255fe7a1
Merge pull request #9724 from sfc-gh-ajbeamon/fix-disk-corruption-check
...
Fix possible off-by-one in the simulation upper bound check for page corruption
2023-03-16 11:05:25 -07:00
Josh Slocum
c7c41bc9db
adding implementation and check for blob worker exclusion ( #9700 )
2023-03-16 12:09:43 -05:00
Evan Tschannen
ac54962533
code cleanup
2023-03-16 09:47:21 -07:00
A.J. Beamon
99f75a9bb1
Fix possible off-by-one in the simulation upper bound check for page corruption
2023-03-16 09:31:11 -07:00
Jingyu Zhou
adda32db46
Merge pull request #9691 from sfc-gh-dadkins/sfc-gh-dadkins/commit-proxy-unavailable
...
Replace 10-second delay with explicit wait for cluster recovery in checkExtraDataStores
2023-03-16 09:21:59 -07:00
A.J. Beamon
91ebf2ffc8
Merge pull request #9714 from sfc-gh-ajbeamon/fix-tenant-id-increment
...
More carefully validate tenant increments to avoid overflows
2023-03-15 18:53:44 -07:00
A.J. Beamon
8bc75eb313
Merge pull request #9716 from sfc-gh-ajbeamon/fix-storage-quota-enables-tenant-aware-dd
...
The `DD_TENANT_AWARENESS_ENABLED` knob was accidentally bypassed by the `STORAGE_QUOTA_ENABLED` knob
2023-03-15 18:52:56 -07:00
A.J. Beamon
4b8311d932
The automatic idempotency workload has a long runtime and can occasionally log too many events, etc. This decreases the number of transactions it runs significantly to avoid that issue.
2023-03-15 18:45:26 -07:00
A.J. Beamon
436a187171
Merge branch 'main' into fix-storage-quota-enables-tenant-aware-dd
2023-03-15 17:59:01 -07:00
A.J. Beamon
6d5ffa11f9
Merge branch 'main' into fix-tenant-id-increment
2023-03-15 17:56:42 -07:00
A.J. Beamon
a6202253a4
When a storage server fails to register (e.g. due to worker_removed), we need to throw that error to terminate the SS. ( #9712 )
2023-03-15 17:46:21 -07:00
A.J. Beamon
3f9d51db4e
The DD_TENANT_AWARENESS_ENABLED knob was indirectly disabling the feature by not initializing a dd tenant cache, but this could be bypassed by enabling storage quotas. This makes the knob more explicitly control the feature.
2023-03-15 15:56:24 -07:00
Josh Slocum
b4eb665f1d
fixing copy constructor error and adding test for it ( #9711 )
2023-03-15 15:33:16 -07:00
A.J. Beamon
3881f1ccc6
More carefully validate tenant increments to avoid overflows
2023-03-15 14:56:12 -07:00
Markus Pilman
b4bc24ae2c
Fix includes (these broke the windows build)
2023-03-15 14:06:33 -07:00
Ata E Husain Bohra
dbcab0b1bd
Revert "Refactor GetEncryptCipherKeys ( #9600 )" ( #9708 )
...
This reverts commit 2702665e35
.
2023-03-15 12:10:08 -07:00
Evan Tschannen
aaf7b9b32b
Added the ability to manually create a shard and also increase its replication factor
2023-03-15 11:26:15 -07:00
Markus Pilman
3da6582f3f
Merge remote-tracking branch 'sfc/features/negative-simulation' into features/negative-simulation
2023-03-15 11:23:14 -07:00
Markus Pilman
303b833d7b
Adding data corruption test to verify consistency check
2023-03-15 11:22:25 -07:00
Markus Pilman
e5444dd4e1
Adding additional tests
2023-03-15 11:22:25 -07:00
Markus Pilman
3050aa611f
Make process events reentrant save
2023-03-15 11:22:25 -07:00
Markus Pilman
79447c6e06
First successful negative run
2023-03-15 11:22:25 -07:00
Markus Pilman
3894d5069e
fix compiler error
2023-03-15 11:22:25 -07:00
Markus Pilman
7a108a2768
Add framework for writing negative simulation tests
2023-03-15 11:22:25 -07:00
Markus Pilman
aa09baadab
Merge pull request #9635 from sfc-gh-etschannen/fix-consistency-check
...
Fix: the consistency check did not properly report failed tests
2023-03-15 11:21:44 -07:00
Evan Tschannen
6c1d02a14f
Merge pull request #9703 from sfc-gh-jslocum/bg_file_logical_size
...
adding blob granule logical size
2023-03-15 09:59:57 -07:00
Evan Tschannen
2f96627d43
merge in main
2023-03-15 09:26:22 -07:00
Jingyu Zhou
bc380c9a5d
Merge pull request #9699 from sfc-gh-xwang/fix/main/tcTest
...
fix unit test failure because of implicit uint16_t conversion to int
2023-03-15 09:18:10 -07:00
Evan Tschannen
0a8435b742
Merge pull request #9702 from sfc-gh-jslocum/dbg_bg_ctest_timeout
...
fixing 2 bugs related to high delta file waitCommitted latency
2023-03-15 08:52:35 -07:00
Josh Slocum
895b418384
undo json traces
2023-03-15 09:39:26 -05:00