Commit Graph

2626 Commits

Author SHA1 Message Date
sfc-gh-tclinkenbeard 0eb1598afa Merge remote-tracking branch 'origin/main' into expose-txn-cost 2022-10-30 09:36:37 -07:00
Steve Atherton b3017ae330
Merge pull request #8577 from sfc-gh-satherton/storageserver-pml
Unrevert #7578 - new PriorityMultiLock and StorageServer read prioritization.
2022-10-28 16:04:44 -07:00
He Liu 7bb823edbe
Replace KeyRange with std::vector<KeyRange> in DataMoveMetaData and (#8591)
* Replace KeyRange with std::vector<KeyRange> in DataMoveMetaData and
CheckpointMetaData.

* Checked if ranges.empty().

* fmt.

* Resolved some comments.

Co-authored-by: He Liu <heliu@apple.com>
2022-10-28 15:22:55 -07:00
Steve Atherton 326d45819e Merge branch 'main' into storageserver-pml 2022-10-28 14:14:44 -07:00
Andrew Noyes 0a15f081a1
Proactively clean up idempotency ids for successful commits (#8578)
* Proactively clean up idempotency ids for successful commits

This change also includes some minor changes from my branch working on
an idempotency ids cleaner, that I'd like to get merged sooner rather
than later.

- Adding a timestamp to idempotency values
- Making IdempotencyId an actor file
- Adding commit_unknown_result_fatal
- Checking idempotencyIdsExpiredVersion in determineCommitStatus
- Some testing QOL changes

* Factor out decodeIdempotencyKey logic

* Fix formatting

* Update flow/include/flow/error_definitions.h

Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>

* Use KeyBackedObjectProperty for idempotencyIdsExpiredVersion

* Add IDEMPOTENCY_ID_IN_MEMORY_LIFETIME knob

* Rename ExpireIdempotencyKeyValuePairRequest

Also add a code probe for the case where an ExpireIdempotencyIdRequest is
received before the count is known, and add an assert

* Fix formatting and add TODO for nwijetunga

Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>
2022-10-28 09:07:54 -07:00
Steve Atherton f9ad7fb35b Merge origin/main into storageserver-pml 2022-10-27 18:00:11 -07:00
Lukas Joswiak 1fca3b7ddc Modify how cluster ID tests are run in simulation 2022-10-27 13:56:13 -07:00
Lukas Joswiak 5ca2b89bdf Fix simulation issue where process switch was ignored
The simulator tracks only active processes. Rebooted or killed processes
are removed from the list of processes, and only get added back when the
process is rebooted and starts up again. This causes a problem for the
`RebootProcessAndSwitch` kill type, which wants to simultaneously reboot
all machines in a cluster and change their cluster file. If a machine is
currently being rebooted, it will miss the reboot process and switch
command.

The fix is to add a check when a process is being started in simulation.
If the process has had its cluster file changed and the cluster is in a
state where all processes should have had their cluster files reverted
to the original value, the simulator will now send a
`RebootProcessAndSwitch` signal right when the process is started. This
will cause an extra reboot, but should correctly switch the process back
to its original, correct cluster file, allowing the cluster to fully
recover all clusters.

Note that the above issue should only affect simulation, due to how the
simulator tracks processes and handles kill signals.

This commit also adds a field to each process struct to determine
whether the process is being run in a DR cluster in the simulation run.
This is needed because simulation does not differentiate between
processes in different clusters (other than by the IP), and some
processes needed to switch clusters and some simply needed to be
rebooted.
2022-10-27 13:56:13 -07:00
Lukas Joswiak a72066be33 Add simulation support for changing the cluster file 2022-10-27 13:56:13 -07:00
Nim Wijetunga bf01d9b879
Bulk Setup Workload Improvements (#8573)
* bulk setup  workload improvements

* fix workload

* modify
2022-10-27 11:10:14 -07:00
Josh Slocum 4d3553481f
Blob connection provider test (#8478)
* Refactoring test blob metadata creation

* Implementing BlobConnectionProviderTest

* createRandomTestBlobMetadata supports blobstore and works outside simulation
2022-10-27 10:44:06 -05:00
Dennis Zhou deeedfc3f8
Merge pull request #8537 from sfc-gh-dzhou/unblob
blob: allow purge ranges to begin and end in unblobbified regions
2022-10-26 11:11:09 -07:00
Nim Wijetunga 6f37f55917
Restore System Keys First in Backup/Restore Workloads (#8475)
* system key restore ordering

* restore system keys before regular data

* atomic restore backup fix

* change testing

* fix compile error

* fix compile issue

* fix compile issues

* Trigger Build

* only split restore if encryption is enabled

* revert knob changes

* Update fdbserver/workloads/AtomicSwitchover.actor.cpp

Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>

* Update fdbserver/workloads/AtomicSwitchover.actor.cpp

Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>

* Update fdbserver/workloads/BackupCorrectness.actor.cpp

Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>

* Update fdbserver/workloads/AtomicRestore.actor.cpp

Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>

* add todo

* strengthen check

* seperate system restore for atomic restore

* address pr comments

* address pr comments

Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>
2022-10-26 09:38:27 -07:00
Josh Slocum ab6953be7d
Blob Granule read-driven compaction (#8572) 2022-10-26 09:02:50 -07:00
Xiaoxi Wang bb0236433c
Merge pull request #8540 from sfc-gh-xwang/feature/main/storageMetrics
Make MockStorageServer serve StorageMetrics related request
2022-10-25 17:29:21 -07:00
Trevor Clinkenbeard 0f4fddfa17
Merge pull request #8480 from sfc-gh-tclinkenbeard/reject-tag-throttled-txns
Reject transactions that have been tag throttled too long
2022-10-25 15:34:07 -07:00
Jingyu Zhou 744c391608
Merge pull request #8539 from vishesh/cc-fail-later
Don't fail ConsistencyCheck on first mismatch
2022-10-25 15:33:11 -07:00
sfc-gh-tclinkenbeard 1c119be26d Merge remote-tracking branch 'origin/main' into expose-txn-cost 2022-10-25 13:50:43 -07:00
sfc-gh-tclinkenbeard f339819758 Merge remote-tracking branch 'origin/main' into reject-tag-throttled-txns 2022-10-25 11:59:35 -07:00
Steve Atherton 27dc180b68 Merge commit '0ae568a872e474c8c755e648efbbe4524e63e445' into storageserver-pml
# Conflicts:
#	fdbserver/VersionedBTree.actor.cpp
2022-10-24 22:31:36 -07:00
Dennis Zhou 136a325fdc blob/testing: randomly purge the whole range instead of just active 2022-10-24 11:08:04 -07:00
Dennis Zhou 070e4c133e blob/testing: remove setRange() and call (un)blobbifyRange() directly
This also fixes a few wrong setRange(true/false).
2022-10-24 11:08:04 -07:00
Xiaoxi Wang 3c67b7df39 extract serveStorageMetricsRequests template function 2022-10-24 09:58:41 -07:00
sfc-gh-tclinkenbeard 4fbaa816ea Fix several bugs in TransactionCostWorkload 2022-10-23 13:57:24 -07:00
sfc-gh-tclinkenbeard 32ae7bb529 Merge remote-tracking branch 'origin/main' into expose-txn-cost 2022-10-23 12:59:07 -07:00
sfc-gh-tclinkenbeard b442705dc7 Change units for tag quota enforcement from pages to bytes 2022-10-23 12:57:19 -07:00
Vishesh Yadav aa99b89d53 Don't fail ConsistencyCheck on first mismatch
ConsistencyCheck fails when it sees the first corrupted shard. We may want to
keep it running so that we can see all the corrupted data in logs.
2022-10-21 17:02:02 -07:00
Ankita Kejriwal c34a23152c Change the storage quota type from unit64_t to int64_t
With this change, the storage quota will be of the same type
as the storage bytes used returned by `getEstimatedRangeSizeBytes`.
2022-10-21 16:18:52 -07:00
Jingyu Zhou b839489661
Merge pull request #8518 from liquid-helium/fix-fetch-checkpoint
Fixed file offset not reset issue.
2022-10-21 09:07:55 -07:00
Jon Fu 60e76ef4a7 Merge branch 'main' of github.com:apple/foundationdb into tenant-restarting-tests 2022-10-20 17:57:22 -07:00
Jingyu Zhou a8391caf23 Revert "Data loss protection v2" 2022-10-20 18:09:58 -05:00
He Liu 915ab8c402 Fixed file offset not reset issue. 2022-10-20 15:24:00 -07:00
Nim Wijetunga 2745168d72
Add Testing for LastTenantModification Field (#8509)
* add testing

* address pr comments
2022-10-20 13:38:38 -07:00
Jon Fu 3533cff94d process '_experimental' suffix in fromString helper 2022-10-20 13:38:13 -07:00
Jon Fu c6998a7185 change tenantModes option to accept array of string 2022-10-20 11:44:03 -07:00
Jon Fu 04f709c7cb Merge branch 'main' of github.com:apple/foundationdb into tenant-restarting-tests 2022-10-19 14:30:55 -07:00
Steve Atherton e5a5ec36a4 Merge commit '0872cbfb2f00886817f18584d95af217e28ad51d' into storageserver-pml
# Conflicts:
#	fdbserver/storageserver.actor.cpp
2022-10-19 13:25:31 -07:00
Jingyu Zhou c357a054a7
Merge pull request #8472 from sfc-gh-ljoswiak/fixes/cluster-id-v2
Data loss protection v2
2022-10-19 12:49:05 -07:00
Lukas Joswiak cf6c5c3a47 Modify how cluster ID tests are run in simulation 2022-10-18 21:37:42 -07:00
Lukas Joswiak 7f889c87e3 Fix simulation issue where process switch was ignored
The simulator tracks only active processes. Rebooted or killed processes
are removed from the list of processes, and only get added back when the
process is rebooted and starts up again. This causes a problem for the
`RebootProcessAndSwitch` kill type, which wants to simultaneously reboot
all machines in a cluster and change their cluster file. If a machine is
currently being rebooted, it will miss the reboot process and switch
command.

The fix is to add a check when a process is being started in simulation.
If the process has had its cluster file changed and the cluster is in a
state where all processes should have had their cluster files reverted
to the original value, the simulator will now send a
`RebootProcessAndSwitch` signal right when the process is started. This
will cause an extra reboot, but should correctly switch the process back
to its original, correct cluster file, allowing the cluster to fully
recover all clusters.

Note that the above issue should only affect simulation, due to how the
simulator tracks processes and handles kill signals.

This commit also adds a field to each process struct to determine
whether the process is being run in a DR cluster in the simulation run.
This is needed because simulation does not differentiate between
processes in different clusters (other than by the IP), and some
processes needed to switch clusters and some simply needed to be
rebooted.
2022-10-18 21:37:42 -07:00
Lukas Joswiak 43854c5ac8 Add simulation support for changing the cluster file 2022-10-18 21:37:16 -07:00
Nim Wijetunga d439bc1e6e
TenantEntryCache Watch Based Refresh (#8399)
* tenant modification changekey

* address pr comments

* change backup to use watch based tenant cache

* format

* address pr comments

* trigger build

* add todo

* check tenants disabled

* trigger build

* trigger build

* address pr comments

* address pr comments

* trigger build
2022-10-18 19:05:07 -07:00
sfc-gh-tclinkenbeard 300840ea2e Enable GLOBAL_TAG_THROTTLING by default 2022-10-18 15:16:24 -07:00
sfc-gh-tclinkenbeard 73884bbfc7 Add more tests to TransactionCostWorkload 2022-10-18 15:06:05 -07:00
sfc-gh-tclinkenbeard 92bbcebed9 Increase trState->totalCost by one with each clear 2022-10-18 14:05:46 -07:00
Ata E Husain Bohra 765aeabb70
Enable EncryptKeyProxy test and add code probes (#8490)
* Enable EncryptKeyProxy test and add code probes

Description

  diff-1: Address review comments

Major changes include:
1. Enable EncryptKeyProxyTest, the test was modified to leverage
GetEncrrptCipherKey actor, as existing code wouldn't work if EKP
process gets restarted while test is running.
2. Add code probes to EKP
3. Minor refactoring.

Testing

EncryptKeyProxyTest - 100K
2022-10-18 13:58:16 -07:00
Xiaoxi Wang a6a53b40fd
Merge pull request #8477 from sfc-gh-xwang/feature/main/moveKey
Implement mock mode moveKeys in MockDDTxnProcessor and simulation test
2022-10-18 11:02:27 -07:00
Jingyu Zhou 1b2fcdd4f6
Merge pull request #8053 from sfc-gh-akejriwal/getsizetenant
Make the storage metrics functions tenant aware
2022-10-17 20:02:52 -07:00
Xiaoxi Wang 6630ec8c1d remove virtual on raw* Methods 2022-10-17 16:43:32 -07:00
Xiaoxi Wang 5d57429949 modify boolean definition; add comments 2022-10-17 10:12:00 -07:00