Commit Graph

6102 Commits

Author SHA1 Message Date
Xiaoxi Wang b8baf19634 Merge branch 'main' of https://github.com/apple/foundationdb into feature/main/dataApi 2022-11-02 13:46:58 -07:00
A.J. Beamon fc8929cde7 During a restore, the tenant map may not be self-consistent. For example, it is possible for a tenant to exist with two names if it was renamed during a backup. This updates the tenant maps in SS and CP to allow there to be multiple tenants with the same ID, but it expects there to only be one such tenant once the restore is complete and the data is accessed. 2022-11-02 09:05:31 -07:00
Jingyu Zhou 89857c4be0 Merge branch 'main' of https://github.com/apple/foundationdb into fix-head 2022-11-01 20:25:41 -07:00
Jingyu Zhou c127bb1c30 Fix some clang warnings on unused variables 2022-11-01 15:38:47 -07:00
Xiaoxi Wang 7442cfa2cb format code 2022-11-01 14:56:55 -07:00
Xiaoxi Wang 55a3db82b5 update the name, comment and discription of write byte sampling; update the calculation of write bandwidth metrics 2022-11-01 14:56:55 -07:00
Xiaoxi Wang 11b2c035c0 add unit test for randomKeyBetween 2022-11-01 14:56:55 -07:00
Xiaoxi Wang 334fced572 add data api implementations; add more realistic fetchKey implementation; finish randomKeyBetween implementation 2022-11-01 14:56:55 -07:00
sfc-gh-tclinkenbeard 80ee79e39b Merge remote-tracking branch 'origin/main' into debug 2022-11-01 12:37:27 -07:00
sfc-gh-tclinkenbeard 047578b3d9 Merge remote-tracking branch 'origin/main' into debug 2022-11-01 12:05:45 -07:00
sfc-gh-tclinkenbeard 5fd8d05810 Make PROXY_MAX_TAG_THROTTLE_DURATION a server knob 2022-11-01 11:00:45 -07:00
Xiaoge Su 520a14cd72 fixup! update code per comments 2022-11-01 10:36:25 -07:00
Trevor Clinkenbeard 39abc712b0
Merge pull request #8549 from sfc-gh-tclinkenbeard/expose-txn-cost
Create `fdb_transaction_get_total_cost` function
2022-11-01 08:14:57 -07:00
Nim Wijetunga 24ce8c0fd0
Commit Proxy Encryption Code Probes (#8618)
* add commit proxy encryption code probes

* fix comment

* address pr comments

* address pr comments

* address pr comments

* address pr comments

* Trigger Build
2022-10-31 20:04:42 -07:00
Trevor Clinkenbeard fdd2f24174
Fix comment in NativeAPI.actor.h
Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>
2022-10-31 14:48:13 -07:00
Dennis Zhou f7b608e53f blob: refactor blob get tenant code 2022-10-31 10:37:36 -07:00
Dennis Zhou 1ab432e49d blob: fix error propagation in getBlobRanges()
Fixes: 48d6e725c2 ("blob: convert listBlobbifiedRangesActor() to take a Transaction")
2022-10-31 10:37:36 -07:00
sfc-gh-tclinkenbeard 0eb1598afa Merge remote-tracking branch 'origin/main' into expose-txn-cost 2022-10-30 09:36:37 -07:00
sfc-gh-tclinkenbeard 1d6bd0057b Merge remote-tracking branch 'origin/main' into add-tag-throttling-latency-bands 2022-10-29 09:49:37 -07:00
Jingyu Zhou 22293ebac5
Merge pull request #8614 from neethuhaneesha/clearRanges
Enable clear range eager reads knob for rocksdb.
2022-10-28 17:03:39 -07:00
Steve Atherton b3017ae330
Merge pull request #8577 from sfc-gh-satherton/storageserver-pml
Unrevert #7578 - new PriorityMultiLock and StorageServer read prioritization.
2022-10-28 16:04:44 -07:00
He Liu 7bb823edbe
Replace KeyRange with std::vector<KeyRange> in DataMoveMetaData and (#8591)
* Replace KeyRange with std::vector<KeyRange> in DataMoveMetaData and
CheckpointMetaData.

* Checked if ranges.empty().

* fmt.

* Resolved some comments.

Co-authored-by: He Liu <heliu@apple.com>
2022-10-28 15:22:55 -07:00
neethuhaneesha 55fc0c1a0b Enable clear range eager reads knob for rocksdb. 2022-10-28 14:30:05 -07:00
Steve Atherton 326d45819e Merge branch 'main' into storageserver-pml 2022-10-28 14:14:44 -07:00
Andrew Noyes 0a15f081a1
Proactively clean up idempotency ids for successful commits (#8578)
* Proactively clean up idempotency ids for successful commits

This change also includes some minor changes from my branch working on
an idempotency ids cleaner, that I'd like to get merged sooner rather
than later.

- Adding a timestamp to idempotency values
- Making IdempotencyId an actor file
- Adding commit_unknown_result_fatal
- Checking idempotencyIdsExpiredVersion in determineCommitStatus
- Some testing QOL changes

* Factor out decodeIdempotencyKey logic

* Fix formatting

* Update flow/include/flow/error_definitions.h

Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>

* Use KeyBackedObjectProperty for idempotencyIdsExpiredVersion

* Add IDEMPOTENCY_ID_IN_MEMORY_LIFETIME knob

* Rename ExpireIdempotencyKeyValuePairRequest

Also add a code probe for the case where an ExpireIdempotencyIdRequest is
received before the count is known, and add an assert

* Fix formatting and add TODO for nwijetunga

Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>
2022-10-28 09:07:54 -07:00
Steve Atherton d53ed6acae Merge branch 'main' into storageserver-pml 2022-10-28 00:14:01 -07:00
Jingyu Zhou dc60f63f9b Revert "Cancel watch when the key is not being waited"
This reverts commit 639afbe62c.
2022-10-27 19:46:05 -07:00
Jingyu Zhou 634bd529e7 Revert "Record the version of each watch"
This reverts commit 4bd24e4d64.
2022-10-27 19:46:05 -07:00
Jingyu Zhou 19ae4e7eb7 Revert "Reformat source"
This reverts commit ec47c261bf.
2022-10-27 19:46:05 -07:00
Jingyu Zhou e460933b52 Revert "Remove debugging output"
This reverts commit 41d1d6404d.
2022-10-27 19:46:05 -07:00
Jingyu Zhou e7fd3eda00 Revert "Update fdbclient/NativeAPI.actor.cpp"
This reverts commit 812243bafa.
2022-10-27 19:46:05 -07:00
Steve Atherton 1dad43cb06 Remove unnecessary change feed disk read lock as its functionality is obsoleted by the storage server read priority lock. 2022-10-27 18:03:14 -07:00
Steve Atherton f9ad7fb35b Merge origin/main into storageserver-pml 2022-10-27 18:00:11 -07:00
Steve Atherton 848e80c08b Add comment. 2022-10-27 17:52:56 -07:00
sfc-gh-tclinkenbeard 6ae0aac153 Merge remote-tracking branch 'origin/main' into add-tag-throttling-latency-bands 2022-10-27 14:07:51 -07:00
Lukas Joswiak 9625efd5b9 Add comment about configuration database 2022-10-27 13:56:13 -07:00
Lukas Joswiak 8e76621653 Disable shared state updates on configuration database 2022-10-27 13:56:13 -07:00
Lukas Joswiak a8f8757f77 Rename cluster ID key
In FDB 7.1, this key was stored in the txnStateStore. In 7.2, it has
been moved to the database. This was causing protocol compatibility
issues during upgrades, so we need to rename the key.
2022-10-27 13:56:13 -07:00
Lukas Joswiak bba05b7c9b Move cluster ID from txnStateStore to the database
The cluster ID is now stored in the database instead of in the
txnStateStore. The cluster controller will read it on boot and send it
to all processes to persist.
2022-10-27 13:56:13 -07:00
Lukas Joswiak 5ca2b89bdf Fix simulation issue where process switch was ignored
The simulator tracks only active processes. Rebooted or killed processes
are removed from the list of processes, and only get added back when the
process is rebooted and starts up again. This causes a problem for the
`RebootProcessAndSwitch` kill type, which wants to simultaneously reboot
all machines in a cluster and change their cluster file. If a machine is
currently being rebooted, it will miss the reboot process and switch
command.

The fix is to add a check when a process is being started in simulation.
If the process has had its cluster file changed and the cluster is in a
state where all processes should have had their cluster files reverted
to the original value, the simulator will now send a
`RebootProcessAndSwitch` signal right when the process is started. This
will cause an extra reboot, but should correctly switch the process back
to its original, correct cluster file, allowing the cluster to fully
recover all clusters.

Note that the above issue should only affect simulation, due to how the
simulator tracks processes and handles kill signals.

This commit also adds a field to each process struct to determine
whether the process is being run in a DR cluster in the simulation run.
This is needed because simulation does not differentiate between
processes in different clusters (other than by the IP), and some
processes needed to switch clusters and some simply needed to be
rebooted.
2022-10-27 13:56:13 -07:00
Xiaoge Su 812243bafa Update fdbclient/NativeAPI.actor.cpp
Co-authored-by: Jingyu Zhou <jingyuzhou@gmail.com>
2022-10-27 12:42:05 -07:00
Xiaoge Su 41d1d6404d Remove debugging output 2022-10-27 12:42:05 -07:00
Xiaoge Su ec47c261bf Reformat source 2022-10-27 12:42:05 -07:00
Xiaoge Su 4bd24e4d64 Record the version of each watch
In the case
    1. A watch to key A is set, the watchValueMap ACTOR, noted as X, starts waiting.
    2. All watches are cleared due to connection string change.
    3. The watch to key A is restarted with watchValueMap ACTOR Y.
    4. X receives the cancel exception, and tries to dereference the counter. This causes Y gets cancelled.

the reference count will cause watch prematurely terminate. Recording
the versions of each watch would help preventing this issue
2022-10-27 12:42:05 -07:00
Xiaoge Su 639afbe62c Cancel watch when the key is not being waited
Currently, there is a cyclic reference situation in

    DatabaseContext -> WatchMetadata -> watchStorageServerResp ->
    DatabaseContext

If there is a watch created in the DatabaseContext, even the
corresponding wait ACTOR is cancelled, the WatchMetadata will still hold
a reference to watchStorageServerResp ACTOR, which holds a reference to
DatabaseContext.

In this situation, any DatabaseContext who held a watch will not be
automatically destructed since its reference count will never reduce to
0 until the watch value is changed. Every time the cluster recoveries,
several watches are created, and when the cluster restarts, the
DatabaseContext which not being used, will not be able to destructed due
to these watches.

With this patch, each wait to the watch will be counted. Either the
watch is triggered or cancelled, the corresponding count will be
reduced. If a watch is not being waited, the watch will be cancelled,
effectively reduce the reference count of DatabaseContext. This will
hopefully fix the issue mentioned above.

The code is tested by 1) Manually change the number of logs of a local
cluster, see the cluster recovery and previous DatabaseContext being
destructed; 2) 100K joshua run, with 1 failure, the same test will fail
on the current git main branch.
2022-10-27 12:42:05 -07:00
Jingyu Zhou fe66c026b4
Merge pull request #8598 from jzhou77/fix
Fix restarting restore test failure
2022-10-27 10:44:17 -07:00
Josh Slocum 4d3553481f
Blob connection provider test (#8478)
* Refactoring test blob metadata creation

* Implementing BlobConnectionProviderTest

* createRandomTestBlobMetadata supports blobstore and works outside simulation
2022-10-27 10:44:06 -05:00
Jingyu Zhou 6c0f890f78 Fix restarting restore test failure
Old fdbserver may not set the "enableSnapshotBackupEncryption" key, thus we
should allow the key to be not present.
2022-10-27 08:43:55 -07:00
Steve Atherton 56abec32f1 Bug fix: The change feed request UID is actually not just for debugging and can't be shared across requests, so the debugID in ReadOptions should not be used. Restored the original ChangeFeedRequest member but renamed it from debugUID to just id. 2022-10-26 20:45:39 -07:00
Steve Atherton fb44945a89 Use id variable to simplify logic a bit. 2022-10-26 17:35:39 -07:00