Xiaoxi Wang
22eafcf7a2
rename trace event
2022-05-04 14:27:42 -07:00
Xiaoxi Wang
ae66ed6c16
fix DataDistributionQueue time_out ; reset the rebalance poll time
2022-05-04 14:11:20 -07:00
Xiaoxi Wang
a3d0b005dc
reset several method use getShardMetrics
2022-05-04 00:00:03 -07:00
Xiaoxi Wang
1723bee639
add fetchTopKShardMetrics to dd tracker
2022-05-03 23:42:09 -07:00
Xiaoxi Wang
d848441cdd
simulate ReadSkewReadWrite spec
2022-05-03 23:39:05 -07:00
Xiaoxi Wang
7c37d172b9
solve some comments
2022-05-03 17:21:08 -07:00
Xiaoxi Wang
269d85daa8
Merge branch 'main' of https://github.com/apple/foundationdb into readaware
2022-05-03 13:37:56 -07:00
A.J. Beamon
be0c7a8884
Merge pull request #7037 from sfc-gh-ajbeamon/fdbcli-generator-refactor
...
Move fdbcli command and hint generators into the CommandFactory
2022-05-03 12:52:29 -07:00
Hao Fu
97eb12381b
implement equals and hashCode in MappedKeyValue ( #7041 )
2022-05-03 12:24:26 -07:00
Johannes Scheuermann
9665786785
Merge pull request #7011 from johscheuer/add-is-present-method-sidecar
...
Add sidecar method to check if a file is present
2022-05-03 07:22:23 +01:00
Steve Atherton
fa02f17932
Merge pull request #7050 from sfc-gh-satherton/redwood-shutdown-hang-fix
...
Bug fix: Redwood shutdown would wait for pending IO success
2022-05-02 21:34:13 -07:00
Steve Atherton
9a279c24ae
Bug fix: Redwood shutdown would wait for pending IO success so if any of them failed the shutdown would never complete.
2022-05-02 19:26:44 -07:00
Andrew Noyes
90dae38d04
Update RYWIterator test to match #6993 ( #7046 )
...
There's a test which checks behavior against a reference implementation,
and so the reference implementation needs to be updated as well.
2022-05-02 18:22:59 -07:00
Jingyu Zhou
05e63bc703
Fix orphaned storage server due to force recovery ( #6914 )
...
* Fix orphaned storage server due to force recovery
The force recovery can roll back the transaction that adds a storage server.
However, the storage server may now at version B > A, the recovery version.
As a result, its peek to buddy TLog won't return TLogPeekReply::popped to
trigger its exit, and instead getting a higher version C > B back. To the
storage server, this means the message is empty, thus not removing itself and
keeps peeking.
The fix is to instead of using recovery version as the popped version for the
SS, we use the recovery transaction version, which is the first transaction
after the recovery. Force recovery bumps this version to a much higher version
than the SS's version. So the TLog would set TLogPeekReply::popped to trigger
the storage server exit.
* Fix tlog peek to disallow return empty message between recoveredAt and recovery txn version
This contract today is not explicitly set and can cause storage server to fail
with assertion "rollbackVersion >= data->storageVersion()". This is because if
such an empty version is returned, SS may advance its storage version to a
value larger than the rollback version set in the recovery transaction.
The fix is to block peek reply until recovery transaction has been received.
* Move recoveryTxnReceived to be per LogData
This is because a shared TLog can have a first generation TLog which is already
setting the promise, thus later generations won't wait for the recovery version.
For the current generation, all peeks need to wait, while for older generations,
there is no need to wait (by checking if they are stopped).
* For initial commit, poppedVersion needs to be at least 2
To get rid of the previous unsuccessful recovery's recruited seed
storage servers.
2022-05-02 17:17:37 -07:00
Hao Fu
fa2e85f1d3
Add comment about getMappedRange parameters ( #7044 )
2022-05-02 15:17:14 -07:00
Andrew Noyes
7ed82c1ac5
Mac m1 has 16k pages ( #7038 )
...
Previously the page guard implementation assumed that the page size was
4k. Also check for mmap and mprotect returning errors.
2022-05-02 14:24:43 -07:00
Andrew Noyes
0a4b364379
Fix operation_failed thrown incorrectly from transactions ( #6993 )
...
* Add a test demonstrating the issue
If you write a versionstamped value after a set, then reading throws
operation_failed.
* Treat SetVersionstampedValue as independent in coalesce and mutate
2022-05-02 13:49:42 -07:00
Rajiv Ranganath
cf6e39af79
docs: add `GET_RANGE_SPLIT_POINTS`
...
Add `GET_RANGE_SPLIT_POINTS` instruction documentation.
2022-05-02 13:31:20 -07:00
Ray Jenkins
dc9e782ccc
OpenTelemetry Tracing Perf Fixes ( #6990 )
2022-05-02 14:56:51 -05:00
Josh Slocum
8284ec5712
Fixing memory leak when handling FDBResult in multi version client
2022-05-02 12:56:05 -05:00
Josh Slocum
57e1b487f1
Fixing ASAN alloc-dealloc-mismatch
2022-05-02 12:56:05 -05:00
Xiaoxi Wang
69985ba251
Merge branch 'main' of https://github.com/apple/foundationdb into readaware
2022-05-02 10:53:22 -07:00
Markus Pilman
d9aee5c253
Merge pull request #7012 from sfc-gh-vgasiunas/vgasiunas-upgrade-tests
...
Improve robustness of upgrade tests
2022-05-02 10:30:21 -06:00
Xiaodong Zhang
a7a5b3e273
fix bug in tpcc workload
2022-05-02 09:28:23 -07:00
A.J. Beamon
75fc526697
Merge pull request #7020 from sfc-gh-ajbeamon/fix-dd-team-removal-health
...
Mark a team as unhealthy when it is removed
2022-05-02 08:45:55 -07:00
A.J. Beamon
43c2ca35a5
Move fdbcli command and hint generators into the files implementing the command.
2022-05-02 08:39:59 -07:00
Markus Pilman
f5570ba49e
Merge pull request #7035 from sfc-gh-jshim/fix-token-sign-arena
...
Fix TokenSign copying and using uninitialized arena
2022-05-02 08:52:19 -06:00
Junhyun Shim
41d1c73b9c
Fix TokenSign copying and using uninitialized arena
...
TokenSign was copying unused Arena held by Standalone instead of refering to it.
An Arena has to be used at least once before it holds a valid, copyable reference.
Otherwise the lifecycle of the copied Arena would be its own and not be shared with the original.
Thus, when the copied arena went out of scope,
the memory supposed to be held by returned Standalone also got released.
Fix: instead of copying, refer to Standalone's arena.
2022-05-02 09:48:43 +02:00
Jingyu Zhou
0ca9761088
Fix IDE build warnings and errors
2022-05-01 16:20:57 -07:00
Steve Atherton
74b82205df
Merge pull request #7024 from sfc-gh-etschannen/fix-cluster-id
...
fix: prevent a storage server from attempting to read the cluster id from itself due to a stale cache entry
2022-04-29 22:33:07 -07:00
Steve Atherton
1ab5c21967
Merge pull request #6979 from sfc-gh-jslocum/speedup_tail_latency
...
Don't do huge tail latencies for network requests when speed up simul…
2022-04-29 22:31:35 -07:00
Steve Atherton
b7bc0a3aff
Merge pull request #6911 from sfc-gh-jslocum/min_shard_bytes_main
...
Decreasing MIN_SHARD_BYTES knob
2022-04-29 22:31:21 -07:00
Evan Tschannen
3bab26c01b
fix: prevent a storage server from attempting to read the cluster id from itself due to a stale cache entry
2022-04-29 14:56:43 -07:00
Josh Slocum
e5840d3a38
Merge branch 'main' into speedup_tail_latency
2022-04-29 16:05:12 -05:00
Steve Atherton
165d9fa6b1
Merge pull request #7013 from sfc-gh-jslocum/writeduringread_keysize_main
...
Fix for WriteDuringRead workload key sizes with useSystemKeys=true bu…
2022-04-29 14:01:44 -07:00
Steve Atherton
338d2304d7
Bug fix: Killing a machine process would not wait for AsyncFileNonDurable close operations to finish, causing a later reopen of the same file in a new process to hang forever. Renamed AsyncFileNonDurable::deleteFile to closeFile for clarity. Renamed Machine deletingFiles to deletingOrClosingFiles for clarity. ( #7007 )
2022-04-29 14:01:18 -07:00
Steve Atherton
504400c1b3
Merge pull request #7017 from sfc-gh-jslocum/tssq_cc_fix
...
Allow TSS failures in consistency check when fault injection is enabled
2022-04-29 14:01:06 -07:00
Josh Slocum
674e6a8fdc
Merge branch 'main' into min_shard_bytes_main
2022-04-29 16:00:27 -05:00
Steve Atherton
2887429f42
Merge pull request #6991 from sfc-gh-etschannen/fix-recovery-version
...
fix: when more tlogs are absent than the replication factor we would access invalid memory
2022-04-29 14:00:15 -07:00
Steve Atherton
2bcbff2809
Merge pull request #6965 from sfc-gh-tclinkenbeard/increase-snapshot-lower-bound
...
Increase lower bound for snapshot restart tests to 7.1.0
2022-04-29 13:59:37 -07:00
Steve Atherton
2678546c00
Merge pull request #6950 from sfc-gh-jslocum/cf_delete_race
...
Fixing change feed deleted from multiple sources race
2022-04-29 13:58:22 -07:00
A.J. Beamon
5eedcafcfb
Mark a team as unhealthy when it is removed
2022-04-29 12:40:41 -07:00
Josh Slocum
7d94b0b442
Allow TSS failures in consistency check when fault injection is enabled
2022-04-29 13:24:54 -05:00
Josh Slocum
aa20eefe7b
Fix for WriteDuringRead workload key sizes with useSystemKeys=true but writing to normal key space
2022-04-29 11:33:54 -05:00
Vaidas Gasiunas
1ef33db1ef
Upgrade Tests: Create the destination directory before copying a client library from the local repository
2022-04-29 16:41:03 +02:00
Vaidas Gasiunas
449d5aec61
Upgrade Tests: Fix paths for accessing binaries from the local repo
2022-04-29 15:54:47 +02:00
Vaidas Gasiunas
e33d7455a5
Upgrade Tests: Retry on download errors
2022-04-29 15:32:47 +02:00
Vaidas Gasiunas
27c3d7a953
Upgrade Tests: Use old binaries from the Docker image if available
2022-04-29 14:53:22 +02:00
Johannes M. Scheuermann
d1c71a7903
Add sidecar method to check if a file is present
2022-04-29 13:10:05 +01:00
Vaidas Gasiunas
27d7b2e409
Upgrade Test: Avoid blocking on opening names pipes in case the tester fails to initialize
2022-04-29 14:04:27 +02:00