Commit Graph

6072 Commits

Author SHA1 Message Date
Lukas Joswiak d2143f184a Add fast UDP tracer 2021-01-20 13:57:36 -08:00
Lukas Joswiak 3132c38bff Rename fluentd -> UDP 2021-01-20 13:57:36 -08:00
Lukas Joswiak 1961c3fc80 Serialize spans to msgpack and send via UDP 2021-01-20 13:57:35 -08:00
Xin Dong 30962551d5 Also log when the LB is re-enabled. Attach the FetchKeysID. 2021-01-19 13:37:36 -08:00
Xin Dong 5bc129a35f Add a SevWarnAlways trace line to help debug a rare failure. 2021-01-19 13:33:15 -08:00
Lukas Joswiak 88f8145dec Add CLI option to get build information 2021-01-19 11:21:21 -08:00
Andrew Noyes 45f064c420 Take keyRange by const reference 2021-01-19 16:49:48 +00:00
Steve Atherton 523a8c08db Merge branch 'add-throttling-on-AsyncfileCached' of github.com:sfc-gh-clin/foundationdb into afcache-write-limit
# Conflicts:
#	flow/Knobs.h
2021-01-17 05:30:55 -08:00
Steve Atherton 11927f6fff Another refactor of SQLite injected error handling to address failures of the previous attempt. Some code cleanup. Added documentation of injected error handling works and why. 2021-01-17 04:17:13 -08:00
Steve Atherton f1adafaf8c Refactored how injected fault state is managed to make debugging injected fault handling easier. Moved injected error tracking for open() into an INetwork global since it should be unique per simulated process. Fixed bug where injected error in WAL file was not recognized as injected because it occurred before SQLiteDB's vfsWAL pointer was initialized. Simplified logic in SQLiteDB. Added comments. 2021-01-16 05:04:30 -08:00
Steve Atherton 8df6874417 Bug fixes: Check for null pager and wal pointers, and initialize vfsWAL after SQLite has opened the WAL. 2021-01-15 21:13:31 -08:00
Steve Atherton 6da78b1b19 Rewrote how injected faults are handled in SQLite to be more reliable and work with an upcoming write throttling feature in AsyncFileCached. 2021-01-15 19:29:14 -08:00
Andrew Noyes a5d6c96875 Fix heap use after free
See
https://github.com/apple/foundationdb/blob/master/flow/README.md#arenas
for an explanation of how *Ref types are supposed to work
2021-01-16 01:04:28 +00:00
Andrew Noyes 58e8a9ac73 Remove MasterProxyServer.actor.cpp (it was renamed) 2021-01-15 23:35:20 +00:00
Xin Dong 83506cda87 Log SevError instead of SevWarnAlways when all replicas of some data are lost. 2021-01-15 15:00:48 -08:00
Andrew Noyes b0f61fb74f Resolve the simple-looking conflicts 2021-01-15 19:35:10 +00:00
Andrew Noyes ff7d306b09 Merge branch 'release-6.3' into anoyes/merge-6.3-to-master
Include conflict markers for now. Will resolve.
2021-01-15 18:04:09 +00:00
sfc-gh-tclinkenbeard 95eaa5e866 Merge remote-tracking branch 'origin/master' into misc-changes 2021-01-13 21:14:36 -08:00
sfc-gh-tclinkenbeard 750d5e5af2 Fix compile error 2021-01-13 17:43:10 -08:00
sfc-gh-tclinkenbeard 8ff14878fe Merge remote-tracking branch 'origin/master' into simplify-global-knobs 2021-01-13 14:39:35 -08:00
sfc-gh-tclinkenbeard e29ed3bf99 Remove createGlobal*Knobs functions 2021-01-13 12:14:04 -08:00
Markus Pilman 2609c3d619
Merge pull request #4072 from sfc-gh-tclinkenbeard/improve-type-safety
Make enums automatically binary serializable
2021-01-12 10:31:34 -07:00
Daniel Smith ecf5c2b591 Add knobs for prefix bloom filters and larger block cache 2021-01-11 21:38:34 +00:00
Steve Atherton 2ce967bd77 Merge branch 'release-6.3' of github.com:apple/foundationdb into feature-redwood 2021-01-05 21:34:26 -08:00
Steve Atherton 4c28341351 Performance bug fix: Every page cache hit was calling forwardError() unnecessarily. 2021-01-05 21:34:16 -08:00
Balachandar Namasivayam 43a79a34ff
Merge pull request #4175 from vishesh/task/document
Docs, fix and hack for Fault Tolerance
2021-01-05 15:46:45 -08:00
Vishesh Yadav 31cc888562 Make fault_tolerance_without_losing data consistent with 6.2 in HA.
Don't consider satellites for now. This is a HACK which needs to be fixed
soon, but for now need this to keep the monitoring sane.
2021-01-05 13:53:48 -08:00
sfc-gh-tclinkenbeard 61a29ecfc8 Merge remote-tracking branch 'origin/master' into run-minio-joshua 2021-01-01 09:38:36 -04:00
sfc-gh-tclinkenbeard 70e62d34ca Merge remote-tracking branch 'origin/master' into misc-changes 2020-12-28 01:58:56 -04:00
sfc-gh-tclinkenbeard 8dc39f4d8f Make ExecCmdValueString const-correct 2020-12-27 14:15:22 -04:00
sfc-gh-tclinkenbeard 5b2e88b187 Use structured bindings in for loops 2020-12-27 01:46:20 -04:00
sfc-gh-tclinkenbeard 0d4e81e6b4 Use unique_ptr in DataDistribution.actor.cpp 2020-12-26 23:40:54 -04:00
sfc-gh-tclinkenbeard e7e53aeb98 Use unique_ptr for adding shards 2020-12-26 22:34:46 -04:00
sfc-gh-tclinkenbeard c2334a4904 Improve SkipList const-correctness 2020-12-26 21:31:45 -04:00
sfc-gh-tclinkenbeard 5bfa6cea98 Merge remote-tracking branch 'origin/master' into misc-changes 2020-12-26 20:47:00 -04:00
sfc-gh-tclinkenbeard 19816ccdbf Improve DataDistribution const-correctness 2020-12-26 20:22:27 -04:00
sfc-gh-tclinkenbeard 26a4884eef Mark TCMachineTeamInfo::size const 2020-12-26 19:23:01 -04:00
sfc-gh-tclinkenbeard 33ec968d5a Mark expectedSize methods const 2020-12-26 18:30:44 -04:00
sfc-gh-tclinkenbeard abb7cd4e16 Use platform::getEnvironmentVar instead of std::getenv 2020-12-26 16:32:02 -04:00
sfc-gh-tclinkenbeard f6ad4559a7 Remove MasterProxyServer.actor.cpp 2020-12-26 13:24:21 -04:00
sfc-gh-tclinkenbeard 2c8ca5d7f9 Added BlobStoreWorkload.h 2020-12-24 22:04:06 -04:00
sfc-gh-tclinkenbeard 2a8f971733 Fix Azure blob storage backup test 2020-12-24 21:20:05 -04:00
sfc-gh-tclinkenbeard 1dc0343092 Update blob backup test files 2020-12-24 21:20:05 -04:00
sfc-gh-tclinkenbeard 555c3d95fc Added fdbrpc/SimExternalClient unit test 2020-12-24 21:20:03 -04:00
Jingyu Zhou a167016378 Resolver merge conflicts 2020-12-23 14:33:30 -08:00
Jingyu Zhou bbb56e4089 Merge branch 'release-6.2' of https://github.com/apple/foundationdb into release-6.3 2020-12-23 14:26:59 -08:00
Vishesh Yadav a1809f7d86 doc: Fault Tolerance and Region Configuration 2020-12-22 16:25:17 -08:00
Xin Dong 8d4cbfbb86
Merge branch 'release-6.2' into feature/allow-manually-trigger-dd-teams-info-logging 2020-12-22 10:00:24 -08:00
Xin Dong 35e8bf3ec7 Change from logging point value to logging histogram. 2020-12-18 16:33:46 -08:00
Xin Dong 15ac27ef94 Added a metric into ProxyMetrics to expose current commit batching window, which can be used as an indicator of the load on the proxy. 2020-12-16 11:16:13 -08:00
Steve Atherton d7ebae07f5 Added yield to remap cleanup loop to prevent a stack overflow if too many queue entries are processed without waiting on IO. 2020-12-12 02:40:06 -08:00
Andrew Noyes 9601769b01
Merge pull request #3858 from sfc-gh-rchen/stable_interfaces
Stable interfaces
2020-12-11 09:34:27 -08:00
Chaoguang Lin 0613a149a6 Add rate control on both db file and log file 2020-12-11 00:22:26 -08:00
Richard Chen 5f57d72a59 remove print statements and format protocol version workload 2020-12-11 04:46:20 +00:00
Russell Sears 4cb821cd63 Merge remote-tracking branch 'upstream/release-6.2' into merge-6.2-to-6.3 2020-12-09 15:47:44 -08:00
Chaoguang Lin 9156d38216 Add knob SQLITE_READER_THREADS to control the number of read threads 2020-12-09 12:09:07 -08:00
Chaoguang Lin 5b55216252 Add rate control to AsyncFileCached 2020-12-09 12:03:32 -08:00
Jingyu Zhou c68b62f89b Update txnCommitOut 2020-12-08 16:57:59 -08:00
sfc-gh-tclinkenbeard b8a55dd097 Use unique_ptr for PImpl 2020-12-08 09:09:32 -08:00
sfc-gh-tclinkenbeard 5020e3faa1 Make ILogSystem::IPeekCursor const-correct 2020-12-08 09:09:31 -08:00
sfc-gh-tclinkenbeard dd3669886e Improve IPeekCursor and ILogSystem method signatures 2020-12-08 09:09:31 -08:00
sfc-gh-tclinkenbeard f3c0d26806 Make ISimulator::BackupAgentType an enum class 2020-12-08 09:09:30 -08:00
Andrew Noyes cc669f399e Merge remote-tracking branch 'upstream/release-6.3' into anoyes/merge-release-6.3-master 2020-12-07 22:26:11 +00:00
Richard Chen df37751f6f Merge branch 'master' of https://github.com/apple/foundationdb into stable_interfaces 2020-12-07 00:40:01 +00:00
Jingyu Zhou bf59ab684e Revert change to forceRecovery 2020-12-04 16:36:34 -08:00
Andrew Noyes 7fbc4d7391 Resolve conflicts 2020-12-04 23:58:42 +00:00
Jingyu Zhou 5ad0878254
Update fdbserver/MasterProxyServer.actor.cpp
Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>
2020-12-04 14:27:49 -08:00
Chaoguang Lin 50dda323f8 Add a test for \xff\xff/worker_interfaces 2020-12-04 13:46:07 -08:00
A.J. Beamon fa4d87f432
Merge pull request #4112 from sfc-gh-tclinkenbeard/6.3-fix-tlog-pop-slow-task
Yield while processing ignored pop requests on tlog
2020-12-04 09:20:36 -08:00
Andrew Noyes 877997632d Merge branch 'release-6.3' into anoyes/merge-release-6.3-master
Include conflict markers for review purposes
2020-12-04 01:38:07 +00:00
Jingyu Zhou e499e3ad70 Remove unneeded knob change for workloads 2020-12-03 15:10:34 -08:00
Xin Dong 78503db523 Reset and retry transaction errors 2020-12-03 14:42:30 -08:00
Xin Dong ac02329d7d Added a command in fdbcli to allow user to manually trigger the detailed teams info loggings in data distributor 2020-12-03 14:42:30 -08:00
sfc-gh-tclinkenbeard e9c31b4200 Move ignore pop logic from tLogPopCore to tLogPop 2020-12-03 14:23:49 -08:00
sfc-gh-tclinkenbeard 6184236c87 Add code coverage macro to processPopRequests 2020-12-03 11:56:55 -08:00
sfc-gh-tclinkenbeard 7e815ebb68 Added TLOG_POP_BATCH_SIZE knob 2020-12-03 11:56:55 -08:00
sfc-gh-tclinkenbeard 1003057a7e Removed TLogData::toBePoppedMutex 2020-12-03 11:56:52 -08:00
Jingyu Zhou beccd6e3c2 Do not reject blind writes, i.e., empty read conflict range 2020-12-03 11:02:41 -08:00
Arjun Arun 788391e5e3 Remove word Ghetto from repo 2020-12-02 21:52:57 -06:00
sfc-gh-tclinkenbeard a8af598307 Addressed review comments 2020-12-02 16:24:55 -08:00
Trevor Clinkenbeard 9d702099cf
Update fdbserver/TLogServer.actor.cpp
Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>
2020-12-02 16:11:23 -08:00
Richard Chen c77d9e4abe merge conflicts 2020-12-02 21:53:19 +00:00
Russell Sears a91bd6338b
Merge pull request #4084 from Daniel-B-Smith/rocksdb-suggest-compact-again
Attempt to suggest compactions after commit finishes
2020-12-02 12:12:06 -08:00
Jingyu Zhou 52d238967f Allow backup's "ApplyMutations" to bypass rejection
These mutations should always commit.
2020-11-30 21:06:56 -08:00
Jingyu Zhou 3d9ac0b050 Address review comments 2020-11-30 09:29:18 -08:00
Jingyu Zhou 6bf1e5f3f9 Update release notes 2020-11-29 14:34:25 -08:00
Jingyu Zhou da55d73885 Fix BackupToDBAbort workload w.r.t. proxy rejection 2020-11-29 14:07:01 -08:00
Jingyu Zhou 589ee01979 Fix backup workload w.r.t. proxy rejection 2020-11-29 09:45:55 -08:00
Jingyu Zhou df5293e2be Add a knob PROXY_REJECT_BATCH_QUEUED_TOO_LONG
Disable the proxy rejection feature for backup workload, because of the
ApplyMutationsError.
2020-11-28 21:32:41 -08:00
Jingyu Zhou 5cb0b138be Don't reject transactions for the recovery transaction
Otherwise, the recovery just keeps repeating.
2020-11-27 09:49:29 -08:00
sfc-gh-tclinkenbeard 5237549c19 Yield while processing ignored pop requests on tlog 2020-11-26 20:36:02 -08:00
Jingyu Zhou 78c82dc891 Skip transactions that queued for a long time 2020-11-25 23:05:26 -08:00
Markus Pilman 18ba83fc3d
Merge pull request #4105 from sfc-gh-anoyes/anoyes/release-6.3-merge
Merge release-6.2 into release-6.3 and fix conflicts
2020-11-25 11:16:05 -07:00
Andrew Noyes b8a9807336 Move trackerCancelled higher in catch block 2020-11-24 20:34:06 +00:00
Andrew Noyes dc2bac5670 Resolve conflicts 2020-11-24 19:09:42 +00:00
Andrew Noyes 1f541f02be Merge branch 'anoyes/merge-6.2-to-6.3' into anoyes/release-6.3-merge
Merge, leaving conflict markers for now
2020-11-24 16:55:34 +00:00
sfc-gh-tclinkenbeard aa07df6a91 Backport to 6.3 the ability to read xxhash3 checksum for sqlite pages 2020-11-24 00:28:49 -08:00
Andrew Noyes 231a7a10d0
Merge pull request #4075 from sfc-gh-tclinkenbeard/sqlite-xxhash3-checksum
Sqlite xxhash3 checksum
2020-11-23 10:43:35 -08:00
Trevor Clinkenbeard 4029217c2f
Update fdbserver/KeyValueStoreSQLite.actor.cpp
Co-authored-by: Andrew Noyes <63815641+sfc-gh-anoyes@users.noreply.github.com>
2020-11-23 10:07:08 -08:00
sfc-gh-tclinkenbeard 156a617ed8 Replace xxhash64 with xxhash3
The goal here is to improve performance
2020-11-21 18:25:35 -08:00
sfc-gh-tclinkenbeard b1e3478267 Simplified global knobs 2020-11-21 13:27:48 -08:00
Markus Pilman 615029a393
Merge pull request #4082 from sfc-gh-dyoungworth/dyoungworth/merge_6_3_master
merge 6 3 master
2020-11-19 20:53:51 -07:00
Andrew Noyes 5c9d7c1d94 Actually use privatized arena 2020-11-19 12:48:34 -08:00
Andrew Noyes b405d86301 Fix heap use after free 2020-11-19 10:40:24 -08:00
Evan Tschannen 2e13aacf1e
Merge pull request #4049 from sfc-gh-tclinkenbeard/fix-shard-merge-too-soon
Fix HasBeenTrueFor::set function
2020-11-18 10:29:58 -08:00
Evan Tschannen 7cabe39aae
Merge pull request #4044 from sfc-gh-tclinkenbeard/fix-slow-sim-recovery
Lower DELAY_CC_WORST_FIT_CANDIDACY_SECONDS for LowLatencyWorkload
2020-11-18 10:28:53 -08:00
Russell Sears fc8b57189e
Merge pull request #3969 from Daniel-B-Smith/rocksdb-unsafe-fsync
RocksDB grab bag
2020-11-18 10:09:24 -08:00
sfc-gh-tclinkenbeard bcda617b80 Benchmark hashing algorithms in flowbench 2020-11-17 23:14:55 -08:00
David Youngworth fc9b78737f Fix some merge bugs 2020-11-17 14:53:02 -08:00
Markus Pilman 659f58d28d
Merge pull request #4001 from sfc-gh-ljoswiak/features/simulation-tracing
Open random tracer in simulation
2020-11-17 14:27:18 -07:00
David Youngworth d64cf8b9e3 Merge branch 6.3 into master 2020-11-17 11:22:45 -08:00
Lukas Joswiak 71d0b1da85 Open random tracer in simulation 2020-11-17 09:51:19 -08:00
sfc-gh-tclinkenbeard e9bceda8ca Add code coverage macros to PageChecksumCodec::checksum 2020-11-16 19:13:18 -08:00
sfc-gh-tclinkenbeard 84632d63ad Remove stale comment 2020-11-16 17:54:47 -08:00
David Youngworth fdf41110e5 Merge branch 'release-6.3' into dyoungworth/merge_6_2_to_6_3 2020-11-16 14:49:51 -08:00
David Youngworth 489ba20641 Fix several merge issues 2020-11-16 14:46:36 -08:00
sfc-gh-tclinkenbeard 898fbac191 Remove special handling of broken_promise errors 2020-11-16 13:28:08 -08:00
sfc-gh-tclinkenbeard 4f369e70ab Add code coverage macro to DataDistributionTracker::SafeAccessor::operator() 2020-11-16 13:25:39 -08:00
sfc-gh-tclinkenbeard 6235d087a6 Prevent shardTracker or trackShardBytes from accidentally unsafely accessing DataDistributionTracker 2020-11-16 12:46:21 -08:00
David Youngworth d0391db862 Merge branch 'release-6.2' into release-6.3 2020-11-16 10:15:23 -08:00
sfc-gh-tclinkenbeard ca8ea3b6ff Fix memory issues caused by cancelling data distribution tracker 2020-11-15 23:52:36 -08:00
sfc-gh-tclinkenbeard 287ab51502 Fix some bugs in PageChecksumCodec::checksum 2020-11-15 20:35:05 -08:00
sfc-gh-tclinkenbeard 145cddbb99 Clear the first 8 bits of xxHash64 checksum of SQLite pages
When upgrading a cluster to 7.0 which has stale data from 6.2, we don't
want to calculate an xxHash64 checksum for every page that's read. By
clearing the first 8 bits of all checksums written using xxHash64, the
server can skip this checksum in 255/256 cases when reading pages
written from version <=6.2.
2020-11-15 17:48:44 -08:00
sfc-gh-tclinkenbeard c5689c4a72 Use xxhash64 for checksumming SQLite pages 2020-11-15 17:48:44 -08:00
sfc-gh-tclinkenbeard fff8e34b4d Move IKeyValueContainer from flow to fdbserver 2020-11-15 12:23:08 -08:00
sfc-gh-tclinkenbeard 575b36bf53 Move RadixTree from flow to fdbserver 2020-11-15 11:57:54 -08:00
sfc-gh-tclinkenbeard 12a6205d99 Move MetricLogger from fdbclient to fdbserver 2020-11-15 11:41:57 -08:00
sfc-gh-tclinkenbeard fcf92b8477 Improve type-safety of fdbserver.actor.cpp 2020-11-14 23:06:46 -08:00
sfc-gh-tclinkenbeard 82a50ea157 Improve type safety of ClientLogEvents 2020-11-14 19:22:19 -08:00
sfc-gh-tclinkenbeard eab75d4ee1 Make enums automatically binary serializable 2020-11-14 19:22:04 -08:00
Jingyu Zhou 569ab46bf6
Merge pull request #4000 from xumengpanda/mengxu/ha-code-read
Add comments  to TLog, SS, and DD related code
2020-11-14 09:06:05 -08:00
Meng Xu 4b0fba6ea8 Explain waitForVersion why wait for version minus MAX_READ_TRANSACTION_LIFE_VERSIONS 2020-11-13 22:14:01 -08:00
sfc-gh-tclinkenbeard 45c9a0abc7 Revert "Revert "Add limiting health metrics""
This reverts commit 209ebcc595.
2020-11-13 17:24:57 -08:00
Trevor Clinkenbeard 209ebcc595
Revert "Add limiting health metrics" 2020-11-13 17:08:46 -08:00
Trevor Clinkenbeard 8c0b4dbe4c
Merge pull request #4067 from sfc-gh-tclinkenbeard/add-limiting-health-metrics
Add limiting health metrics
2020-11-13 16:04:44 -08:00
Jingyu Zhou 9f2399f951
Merge pull request #4061 from xis19/reportHistogramPeriodically
Report histogram periodically
2020-11-13 14:14:25 -08:00
Jingyu Zhou d4a90565e1
Merge pull request #4056 from xis19/release-6.2
Add bytes_per_second unit in histograms
2020-11-13 14:12:20 -08:00
Xiaoge Su 65076db908 Move the histogramReporter() to actors list 2020-11-13 13:15:58 -08:00
sfc-gh-tclinkenbeard 6917da9ce7 Fix HasBeenTrueFor::set 2020-11-13 13:15:51 -08:00
sfc-gh-tclinkenbeard 9bb93dadf1 Reenabled Throttling.toml test (as a rare test) 2020-11-13 11:34:32 -08:00
Jingyu Zhou e92c98d2f2
Merge pull request #4059 from vishesh/task/rdar-70979996-tlog-latencies
tLog: Track tlog commit latencies in histogram
2020-11-12 21:49:31 -08:00
sfc-gh-tclinkenbeard 6c4493166f Add limiting storage queue and durability lag to health metrics 2020-11-12 20:14:41 -08:00
Vishesh Yadav 4df23741b2 tLog: Track tlog commit latencies in histogram 2020-11-12 17:48:16 -08:00
Xiaoge Su 3a6948c199 Report histogram periodically 2020-11-12 17:04:33 -08:00
Xiaoge Su 4a0fa57989 Add bytes_per_second unit in histograms 2020-11-12 15:38:51 -08:00
Xin Dong 8343c78bf0
Merge pull request #3960 from dongxinEric/misc/expose-proxy-local-rate-info
Expose local transaction rate and limit info on commit proxies.
2020-11-12 15:23:50 -08:00
Xiaoge Su a02b721170
Merge pull request #4013 from xis19/ddmetric
Record shard moves metrics
2020-11-12 14:00:43 -08:00
Meng Xu 222da17558 Merge branch 'release-6.2' into mengxu/ha-code-read 2020-11-12 13:39:27 -08:00
Meng Xu 046a6e8427 Add Alex comment on tLog 2020-11-12 13:29:11 -08:00
Jon Fu cc13ef08bd Sort the failed sets before modifying them in attempts to make changes consistent 2020-11-12 16:26:34 -05:00