Evan Tschannen
8449badb3e
Merge pull request #1868 from dongxinEric/fix/1827/error_instead_of_timeout
...
Send error back before put the GRV request with PRIORITY_BATCH into t…
2020-02-04 14:32:47 -08:00
Evan Tschannen
4524831456
Merge pull request #2518 from vishesh/task/failmon-remove-server
...
FailureMonitoring: Server processes no longer need to talk to ClusterController
2020-02-03 17:22:50 -08:00
Xin Dong
7016f7903b
Fixed another build error. Do not use timeReplyIgnoreError since we don not want the logging inside that function and thus that's unnecessary anymore. Change to use ready() which basically ignores the error.
2020-01-31 15:48:29 -08:00
Alex Miller
ee6490c9d1
Merge pull request #2314 from mengranwo/memory-engine
...
New Radix-Tree based Memory Storage Engine
2020-01-30 16:20:13 -08:00
Xin Dong
7216961e46
Do not time the error.
2020-01-30 14:13:56 -08:00
Xin Dong
e21426d12a
Send error back to the GRV requests with batch priority when the cluster is saturated, instead of blindly enqueue the requests and let the client timeout.
2020-01-30 14:13:56 -08:00
Alex Miller
2bc5b2cf8a
Merge pull request #2585 from Ma27/fix-glibc230-build
...
Fix build with glibc 2.30
2020-01-23 20:21:32 -08:00
Evan Tschannen
76e192d490
Merge pull request #2538 from alexmiller-apple/hashlittle2-to-crc32c
...
Convert more hashlittle{,2} uses to crc32c_append
2020-01-23 17:54:38 -08:00
Evan Tschannen
6c0b934dda
Merge pull request #2242 from alexmiller-apple/fix-10min-stall-again
...
Fix the 10min multi-region recovery stall again
2020-01-23 17:53:02 -08:00
Maximilian Bosch
e133cb974b
Fix build with glibc 2.30
...
The `gettid()` function is part of glibc 2.30[1]. I decided to keep the
`gettid` implementation here under a different name to remain compatible
to older glibc versions.
[1] https://sourceware.org/ml/libc-alpha/2019-08/msg00029.html
2020-01-23 09:28:18 +01:00
Jingyu Zhou
17002740bb
Add epoch and backup workers to DBCoreState
...
This enables backup workers to know the end version of the epoch. Additionally,
the master recovery only needs to deal with crashed backup workers by
recruiting new workers to backup the unfinished version range.
2020-01-22 19:38:45 -08:00
Jingyu Zhou
a4d6ebe79e
Recruit backup worker in newEpoch
2020-01-22 19:37:48 -08:00
Evan Tschannen
78adbea834
Merge branch 'release-6.2'
...
# Conflicts:
# CMakeLists.txt
# documentation/sphinx/source/release-notes.rst
# flow/Knobs.h
# versions.target
2020-01-21 21:38:19 -08:00
Evan Tschannen
afd3ec13ff
added knobs
2020-01-21 18:58:34 -08:00
Alex Miller
1cb311fcb8
Add an ASSERT_WE_THINK that peek cursors don't get timed_out()
...
This should prevent us from regressing and having multi-region
recoveries hang for 10min again.
2020-01-21 17:07:37 -08:00
Evan Tschannen
7a4b459f07
wait for a tls handshake to complete before returning a connection
...
wait for multiple tls errors before throttling
2020-01-21 16:45:15 -08:00
Vishesh Yadav
daef5f011a
Merge remote-tracking branch 'apple/master' into task/failmon-remove-server
2020-01-21 13:20:15 -08:00
Evan Tschannen
3f9d9d8b84
Merge branch 'release-6.2'
...
# Conflicts:
# CMakeLists.txt
# cmake/FlowCommands.cmake
# documentation/sphinx/source/release-notes.rst
# fdbclient/StorageServerInterface.h
# fdbserver/DataDistributionTracker.actor.cpp
# fdbserver/MasterProxyServer.actor.cpp
# fdbserver/fdbserver.actor.cpp
# flow/Knobs.h
# flow/Platform.cpp
# versions.target
2020-01-16 18:37:47 -08:00
mengranwo
115ff8bf65
change getKey() interface, pass in uint8_t * only
2020-01-15 13:49:45 -08:00
mengranwo
6836e370a7
revert changes inside KVStoreTest, ready for code review
2020-01-15 13:49:45 -08:00
mengranwo
f597aa7e18
WIP : deployable/stable version since Nov 3. Start rebase to master branch
2020-01-15 13:49:45 -08:00
Evan Tschannen
e65760eb46
Merge pull request #2536 from etschannen/feature-commit-latency
...
Improved commit latency in large clusters
2020-01-13 19:12:02 -08:00
Alex Miller
31fbf84ac5
Make FastAlloc use crc32c instead of hashlittle2
2020-01-13 18:23:12 -08:00
Alex Miller
da73164eda
Move crc32c from fdbrpc to flow
...
So that we can use it from a piece of flow code without breaking module
boundaries.
Also rename generated-constants to crc32c-generated-constants so that
it's more apparent that they're related files.
2020-01-13 18:19:30 -08:00
Evan Tschannen
0e916fdbed
throttle client TLS errors longer than server errors so that when both happen simultaneously the server throttling will be disabled when the client makes its next attempt
2020-01-12 22:12:18 -08:00
Evan Tschannen
1f7eb1f738
throttle outgoing tls connections before establishing a network connection
...
store serverTLSConnectionThrottler map inside of g_network, so that it works properly with simulation
2020-01-12 16:44:30 -08:00
Evan Tschannen
ef5dfb87dc
Merge pull request #2529 from bnamasivayam/tls-throtlling
...
Establishing TLS connection through the handshake process is expensiv…
2020-01-12 14:56:21 -08:00
Balachandar Namasivayam
741aa523e6
Establishing TLS connection through the handshake process is expensive and the fdbserver process can get easily saturated with doing repeated TLS handshakes with only a few hundreds of clients have bad certificate. Hence throttle the number of handshakes done on the server per client ip if it has a bad certificate.
2020-01-10 16:19:41 -08:00
Evan Tschannen
2e20c12200
Merge pull request #2475 from ajbeamon/priority-busy-fixes
...
Fix PriorityBusy calculation and add PriorityMaxBusy
2020-01-10 12:47:17 -08:00
Evan Tschannen
176a1b6319
Merge pull request #2515 from ajbeamon/remove-timer-in-slowtask-profiler
...
Fix slow task profiler crash
2020-01-10 12:41:57 -08:00
Evan Tschannen
a5f544818c
Merge pull request #2420 from ajbeamon/trace-clock-source-fix
...
Revert change to make g_trace_clock thread_local, ...
2020-01-10 12:36:38 -08:00
Alvin Moore
7628d04fb9
Merge branch 'release-6.2' of github.com:apple/foundationdb into release_6.2_merge
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
2020-01-09 07:21:16 -08:00
Vishesh Yadav
598b2eaeb0
fdbrpc: Add warning when peer is unavailable for long time
2020-01-08 13:55:13 -08:00
A.J. Beamon
de5a591b15
Attempt a minor pointless change to fix the build
2020-01-06 15:17:13 -08:00
A.J. Beamon
6cf38790d6
Reorganize declaration of variable and add release note.
2020-01-06 12:27:56 -08:00
A.J. Beamon
4a52864023
Remove call of timer() from the slow task profiling signal handler, as it can lead to crashes if called at the wrong time.
2020-01-06 12:19:45 -08:00
Evan Tschannen
16b5af067c
changed trace event name
2020-01-03 16:03:29 -08:00
Evan Tschannen
deb032745a
fix: do not set logged until then end of the function
2020-01-03 12:45:23 -08:00
Evan Tschannen
1867d30017
added asserts to protect against future actions on a trace event that has been logged
2020-01-03 12:31:06 -08:00
Evan Tschannen
7152469cc3
log the base trace event before the endpoint messages
2020-01-03 12:15:38 -08:00
Evan Tschannen
6e473c3a83
Merge branch 'release-6.2' into feature-addpeer-fix
2020-01-02 17:37:23 -08:00
Evan Tschannen
032797ca5c
Merge pull request #2430 from etschannen/release-6.2
...
Reduce recovery times caused by saturating the cluster controller
2020-01-02 17:35:59 -08:00
A.J. Beamon
3dd3ac3cfd
Merge branch 'release-6.2' into trace-clock-source-fix
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
2020-01-02 15:14:12 -08:00
A.J. Beamon
ca01593067
Cap busyness to 1.0 at logging time to cover all cases where it could be measured above.
2020-01-02 15:10:42 -08:00
Evan Tschannen
9e137d3b49
fix: addPeerReference only marks a connection as healthy if it is the first peerReference
...
added additional logging to long LoadBalance calls, and when the failure monitor state changes for an address
2019-12-19 18:26:29 -08:00
A.J. Beamon
3b28c7f103
Throw the correct error in deleteFile
2019-12-19 14:13:09 -08:00
A.J. Beamon
414be7a0e4
Merge pull request #2479 from AlvinMooreSr/release_6.2_merge
...
Release 6.2 merge
2019-12-19 09:36:05 -08:00
Evan Tschannen
d8c3c2fda4
Improved prioritization of commit path on the proxies
2019-12-18 16:56:35 -08:00
Vishesh Yadav
902029bbec
Merge pull request #2448 from mpilman/issues/2446
...
Changed failure monitor ping delay to 1 second
2019-12-18 13:26:34 -08:00
Alvin Moore
21390c493a
Merge branch 'release-6.2' of github.com:apple/foundationdb into release_6.2_merge
...
Resolved merge by keeping new test file from master branch: SampleNoSimAttrition.txt adding new constraint from Release branch about existing test file: SimpleExternalTest.txt
# Conflicts:
# tests/CMakeLists.txt
2019-12-18 09:05:08 -08:00