Evan Tschannen
5d3e69b6fc
Merge pull request #1820 from fzhjon/load-balance-locality
...
Introduced a knob that can turn locality on/off
2019-07-16 16:40:43 -07:00
mpilman
d5caf0c1b4
Merge branch 'flatbuffers-fixes2' of github.com:mpilman/foundationdb into flatbuffers-fixes2
2019-07-16 14:47:40 -07:00
Jon Fu
7d37040725
split the single knob into two for finer-grained control
2019-07-16 12:46:02 -07:00
Meng Xu
20f067e794
Merge with master:Resolve conflict with PR#1797
2019-07-16 10:52:28 -07:00
mpilman
6c6a1ca8f4
Expose serialization context too all traits
2019-07-15 12:58:31 -07:00
Meng Xu
1c0daa7f2c
Resolve review comments:Remove unneeded code
2019-07-12 18:10:04 -07:00
mpilman
75d4b612cf
Make object serializer versioned
2019-07-12 11:53:14 -07:00
Meng Xu
cf935ff9e6
Remove debug message and format code
2019-07-11 22:05:20 -07:00
Andrew Noyes
c96854fce2
Simplify IReplicationPolicy serialization
2019-07-11 17:35:37 -07:00
mpilman
be3a07826d
fixed serialization bug with ReplicationPolicy
2019-07-11 17:35:37 -07:00
Andrew Noyes
ae6f17625e
Support PacketBuffer's of arbitrary size
2019-07-11 17:35:37 -07:00
Andrew Noyes
59ddf091e8
Re-use writeToOffsets vector
2019-07-11 17:35:37 -07:00
Andrew Noyes
b9af1f43b7
Implement new dynamic_size_traits
2019-07-11 17:35:37 -07:00
Steve Atherton
1700d492cf
Merge pull request #1823 from ajbeamon/cache-hit-rate-in-status
...
Tweak cache hit calculations and add cache hit rate to status
2019-07-11 14:06:06 -07:00
Meng Xu
c6e42d6119
ReplicationPolicy:Add trace for the name of each keyIndex
2019-07-10 19:29:29 -07:00
A.J. Beamon
b4dbc6d7fa
Change the way cache hits and misses are tracked to avoid counting blind page writes as misses and count the results of partial page writes. Report cache hit rate in status.
2019-07-10 14:43:20 -07:00
Vishesh Yadav
c694931e33
sim2: Remove obsolete comment
2019-07-10 14:06:06 -07:00
Jon Fu
19a91765f6
introduced a knob that can turn locality on/off
2019-07-09 16:16:15 -07:00
Vishesh Yadav
4b8eb27134
fdbrpc: Move setStatus line in addPeerReference
2019-07-09 15:01:12 -07:00
Vishesh Yadav
983343978e
fdbrpc: ConnectionMonitor should close unreferenced after delay
...
Potentially for cases, where it goes up to 1 immediately.
2019-07-09 14:24:16 -07:00
Vishesh Yadav
22678267cd
fdbrpc: Don't drop idle connections from server
...
Instead try pinging the client and let that decide whether the client
is alive or not. Ideally, it should always be failed since a well
behaved client would have closed the connection.
2019-07-09 14:24:16 -07:00
Vishesh Yadav
1f9c80f633
fdbrpc: Instead of tracking last sent data, track last sent non-ping data
...
* This will allow client to continue monitoring peer connections while
connection stays open, so that there is no period of "uncertainity"
without previous no-monitoring approach.
* Use multiplier for incoming connection idle timeout
* Update idle connection timeout values and leaked connection timeout in
simulator.
2019-07-09 14:24:16 -07:00
Vishesh Yadav
867986cdea
fdbrpc: Reduced connection monitoring from clients
...
This patch does two changes to connection monitoring:
1. Connection monitoring at client side will check if the connection
has been stayed idle for some time. If connection is unused for a
while, we close the connection. There is some weirdness involved here
as ping messages are by themselves are connection traffic. We get over
this by making it two-phase process, first being checking idle
reliable traffic, followed by disabling pings and then checking for
idle unreliable traffic.
2. Connection monitoring of clients from server will no longer send
pings to clients. Instead, it keep monitor the received bytes and
close after certain period of inactivity.
2019-07-09 14:24:16 -07:00
Vishesh Yadav
7647d3e3c0
fdbrpc: Don't use RequestStream for pings in ConnectionMonitor
...
RequestStream add another count to peerReference, which means as long
as ConnectionMonitor is alive, we'll never get peerReference=0 keeping
unnecessary connections potentially alive.
2019-07-09 14:24:16 -07:00
Vishesh Yadav
3f4f71ff9f
fdbrpc: Increment peerReferences correctly
...
The constructor of FlowReceiver which handled reference counting
peerReferences relied on calling a virtual method from constructor
whose behaviour isn't correct. This patch, bubbles down result of that
virtual method from derived constructor to base contructor.
2019-07-09 14:24:16 -07:00
Andrew Noyes
15c6f2b864
Explain SFINAE for has_serialization_done
2019-07-05 14:07:02 -07:00
Andrew Noyes
7350b3db30
Don't assume serializeReplicationPolicy succeeds
2019-07-05 14:07:02 -07:00
Andrew Noyes
e2ed56fa56
Convert ownedPtr to unownedPtr for IReplicationPolicy
...
Remove WriteRawMemory feature
Remove deserialization_done
2019-07-05 14:07:02 -07:00
Andrew Noyes
4c5ebd7609
Avoid assert when collecting vtables
2019-07-05 14:07:02 -07:00
Alex Miller
888f4f92e0
Fix errors and TaskPriority more priorities.
2019-07-03 21:03:58 -07:00
Alex Miller
8e1ab6e7db
Merge remote-tracking branch 'upstream/master' into flowlock-api
2019-06-28 17:32:54 -07:00
Evan Tschannen
5ccffe3cb3
Merge pull request #1684 from jzhou77/large-packet
...
Better handling for large packets
2019-06-28 16:19:01 -07:00
Alex Miller
bc4548e0d3
Fix sed accidentally rewriting a trace event to have an invalid field name.
2019-06-27 17:55:41 -07:00
Alex Miller
bf883d7055
Merge remote-tracking branch 'upstream/master' into flowlock-api
2019-06-25 14:26:50 -07:00
Evan Tschannen
0fe6edc254
Merge pull request #1678 from mpilman/features/external-workload
...
Features/external workload
2019-06-25 13:53:19 -07:00
Evan Tschannen
24937d8125
Merge pull request #1744 from vishesh/task/monitor-leader-on-demand
...
Fix setting enClientFailureMonitor global for client
2019-06-25 13:38:59 -07:00
Jingyu Zhou
e6ff23c420
Allow next packet size to be read when receiving data
...
For large packet, allocate sizeof(uint32_t) more bytes for next packet size.
Also add knob MIN_PACKET_BUFFER_FREE_BYTES, which is used to trigger allocation
of a new arena when free bytes are lower than this threshold.
2019-06-25 10:18:56 -07:00
Jingyu Zhou
c4e44e6697
Refactor: add missing include
2019-06-25 10:18:56 -07:00
Jingyu Zhou
54ac91342d
Refactor: include ordering in FlowTransport.actor.cpp
2019-06-25 10:18:56 -07:00
Jingyu Zhou
9d07e84dc8
Update fdbrpc/FlowTransport.actor.cpp
...
Co-Authored-By: Evan Tschannen <36455792+etschannen@users.noreply.github.com>
2019-06-25 10:18:56 -07:00
Jingyu Zhou
e0c3df899b
Break connection read into multiple smaller reads
...
And add yield between consecutive reads.
2019-06-25 10:18:56 -07:00
Jingyu Zhou
559a4b6e26
Add const qualifier
2019-06-25 10:18:56 -07:00
Jingyu Zhou
2363326ecb
Add knobs for various packet buffer sizes
2019-06-25 10:18:56 -07:00
Jingyu Zhou
b151141965
Better handling for large packets
...
On the sending side, a large packet is split into smaller pieces. On the
receiving side, use packet length to allocate buffer to avoid multiple memcpy
and allocations.
2019-06-25 10:18:56 -07:00
Alex Miller
7a500cd37f
A giant translation of TaskFooPriority -> TaskPriority::Foo
...
This is so that APIs that take priorities don't take ints, which are
common and easy to accidentally pass the wrong thing.
2019-06-25 02:47:35 -07:00
Vishesh Yadav
cbc2398254
fdbrpc: Remove default parameter from FlowTransport::createInstance
2019-06-25 01:17:38 -07:00
A.J. Beamon
189541d42c
LoadBalancedReplies were sending uninitialized versions of their subclasses when there was an error.
2019-06-20 13:57:56 -07:00
Alex Miller
12dbe13c9c
Provide a no-op O_CLOEXEC on windows to fix the build.
2019-06-19 17:16:06 -07:00
Alex Miller
df0baa0066
Merge pull request #1720 from mpilman/features/protocol-version
...
Make protocol version a type
2019-06-19 13:46:35 -07:00
mpilman
2eff2b7e21
First simple test is working (but very buggy)
2019-06-19 13:03:41 -07:00
mpilman
68ce9a5e75
ProtocolVersion type - second try
2019-06-18 17:55:27 -07:00
Alex Miller
4fa5dc0502
Merge remote-tracking branch 'upstream/master' into cloexec
2019-06-18 16:35:18 -07:00
mpilman
8576665a90
Revert "Revert "Make protocol version a type""
...
This reverts commit 455bf3b3ec
.
2019-06-18 14:49:04 -07:00
Alex Miller
455bf3b3ec
Revert "Make protocol version a type"
2019-06-18 10:59:17 -07:00
mpilman
dc9522bd86
Address code-review comments
2019-06-16 09:59:15 -07:00
mpilman
da53a92bec
Make protocol version a type
...
This fixes #1214
The basic idea is that ProtocolVersion is now its own type. This
alone is an improvement as it makes many things more typesafe. For
each version, we can now add breaking features (for example Fearless).
After that, there's no need to test against actual (confusing) version
numbers. Instead a developer can simply test
`protocolVersion->hasFearless()` and this will return true iff the
protocolVersion is newer than the newest version that didn't support
fearless.
2019-06-16 09:59:15 -07:00
Evan Tschannen
6ececa94ce
Merge pull request #1640 from vishesh/task/client-failmon
...
Clients will no longer get failure monitoring info from cluster controller
2019-06-13 17:31:17 -07:00
Evan Tschannen
55f7e7d372
fix: The delay inside the disabledMap was causing the storage server updateStorage actor to run on the client process
2019-06-13 14:28:30 -07:00
Evan Tschannen
dccb9bc26d
fixed a number of correctness problems
2019-06-12 19:40:50 -07:00
Vishesh Yadav
42fafe8a42
Addressed review comments
2019-06-11 18:58:00 -07:00
Evan Tschannen
9fdbf0c846
Merge pull request #1477 from tclinken/features/local-rk
...
Local Ratekeeper
2019-06-11 13:24:08 -07:00
mpilman
4c62458172
Make valgrind work on Fedora 30
2019-06-11 10:27:08 -07:00
Trevor Clinkenbeard
8144882d7b
Merge branch 'apple-master' into features/local-rk
2019-06-10 19:40:25 -07:00
Vishesh Yadav
a8e408e268
run clang-format on changes
2019-06-10 14:10:24 -07:00
Vishesh Yadav
4316ef9ec6
failMon: For clients remove expireFailure and report failures only during connect
2019-06-09 00:43:38 -07:00
Vishesh Yadav
6fa7081a21
net: Don't make FailureMonitoring requests from client
...
This patch removes the need for clients to continuously contact
cluster coordinator for failure monitoring information. Instead, it
uses the FlowTransport to monitor the statuses of peers and update
FailureMonitor accordingly.
2019-06-09 00:43:38 -07:00
Vishesh Yadav
6b4d30c3ae
failmon: Identify client vs server when starting failure monitoring client
2019-06-09 00:43:12 -07:00
Evan Tschannen
29b96414e2
Merge branch 'release-6.1'
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# fdbclient/NativeAPI.actor.cpp
# fdbserver/Coordination.actor.cpp
# flow/Arena.h
# versions.target
2019-06-03 18:49:35 -07:00
Parallels
773f52d0a1
Merge remote-tracking branch 'upstream/master' into cloexec
2019-06-03 15:43:32 -07:00
Evan Tschannen
172a0a9009
added an extra trace event
2019-05-30 17:25:42 -07:00
sramamoorthy
d68a229772
makefile changes to accommodate boost/process.hpp
2019-05-28 22:07:46 -07:00
A.J. Beamon
603721e125
Merge branch 'master' into thread-safe-random-number-generation
...
# Conflicts:
# fdbclient/ManagementAPI.actor.cpp
# fdbrpc/AsyncFileCached.actor.h
# fdbrpc/genericactors.actor.cpp
# fdbrpc/sim2.actor.cpp
# fdbserver/DiskQueue.actor.cpp
# fdbserver/workloads/BulkSetup.actor.h
# flow/ActorCollection.actor.cpp
# flow/Net2.actor.cpp
# flow/Trace.cpp
# flow/flow.cpp
2019-05-23 08:35:47 -07:00
Evan Tschannen
f4fbaac6b0
Merge branch 'release-6.1'
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# versions.target
2019-05-19 10:27:59 -07:00
Evan Tschannen
2b8b7954a9
in simulation, prevent data from being received over a connection 1 second after the connection is closed on the other end
2019-05-17 15:05:32 -07:00
Steve Atherton
5a8c97480a
Merge pull request #1506 from nikolas-ioannou/feature-pagecache-lru
...
AsyncFileCached: switch from a random to an LRU cache eviction policy
2019-05-17 13:42:21 -07:00
Alvin Moore
3acaa7343e
Enabled C++17 for all Windows projects
...
Set Visual Studio version to 2017 (first version to support C++17)
2019-05-16 17:44:13 -07:00
Evan Tschannen
a745a8094e
do not close a connection due to a missed ping if the process is still receiving data from the connection
2019-05-16 17:26:48 -07:00
Alvin Moore
94aed513c7
Switched Windows tools within projects to 2017
2019-05-16 15:05:11 -07:00
Alex Miller
69fb852ee0
Add more CLOEXEC-like things.
...
From missed call sites found during/after code review.
2019-05-14 20:30:58 -10:00
Alex Miller
1f02cd30e2
Mark all opened files as close-on-exec.
2019-05-13 16:05:49 -10:00
mpilman
46e7a0ca56
address reviews and make compile with `-Wunused-variable`
2019-05-13 14:15:23 -07:00
mpilman
0713e06efc
Started to work on Windows
2019-05-13 14:15:23 -07:00
mpilman
20c3f7f264
remove mixed-mode support
2019-05-13 14:15:23 -07:00
mpilman
42385c2f81
Fixed issues introduced during rebase
2019-05-13 14:15:23 -07:00
mpilman
f5fa3a65b4
some more fixes
2019-05-13 14:15:23 -07:00
mpilman
44db3450ec
Several flatbuffers bug fixes
2019-05-13 14:15:23 -07:00
mpilman
69fa3d3903
fixed compilation issues after rebase
2019-05-13 14:15:23 -07:00
mpilman
3bc5ee8c58
fix indentation
2019-05-13 14:15:22 -07:00
mpilman
47fdb78782
use EnsureTable for enums of enums
2019-05-13 14:15:22 -07:00
mpilman
642a96807b
Fixed compilation issues after rebase
2019-05-13 14:15:22 -07:00
mpilman
6afce01744
Implementation complete (not yet working)
2019-05-13 14:15:22 -07:00
mpilman
9eeb48c43d
Allow to turn on object serializer
...
This commit includes functionality to turn on
the object serializer for network communication.
This is done the following way:
- On incoming connections, a process will detect
whether the client supports the object serializer
and will only serialize responses with it, if it does
- On outgoing connections, the command line flag is used
to determine whether the object serializer should be used
to send data.
This way, a cluster can run in mixed mode. To upgrade one
can upgrade one process at a time and set the flag one process
at a time.
This is how this is tested on the simulator:
- The command line flag can take three options: on, off,
and random.
- For off, the object serializer will never we used.
- For on, the object serializer will be always used.
- For random, the simulator will flip a coin for each
process it starts up.
2019-05-13 14:15:22 -07:00
mpilman
ba83c458a6
types implemented
2019-05-13 14:15:22 -07:00
mpilman
fe81454ec2
basic functionality for object serializer
...
This commit includes:
- The flatbuffers implementation
- A draft on how it should be used for network messages
- A serializer that can be used independently
What is missing:
- All root objects will need a file identifier
- Many special classes can not be serialized yet as the
corresponding traits are not yet implemented
- Object serialization can not yet be turned on (this will
need a network option)
2019-05-13 14:15:22 -07:00
Evan Tschannen
8c3516951a
Merge branch 'release-6.1'
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# versions.target
2019-05-12 20:13:49 -07:00
Alex Miller
c502ed3d15
Fix a variety of problems stemming from a wait() being added to push().
...
And that this code was previously insufficiently tested.
2019-05-10 14:55:11 -10:00
A.J. Beamon
5f55f3f613
Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used.
2019-05-10 14:01:52 -07:00
Alex Miller
510b0b2fcd
Fix DiskQueue not replaceFile'ing frequently enough for the final time.
2019-05-08 23:08:25 -10:00
Nikolas Ioannou
c2827f4fa3
Add page cache hit, miss, and eviction stats to SystemMonitor
2019-05-08 15:41:17 +02:00
Austin Seipp
af00248df6
fdbrpc: fix some print/scan format warnings
...
Signed-off-by: Austin Seipp <aseipp@pobox.com>
2019-05-06 13:35:29 -07:00