Evan Tschannen
3b5b03e435
ReplyPromise does not serialize an empty NetworkAddress
2019-03-26 12:05:43 -07:00
Evan Tschannen
d45159ebf7
Merge pull request #1307 from jzhou77/ratekeeper
...
Monitor placement of Ratekeeper and DataDistributor
2019-03-24 17:26:07 -07:00
Evan Tschannen
1fc6937802
changed NetworkAddressList to at most two addresses for performance
2019-03-23 17:54:46 -07:00
Evan Tschannen
36ab852bb1
Merge branch 'master' into ratekeeper
...
# Conflicts:
# fdbserver/ClusterController.actor.cpp
2019-03-22 18:41:00 -07:00
Evan Tschannen
efbcd18987
fixed a performance regression related to broadcasting a read version to too many transactions simultaneously
2019-03-22 16:05:20 -07:00
Jingyu Zhou
0fb6a03c07
First round of review comment fixes for PR#1307
2019-03-19 11:29:19 -07:00
Jingyu Zhou
254c78053c
Fix a segfault error
...
After wait, ServerDBInfo may have changed. Using the old copy is wrong.
2019-03-15 22:11:13 -07:00
A.J. Beamon
85b3f11e71
Fix various compiler warnings
2019-03-15 10:34:57 -07:00
Meng Xu
5a10bf5dfc
Merge branch 'master' into mengxu/tls-switch-status-PR
2019-03-14 10:35:12 -07:00
Steve Atherton
be0da73938
Merge pull request #1290 from etschannen/feature-cheap-policy
...
Optimized a few uses of the replication policy engine
2019-03-13 17:01:19 -07:00
Evan Tschannen
e7d1f9e5f1
fixed review comments
2019-03-13 15:59:03 -07:00
Evan Tschannen
e8cb85ed8e
optimize validateAllCombinations
2019-03-13 14:47:35 -07:00
Vishesh Yadav
c32504f705
io: Add DISABLE_POSIX_KERNEL_AIO knob to use EIO instead of Kernel AIO
...
- Some Linux filesystems don't support O_DIRECT which is required by
Kernel AIO to function properly. Instead of using O_SYNC, EIO is
much better options in terms of performance penalty.
- Some systems may not support AIO at all. Eg. Windows Subsystem for
Linux.
FIXES #842
RELATED #274
2019-03-13 13:39:45 -07:00
Evan Tschannen
a2108047aa
removed LocalitySetRef and IRepPolicyRef typedefs, because for clarity the Ref suffix is reserved for arena allocated objects instead of reference counted objects.
2019-03-13 13:14:39 -07:00
Evan Tschannen
e068c478b5
merge master
2019-03-12 18:31:25 -07:00
Evan Tschannen
a7e45cff91
Merge pull request #1176 from jzhou77/ratekeeper
...
Make Ratekeeper a separate role
2019-03-12 15:58:59 -07:00
Meng Xu
85c24b0067
Merge branch 'master' into mengxu/tls-switch-status-PR
2019-03-12 15:20:54 -07:00
Balachandar Namasivayam
880e8643d1
Fix Windows link errors
2019-03-11 17:49:03 -07:00
Evan Tschannen
044b6b4f8a
Merge branch 'master' into feature-degraded-tlog
...
# Conflicts:
# fdbserver/ClusterController.actor.cpp
2019-03-08 22:50:41 -05:00
Evan Tschannen
41c493f8d4
fix: connectPacket accessed uninitialized variables
2019-03-08 14:40:32 -05:00
Jingyu Zhou
5dcde9efe0
Fix locality per review comment and a mac compile error
2019-03-07 13:16:20 -08:00
Jingyu Zhou
3c86643822
Separate Ratekeeper from data distribution.
...
Add a new role for ratekeeper.
Remove StorageServerChanges from data distribution.
Ratekeeper monitors storage servers, which borrows the idea from
DataDistribution.
2019-03-07 13:16:20 -08:00
Meng Xu
04880e3d4d
Merge branch 'master' into mengxu/tls-switch-status-PR
2019-03-06 13:41:16 -08:00
Alex Miller
c6a65389ae
Remove noexcept macro and replace with BOOST_NOEXCEPT.
...
BOOST_NOEXCEPT does what the noexcept macro was supposed to do, but in a
way that is correctly maintained over time.
2019-03-05 22:06:12 -08:00
Alex Miller
af617d68e6
boost 1.52.0 -> 1.67.0 in all vcxproj files
2019-03-05 22:06:12 -08:00
Meng Xu
820548223a
Status: connected_coordinators misc minor changes
...
Change the rst document file;
Change the coding style to be consistent with the nearby code;
Ensure we always initilize the connectedCoordinatesNum to 0
even when the variable is not used.
2019-03-05 21:45:18 -08:00
anoyes
981426bac9
More ide fixes
2019-03-05 18:03:57 -08:00
Evan Tschannen
82d957e0bb
Merge pull request #1178 from vishesh/task/issue-963-IPv6
...
IPv6 Support
2019-03-05 17:14:16 -08:00
Meng Xu
afd7c1d497
AsynFileWinASIO: Make error checking consistent with Linux
...
In Linux, KAIO uses ASSERT to make sure open() flags have
OPEN_UNBUFFERED set.
In Windows, we uses if-condition and return io_errors() when the
flag is not set.
This PR makes Windoes implementation always use ASSERT to check the
flag.
2019-03-04 16:36:04 -08:00
Vishesh Yadav
5cd8bac6cb
fix: segfault due external assignment of Endpoint::addresses #1201
...
isLocal() now checks if the address is equal to default
NetworkAddress() which should match the behaviour before TLS changes.
2019-03-04 15:49:11 -08:00
Vishesh Yadav
1d3e62c4e3
net: Don't use a union of IP in ConnectPacket #963
...
Since keeping a union and using the packet size to figure out whether
the ConnectPacket is using IPv6 to IPv4 address is not easily
maintainable. For simplicity, we just serialize everything in
ConnectPacket and be backward compatible with older format.
However, some code for some much older stuff is removed.
2019-03-04 14:12:45 -08:00
Vishesh Yadav
e93cd0ff21
Add some checks and comments to IPv6 changes #963
2019-03-04 14:12:45 -08:00
Vishesh Yadav
592e224155
net: add/use formatIpPort to format IP:PORT pairs #963
2019-03-04 14:12:45 -08:00
Vishesh Yadav
cc9ad0e202
net: Use IPv6 in simulation testing #963
...
25% times we will use IPv6 addresses
2019-03-04 14:12:45 -08:00
Vishesh Yadav
57832e625d
net: Support IPv6 #963
...
- NetworkAddress now contains IPAddress object which can be either
IPv4 or IPv6 address. 128bits are used even for IPv4 addresses,
however only 32bits are used when using/serializing IPv4 address.
- ConnectPacket is updated to store IPv6 address. Backward compatible
with old format since the first 32bits of IP address field is used
for serialization of IPv4.
- Mainly updates rest of the code to use IPAddress structure instead
of plain uint32_t.
- IPv6 address/pair ports should be represented as `[ip]:port` as per
convention. This applies to both cluster files and command line
arguments.
2019-03-04 14:12:41 -08:00
Meng Xu
94385447bc
Status: Get if client configured TLS
...
To understand if all clients have configured TLS,
we check the tlsoption when a client tries to open database.
This is similar to how we track the versions of multi-version clients.
2019-03-01 15:17:01 -08:00
Stephen Atherton
7d287c6999
Merge branch 'release-6.0'
...
# Conflicts:
# fdbclient/FileBackupAgent.actor.cpp
2019-02-28 14:01:00 -08:00
Stephen Atherton
887856b6b0
Bug fix in AsyncFileReadAhead where a file size that is an integer multiple of the read chunk size will cause a crash when reading the file's final block. BackupContainerLocalDirectory now uses AsyncFileReadAhead in simulation to get simulation coverage of that class, and FileBackup will generate file sizes which expose the bug.
2019-02-28 00:22:38 -08:00
Evan Tschannen
8afb7fbb9d
Merge pull request #1160 from alexmiller-apple/tstlog-fork
...
Spill-By-Reference TLog Part 2: New and Old TLogServers co-exist harmoniously
2019-02-26 18:00:04 -08:00
Alex Miller
2dc57568cb
Change many things about log_version.
...
* log_version in the database (`/conf/log_version`) is now a hint that gets
rounded to the nearest supported version.
* fdbcli and FDB enforce that only a valid log_version can be configured to
* TLogVersion is persisted in CoreTLogSet (and LogSet and TLogSet)
* Some comments here and there
* Add an assert on filename length to make sure KV-pairs in filename
don't exceed a maximum length.
2019-02-26 16:47:04 -08:00
Evan Tschannen
b8910ba7cd
Merge branch 'master' into feature-fix-force-recovery
...
# Conflicts:
# fdbclient/ManagementAPI.actor.h
# fdbserver/DataDistribution.actor.cpp
# fdbserver/storageserver.actor.cpp
# fdbserver/workloads/KillRegion.actor.cpp
2019-02-22 14:38:13 -08:00
Trevor Clinkenbeard
25b397977c
Never assign DataDistributor role to process of class CoordinatorClass
2019-02-20 17:22:01 -08:00
Trevor Clinkenbeard
1bb384db4d
Merge branch 'master' of https://github.com/apple/foundationdb into add-no-assign-class
2019-02-20 13:13:12 -08:00
mpilman
f14dee764b
Use fwd decl for connectionReader - fdbrpc compiling
2019-02-19 15:16:59 -08:00
mpilman
3bd9b9047b
Minor fixes - flow now compiling with intellisense
2019-02-19 15:16:59 -08:00
Evan Tschannen
065a45e05f
Merge branch 'master' into feature-fix-force-recovery
...
# Conflicts:
# fdbclient/ManagementAPI.actor.cpp
# fdbserver/ClusterController.actor.cpp
# fdbserver/workloads/KillRegion.actor.cpp
2019-02-18 17:09:06 -08:00
Vishesh Yadav
0898686c9b
Remove old TODO
2019-02-18 15:43:27 -08:00
Evan Tschannen
62603d11a1
updated the killRegion simulation test to test a much larger variety of failure scenarios
2019-02-18 15:32:51 -08:00
Vishesh Yadav
e05b53d755
Merge remote-tracking branch 'apple/master' into task/tls-upgrade
2019-02-15 20:37:07 -08:00
Vishesh Yadav
345fd7e4da
Prefer unencrypted ports at client side during transition
2019-02-15 20:23:07 -08:00
Evan Tschannen
83060c6e56
Merge pull request #1062 from jzhou77/PR
...
Add a new DataDistributor role.
2019-02-15 13:51:27 -08:00
mpilman
75f692b931
simplify actorcompiler and target to compile coveragetool
2019-02-15 00:01:42 -08:00
Jingyu Zhou
c35d1bf2ef
Fix according Alex's comment
2019-02-14 16:30:13 -08:00
Jingyu Zhou
886e7ab2ba
Add a new DataDistributor role.
...
Let cluster controller to start a new data distributor role by sending a
message to a chosen worker.
Change MasterInterface usage in DataDistribution to masterId
Add DataDistributor rejoin handling.
This allows the data distributor to tell the new cluster controller of its
existence so that the controller doesn't spawn a new one. I.e., there should
be only ONE data distributor in the cluster.
If DataDistributor (DD) doesn't join in a while, then ClusterController (CC) tries
to recruit one as DD. CC also monitors DD and restarts one if it failed.
The Proxy is also monitoring the DD. If DD failed, the Proxy will ask CC for
the new DD.
Add GetRecoveryInfo RPC to master server, which is called by data distributor
to obtain the recovery Transaction version from the master server.
2019-02-14 16:30:13 -08:00
Vishesh Yadav
907446d0ce
Merge remote-tracking branch 'apple/master' into task/tls-upgrade
2019-02-14 11:37:38 -08:00
A.J. Beamon
9272a41e5f
Merge pull request #1146 from atn34/fix-actor-warning
...
Fix actor warning for cmake build
2019-02-13 11:01:37 -08:00
Andrew Noyes
3a38bff8ee
Use DISABLE_ACTOR_WITHOUT_WAIT_WARNING consistently
2019-02-13 10:30:35 -08:00
Andrew Noyes
067a445e06
Replace unused _ variables with wait(success(...))
2019-02-12 17:30:30 -08:00
Andrew Noyes
874a58cb4f
Suppress actor without wait for tests in cmake
2019-02-12 11:01:17 -08:00
mpilman
8a94d80deb
fdbservice and fdbrpc now compiling
2019-02-07 15:37:04 -08:00
Evan Tschannen
486e0e13c3
Merge pull request #1116 from alexmiller-apple/tstlog
...
Random cleanups that prepare for Spill-By-Reference TLog
2019-02-05 18:09:06 -08:00
A.J. Beamon
882f8d70b7
Merge pull request #1066 from etschannen/master
...
fix: coordinators auto could put two coordinators in the same zone
2019-02-05 11:52:04 -08:00
Alex Miller
6668b7c544
Make simulation enforce what KAIO requires.
2019-02-04 18:04:22 -08:00
Evan Tschannen
e9ddd94e27
The failure monitor is given a list of all IP addresses associated with a process
...
The connect packet includes the correct remote address
Did a lot of code cleanup
Simulation test mixed TLS and non-TLS listeners on the same process
2019-01-31 18:20:14 -08:00
Balachandar Namasivayam
9cf2b4e1e7
Improve TLS logging on error scenarios.
2019-01-29 17:04:09 -08:00
A.J. Beamon
05b38167d0
Update fdbrpc/sim2.actor.cpp
...
Co-Authored-By: etschannen <36455792+etschannen@users.noreply.github.com>
2019-01-29 11:35:02 -08:00
Trevor Clinkenbeard
2e0b3a7f1d
Added ProcessClass::CoordinatorClass, which can be used by coordinators, so that coordinators do not have to take on other roles if desired
2019-01-25 11:03:13 -08:00
Evan Tschannen
1d7fec3074
Merge commit '048bfc5c368063d9e009513078dab88be0cbd5b0' into task/tls-upgrade-2
...
# Conflicts:
# .gitignore
2019-01-24 17:43:06 -08:00
Evan Tschannen
9cf77d70bc
fix: getFirstLocalAddress has to be the same as primary address, because it is what we put in the connect packet, and we always connect from the primary address
2019-01-24 17:28:26 -08:00
Evan Tschannen
699f8dd617
fix: coordinators auto could put two coordinators in the same zone
...
simulation now tests two machines in the same zone
2019-01-18 15:42:48 -08:00
Evan Tschannen
4eb11d74af
Merge pull request #1029 from bnamasivayam/reenable-check_desired_classes
...
Re-enable CheckDesiredClasses after making necessary changes for mult…
2019-01-11 17:15:05 -08:00
Balachandar Namasivayam
a8e2e75cd5
Re-enable CheckDesiredClasses after making necessary changes for multi-region setup.
...
Fixed a couple of bugs
1) A rare race condition where a worker is being roles even after it died.
2) Fix how RoleFitness is calculated for TLog and LogRouter. Only worst fitness is compared to see if a better fit is available.
2019-01-10 10:28:32 -08:00
Evan Tschannen
684a22a52b
Merge branch 'release-6.0'
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# fdbbackup/backup.actor.cpp
# fdbclient/BackupContainer.actor.cpp
# fdbclient/HTTP.actor.cpp
# fdbserver/storageserver.actor.cpp
# fdbserver/workloads/BackupCorrectness.actor.cpp
# versions.target
2019-01-09 16:14:46 -08:00
Vishesh Yadav
31c4ac07ac
WIP: FailureMonitoring use endpointAddressList (create individual endpoints for each address) WIP: g_currentDeliveryPeerAddress WIP: FlowTransport endpoint map WIP: Add peerReference to addressToEndpointMap
2019-01-09 07:46:01 -08:00
Vishesh Yadav
51b89ae083
WIP
2019-01-09 07:41:02 -08:00
Alex Miller
cebdb83def
Revert "Merge pull request #977 from alexmiller-apple/abspath"
...
This reverts commit 9881b1d074
, reversing
changes made to 6d278e466b
.
2019-01-08 16:52:09 -08:00
Evan Tschannen
57293a2db0
byte sample recovery did not use limits for its range reads, leading to slow tasks
2019-01-04 10:32:31 -08:00
Andrew Noyes
d5430d7bf8
Remove ignore "-Wreturn-local-addr" pragma
...
This seems to still build on gcc 8
2019-01-03 13:55:17 -08:00
Markus Pilman
dbe9baff1f
Several small compilation fixes for new versions of gcc
...
There are several missing includes for cmath in the code, I added those.
Next, Coro returns a reference to a stack variable and this causes a
warning. As this is probably ok for Coro, I disabled the warning in
that file for GCC. I want to have this warning in the build system as
it is generally a very useful warning to have.
Another change is that major and minor are deprecated for a while now.
I replaced those with gnu_dev_major and gnu_dev_minor.
ErrorOr currently implements operators ==, !=, and <. These do not
compile because Error does not implement ==. This compiles on older
versions of gcc and clang because ErrorOr<T>::operator== is not used
anywhere. It is still wrong though and newer gcc versions complain.
I simply removed these methods.
The most interesting fix is that TraceEvent::~TraceEvent is currently
throwing exceptions. This is illegal behavior in C++11 and a idea in
older versions of C++. For now I simply removed the throw, but this
might need some more thought.
2019-01-03 12:44:19 -08:00
Bhaskar Muppana
aa2a76ef4c
Merge pull request #981 from alexmiller-apple/cmake
...
Add a CMake build system
2019-01-02 18:50:15 -08:00
A.J. Beamon
d8f33a2419
Add parentheses to bitwise ops (turned up by clang after recent change)
2019-01-02 10:15:59 -08:00
anoyes
6a4d87802b
Replace & operator with variadic function
2018-12-28 11:33:42 -08:00
Steve Atherton
9881b1d074
Merge pull request #977 from alexmiller-apple/abspath
...
Use abspath when dealing with the simulator file-cache
2018-12-20 14:56:38 -08:00
Vishesh Yadav
209ecd09ee
Keep local addresses in a vector
2018-12-17 11:25:44 -08:00
Meng Xu
486a7b04fa
TeamCollection: Fix build in osX
...
In osX, we cannot adding unsigned long to a string to append to the string.
2018-12-14 13:44:11 -08:00
Markus Pilman
4ae701d8a9
minor bugfix to look up correct filename in cache
...
(manually cherry-picked from flat-buffers branch)
2018-12-13 22:21:25 -08:00
Markus Pilman
0207831fd6
Use abspath when dealing with the simulator file-cache
...
The simulator uses a hash table to cache all open files to make sure
that several simulated processes don't open the file more than once.
This currently doesn't work properly and deleted files are often kept
open forever. As a result, we often ran out of file descriptors.
The problem is luckily quite simple: files are often opened with an
absolute path but later a relativ path is passed for deletion. This
is not working because the map that is used to store the file
descriptors is not aware of paths - so deleted files are often not
removed from this map. The fix that works for us is to just always
work with absolute paths when adding and removing files from this map.
2018-12-13 22:21:06 -08:00
Alex Miller
a982b9da72
Additional changes from a merge commit.
2018-12-13 17:13:41 -08:00
Alex Miller
e70e59a895
Change some file locations.
2018-12-13 14:53:19 -08:00
Markus Pilman
dce290909d
fdbserver now compiling
2018-12-13 14:13:47 -08:00
mpilman
51beb8b48c
fdbrpc compiling with cmake
2018-12-13 14:02:16 -08:00
Vishesh Yadav
e04abf25f7
simulator: Support multiple listeners on single process
...
Sim2Listener can now take the network address to listen on. This is
used to listen to multiple ports in simulator and test the patch
which added multiple network addresses to single endpoint.
2018-12-13 13:36:52 -08:00
Vishesh Yadav
3eb9b23024
Listen to multiple addresses and start using vector<NetworkAdddress> in Endpoint
...
- This patch will make FDB listen to multiple addresses given via
command line. Although, we'll still use first address in most places,
this patch starts using vector<NetworkAddress> in Endpoint at some basic
places.
- When sending packets to an endpoint, pick a random network address in
endpoints
- Renames Endpoint::address to Endpoint::addresses since it
now holds a vector of addresses.
2018-12-13 13:36:52 -08:00
Vishesh Yadav
43e5a46f9b
Change Endpoint::address(NetworkAddress) to vector<NetworkAddress>
...
Extend `Endpoint` class to take multiple NetworkAddresses instead of
just one. Hence, to talk to an endpoint instead of one IP:PORT, we'll
have multiple IP:PORT pairs.
This patch simply adds the field and makes changes to compile the
codebase. The first element of of `address` field is used everywhere.
Hence the way we talk to remains same with this patch.
NOTE:
Directly accessing the first memeber of Endpoint::address is unsafe
as Endpoint() doesn't enforces non-empty address list. However, since
the correctness test pass for now and are anyway replacing all those
unsafe accesses with ones considering the whole vector, this patch
ignores to access them in safe way.
2018-12-13 13:36:52 -08:00
Vishesh Yadav
e8e01b2406
Remove unused localAddress parameter from newNet2 and Net2 classes
2018-12-13 13:36:52 -08:00
Evan Tschannen
d9626895b1
Merge pull request #964 from xumengpanda/mengxu/teamcollection-release
...
TeamCollection: Use machine teams to create server teams to increase availability at scale when a machine has multiple servers
2018-12-13 13:18:54 -08:00
Meng Xu
e069b5c31c
TeamCollection: Use clang format
...
No functional change.
Signed-off-by: Meng Xu <meng_xu@apple.com>
2018-12-06 11:39:35 -08:00
Evan Tschannen
d2d68aa171
Merge branch 'release-6.0'
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# fdbclient/ManagementAPI.actor.cpp
# versions.target
2018-12-03 18:26:52 -08:00
Evan Tschannen
55a9c4a0f0
Merge pull request #955 from ajbeamon/fix-bad-error-creation-and-whitespace
...
throw platform_error; -> throw platform_error();. Convert some spaces to tabs.
2018-12-03 15:12:37 -08:00
A.J. Beamon
50c9dfdd01
Errors that occur in platform that are the result of IO issues are now raised as io_error rather than platform_error.
2018-11-30 10:55:19 -08:00
A.J. Beamon
97847f517b
throw platform_error; -> throw platform_error();. Convert some spaces to tabs.
2018-11-28 12:56:57 -08:00
Meng Xu
8de031f9a6
TeamCollection: clang-format
...
Format the changes with git clang-format.
No functional changes.
Signed-off-by: Meng Xu <meng_xu@apple.com>
2018-11-21 11:18:26 -08:00
Meng Xu
f7a7e069f0
TeamCollection: Remove unnecessary comments
...
Pass 41806 tests with no failure
Signed-off-by: Meng Xu <meng_xu@apple.com>
2018-11-16 15:56:35 -08:00
Meng Xu
73c58852f0
TeamCollection: Resolve code review comments
...
Resolve code review comments:
1) Improve the code efficiency by avoiding unnecessary map search
and avoiding unnecessary checking
2) Remove or comment out trace events when they can be spammy
3) Improve coding style
Tested for 1 hour and no error was found.
KillRegionCycle.txt test was excluded from the test because
existing code cannot pass that test either
Signed-off-by: Meng Xu <meng_xu@apple.com>
2018-11-16 15:55:33 -08:00
Meng Xu
5051b35c61
TeamCollection: Use machine team to create server team
...
Current server team collection logic does not consider
the fact that multipe storage servers can run on the same machine.
When multiple machines fail, all servers on the machines will fail, and
the possibility of having one process team fail and lose data is very high.
To reduce the possibility of losing data when multiple machine fails,
we first create machine teams which span across different fault zones;
we then create server teams based on machine teams by
first picking 1 machine team, and then
picking 1 server from each machine in the machine team.
Signed-off-by: Meng Xu <meng_xu@apple.com>
2018-11-16 15:53:22 -08:00
Evan Tschannen
4b5d0b4e2c
Merge branch 'release-6.0'
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# fdbclient/AsyncFileBlobStore.actor.cpp
# fdbclient/AsyncFileBlobStore.actor.h
# fdbclient/BlobStore.actor.cpp
# fdbclient/BlobStore.h
# fdbclient/HTTP.actor.cpp
# fdbclient/ManagementAPI.actor.cpp
# fdbclient/NativeAPI.actor.cpp
# fdbrpc/LoadBalance.actor.h
# fdbrpc/batcher.actor.h
# fdbrpc/fdbrpc.vcxproj
# fdbrpc/sim2.actor.cpp
# fdbserver/DataDistribution.actor.cpp
# fdbserver/DataDistributionTracker.actor.cpp
# fdbserver/SimulatedCluster.actor.cpp
# fdbserver/TLogServer.actor.cpp
# fdbserver/masterserver.actor.cpp
2018-11-10 13:04:24 -08:00
Evan Tschannen
6f4ad84777
Merge pull request #903 from ajbeamon/move-batcher-into-proxy
...
Move the sort of generic batcher from fdbrpc and make it specific to …
2018-11-10 09:56:03 -08:00
Evan Tschannen
b8381b3cea
Merge branch 'release-6.0' of github.com:apple/foundationdb into release-6.0
2018-11-10 09:51:49 -08:00
A.J. Beamon
67a152ae9f
Move the sort of generic batcher from fdbrpc and make it specific to batching commits in master proxy. Also a couple minor formatting changes.
2018-11-09 14:19:18 -08:00
Evan Tschannen
56c51c1bb3
fix: usableRegions was uninitialized
2018-11-09 10:17:35 -08:00
Stephen Atherton
9d73166b3b
Many bug fixes related to concurrent page operations and pager shutdown.
2018-11-06 19:31:16 -08:00
Evan Tschannen
87295cc263
suppressed spammy trace events, and avoid reporting a long master recovery duration when the cluster is first created
2018-11-04 23:07:56 -08:00
Evan Tschannen
bf6545a9cf
clients cache storage server interfaces individually, instead of as a team. This is needed because in fearless every shard has storage servers from two separate teams, leading to a lot of possible combinations
...
allAlternatives failed logic was simplified, because we are already doing a global rate limiting, so a per shard limit is unnecessary
reduced unnecessary state variables in waitMetrics requests
2018-11-02 13:15:09 -07:00
Stephen Atherton
df3bdde50b
Many bug fixes. AsyncFileCached write() on a page with a zero-copy read in progress would orphan the old page before the read was finished. Pager file operations were not converting page id to int64 for byte offset calculation. Pager was not calling releaseZeroCopy() after readZeroCopy() if there was an error or cancellation. Pager reads were using some variables that could go out of scope. BusyPage's mechanism for notifying when a physical page is no longer in use is itself no longer in use and therefore removed. Pager shutdown now cancels all outstanding reads. Improved some debug output.
2018-10-31 02:14:55 -07:00
A.J. Beamon
776b289bfe
Move AsyncFileBlobStore and related files to fdbclient.
2018-10-26 13:49:42 -07:00
A.J. Beamon
58a0e22d3c
Remove sim2 dependency on fdbclient:
...
* Remove unused 'exclusionSet' that used a type from fdbclient.
* Replace usages of describe(x) with x.toString().
Also removed some using statements.
2018-10-26 09:23:12 -07:00
Alex Miller
6bb1f4093d
Merge pull request #856 from dropbox/pr/include-fix
...
Adjust all includes to be relative to the root.
2018-10-22 09:51:55 -07:00
Alex Miller
e2fc1c9b95
Remove specifying non-root directory as a path to search for includes.
2018-10-19 18:56:45 -07:00
Evan Tschannen
1ef29cbf0d
more windows build fixes
2018-10-19 17:00:24 -07:00
Robert Escriva
268093a96d
Adjust all includes to be relative to the root.
...
Remove the use of relative paths. A header at foo/bar.h could be included by
files under foo/ with "bar.h", but would be included everywhere else as
"foo/bar.h". Adjust so that every include references such a header with the
latter form.
Signed-off-by: Robert Escriva <rescriva@dropbox.com>
2018-10-19 17:35:33 +00:00
Evan Tschannen
db71b60d72
Merge pull request #819 from satherton/feature-redwood
...
Redwood storage engine, initial/experimental version
2018-10-18 18:38:11 -07:00
Evan Tschannen
0217aed74c
Merge branch 'release-6.0'
...
# Conflicts:
# bindings/go/README.md
# documentation/sphinx/source/release-notes.rst
# fdbserver/MasterProxyServer.actor.cpp
# versions.target
2018-10-15 18:38:51 -07:00
A.J. Beamon
a963ff7a64
Fix line endings
2018-10-08 09:30:09 -07:00
Stephen Atherton
22f8a4efa9
Normalized all unit test names to begin with "/" if they should be included in random unit testing.
2018-10-05 22:09:58 -07:00
A.J. Beamon
664f64881c
Port truncate optimization from Snowflake PR in order to make quick changes for a patch release.
2018-10-05 15:05:26 -07:00
Stephen Atherton
7c1dc305cb
Merge commit 'a72c8f5cb2e79a673abc0ed3d27ef1c51028fb13' into feature-redwood
2018-10-05 10:15:10 -07:00
Evan Tschannen
3922e477a5
Merge branch 'release-6.0'
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# fdbclient/ManagementAPI.actor.cpp
# fdbserver/ClusterController.actor.cpp
# fdbserver/DataDistribution.actor.cpp
# fdbserver/LogSystemDiskQueueAdapter.actor.cpp
# fdbserver/SimulatedCluster.actor.cpp
# fdbserver/TLogServer.actor.cpp
2018-10-03 16:57:18 -07:00
A.J. Beamon
92990d6aef
Merge release-6.0 into master
2018-09-21 16:14:39 -07:00
Evan Tschannen
77e2fb787e
Merge branch 'release-6.0' into feature-fix-forced-recovery
2018-09-21 14:55:37 -07:00
Stephen Atherton
2fc86c5ff3
Merge branch 'master' of github.com:apple/foundationdb into feature-redwood
...
# Conflicts:
# fdbrpc/AsyncFileCached.actor.h
# fdbserver/IKeyValueStore.h
# fdbserver/KeyValueStoreMemory.actor.cpp
# fdbserver/workloads/StatusWorkload.actor.cpp
# tests/fast/SidebandWithStatus.txt
# tests/rare/LargeApiCorrectnessStatus.txt
# tests/slow/DDBalanceAndRemoveStatus.txt
2018-09-20 03:39:55 -07:00
Evan Tschannen
42a67efb0c
the cluster controller should prefer to be located on a transaction class machine over a storage server class machine
2018-09-19 18:04:59 -07:00
Evan Tschannen
200e65fe61
added a workload which tests killing an entire region, and recovering from the failure with data loss.
...
fix: we cannot pop the txs tag from remote logs until they have a full copy of the txnStateStore
fix: we have to modify all of history, we cannot stop after finding a local remote
2018-09-17 18:32:39 -07:00
Evan Tschannen
4dd2dda0a3
Merge branch 'release-6.0'
...
# Conflicts:
# fdbserver/worker.actor.cpp
2018-09-05 16:11:06 -07:00
Evan Tschannen
df406a340e
Merge pull request #742 from ajbeamon/roles-in-trace-events
...
Add the roles running on a process as a field on trace events in the …
2018-09-05 16:08:12 -07:00
Evan Tschannen
90301f497f
Merge branch 'release-6.0'
...
# Conflicts:
# fdbclient/ManagementAPI.actor.cpp
# fdbrpc/FlowTransport.actor.cpp
# fdbrpc/TLSConnection.actor.cpp
# fdbserver/DataDistribution.actor.cpp
# fdbserver/Status.actor.cpp
# fdbserver/storageserver.actor.cpp
# fdbserver/workloads/StatusWorkload.actor.cpp
# versions.target
2018-09-05 16:06:33 -07:00
A.J. Beamon
2de0b5d6d7
Add the roles running on a process as a field on trace events in the form of a comma delimited string of role abbreviations.
2018-09-05 15:06:14 -07:00
Evan Tschannen
dcdbb3ec4d
Merge branch 'release-6.0' of github.com:apple/foundationdb into feature-movekey-fixes
2018-09-05 10:29:13 -07:00
Evan Tschannen
21f5cf9ce9
suppress spammy trace events
2018-09-04 17:12:26 -07:00
Steve Atherton
89dd9cc4a3
Cherry-pick pull request #717 to release-6.0
...
Which contains:
* Improve TLS cert refresh logging.
* Loading a mismatching cert shouldn't prevent TLS connections.
* Initialize the cached copy of ca/cert/key data.
* Open certificates as uncached, which means they can be write-protected.
2018-08-23 16:53:40 -07:00
Steve Atherton
365fe992b4
Merge pull request #717 from alexmiller-apple/tls-refresh-fixes
...
Fix certificate reloading issues
2018-08-22 15:09:12 -07:00
Evan Tschannen
717c43a69f
merge 6.0 into master
2018-08-22 00:28:04 -07:00
Alex Miller
d2da969412
Improve TLS cert refresh logging.
...
Explicitly call out failure/success, and surface repeated cert
mismatches.
2018-08-21 15:05:41 -07:00
Alex Miller
4113b36df7
Loading a mismatching cert shouldn't prevent TLS connections.
...
set_{cert,key,ca}_data returns pass/fail and not throw. The existing
code wrongly assumed that they threw.
2018-08-21 15:02:54 -07:00
Evan Tschannen
26ec6ebac8
fixed line endings
2018-08-21 14:58:26 -07:00
Evan Tschannen
712aa00261
a better fix to the windows build issue
2018-08-21 14:54:38 -07:00
Alex Miller
4caacaaf4e
I would like to atone for my sins. But later.
...
This fixes the windows build. For some reason, MSVC believes that the
actor-compiled version of networkSender actually exists, but the
non-actor-compiled version doesn't exist.
This is a hackish workaround, as the largest reason to not include a
.g.h file is because it defines a POST_ACTOR_COMPILER define that messes
with actorcompiler.h's #defines. We can just undefine that after
including the file. ...but carefully.
2018-08-20 20:33:38 -07:00
Alex Miller
3ece3cf301
Initialize the cached copy of ca/cert/key data.
...
This was just purely an accidental oversight from before. The variables
were there and handled like they were actually initilized with the
contents of the various certificate files at start-up, but never
actually were.
And add a few trace events to make it easy to see when the system
noticed and tried to reload certificate data.
2018-08-20 19:09:34 -07:00
Alex Miller
fd866a3b47
Open certificates as uncached, which means they can be write-protected.
...
OPEN_READONLY still opens the file as read-write. To actually be
read-only, one needs to open the file as READONLY and UNCACHED.
2018-08-20 19:07:58 -07:00
Alex Miller
63b1e85338
Ban `Void _ = wait(...)` constructions, and require just `wait(...)`.
...
There's never any reason to save the value of a Void return, and it's
the easiest source of redefined variable bugs that will creep back in
over time. So just `wait(...)`, it's cleaner that way.
2018-08-14 15:50:26 -07:00
Alex Miller
86dbe1f0e9
Fix more instances of actorcompiler.h being in the wrong place.
2018-08-14 15:50:26 -07:00
Alex Miller
7feb5d8209
Remove including flow.h in actorcompiler.h, and fix resulting breakage.
...
For files that required flow.h, and only got it through actorcompiler.h,
their version of flow.h would have the actorcompiler #defines defined.
Then, if it included a STL/boost file, the same breakage would result.
This needs to not happen, so the include of flow.h in actorcompiler.h
was removed.
2018-08-14 15:50:26 -07:00
Alex Miller
bca324eaa6
More actorcompiler.h fixes and additions.
2018-08-14 15:50:26 -07:00
Alex Miller
fb31a6999f
Rewrite all files to have #include actorcompiler.h as the last include.
2018-08-14 15:50:26 -07:00
Alex Miller
07e5281142
Restrict actor keyword #defines to actor files.
...
This introduces a new rule in our codebase, that any file that #includes
actorcompiler.h needs to do it as the last #include, and it needs to
then #include unactorcompiler.h at the end of the file.
The point of this is that it prevents our actorcompiler.h #defines from
leaking into boost or the c++ standard library. Both of these start
throwing errors if you s/state// their code, which `#define state `
effectively does.
2018-08-14 15:50:26 -07:00
Alex Miller
535b5701e5
Rewrite all `Void _ = wait(...)` -> `wait(...)`.
...
This takes advantage of the new actorcompiler functionality to avoid
having duplicate definitions of `Void _` when trying to feed the
un-actorompiled source through clang.
2018-08-14 15:50:26 -07:00
Evan Tschannen
cdcf056aef
Merge branch 'release-6.0'
2018-08-14 09:43:51 -07:00
A.J. Beamon
168dce94cb
Remove some trace event suppressions that were happening off the network thread. Downgrade some trace events related to trace logging problems from SevError to SevWarnAlways.
2018-08-14 09:00:43 -07:00
Evan Tschannen
3186fac397
Make sure we still accept some connections even if we are CPU bound by high priority work
2018-08-10 17:47:21 -07:00
A.J. Beamon
574c5576a2
Merge branch 'release-6.0' of github.com:apple/foundationdb
...
# Conflicts:
# fdbrpc/TLSConnection.actor.cpp
# versions.target
2018-08-10 14:31:58 -07:00
A.J. Beamon
3535ddad80
Merge pull request #674 from alexmiller-apple/glibcxx-debug-fixes
...
Fix bugs uncovered by -D_GLIBCXX_DEBUG
2018-08-09 08:18:51 -07:00
A.J. Beamon
24dec1529b
Merge pull request #673 from etschannen/release-6.0
...
A variety of bug fixes and performance improvements
2018-08-07 10:55:46 -07:00
Alex Miller
ff0e14d5a7
Fix a compilation error on windows.
2018-08-06 18:36:01 -07:00
Evan Tschannen
b5a133865d
Merge branch 'release-6.0' of github.com:apple/foundationdb into release-6.0
...
# Conflicts:
# fdbrpc/TLSConnection.actor.cpp
2018-08-06 18:26:54 -07:00
Evan Tschannen
22f2a1fedd
Merge pull request #676 from etschannen/master
...
fix: we should not free statdata ourselves, it will be deleted by libeio itself
2018-08-06 18:08:45 -07:00
Steve Atherton
fb46385a39
Merge pull request #628 from alexmiller-apple/reloadcertificates
...
Reload certificates if changed.
This is a cherry-pick of #628 back to release-6.0
2018-08-06 18:04:04 -07:00
Evan Tschannen
56e0b729c8
fix: we should not free statdata ourselves, it will be deleted by libeio itself
2018-08-06 17:46:53 -07:00
Alex Miller
d99592f8bd
Fix an out-of-bounds vector access.
2018-08-06 12:50:34 -07:00
Evan Tschannen
6f328d41ac
suppressed spammy trace events
2018-08-06 12:12:55 -07:00
Evan Tschannen
538e684f1c
Merge branch 'release-6.0'
...
# Conflicts:
# versions.target
2018-08-03 11:41:46 -07:00
Evan Tschannen
2619234477
Merge branch 'release-5.2' into release-6.0
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
2018-08-03 11:40:24 -07:00
Evan Tschannen
21fe6adac4
fix: give time to do other work between accepting connections. It is expensive to accept TLS connections, so we have a slow task (which can kill other connections) if we accept too many connections in a row
2018-08-03 11:37:10 -07:00
Alex Miller
1a7cda4149
Stop performing self-moves. (e.g. a = std::move(a))
...
self-moves are frowned upon in C++, and in our code this generally happens from
calls to swap as part of trying to implement a "unordered erase" function via
swap-to-the-end-and-pop_back. For convenience, a swapAndPop() function is now
offered that performs this, while disallowing self-moves.
2018-08-01 18:09:54 -07:00
Evan Tschannen
1c29275672
call all methods which could disable a trace event before it is initialized. In practice this means calling .error first, then .suppressFor, then all your details.
2018-08-01 14:30:57 -07:00
Alex Miller
f70f204d55
Fix a compilation error on windows.
2018-07-30 17:13:37 -07:00
Evan Tschannen
28a26d54f2
Merge commit 'ccf4384c79d026edbf76152e95e7410ebe621c1f' into release-6.0
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# fdbrpc/FlowTransport.actor.cpp
2018-07-28 09:11:31 -07:00
Evan Tschannen
fa3b61508c
fix: do not increase numIncompatibleConnections if the connect was already incompatible
2018-07-28 08:50:54 -07:00
Stephen Atherton
4379a58bbe
Suppress potentially spammy event and don't log cancellation errors.
2018-07-27 21:03:10 -07:00
Stephen Atherton
59e005485d
Fixed bug where incompatible connection count was sometimes decremented twice for the same peer.
2018-07-27 20:48:14 -07:00
Stephen Atherton
6a3834c3f8
Fixed memory leak when destroying a FlowTransport.
2018-07-27 20:46:54 -07:00
Stephen Atherton
c593d1c6a2
Bug fix causing clients to sometimes (rarely) not reconnect to upgraded clusters. Reliable packets were being dropped to incompatible peers intentionally, but now this is only done if the peer is newer since successful communication with a newer peer will never be possible.
2018-07-27 20:42:06 -07:00
Steve Atherton
d1a877039d
Merge pull request #628 from alexmiller-apple/reloadcertificates
...
Reload certificates if changed.
2018-07-26 17:21:23 -07:00
Stephen Atherton
40762d9f9b
Merge branch 'master' of github.com:apple/foundationdb into feature-redwood
2018-07-25 17:58:52 -07:00
Evan Tschannen
95bc695f0e
Merge branch 'release-6.0' of github.com:apple/foundationdb into release-6.0
2018-07-25 13:06:54 -07:00
Evan Tschannen
89a3e2e1b4
Backed out connection closing changes because of upgrade problems
2018-07-25 13:06:13 -07:00
Alex Miller
262af775eb
Implement overly simple file write timestamps for simulation, and clean up code.
2018-07-24 17:20:31 -07:00
Alex Miller
168496f819
Poll the certificate files if TLS is enabled and reload them if changed.
...
This allows certificates to be changed/updated without having to restart fdbserver.
2018-07-20 19:00:32 -07:00
Alex Miller
2d26e98d07
Add a cross-platform getLastWrite() to get a file's mtime.
2018-07-20 19:00:32 -07:00
A.J. Beamon
a7a1124c11
Fix incompatible connection accounting that was incorrectly decrementing the incompatible count in some cases.
2018-07-17 11:36:05 -07:00
A.J. Beamon
8879954254
Merge pull request #609 from etschannen/release-6.0
...
Improved simulation strength by only remove datacenters that have been killed
2018-07-16 15:59:28 -07:00
Evan Tschannen
e0caa28758
code cleanup
2018-07-16 15:56:43 -07:00
AlvinMooreSr
aafb3c5c00
Merge pull request #593 from AlvinMooreSr/release-6.0-tls-funct
...
Replaced separate TLS Log function with FDB TraceEvent logger
2018-07-16 12:01:02 -07:00
Evan Tschannen
f72a9f60c0
only disable fearless if a datacenter has actually been killed
...
fix: we must prevent recovery into the dead datacenter while reducing usable_regions
2018-07-16 10:06:57 -07:00
Alvin Moore
a034acf3bd
Replaced separate TLS Log function with FDB TraceEvent logger
2018-07-11 18:41:46 -07:00
Stephen Atherton
96389c74cd
Merge branch 'master' of github.com:apple/foundationdb into feature-redwood
...
# Conflicts:
# tests/fast/SidebandWithStatus.txt
# tests/rare/LargeApiCorrectnessStatus.txt
# tests/slow/DDBalanceAndRemoveStatus.txt
2018-07-10 16:42:34 -07:00
Alec Grieser
d5a23642a1
Merge pull request #587 from etschannen/feature-remote-logs
...
close unneeded connections
2018-07-10 13:27:15 -07:00
Evan Tschannen
a35d5e30d9
Added a SevError trace event in case peer references becomes negative
2018-07-10 13:26:28 -07:00
Evan Tschannen
c25be5699a
close unneeded connections
2018-07-10 13:10:29 -07:00
Alec Grieser
be9c34c6f8
Merge remote-tracking branch 'upstream/release-5.2' into merge-release-5.2
2018-07-10 10:04:48 -07:00
Alec Grieser
ad37b1693d
Merge pull request #585 from etschannen/feature-remote-logs
...
A variety of cleanup and test strengthening commits
2018-07-10 09:58:44 -07:00
AlvinMooreSr
b3916a9b71
Merge pull request #409 from joelarmstrong/tlsconnection-clang-ub-warning
...
Fix compilation with clang from Apple LLVM 9.1.0
2018-07-10 09:32:24 -07:00
Stephen Atherton
1bc95862b7
Merge branch 'release-6.0' of github.com:apple/foundationdb into feature-redwood
...
# Conflicts:
# tests/fast/SidebandWithStatus.txt
# tests/rare/LargeApiCorrectnessStatus.txt
# tests/slow/DDBalanceAndRemoveStatus.txt
2018-07-10 04:16:02 -07:00
Evan Tschannen
82cc30be62
added testing for two_satellite_fast and two_satellite_safe
2018-07-09 22:01:46 -07:00
Stephen Atherton
fddb3e87e2
Differentiate between a timeout in attempting to connect vs a timeout on an active connection by converting timeouts during connection attempts to connection_failed errors.
2018-07-09 19:40:01 -07:00
Stephen Atherton
3ce7c78d36
If an HTTP request fails due to a connection failure or a timeout, do not convert the error to the more generic http_request_failed.
2018-07-09 18:58:33 -07:00
Evan Tschannen
e503dc975c
fix: destroy peers that are inactive
...
do not open new connections to send replies
2018-07-09 13:37:06 -07:00
Evan Tschannen
5a2cb3037b
merge 5.2 into 6.0
2018-07-08 20:14:06 -07:00
Evan Tschannen
0e97ce79b4
fix: destroy peers that are inactive
...
do not open new connections to send replies
2018-07-08 10:26:41 -07:00
Stephen Atherton
a2f16e217e
Memory waste fix, when a Peer disconnects an extra packet buffer block is allocated to copy unsent reliable bytes to even if there aren't any.
2018-07-06 19:44:30 -07:00
Evan Tschannen
6d7172ef7e
fix: canKillProcesses did not take into account the remoteTLogPolicy when checking notEnoughLeft
2018-07-05 21:36:09 -07:00
Evan Tschannen
6f4ca2eba2
fix: get all processes did not include rebooting processes
2018-07-05 21:13:56 -07:00
Evan Tschannen
cd4fb9285a
waitForExlusion requires both regions to be healthy, which is only possible if we do not kill all logs in a region
2018-07-05 14:04:42 -07:00
Stephen Atherton
9d85a05372
Merge branch 'master' of github.com:apple/foundationdb into feature-redwood
...
# Conflicts:
# tests/fast/SidebandWithStatus.txt
# tests/rare/LargeApiCorrectnessStatus.txt
# tests/slow/DDBalanceAndRemoveStatus.txt
2018-07-05 12:52:06 -07:00
Stephen Atherton
2cb0362102
AsyncFileCached now allows writing and truncation of whole pages previously read using readZeroCopy and not yet released without prior readers seeing the effects of the write.
2018-07-05 02:59:13 -07:00
Evan Tschannen
7315e5da55
fix: isExcluded and isCleared were exactly wrong
...
fix: isCleared should mean the process is dead
2018-07-05 02:22:22 -07:00
Evan Tschannen
e17dfea3b6
fix: desiredTLogCount was used instead of getDesiredLogs(), which caused problems with recruitment when desiredTLogCount was -1.
...
canKillProcess logic was wrong.
We still need to configure usable_regions because if datacenterVersionDifference is too large we cannot complete data movement.
2018-07-04 16:22:32 -04:00
Stephen Atherton
2925b9b984
Merge branch 'master' of github.com:apple/foundationdb into feature-redwood
2018-07-03 23:03:56 -07:00
Alvin Moore
c3f88dbfe1
Merge branch 'master' of github.com:apple/foundationdb into tls-static
2018-07-01 23:13:57 -07:00
Alvin Moore
132e2d9267
Defined TLS build flags for projects
...
Updated TLS documentation
2018-07-01 22:49:39 -07:00
Stephen Atherton
b95a2bd6c1
Merge commit 'b17c8359ec22892ed4daeaa569f2f5e105477251' into feature-redwood
...
# Conflicts:
# flow/Trace.cpp
2018-06-30 23:18:29 -07:00
Evan Tschannen
899f880ce0
fix: log router class did not have the proper fitness for becoming the cluster controller
2018-06-28 23:20:01 -07:00
Alvin Moore
45849d1f95
Added support for no-op legacy TLS options
2018-06-27 09:25:05 -07:00
Alvin Moore
65d8b38ae9
Changed generic plugin code to work as expected plugin code except for TLS use case
...
Defined TLS plugin name constant
Changed TLS plugin name to get_tls_plugin
Fixed link script
Removed compilation flags from info make target
2018-06-26 16:01:25 -07:00
Alvin Moore
ef8de426d3
Changed the TLS_DISABLED macro
...
Disable TLS within Windows until working
2018-06-26 12:08:32 -07:00
Evan Tschannen
0123627d67
Merge branch 'master' into feature-remote-logs
...
# Conflicts:
# tests/fast/SidebandWithStatus.txt
# tests/rare/LargeApiCorrectnessStatus.txt
# tests/slow/DDBalanceAndRemoveStatus.txt
2018-06-22 10:43:07 -07:00
Evan Tschannen
5fc8199abc
Swapped OkayFit and UnsetFit, because generally if machine classes are set on one machine they are set everywhere and it helps with wait_for_good_recruitment logic
...
wait_for_good_recruitment now requires that you have the desired count of each roll
remote recruitment is given a much longer wait_for_good_recruitment time interval, which does not start until enough remote machines have registered
2018-06-22 10:15:24 -07:00
Evan Tschannen
1dce97f28c
Merge branch 'release-5.2'
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# fdbserver/SimulatedCluster.actor.cpp
# packaging/msi/FDBInstaller.wxs
# versions.target
2018-06-21 17:05:11 -07:00
Balachandar Namasivayam
d7dba11366
Throw tls_error instead of internal_error when not able to create a TLS connection.
2018-06-21 15:33:00 -07:00
Stephen Atherton
e9e1e194f0
Added operation-specific rate controls to blob store interface.
2018-06-20 20:34:34 -07:00
Richard Low
fff6a47c43
Validate certiicates by default
2018-06-20 14:04:03 -07:00
Alvin Moore
f8ce1de601
Added support for compiling TLS into binaries
2018-06-20 09:21:23 -07:00
Stephen Atherton
e5c48d453a
Merge branch 'master' of github.com:apple/foundationdb into feature-redwood
2018-06-18 22:45:27 -07:00
Evan Tschannen
0913368651
added usable_regions to specify if we will replicate into a remote region
...
remote replication defaults to the primary replication
removed remote_logs, because they should be specified as an override in the regions object
2018-06-17 19:31:15 -07:00
Stephen Atherton
1eae9d621b
Merge branch 'master' of github.com:apple/foundationdb into feature-redwood
2018-06-13 15:58:21 -07:00
Stephen Atherton
2878f30f29
Merge branch 'master' of github.com:apple/foundationdb into feature-redwood
...
# Conflicts:
# fdbserver/IKeyValueStore.h
# fdbserver/KeyValueStoreMemory.actor.cpp
# fdbserver/storageserver.actor.cpp
2018-06-13 15:56:06 -07:00
Alex Miller
6c2cb25c53
Rename BestOtherFit -> OkayFit.
...
The previous order of fitness was
BestFit > GoodFit > BestOtherFit > ...
which is baffling. It's now:
BestFit > GoodFit > OkayFit > ...
which won't break anyone's expectations.
2018-06-12 16:50:25 -07:00
Evan Tschannen
372ed67497
Merge branch 'master' into feature-remote-logs
...
# Conflicts:
# fdbserver/DataDistribution.actor.cpp
# fdbserver/MasterProxyServer.actor.cpp
# fdbserver/TLogServer.actor.cpp
# fdbserver/TagPartitionedLogSystem.actor.cpp
2018-06-11 11:34:10 -07:00
Evan Tschannen
48fbc407fd
fix: we cannot kill all of the remote tlogs, because we still need their data to copy to the next generation in the same data center
2018-06-08 15:28:44 -07:00
A.J. Beamon
99c9958db7
Some more trace event normalization
2018-06-08 13:57:00 -07:00
A.J. Beamon
e5488419cc
Attempt to normalize trace events:
...
* Detail names now all start with an uppercase character and contain no underscores. Ideally these should be head-first camel case, though that was harder to check.
* Type names have the same rules, except they allow one underscore (to support a usage pattern Context_Type). The first character after the underscore is also uppercase.
* Use seconds instead of milliseconds in details.
Added a check when events are logged in simulation that logs a message to stderr if the first two rules above aren't followed.
This probably doesn't address every instance of the above problems, but all of the events I was able to hit in simulation pass the check.
2018-06-08 11:11:08 -07:00
Balachandar Namasivayam
529d0497f1
Proxy going OOM when applying high volumes of writes to a proxy, particular in a sudden fashion before ratekeeper can control the workload.
...
Address this issue by proactively monitoring the memory used by commit batches and dropping requests if a certain memory limit is exceeded.
2018-06-01 15:21:40 -07:00
A.J. Beamon
d9c702a9e3
Merge release-5.1 into release-5.2
2018-05-30 09:09:55 -07:00
Joel Armstrong
7c35ea6ba1
Fix use of bool in va_start causing undefined behavior
...
The version of clang included in Apple LLVM 9.1.0 complains about
passing the bool parameter `is_error` to va_start, which causes make
to fail:
fdbrpc/TLSConnection.actor.cpp:370:16: error: passing an object that undergoes
default argument promotion to 'va_start' has undefined behavior
[-Werror,-Wvarargs]
va_start( ap, is_error );
^
This just switches is_error back to the type it gets promoted to (int).
2018-05-24 16:37:11 -07:00
A.J. Beamon
026458baf3
Merge release-5.2 into master
2018-05-23 15:32:56 -07:00
Richard Low
84ed35b01f
Only log TLS verify failures if all verification fails; log failures at SevInfo
2018-05-21 10:58:59 -07:00
Richard Low
086700aeb1
Plumb through TLS key password to CLI and from environment
2018-05-21 10:56:10 -07:00
Evan Tschannen
520aaf731d
merge release 5.2 into master
2018-05-10 14:33:08 -07:00
Evan Tschannen
b5b8c5d587
fix: white space issue in getKnobDescription
2018-05-10 14:27:10 -07:00
Balachandar Namasivayam
b2c32ea4f2
Add secure_connection param to BlobStore to configure security.
...
Default is https. Setting secure_connection=0 makes it http.
2018-05-10 13:53:46 -07:00
Evan Tschannen
7bca7b80e6
fixed merge conflicts
2018-05-10 09:13:41 -07:00
Evan Tschannen
8f984cb2c9
Merge branch 'release-5.2'
...
# Conflicts:
# fdbrpc/TLSConnection.h
2018-05-10 09:13:22 -07:00