foundationdb

Commit Graph

Author	SHA1	Message	Date
Andrew Noyes	781b6ece77	Fix OPEN_FOR_IDE -Wunused-variable warnings CC #1255, #1173	2019-04-16 15:28:01 -07:00
Andrew Noyes	75b9369583	Make checksumHistoryBudget optional See https://github.com/apple/foundationdb/pull/1446#discussion_r275933381	2019-04-16 12:55:53 -07:00
Andrew Noyes	247f95a6e2	Add -Wunused-variable	2019-04-15 18:13:00 -07:00
Andrew Noyes	6207d724f8	Fix all -Wunused-variable warnings	2019-04-15 18:13:00 -07:00
Evan Tschannen	cd5c9d91fa	Merge pull request #1443 from etschannen/master Merge 6.1 into master	2019-04-10 17:43:07 -07:00
Evan Tschannen	8e05713a5d	do not log a SevError trace event if we cannot deserialize the connect packet	2019-04-10 17:41:02 -07:00
Evan Tschannen	6220a5ce0f	Merge pull request #1370 from jzhou77/fix-unreferenced Remove unused functions	2019-04-09 11:49:45 -07:00
A.J. Beamon	058d028099	Merge pull request #1301 from mpilman/features/cheaper-traces Defer formatting in traces to make them cheaper	2019-04-09 10:11:04 -07:00
A.J. Beamon	a7288e1325	Throw process_behind instead of future_version when all storage nodes on a team are behind. process_behind gets the same backoff behavior as not_committed. Add proxy_memory_limit_exceeded to the retryable predicate.	2019-04-08 14:21:24 -07:00
mpilman	bdba8e22eb	Added test and bugfixes	2019-04-08 11:05:29 -07:00
mpilman	b944e0b116	generalized read guards, allow for penalty+error	2019-04-08 11:04:44 -07:00
Evan Tschannen	1358603c7a	fix: getReplyUnlessFailedFor must still report endpoint failures even if the address is local	2019-04-08 10:42:58 -07:00
Evan Tschannen	1baae75ac9	merge 6.1	2019-04-07 23:24:31 -07:00
Balachandar Namasivayam	83e67d6b8f	Do not start failure monitoring for local endpoints in getReplyUnlessFailedFor.	2019-04-05 20:21:22 -07:00
Andrew Noyes	bd12e77213	Whitespace tweak	2019-04-05 16:30:42 -07:00
Andrew Noyes	c882743afa	Make actual build happy again	2019-04-05 16:30:42 -07:00
Andrew Noyes	d7612a4426	Fix OPEN_FOR_IDE build errors	2019-04-05 16:30:42 -07:00
mpilman	c008e16c81	Defer formatting in traces to make them cheaper This is the first part of making `TraceEvent` cheaper. The main idea is to defer calls to any code that formats string. These are the main changes: - TraceEvent::detail now takes a c-string instead of std::string for literals. This prevents unnecessary allocations if the trace is not going to be printed in the first place (for example for SevDebug). Before that `detail` expected a `std::string` as key, which mean that any string literal would be copied on each call. - Templates Traceable and SpecialTraceMetricType. These templates can be specialized for any type that needs to be printed. The actual formatting will be deferred to after the `enabled` check. This provides two benefits: (1) if a TraceEvent is disabled, we don't pay for the formatting and (2) TraceEvent can trace types that it doesn't know about. - TraceEvent::enabled will be set in the constructor if the Severity is passed. This will make sure that `TraceEvent::init` is not called. - `TraceEvent::detail` will be inlined. So for disabled TraceEvent calls, a call to detail will only introduce a if-branch which is much cheaper than a function call.	2019-04-05 13:12:19 -07:00
Evan Tschannen	390ab9cfed	A process will mark itself as degraded if it continually disconnects from a different process which the failure monitor thinks is healthy	2019-04-04 14:11:12 -07:00
Jingyu Zhou	2b75c2e684	Restore removed functions. crc32c.cpp is 3rd party code. orYield() in genericactors.actor.h might be used in the future code.	2019-04-04 13:24:55 -07:00
Markus Pilman	101a05ae77	Merge branch 'master' into features/client-simulator	2019-04-03 10:03:56 -08:00
Alex Miller	45c466e269	Open incrementalDelete files with OPEN_UNBUFFERED This fixes crashes from AsyncFileWinASIO refusing to open a file that didn't have OPEN_UNBUFFERED.	2019-04-01 17:25:08 -07:00
Jingyu Zhou	47b4b82628	Merge branch 'master' into fix-unreferenced	2019-04-01 14:07:19 -07:00
Jingyu Zhou	3f76be8f45	Merge remote-tracking branch 'apple/master' into fix-unreferenced	2019-04-01 14:00:43 -07:00
Jingyu Zhou	f7f8ddd894	Fix warnings on unused variables Found by -Wunused-variable flag.	2019-04-01 14:00:20 -07:00
mpilman	e23e63c6ac	Implemented JavaWorkload This change allows a user to write a workload in Java. The way this is implemented is by creating a JVM within the simulator and calling the corresponding workload class. A workload can then run in the simulator or on a testing cluster. If the workload is executed within the simulator, the resulting test will not be deterministic anymore as it will execute in a different thread (and even without that it is not clear, whether we could get determinism as the JVM does a lot of stuff that are not deterministic). This is intendet to get better testing of the Java client and layer authors can use the simulator to test their layers on a single machine but they can still simulate failing machines etc.	2019-03-31 17:57:43 -07:00
Balachandar Namasivayam	0bbdc15f71	Multi-test processes waits until a timeout if any of the tester processes restarts. Use getReplyUnlessFailedFor instead of getReply to detect the restarts and fail quickly instead of waiting for a timeout which is usually large.	2019-03-28 17:05:30 -07:00
Evan Tschannen	b6008558d3	renamed BinaryWriter.toStringRef() to .toValue(), because the function now returns a Standalone<StringRef>() eliminated an unnecessary copy from the proxy commit path eliminated an unnecessary copy from buffered peek cursor	2019-03-28 11:52:50 -07:00
Evan Tschannen	34b9d5e722	Merge pull request #1364 from etschannen/feature-fast-serialize A few performance optimizations	2019-03-27 20:57:25 -07:00
Evan Tschannen	c10f1eea71	QueueModel changed to unordered_map	2019-03-27 20:56:44 -07:00
Evan Tschannen	f1a4bdd70d	changed failureMonitor to use an unordered_map	2019-03-27 19:17:08 -07:00
Evan Tschannen	e5a80f2c94	optimized IPaddress	2019-03-27 18:21:13 -07:00
Jingyu Zhou	a55f06e082	Remove unused functions Found with -Wunused-function flag.	2019-03-27 15:45:28 -07:00
A.J. Beamon	71e2fdafb8	Changes to ratekeeper camel case	2019-03-27 08:24:25 -07:00
Evan Tschannen	3b5b03e435	ReplyPromise does not serialize an empty NetworkAddress	2019-03-26 12:05:43 -07:00
Evan Tschannen	d45159ebf7	Merge pull request #1307 from jzhou77/ratekeeper Monitor placement of Ratekeeper and DataDistributor	2019-03-24 17:26:07 -07:00
Evan Tschannen	1fc6937802	changed NetworkAddressList to at most two addresses for performance	2019-03-23 17:54:46 -07:00
Evan Tschannen	36ab852bb1	Merge branch 'master' into ratekeeper # Conflicts: # fdbserver/ClusterController.actor.cpp	2019-03-22 18:41:00 -07:00
Evan Tschannen	efbcd18987	fixed a performance regression related to broadcasting a read version to too many transactions simultaneously	2019-03-22 16:05:20 -07:00
Jingyu Zhou	0fb6a03c07	First round of review comment fixes for PR#1307	2019-03-19 11:29:19 -07:00
Jingyu Zhou	254c78053c	Fix a segfault error After wait, ServerDBInfo may have changed. Using the old copy is wrong.	2019-03-15 22:11:13 -07:00
A.J. Beamon	85b3f11e71	Fix various compiler warnings	2019-03-15 10:34:57 -07:00
Meng Xu	5a10bf5dfc	Merge branch 'master' into mengxu/tls-switch-status-PR	2019-03-14 10:35:12 -07:00
Steve Atherton	be0da73938	Merge pull request #1290 from etschannen/feature-cheap-policy Optimized a few uses of the replication policy engine	2019-03-13 17:01:19 -07:00
Evan Tschannen	e7d1f9e5f1	fixed review comments	2019-03-13 15:59:03 -07:00
Evan Tschannen	e8cb85ed8e	optimize validateAllCombinations	2019-03-13 14:47:35 -07:00
Vishesh Yadav	c32504f705	io: Add DISABLE_POSIX_KERNEL_AIO knob to use EIO instead of Kernel AIO - Some Linux filesystems don't support O_DIRECT which is required by Kernel AIO to function properly. Instead of using O_SYNC, EIO is much better options in terms of performance penalty. - Some systems may not support AIO at all. Eg. Windows Subsystem for Linux. FIXES #842 RELATED #274	2019-03-13 13:39:45 -07:00
Evan Tschannen	a2108047aa	removed LocalitySetRef and IRepPolicyRef typedefs, because for clarity the Ref suffix is reserved for arena allocated objects instead of reference counted objects.	2019-03-13 13:14:39 -07:00
Evan Tschannen	e068c478b5	merge master	2019-03-12 18:31:25 -07:00
Evan Tschannen	a7e45cff91	Merge pull request #1176 from jzhou77/ratekeeper Make Ratekeeper a separate role	2019-03-12 15:58:59 -07:00
Meng Xu	85c24b0067	Merge branch 'master' into mengxu/tls-switch-status-PR	2019-03-12 15:20:54 -07:00
Balachandar Namasivayam	880e8643d1	Fix Windows link errors	2019-03-11 17:49:03 -07:00
Evan Tschannen	044b6b4f8a	Merge branch 'master' into feature-degraded-tlog # Conflicts: # fdbserver/ClusterController.actor.cpp	2019-03-08 22:50:41 -05:00
Evan Tschannen	41c493f8d4	fix: connectPacket accessed uninitialized variables	2019-03-08 14:40:32 -05:00
Jingyu Zhou	5dcde9efe0	Fix locality per review comment and a mac compile error	2019-03-07 13:16:20 -08:00
Jingyu Zhou	3c86643822	Separate Ratekeeper from data distribution. Add a new role for ratekeeper. Remove StorageServerChanges from data distribution. Ratekeeper monitors storage servers, which borrows the idea from DataDistribution.	2019-03-07 13:16:20 -08:00
Meng Xu	04880e3d4d	Merge branch 'master' into mengxu/tls-switch-status-PR	2019-03-06 13:41:16 -08:00
Alex Miller	c6a65389ae	Remove noexcept macro and replace with BOOST_NOEXCEPT. BOOST_NOEXCEPT does what the noexcept macro was supposed to do, but in a way that is correctly maintained over time.	2019-03-05 22:06:12 -08:00
Alex Miller	af617d68e6	boost 1.52.0 -> 1.67.0 in all vcxproj files	2019-03-05 22:06:12 -08:00
Meng Xu	820548223a	Status: connected_coordinators misc minor changes Change the rst document file; Change the coding style to be consistent with the nearby code; Ensure we always initilize the connectedCoordinatesNum to 0 even when the variable is not used.	2019-03-05 21:45:18 -08:00
anoyes	981426bac9	More ide fixes	2019-03-05 18:03:57 -08:00
Evan Tschannen	82d957e0bb	Merge pull request #1178 from vishesh/task/issue-963-IPv6 IPv6 Support	2019-03-05 17:14:16 -08:00
Meng Xu	afd7c1d497	AsynFileWinASIO: Make error checking consistent with Linux In Linux, KAIO uses ASSERT to make sure open() flags have OPEN_UNBUFFERED set. In Windows, we uses if-condition and return io_errors() when the flag is not set. This PR makes Windoes implementation always use ASSERT to check the flag.	2019-03-04 16:36:04 -08:00
Vishesh Yadav	5cd8bac6cb	fix: segfault due external assignment of Endpoint::addresses #1201 isLocal() now checks if the address is equal to default NetworkAddress() which should match the behaviour before TLS changes.	2019-03-04 15:49:11 -08:00
Vishesh Yadav	1d3e62c4e3	net: Don't use a union of IP in ConnectPacket #963 Since keeping a union and using the packet size to figure out whether the ConnectPacket is using IPv6 to IPv4 address is not easily maintainable. For simplicity, we just serialize everything in ConnectPacket and be backward compatible with older format. However, some code for some much older stuff is removed.	2019-03-04 14:12:45 -08:00
Vishesh Yadav	e93cd0ff21	Add some checks and comments to IPv6 changes #963	2019-03-04 14:12:45 -08:00
Vishesh Yadav	592e224155	net: add/use formatIpPort to format IP:PORT pairs #963	2019-03-04 14:12:45 -08:00
Vishesh Yadav	cc9ad0e202	net: Use IPv6 in simulation testing #963 25% times we will use IPv6 addresses	2019-03-04 14:12:45 -08:00
Vishesh Yadav	57832e625d	net: Support IPv6 #963 - NetworkAddress now contains IPAddress object which can be either IPv4 or IPv6 address. 128bits are used even for IPv4 addresses, however only 32bits are used when using/serializing IPv4 address. - ConnectPacket is updated to store IPv6 address. Backward compatible with old format since the first 32bits of IP address field is used for serialization of IPv4. - Mainly updates rest of the code to use IPAddress structure instead of plain uint32_t. - IPv6 address/pair ports should be represented as `[ip]:port` as per convention. This applies to both cluster files and command line arguments.	2019-03-04 14:12:41 -08:00
Meng Xu	94385447bc	Status: Get if client configured TLS To understand if all clients have configured TLS, we check the tlsoption when a client tries to open database. This is similar to how we track the versions of multi-version clients.	2019-03-01 15:17:01 -08:00
Stephen Atherton	7d287c6999	Merge branch 'release-6.0' # Conflicts: # fdbclient/FileBackupAgent.actor.cpp	2019-02-28 14:01:00 -08:00
Stephen Atherton	887856b6b0	Bug fix in AsyncFileReadAhead where a file size that is an integer multiple of the read chunk size will cause a crash when reading the file's final block. BackupContainerLocalDirectory now uses AsyncFileReadAhead in simulation to get simulation coverage of that class, and FileBackup will generate file sizes which expose the bug.	2019-02-28 00:22:38 -08:00
Evan Tschannen	8afb7fbb9d	Merge pull request #1160 from alexmiller-apple/tstlog-fork Spill-By-Reference TLog Part 2: New and Old TLogServers co-exist harmoniously	2019-02-26 18:00:04 -08:00
Alex Miller	2dc57568cb	Change many things about log_version. * log_version in the database (`/conf/log_version`) is now a hint that gets rounded to the nearest supported version. * fdbcli and FDB enforce that only a valid log_version can be configured to * TLogVersion is persisted in CoreTLogSet (and LogSet and TLogSet) * Some comments here and there * Add an assert on filename length to make sure KV-pairs in filename don't exceed a maximum length.	2019-02-26 16:47:04 -08:00
Evan Tschannen	b8910ba7cd	Merge branch 'master' into feature-fix-force-recovery # Conflicts: # fdbclient/ManagementAPI.actor.h # fdbserver/DataDistribution.actor.cpp # fdbserver/storageserver.actor.cpp # fdbserver/workloads/KillRegion.actor.cpp	2019-02-22 14:38:13 -08:00
Trevor Clinkenbeard	25b397977c	Never assign DataDistributor role to process of class CoordinatorClass	2019-02-20 17:22:01 -08:00
Trevor Clinkenbeard	1bb384db4d	Merge branch 'master' of https://github.com/apple/foundationdb into add-no-assign-class	2019-02-20 13:13:12 -08:00
mpilman	f14dee764b	Use fwd decl for connectionReader - fdbrpc compiling	2019-02-19 15:16:59 -08:00
mpilman	3bd9b9047b	Minor fixes - flow now compiling with intellisense	2019-02-19 15:16:59 -08:00
Evan Tschannen	065a45e05f	Merge branch 'master' into feature-fix-force-recovery # Conflicts: # fdbclient/ManagementAPI.actor.cpp # fdbserver/ClusterController.actor.cpp # fdbserver/workloads/KillRegion.actor.cpp	2019-02-18 17:09:06 -08:00
Vishesh Yadav	0898686c9b	Remove old TODO	2019-02-18 15:43:27 -08:00
Evan Tschannen	62603d11a1	updated the killRegion simulation test to test a much larger variety of failure scenarios	2019-02-18 15:32:51 -08:00
Vishesh Yadav	e05b53d755	Merge remote-tracking branch 'apple/master' into task/tls-upgrade	2019-02-15 20:37:07 -08:00
Vishesh Yadav	345fd7e4da	Prefer unencrypted ports at client side during transition	2019-02-15 20:23:07 -08:00
Evan Tschannen	83060c6e56	Merge pull request #1062 from jzhou77/PR Add a new DataDistributor role.	2019-02-15 13:51:27 -08:00
mpilman	75f692b931	simplify actorcompiler and target to compile coveragetool	2019-02-15 00:01:42 -08:00
Jingyu Zhou	c35d1bf2ef	Fix according Alex's comment	2019-02-14 16:30:13 -08:00
Jingyu Zhou	886e7ab2ba	Add a new DataDistributor role. Let cluster controller to start a new data distributor role by sending a message to a chosen worker. Change MasterInterface usage in DataDistribution to masterId Add DataDistributor rejoin handling. This allows the data distributor to tell the new cluster controller of its existence so that the controller doesn't spawn a new one. I.e., there should be only ONE data distributor in the cluster. If DataDistributor (DD) doesn't join in a while, then ClusterController (CC) tries to recruit one as DD. CC also monitors DD and restarts one if it failed. The Proxy is also monitoring the DD. If DD failed, the Proxy will ask CC for the new DD. Add GetRecoveryInfo RPC to master server, which is called by data distributor to obtain the recovery Transaction version from the master server.	2019-02-14 16:30:13 -08:00
Vishesh Yadav	907446d0ce	Merge remote-tracking branch 'apple/master' into task/tls-upgrade	2019-02-14 11:37:38 -08:00
A.J. Beamon	9272a41e5f	Merge pull request #1146 from atn34/fix-actor-warning Fix actor warning for cmake build	2019-02-13 11:01:37 -08:00
Andrew Noyes	3a38bff8ee	Use DISABLE_ACTOR_WITHOUT_WAIT_WARNING consistently	2019-02-13 10:30:35 -08:00
Andrew Noyes	067a445e06	Replace unused _ variables with wait(success(...))	2019-02-12 17:30:30 -08:00
Andrew Noyes	874a58cb4f	Suppress actor without wait for tests in cmake	2019-02-12 11:01:17 -08:00
mpilman	8a94d80deb	fdbservice and fdbrpc now compiling	2019-02-07 15:37:04 -08:00
Evan Tschannen	486e0e13c3	Merge pull request #1116 from alexmiller-apple/tstlog Random cleanups that prepare for Spill-By-Reference TLog	2019-02-05 18:09:06 -08:00
A.J. Beamon	882f8d70b7	Merge pull request #1066 from etschannen/master fix: coordinators auto could put two coordinators in the same zone	2019-02-05 11:52:04 -08:00
Alex Miller	6668b7c544	Make simulation enforce what KAIO requires.	2019-02-04 18:04:22 -08:00
Evan Tschannen	e9ddd94e27	The failure monitor is given a list of all IP addresses associated with a process The connect packet includes the correct remote address Did a lot of code cleanup Simulation test mixed TLS and non-TLS listeners on the same process	2019-01-31 18:20:14 -08:00
Balachandar Namasivayam	9cf2b4e1e7	Improve TLS logging on error scenarios.	2019-01-29 17:04:09 -08:00
A.J. Beamon	05b38167d0	Update fdbrpc/sim2.actor.cpp Co-Authored-By: etschannen <36455792+etschannen@users.noreply.github.com>	2019-01-29 11:35:02 -08:00
Trevor Clinkenbeard	2e0b3a7f1d	Added ProcessClass::CoordinatorClass, which can be used by coordinators, so that coordinators do not have to take on other roles if desired	2019-01-25 11:03:13 -08:00
Evan Tschannen	1d7fec3074	Merge commit '048bfc5c368063d9e009513078dab88be0cbd5b0' into task/tls-upgrade-2 # Conflicts: # .gitignore	2019-01-24 17:43:06 -08:00
Evan Tschannen	9cf77d70bc	fix: getFirstLocalAddress has to be the same as primary address, because it is what we put in the connect packet, and we always connect from the primary address	2019-01-24 17:28:26 -08:00
Evan Tschannen	699f8dd617	fix: coordinators auto could put two coordinators in the same zone simulation now tests two machines in the same zone	2019-01-18 15:42:48 -08:00
Evan Tschannen	4eb11d74af	Merge pull request #1029 from bnamasivayam/reenable-check_desired_classes Re-enable CheckDesiredClasses after making necessary changes for mult…	2019-01-11 17:15:05 -08:00
Balachandar Namasivayam	a8e2e75cd5	Re-enable CheckDesiredClasses after making necessary changes for multi-region setup. Fixed a couple of bugs 1) A rare race condition where a worker is being roles even after it died. 2) Fix how RoleFitness is calculated for TLog and LogRouter. Only worst fitness is compared to see if a better fit is available.	2019-01-10 10:28:32 -08:00
Evan Tschannen	684a22a52b	Merge branch 'release-6.0' # Conflicts: # documentation/sphinx/source/release-notes.rst # fdbbackup/backup.actor.cpp # fdbclient/BackupContainer.actor.cpp # fdbclient/HTTP.actor.cpp # fdbserver/storageserver.actor.cpp # fdbserver/workloads/BackupCorrectness.actor.cpp # versions.target	2019-01-09 16:14:46 -08:00
Vishesh Yadav	31c4ac07ac	WIP: FailureMonitoring use endpointAddressList (create individual endpoints for each address) WIP: g_currentDeliveryPeerAddress WIP: FlowTransport endpoint map WIP: Add peerReference to addressToEndpointMap	2019-01-09 07:46:01 -08:00
Vishesh Yadav	51b89ae083	WIP	2019-01-09 07:41:02 -08:00
Alex Miller	cebdb83def	Revert "Merge pull request #977 from alexmiller-apple/abspath" This reverts commit `9881b1d074`, reversing changes made to `6d278e466b`.	2019-01-08 16:52:09 -08:00
Evan Tschannen	57293a2db0	byte sample recovery did not use limits for its range reads, leading to slow tasks	2019-01-04 10:32:31 -08:00
Andrew Noyes	d5430d7bf8	Remove ignore "-Wreturn-local-addr" pragma This seems to still build on gcc 8	2019-01-03 13:55:17 -08:00
Markus Pilman	dbe9baff1f	Several small compilation fixes for new versions of gcc There are several missing includes for cmath in the code, I added those. Next, Coro returns a reference to a stack variable and this causes a warning. As this is probably ok for Coro, I disabled the warning in that file for GCC. I want to have this warning in the build system as it is generally a very useful warning to have. Another change is that major and minor are deprecated for a while now. I replaced those with gnu_dev_major and gnu_dev_minor. ErrorOr currently implements operators ==, !=, and <. These do not compile because Error does not implement ==. This compiles on older versions of gcc and clang because ErrorOr<T>::operator== is not used anywhere. It is still wrong though and newer gcc versions complain. I simply removed these methods. The most interesting fix is that TraceEvent::~TraceEvent is currently throwing exceptions. This is illegal behavior in C++11 and a idea in older versions of C++. For now I simply removed the throw, but this might need some more thought.	2019-01-03 12:44:19 -08:00
Bhaskar Muppana	aa2a76ef4c	Merge pull request #981 from alexmiller-apple/cmake Add a CMake build system	2019-01-02 18:50:15 -08:00
A.J. Beamon	d8f33a2419	Add parentheses to bitwise ops (turned up by clang after recent change)	2019-01-02 10:15:59 -08:00
anoyes	6a4d87802b	Replace & operator with variadic function	2018-12-28 11:33:42 -08:00
Steve Atherton	9881b1d074	Merge pull request #977 from alexmiller-apple/abspath Use abspath when dealing with the simulator file-cache	2018-12-20 14:56:38 -08:00
Vishesh Yadav	209ecd09ee	Keep local addresses in a vector	2018-12-17 11:25:44 -08:00
Meng Xu	486a7b04fa	TeamCollection: Fix build in osX In osX, we cannot adding unsigned long to a string to append to the string.	2018-12-14 13:44:11 -08:00
Markus Pilman	4ae701d8a9	minor bugfix to look up correct filename in cache (manually cherry-picked from flat-buffers branch)	2018-12-13 22:21:25 -08:00
Markus Pilman	0207831fd6	Use abspath when dealing with the simulator file-cache The simulator uses a hash table to cache all open files to make sure that several simulated processes don't open the file more than once. This currently doesn't work properly and deleted files are often kept open forever. As a result, we often ran out of file descriptors. The problem is luckily quite simple: files are often opened with an absolute path but later a relativ path is passed for deletion. This is not working because the map that is used to store the file descriptors is not aware of paths - so deleted files are often not removed from this map. The fix that works for us is to just always work with absolute paths when adding and removing files from this map.	2018-12-13 22:21:06 -08:00
Alex Miller	a982b9da72	Additional changes from a merge commit.	2018-12-13 17:13:41 -08:00
Alex Miller	e70e59a895	Change some file locations.	2018-12-13 14:53:19 -08:00
Markus Pilman	dce290909d	fdbserver now compiling	2018-12-13 14:13:47 -08:00
mpilman	51beb8b48c	fdbrpc compiling with cmake	2018-12-13 14:02:16 -08:00
Vishesh Yadav	e04abf25f7	simulator: Support multiple listeners on single process Sim2Listener can now take the network address to listen on. This is used to listen to multiple ports in simulator and test the patch which added multiple network addresses to single endpoint.	2018-12-13 13:36:52 -08:00
Vishesh Yadav	3eb9b23024	Listen to multiple addresses and start using vector<NetworkAdddress> in Endpoint - This patch will make FDB listen to multiple addresses given via command line. Although, we'll still use first address in most places, this patch starts using vector<NetworkAddress> in Endpoint at some basic places. - When sending packets to an endpoint, pick a random network address in endpoints - Renames Endpoint::address to Endpoint::addresses since it now holds a vector of addresses.	2018-12-13 13:36:52 -08:00
Vishesh Yadav	43e5a46f9b	Change Endpoint::address(NetworkAddress) to vector<NetworkAddress> Extend `Endpoint` class to take multiple NetworkAddresses instead of just one. Hence, to talk to an endpoint instead of one IP:PORT, we'll have multiple IP:PORT pairs. This patch simply adds the field and makes changes to compile the codebase. The first element of of `address` field is used everywhere. Hence the way we talk to remains same with this patch. NOTE: Directly accessing the first memeber of Endpoint::address is unsafe as Endpoint() doesn't enforces non-empty address list. However, since the correctness test pass for now and are anyway replacing all those unsafe accesses with ones considering the whole vector, this patch ignores to access them in safe way.	2018-12-13 13:36:52 -08:00
Vishesh Yadav	e8e01b2406	Remove unused localAddress parameter from newNet2 and Net2 classes	2018-12-13 13:36:52 -08:00
Evan Tschannen	d9626895b1	Merge pull request #964 from xumengpanda/mengxu/teamcollection-release TeamCollection: Use machine teams to create server teams to increase availability at scale when a machine has multiple servers	2018-12-13 13:18:54 -08:00
Meng Xu	e069b5c31c	TeamCollection: Use clang format No functional change. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-12-06 11:39:35 -08:00
Evan Tschannen	d2d68aa171	Merge branch 'release-6.0' # Conflicts: # documentation/sphinx/source/release-notes.rst # fdbclient/ManagementAPI.actor.cpp # versions.target	2018-12-03 18:26:52 -08:00
Evan Tschannen	55a9c4a0f0	Merge pull request #955 from ajbeamon/fix-bad-error-creation-and-whitespace throw platform_error; -> throw platform_error();. Convert some spaces to tabs.	2018-12-03 15:12:37 -08:00
A.J. Beamon	50c9dfdd01	Errors that occur in platform that are the result of IO issues are now raised as io_error rather than platform_error.	2018-11-30 10:55:19 -08:00
A.J. Beamon	97847f517b	throw platform_error; -> throw platform_error();. Convert some spaces to tabs.	2018-11-28 12:56:57 -08:00
Meng Xu	8de031f9a6	TeamCollection: clang-format Format the changes with git clang-format. No functional changes. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-11-21 11:18:26 -08:00
Meng Xu	f7a7e069f0	TeamCollection: Remove unnecessary comments Pass 41806 tests with no failure Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-11-16 15:56:35 -08:00
Meng Xu	73c58852f0	TeamCollection: Resolve code review comments Resolve code review comments: 1) Improve the code efficiency by avoiding unnecessary map search and avoiding unnecessary checking 2) Remove or comment out trace events when they can be spammy 3) Improve coding style Tested for 1 hour and no error was found. KillRegionCycle.txt test was excluded from the test because existing code cannot pass that test either Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-11-16 15:55:33 -08:00
Meng Xu	5051b35c61	TeamCollection: Use machine team to create server team Current server team collection logic does not consider the fact that multipe storage servers can run on the same machine. When multiple machines fail, all servers on the machines will fail, and the possibility of having one process team fail and lose data is very high. To reduce the possibility of losing data when multiple machine fails, we first create machine teams which span across different fault zones; we then create server teams based on machine teams by first picking 1 machine team, and then picking 1 server from each machine in the machine team. Signed-off-by: Meng Xu <meng_xu@apple.com>	2018-11-16 15:53:22 -08:00
Evan Tschannen	4b5d0b4e2c	Merge branch 'release-6.0' # Conflicts: # documentation/sphinx/source/release-notes.rst # fdbclient/AsyncFileBlobStore.actor.cpp # fdbclient/AsyncFileBlobStore.actor.h # fdbclient/BlobStore.actor.cpp # fdbclient/BlobStore.h # fdbclient/HTTP.actor.cpp # fdbclient/ManagementAPI.actor.cpp # fdbclient/NativeAPI.actor.cpp # fdbrpc/LoadBalance.actor.h # fdbrpc/batcher.actor.h # fdbrpc/fdbrpc.vcxproj # fdbrpc/sim2.actor.cpp # fdbserver/DataDistribution.actor.cpp # fdbserver/DataDistributionTracker.actor.cpp # fdbserver/SimulatedCluster.actor.cpp # fdbserver/TLogServer.actor.cpp # fdbserver/masterserver.actor.cpp	2018-11-10 13:04:24 -08:00
Evan Tschannen	6f4ad84777	Merge pull request #903 from ajbeamon/move-batcher-into-proxy Move the sort of generic batcher from fdbrpc and make it specific to …	2018-11-10 09:56:03 -08:00
Evan Tschannen	b8381b3cea	Merge branch 'release-6.0' of github.com:apple/foundationdb into release-6.0	2018-11-10 09:51:49 -08:00
A.J. Beamon	67a152ae9f	Move the sort of generic batcher from fdbrpc and make it specific to batching commits in master proxy. Also a couple minor formatting changes.	2018-11-09 14:19:18 -08:00
Evan Tschannen	56c51c1bb3	fix: usableRegions was uninitialized	2018-11-09 10:17:35 -08:00
Stephen Atherton	9d73166b3b	Many bug fixes related to concurrent page operations and pager shutdown.	2018-11-06 19:31:16 -08:00
Evan Tschannen	87295cc263	suppressed spammy trace events, and avoid reporting a long master recovery duration when the cluster is first created	2018-11-04 23:07:56 -08:00
Evan Tschannen	bf6545a9cf	clients cache storage server interfaces individually, instead of as a team. This is needed because in fearless every shard has storage servers from two separate teams, leading to a lot of possible combinations allAlternatives failed logic was simplified, because we are already doing a global rate limiting, so a per shard limit is unnecessary reduced unnecessary state variables in waitMetrics requests	2018-11-02 13:15:09 -07:00
Stephen Atherton	df3bdde50b	Many bug fixes. AsyncFileCached write() on a page with a zero-copy read in progress would orphan the old page before the read was finished. Pager file operations were not converting page id to int64 for byte offset calculation. Pager was not calling releaseZeroCopy() after readZeroCopy() if there was an error or cancellation. Pager reads were using some variables that could go out of scope. BusyPage's mechanism for notifying when a physical page is no longer in use is itself no longer in use and therefore removed. Pager shutdown now cancels all outstanding reads. Improved some debug output.	2018-10-31 02:14:55 -07:00
A.J. Beamon	776b289bfe	Move AsyncFileBlobStore and related files to fdbclient.	2018-10-26 13:49:42 -07:00
A.J. Beamon	58a0e22d3c	Remove sim2 dependency on fdbclient: * Remove unused 'exclusionSet' that used a type from fdbclient. * Replace usages of describe(x) with x.toString(). Also removed some using statements.	2018-10-26 09:23:12 -07:00
Alex Miller	6bb1f4093d	Merge pull request #856 from dropbox/pr/include-fix Adjust all includes to be relative to the root.	2018-10-22 09:51:55 -07:00
Alex Miller	e2fc1c9b95	Remove specifying non-root directory as a path to search for includes.	2018-10-19 18:56:45 -07:00
Evan Tschannen	1ef29cbf0d	more windows build fixes	2018-10-19 17:00:24 -07:00
Robert Escriva	268093a96d	Adjust all includes to be relative to the root. Remove the use of relative paths. A header at foo/bar.h could be included by files under foo/ with "bar.h", but would be included everywhere else as "foo/bar.h". Adjust so that every include references such a header with the latter form. Signed-off-by: Robert Escriva <rescriva@dropbox.com>	2018-10-19 17:35:33 +00:00
Evan Tschannen	db71b60d72	Merge pull request #819 from satherton/feature-redwood Redwood storage engine, initial/experimental version	2018-10-18 18:38:11 -07:00
Evan Tschannen	0217aed74c	Merge branch 'release-6.0' # Conflicts: # bindings/go/README.md # documentation/sphinx/source/release-notes.rst # fdbserver/MasterProxyServer.actor.cpp # versions.target	2018-10-15 18:38:51 -07:00
A.J. Beamon	a963ff7a64	Fix line endings	2018-10-08 09:30:09 -07:00
Stephen Atherton	22f8a4efa9	Normalized all unit test names to begin with "/" if they should be included in random unit testing.	2018-10-05 22:09:58 -07:00
A.J. Beamon	664f64881c	Port truncate optimization from Snowflake PR in order to make quick changes for a patch release.	2018-10-05 15:05:26 -07:00
Stephen Atherton	7c1dc305cb	Merge commit 'a72c8f5cb2e79a673abc0ed3d27ef1c51028fb13' into feature-redwood	2018-10-05 10:15:10 -07:00
Evan Tschannen	3922e477a5	Merge branch 'release-6.0' # Conflicts: # documentation/sphinx/source/release-notes.rst # fdbclient/ManagementAPI.actor.cpp # fdbserver/ClusterController.actor.cpp # fdbserver/DataDistribution.actor.cpp # fdbserver/LogSystemDiskQueueAdapter.actor.cpp # fdbserver/SimulatedCluster.actor.cpp # fdbserver/TLogServer.actor.cpp	2018-10-03 16:57:18 -07:00
A.J. Beamon	92990d6aef	Merge release-6.0 into master	2018-09-21 16:14:39 -07:00
Evan Tschannen	77e2fb787e	Merge branch 'release-6.0' into feature-fix-forced-recovery	2018-09-21 14:55:37 -07:00
Stephen Atherton	2fc86c5ff3	Merge branch 'master' of github.com:apple/foundationdb into feature-redwood # Conflicts: # fdbrpc/AsyncFileCached.actor.h # fdbserver/IKeyValueStore.h # fdbserver/KeyValueStoreMemory.actor.cpp # fdbserver/workloads/StatusWorkload.actor.cpp # tests/fast/SidebandWithStatus.txt # tests/rare/LargeApiCorrectnessStatus.txt # tests/slow/DDBalanceAndRemoveStatus.txt	2018-09-20 03:39:55 -07:00
Evan Tschannen	42a67efb0c	the cluster controller should prefer to be located on a transaction class machine over a storage server class machine	2018-09-19 18:04:59 -07:00
Evan Tschannen	200e65fe61	added a workload which tests killing an entire region, and recovering from the failure with data loss. fix: we cannot pop the txs tag from remote logs until they have a full copy of the txnStateStore fix: we have to modify all of history, we cannot stop after finding a local remote	2018-09-17 18:32:39 -07:00
Evan Tschannen	4dd2dda0a3	Merge branch 'release-6.0' # Conflicts: # fdbserver/worker.actor.cpp	2018-09-05 16:11:06 -07:00
Evan Tschannen	df406a340e	Merge pull request #742 from ajbeamon/roles-in-trace-events Add the roles running on a process as a field on trace events in the …	2018-09-05 16:08:12 -07:00
Evan Tschannen	90301f497f	Merge branch 'release-6.0' # Conflicts: # fdbclient/ManagementAPI.actor.cpp # fdbrpc/FlowTransport.actor.cpp # fdbrpc/TLSConnection.actor.cpp # fdbserver/DataDistribution.actor.cpp # fdbserver/Status.actor.cpp # fdbserver/storageserver.actor.cpp # fdbserver/workloads/StatusWorkload.actor.cpp # versions.target	2018-09-05 16:06:33 -07:00
A.J. Beamon	2de0b5d6d7	Add the roles running on a process as a field on trace events in the form of a comma delimited string of role abbreviations.	2018-09-05 15:06:14 -07:00
Evan Tschannen	dcdbb3ec4d	Merge branch 'release-6.0' of github.com:apple/foundationdb into feature-movekey-fixes	2018-09-05 10:29:13 -07:00
Evan Tschannen	21f5cf9ce9	suppress spammy trace events	2018-09-04 17:12:26 -07:00
Steve Atherton	89dd9cc4a3	Cherry-pick pull request #717 to release-6.0 Which contains: * Improve TLS cert refresh logging. * Loading a mismatching cert shouldn't prevent TLS connections. * Initialize the cached copy of ca/cert/key data. * Open certificates as uncached, which means they can be write-protected.	2018-08-23 16:53:40 -07:00
Steve Atherton	365fe992b4	Merge pull request #717 from alexmiller-apple/tls-refresh-fixes Fix certificate reloading issues	2018-08-22 15:09:12 -07:00
Evan Tschannen	717c43a69f	merge 6.0 into master	2018-08-22 00:28:04 -07:00
Alex Miller	d2da969412	Improve TLS cert refresh logging. Explicitly call out failure/success, and surface repeated cert mismatches.	2018-08-21 15:05:41 -07:00
Alex Miller	4113b36df7	Loading a mismatching cert shouldn't prevent TLS connections. set_{cert,key,ca}_data returns pass/fail and not throw. The existing code wrongly assumed that they threw.	2018-08-21 15:02:54 -07:00
Evan Tschannen	26ec6ebac8	fixed line endings	2018-08-21 14:58:26 -07:00
Evan Tschannen	712aa00261	a better fix to the windows build issue	2018-08-21 14:54:38 -07:00
Alex Miller	4caacaaf4e	I would like to atone for my sins. But later. This fixes the windows build. For some reason, MSVC believes that the actor-compiled version of networkSender actually exists, but the non-actor-compiled version doesn't exist. This is a hackish workaround, as the largest reason to not include a .g.h file is because it defines a POST_ACTOR_COMPILER define that messes with actorcompiler.h's #defines. We can just undefine that after including the file. ...but carefully.	2018-08-20 20:33:38 -07:00
Alex Miller	3ece3cf301	Initialize the cached copy of ca/cert/key data. This was just purely an accidental oversight from before. The variables were there and handled like they were actually initilized with the contents of the various certificate files at start-up, but never actually were. And add a few trace events to make it easy to see when the system noticed and tried to reload certificate data.	2018-08-20 19:09:34 -07:00
Alex Miller	fd866a3b47	Open certificates as uncached, which means they can be write-protected. OPEN_READONLY still opens the file as read-write. To actually be read-only, one needs to open the file as READONLY and UNCACHED.	2018-08-20 19:07:58 -07:00
Alex Miller	63b1e85338	Ban `Void _ = wait(...)` constructions, and require just `wait(...)`. There's never any reason to save the value of a Void return, and it's the easiest source of redefined variable bugs that will creep back in over time. So just `wait(...)`, it's cleaner that way.	2018-08-14 15:50:26 -07:00
Alex Miller	86dbe1f0e9	Fix more instances of actorcompiler.h being in the wrong place.	2018-08-14 15:50:26 -07:00
Alex Miller	7feb5d8209	Remove including flow.h in actorcompiler.h, and fix resulting breakage. For files that required flow.h, and only got it through actorcompiler.h, their version of flow.h would have the actorcompiler #defines defined. Then, if it included a STL/boost file, the same breakage would result. This needs to not happen, so the include of flow.h in actorcompiler.h was removed.	2018-08-14 15:50:26 -07:00
Alex Miller	bca324eaa6	More actorcompiler.h fixes and additions.	2018-08-14 15:50:26 -07:00
Alex Miller	fb31a6999f	Rewrite all files to have #include actorcompiler.h as the last include.	2018-08-14 15:50:26 -07:00
Alex Miller	07e5281142	Restrict actor keyword #defines to actor files. This introduces a new rule in our codebase, that any file that #includes actorcompiler.h needs to do it as the last #include, and it needs to then #include unactorcompiler.h at the end of the file. The point of this is that it prevents our actorcompiler.h #defines from leaking into boost or the c++ standard library. Both of these start throwing errors if you s/state// their code, which `#define state ` effectively does.	2018-08-14 15:50:26 -07:00
Alex Miller	535b5701e5	Rewrite all `Void _ = wait(...)` -> `wait(...)`. This takes advantage of the new actorcompiler functionality to avoid having duplicate definitions of `Void _` when trying to feed the un-actorompiled source through clang.	2018-08-14 15:50:26 -07:00
Evan Tschannen	cdcf056aef	Merge branch 'release-6.0'	2018-08-14 09:43:51 -07:00
A.J. Beamon	168dce94cb	Remove some trace event suppressions that were happening off the network thread. Downgrade some trace events related to trace logging problems from SevError to SevWarnAlways.	2018-08-14 09:00:43 -07:00
Evan Tschannen	3186fac397	Make sure we still accept some connections even if we are CPU bound by high priority work	2018-08-10 17:47:21 -07:00
A.J. Beamon	574c5576a2	Merge branch 'release-6.0' of github.com:apple/foundationdb # Conflicts: # fdbrpc/TLSConnection.actor.cpp # versions.target	2018-08-10 14:31:58 -07:00
A.J. Beamon	3535ddad80	Merge pull request #674 from alexmiller-apple/glibcxx-debug-fixes Fix bugs uncovered by -D_GLIBCXX_DEBUG	2018-08-09 08:18:51 -07:00
A.J. Beamon	24dec1529b	Merge pull request #673 from etschannen/release-6.0 A variety of bug fixes and performance improvements	2018-08-07 10:55:46 -07:00
Alex Miller	ff0e14d5a7	Fix a compilation error on windows.	2018-08-06 18:36:01 -07:00
Evan Tschannen	b5a133865d	Merge branch 'release-6.0' of github.com:apple/foundationdb into release-6.0 # Conflicts: # fdbrpc/TLSConnection.actor.cpp	2018-08-06 18:26:54 -07:00
Evan Tschannen	22f2a1fedd	Merge pull request #676 from etschannen/master fix: we should not free statdata ourselves, it will be deleted by libeio itself	2018-08-06 18:08:45 -07:00
Steve Atherton	fb46385a39	Merge pull request #628 from alexmiller-apple/reloadcertificates Reload certificates if changed. This is a cherry-pick of #628 back to release-6.0	2018-08-06 18:04:04 -07:00
Evan Tschannen	56e0b729c8	fix: we should not free statdata ourselves, it will be deleted by libeio itself	2018-08-06 17:46:53 -07:00
Alex Miller	d99592f8bd	Fix an out-of-bounds vector access.	2018-08-06 12:50:34 -07:00
Evan Tschannen	6f328d41ac	suppressed spammy trace events	2018-08-06 12:12:55 -07:00
Evan Tschannen	538e684f1c	Merge branch 'release-6.0' # Conflicts: # versions.target	2018-08-03 11:41:46 -07:00
Evan Tschannen	2619234477	Merge branch 'release-5.2' into release-6.0 # Conflicts: # documentation/sphinx/source/release-notes.rst	2018-08-03 11:40:24 -07:00
Evan Tschannen	21fe6adac4	fix: give time to do other work between accepting connections. It is expensive to accept TLS connections, so we have a slow task (which can kill other connections) if we accept too many connections in a row	2018-08-03 11:37:10 -07:00
Alex Miller	1a7cda4149	Stop performing self-moves. (e.g. a = std::move(a)) self-moves are frowned upon in C++, and in our code this generally happens from calls to swap as part of trying to implement a "unordered erase" function via swap-to-the-end-and-pop_back. For convenience, a swapAndPop() function is now offered that performs this, while disallowing self-moves.	2018-08-01 18:09:54 -07:00
Evan Tschannen	1c29275672	call all methods which could disable a trace event before it is initialized. In practice this means calling .error first, then .suppressFor, then all your details.	2018-08-01 14:30:57 -07:00
Alex Miller	f70f204d55	Fix a compilation error on windows.	2018-07-30 17:13:37 -07:00
Evan Tschannen	28a26d54f2	Merge commit 'ccf4384c79d026edbf76152e95e7410ebe621c1f' into release-6.0 # Conflicts: # documentation/sphinx/source/release-notes.rst # fdbrpc/FlowTransport.actor.cpp	2018-07-28 09:11:31 -07:00
Evan Tschannen	fa3b61508c	fix: do not increase numIncompatibleConnections if the connect was already incompatible	2018-07-28 08:50:54 -07:00
Stephen Atherton	4379a58bbe	Suppress potentially spammy event and don't log cancellation errors.	2018-07-27 21:03:10 -07:00
Stephen Atherton	59e005485d	Fixed bug where incompatible connection count was sometimes decremented twice for the same peer.	2018-07-27 20:48:14 -07:00
Stephen Atherton	6a3834c3f8	Fixed memory leak when destroying a FlowTransport.	2018-07-27 20:46:54 -07:00
Stephen Atherton	c593d1c6a2	Bug fix causing clients to sometimes (rarely) not reconnect to upgraded clusters. Reliable packets were being dropped to incompatible peers intentionally, but now this is only done if the peer is newer since successful communication with a newer peer will never be possible.	2018-07-27 20:42:06 -07:00
Steve Atherton	d1a877039d	Merge pull request #628 from alexmiller-apple/reloadcertificates Reload certificates if changed.	2018-07-26 17:21:23 -07:00
Stephen Atherton	40762d9f9b	Merge branch 'master' of github.com:apple/foundationdb into feature-redwood	2018-07-25 17:58:52 -07:00
Evan Tschannen	95bc695f0e	Merge branch 'release-6.0' of github.com:apple/foundationdb into release-6.0	2018-07-25 13:06:54 -07:00
Evan Tschannen	89a3e2e1b4	Backed out connection closing changes because of upgrade problems	2018-07-25 13:06:13 -07:00
Alex Miller	262af775eb	Implement overly simple file write timestamps for simulation, and clean up code.	2018-07-24 17:20:31 -07:00
Alex Miller	168496f819	Poll the certificate files if TLS is enabled and reload them if changed. This allows certificates to be changed/updated without having to restart fdbserver.	2018-07-20 19:00:32 -07:00
Alex Miller	2d26e98d07	Add a cross-platform getLastWrite() to get a file's mtime.	2018-07-20 19:00:32 -07:00
A.J. Beamon	a7a1124c11	Fix incompatible connection accounting that was incorrectly decrementing the incompatible count in some cases.	2018-07-17 11:36:05 -07:00
A.J. Beamon	8879954254	Merge pull request #609 from etschannen/release-6.0 Improved simulation strength by only remove datacenters that have been killed	2018-07-16 15:59:28 -07:00
Evan Tschannen	e0caa28758	code cleanup	2018-07-16 15:56:43 -07:00
AlvinMooreSr	aafb3c5c00	Merge pull request #593 from AlvinMooreSr/release-6.0-tls-funct Replaced separate TLS Log function with FDB TraceEvent logger	2018-07-16 12:01:02 -07:00
Evan Tschannen	f72a9f60c0	only disable fearless if a datacenter has actually been killed fix: we must prevent recovery into the dead datacenter while reducing usable_regions	2018-07-16 10:06:57 -07:00
Alvin Moore	a034acf3bd	Replaced separate TLS Log function with FDB TraceEvent logger	2018-07-11 18:41:46 -07:00
Stephen Atherton	96389c74cd	Merge branch 'master' of github.com:apple/foundationdb into feature-redwood # Conflicts: # tests/fast/SidebandWithStatus.txt # tests/rare/LargeApiCorrectnessStatus.txt # tests/slow/DDBalanceAndRemoveStatus.txt	2018-07-10 16:42:34 -07:00
Alec Grieser	d5a23642a1	Merge pull request #587 from etschannen/feature-remote-logs close unneeded connections	2018-07-10 13:27:15 -07:00
Evan Tschannen	a35d5e30d9	Added a SevError trace event in case peer references becomes negative	2018-07-10 13:26:28 -07:00
Evan Tschannen	c25be5699a	close unneeded connections	2018-07-10 13:10:29 -07:00
Alec Grieser	be9c34c6f8	Merge remote-tracking branch 'upstream/release-5.2' into merge-release-5.2	2018-07-10 10:04:48 -07:00
Alec Grieser	ad37b1693d	Merge pull request #585 from etschannen/feature-remote-logs A variety of cleanup and test strengthening commits	2018-07-10 09:58:44 -07:00
AlvinMooreSr	b3916a9b71	Merge pull request #409 from joelarmstrong/tlsconnection-clang-ub-warning Fix compilation with clang from Apple LLVM 9.1.0	2018-07-10 09:32:24 -07:00
Stephen Atherton	1bc95862b7	Merge branch 'release-6.0' of github.com:apple/foundationdb into feature-redwood # Conflicts: # tests/fast/SidebandWithStatus.txt # tests/rare/LargeApiCorrectnessStatus.txt # tests/slow/DDBalanceAndRemoveStatus.txt	2018-07-10 04:16:02 -07:00
Evan Tschannen	82cc30be62	added testing for two_satellite_fast and two_satellite_safe	2018-07-09 22:01:46 -07:00
Stephen Atherton	fddb3e87e2	Differentiate between a timeout in attempting to connect vs a timeout on an active connection by converting timeouts during connection attempts to connection_failed errors.	2018-07-09 19:40:01 -07:00
Stephen Atherton	3ce7c78d36	If an HTTP request fails due to a connection failure or a timeout, do not convert the error to the more generic http_request_failed.	2018-07-09 18:58:33 -07:00
Evan Tschannen	e503dc975c	fix: destroy peers that are inactive do not open new connections to send replies	2018-07-09 13:37:06 -07:00
Evan Tschannen	5a2cb3037b	merge 5.2 into 6.0	2018-07-08 20:14:06 -07:00
Evan Tschannen	0e97ce79b4	fix: destroy peers that are inactive do not open new connections to send replies	2018-07-08 10:26:41 -07:00
Stephen Atherton	a2f16e217e	Memory waste fix, when a Peer disconnects an extra packet buffer block is allocated to copy unsent reliable bytes to even if there aren't any.	2018-07-06 19:44:30 -07:00
Evan Tschannen	6d7172ef7e	fix: canKillProcesses did not take into account the remoteTLogPolicy when checking notEnoughLeft	2018-07-05 21:36:09 -07:00
Evan Tschannen	6f4ca2eba2	fix: get all processes did not include rebooting processes	2018-07-05 21:13:56 -07:00
Evan Tschannen	cd4fb9285a	waitForExlusion requires both regions to be healthy, which is only possible if we do not kill all logs in a region	2018-07-05 14:04:42 -07:00
Stephen Atherton	9d85a05372	Merge branch 'master' of github.com:apple/foundationdb into feature-redwood # Conflicts: # tests/fast/SidebandWithStatus.txt # tests/rare/LargeApiCorrectnessStatus.txt # tests/slow/DDBalanceAndRemoveStatus.txt	2018-07-05 12:52:06 -07:00
Stephen Atherton	2cb0362102	AsyncFileCached now allows writing and truncation of whole pages previously read using readZeroCopy and not yet released without prior readers seeing the effects of the write.	2018-07-05 02:59:13 -07:00
Evan Tschannen	7315e5da55	fix: isExcluded and isCleared were exactly wrong fix: isCleared should mean the process is dead	2018-07-05 02:22:22 -07:00
Evan Tschannen	e17dfea3b6	fix: desiredTLogCount was used instead of getDesiredLogs(), which caused problems with recruitment when desiredTLogCount was -1. canKillProcess logic was wrong. We still need to configure usable_regions because if datacenterVersionDifference is too large we cannot complete data movement.	2018-07-04 16:22:32 -04:00
Stephen Atherton	2925b9b984	Merge branch 'master' of github.com:apple/foundationdb into feature-redwood	2018-07-03 23:03:56 -07:00
Alvin Moore	c3f88dbfe1	Merge branch 'master' of github.com:apple/foundationdb into tls-static	2018-07-01 23:13:57 -07:00
Alvin Moore	132e2d9267	Defined TLS build flags for projects Updated TLS documentation	2018-07-01 22:49:39 -07:00
Stephen Atherton	b95a2bd6c1	Merge commit 'b17c8359ec22892ed4daeaa569f2f5e105477251' into feature-redwood # Conflicts: # flow/Trace.cpp	2018-06-30 23:18:29 -07:00
Evan Tschannen	899f880ce0	fix: log router class did not have the proper fitness for becoming the cluster controller	2018-06-28 23:20:01 -07:00
Alvin Moore	45849d1f95	Added support for no-op legacy TLS options	2018-06-27 09:25:05 -07:00
Alvin Moore	65d8b38ae9	Changed generic plugin code to work as expected plugin code except for TLS use case Defined TLS plugin name constant Changed TLS plugin name to get_tls_plugin Fixed link script Removed compilation flags from info make target	2018-06-26 16:01:25 -07:00
Alvin Moore	ef8de426d3	Changed the TLS_DISABLED macro Disable TLS within Windows until working	2018-06-26 12:08:32 -07:00
Evan Tschannen	0123627d67	Merge branch 'master' into feature-remote-logs # Conflicts: # tests/fast/SidebandWithStatus.txt # tests/rare/LargeApiCorrectnessStatus.txt # tests/slow/DDBalanceAndRemoveStatus.txt	2018-06-22 10:43:07 -07:00
Evan Tschannen	5fc8199abc	Swapped OkayFit and UnsetFit, because generally if machine classes are set on one machine they are set everywhere and it helps with wait_for_good_recruitment logic wait_for_good_recruitment now requires that you have the desired count of each roll remote recruitment is given a much longer wait_for_good_recruitment time interval, which does not start until enough remote machines have registered	2018-06-22 10:15:24 -07:00
Evan Tschannen	1dce97f28c	Merge branch 'release-5.2' # Conflicts: # documentation/sphinx/source/release-notes.rst # fdbserver/SimulatedCluster.actor.cpp # packaging/msi/FDBInstaller.wxs # versions.target	2018-06-21 17:05:11 -07:00
Balachandar Namasivayam	d7dba11366	Throw tls_error instead of internal_error when not able to create a TLS connection.	2018-06-21 15:33:00 -07:00
Stephen Atherton	e9e1e194f0	Added operation-specific rate controls to blob store interface.	2018-06-20 20:34:34 -07:00
Richard Low	fff6a47c43	Validate certiicates by default	2018-06-20 14:04:03 -07:00
Alvin Moore	f8ce1de601	Added support for compiling TLS into binaries	2018-06-20 09:21:23 -07:00
Stephen Atherton	e5c48d453a	Merge branch 'master' of github.com:apple/foundationdb into feature-redwood	2018-06-18 22:45:27 -07:00
Evan Tschannen	0913368651	added usable_regions to specify if we will replicate into a remote region remote replication defaults to the primary replication removed remote_logs, because they should be specified as an override in the regions object	2018-06-17 19:31:15 -07:00
Stephen Atherton	1eae9d621b	Merge branch 'master' of github.com:apple/foundationdb into feature-redwood	2018-06-13 15:58:21 -07:00
Stephen Atherton	2878f30f29	Merge branch 'master' of github.com:apple/foundationdb into feature-redwood # Conflicts: # fdbserver/IKeyValueStore.h # fdbserver/KeyValueStoreMemory.actor.cpp # fdbserver/storageserver.actor.cpp	2018-06-13 15:56:06 -07:00
Alex Miller	6c2cb25c53	Rename BestOtherFit -> OkayFit. The previous order of fitness was BestFit > GoodFit > BestOtherFit > ... which is baffling. It's now: BestFit > GoodFit > OkayFit > ... which won't break anyone's expectations.	2018-06-12 16:50:25 -07:00
Evan Tschannen	372ed67497	Merge branch 'master' into feature-remote-logs # Conflicts: # fdbserver/DataDistribution.actor.cpp # fdbserver/MasterProxyServer.actor.cpp # fdbserver/TLogServer.actor.cpp # fdbserver/TagPartitionedLogSystem.actor.cpp	2018-06-11 11:34:10 -07:00
Evan Tschannen	48fbc407fd	fix: we cannot kill all of the remote tlogs, because we still need their data to copy to the next generation in the same data center	2018-06-08 15:28:44 -07:00
A.J. Beamon	99c9958db7	Some more trace event normalization	2018-06-08 13:57:00 -07:00
A.J. Beamon	e5488419cc	Attempt to normalize trace events: * Detail names now all start with an uppercase character and contain no underscores. Ideally these should be head-first camel case, though that was harder to check. * Type names have the same rules, except they allow one underscore (to support a usage pattern Context_Type). The first character after the underscore is also uppercase. * Use seconds instead of milliseconds in details. Added a check when events are logged in simulation that logs a message to stderr if the first two rules above aren't followed. This probably doesn't address every instance of the above problems, but all of the events I was able to hit in simulation pass the check.	2018-06-08 11:11:08 -07:00
Balachandar Namasivayam	529d0497f1	Proxy going OOM when applying high volumes of writes to a proxy, particular in a sudden fashion before ratekeeper can control the workload. Address this issue by proactively monitoring the memory used by commit batches and dropping requests if a certain memory limit is exceeded.	2018-06-01 15:21:40 -07:00
A.J. Beamon	d9c702a9e3	Merge release-5.1 into release-5.2	2018-05-30 09:09:55 -07:00
Joel Armstrong	7c35ea6ba1	Fix use of bool in va_start causing undefined behavior The version of clang included in Apple LLVM 9.1.0 complains about passing the bool parameter `is_error` to va_start, which causes make to fail: fdbrpc/TLSConnection.actor.cpp:370:16: error: passing an object that undergoes default argument promotion to 'va_start' has undefined behavior [-Werror,-Wvarargs] va_start( ap, is_error ); ^ This just switches is_error back to the type it gets promoted to (int).	2018-05-24 16:37:11 -07:00
A.J. Beamon	026458baf3	Merge release-5.2 into master	2018-05-23 15:32:56 -07:00
Richard Low	84ed35b01f	Only log TLS verify failures if all verification fails; log failures at SevInfo	2018-05-21 10:58:59 -07:00
Richard Low	086700aeb1	Plumb through TLS key password to CLI and from environment	2018-05-21 10:56:10 -07:00
Evan Tschannen	520aaf731d	merge release 5.2 into master	2018-05-10 14:33:08 -07:00
Evan Tschannen	b5b8c5d587	fix: white space issue in getKnobDescription	2018-05-10 14:27:10 -07:00
Balachandar Namasivayam	b2c32ea4f2	Add secure_connection param to BlobStore to configure security. Default is https. Setting secure_connection=0 makes it http.	2018-05-10 13:53:46 -07:00
Evan Tschannen	7bca7b80e6	fixed merge conflicts	2018-05-10 09:13:41 -07:00
Evan Tschannen	8f984cb2c9	Merge branch 'release-5.2' # Conflicts: # fdbrpc/TLSConnection.h	2018-05-10 09:13:22 -07:00
Evan Tschannen	d3450ce5b0	Merge pull request #343 from bnamasivayam/tls-plugin Tls plugin	2018-05-09 16:35:53 -07:00
Balachandar Namasivayam	479dbf4c04	Addressed review comments. Remove redundant FDBLibTLS/ITLSPlugin.h.	2018-05-09 16:16:09 -07:00
Balachandar Namasivayam	0c2960a221	Use smart pointer instead of naked ones in set_peer_verify() method.	2018-05-09 14:53:01 -07:00
Balachandar Namasivayam	7591931a09	Revert "Make tls_verify_peers as a comma separated string of constraints." This reverts commit `2033847e4b`.	2018-05-09 14:40:36 -07:00
Balachandar Namasivayam	2033847e4b	Make tls_verify_peers as a comma separated string of constraints.	2018-05-09 14:37:39 -07:00
Balachandar Namasivayam	e8b7f4b190	Add password support for tls.	2018-05-08 20:46:31 -07:00
Balachandar Namasivayam	49af5d685b	Restore previous behavior of not specifying peer_verify option means disable checking.	2018-05-08 18:54:44 -07:00
Balachandar Namasivayam	d3b5cfb93c	Support latest TLS plugin. Add support for https in backup.	2018-05-08 16:28:13 -07:00
Evan Tschannen	7acdc314e4	Merge branch 'release-5.2' # Conflicts: # fdbrpc/TLSConnection.actor.cpp	2018-05-08 13:22:53 -07:00
Evan Tschannen	1f6c6a886b	Merge branch 'release-5.1' into release-5.2	2018-05-08 13:08:11 -07:00
Alvin Moore	9aa94e87a3	Renamed the default TLS search plugin	2018-05-07 17:01:14 -07:00
Alex Miller	bc8e6acbe8	Fix the other half of simulation requiring a TLS Plugin. This commit: 1. Restores --tls_plugin as a way to provide the path to the TLS plugin when running in simulation. 2. Removes the TLS Plugin as being required for 5% of tests. 3. Standardizes on 'sslEnabled' as a variable name. And is a fix/improvement upon commit `f7733d1b`. (1) previously didn't work, because we would create multiple new TLSOptions instances and run init_plugin multiple times. Only the first call would use the argument specified on the command line. To fix this, the TLSOptions derived from the command line is threaded through all the simulation code that needs it. (2) was an oversight in `f7733d1b`, which didn't actually make "should we be TLS" dependant on if the TLS plugin was available or not. (3) is just nice for trying to grep around in the codebase.	2018-04-30 18:26:29 -07:00
Stephen Atherton	af61d3596d	Merge branch 'public-master' into feature-redwood # Conflicts: # fdbserver/DatabaseConfiguration.cpp # fdbserver/OldTLogServer.actor.cpp # fdbserver/fdbserver.vcxproj # fdbserver/fdbserver.vcxproj.filters	2018-04-24 17:22:21 -07:00
Alex Miller	f7733d1bd0	Do not require the TLS Plugin for simulation. It appears that explicit calls to TLS-related things had snuck in over time, which meant that simulation runs that weren't even configured to use SSL still wanted and required the TLS plugin. This commit instead threads through the understanding of if any TLS-related options were provided, and if not, then don't call anything TLS-related so that we don't require the TLS plugin. Hopefully this makes life easier for the opensource folk. :)	2018-04-24 16:53:30 -07:00
Dennis Schafroth	290122637b	Using ASSERT_ABORT in destructors	2018-04-23 14:05:10 +02:00
Evan Tschannen	c1ccc8522c	Merge branch 'release-5.2'	2018-04-17 18:38:12 -07:00

... 4 5 6 7 8 ...

747 Commits