Young Liu
3728ed03dd
Resolve comments
2020-09-05 18:55:09 -07:00
Young Liu
1ad5e17458
add support for comparing original and current impls
2020-09-05 11:14:59 -07:00
Meng Xu
1ba9b6b07f
DD:Change SendRelocateToDDQx100 to SendRelocateToDDQueue
2020-07-17 14:10:17 -07:00
Meng Xu
42a7b91647
Turn clang-format off for error_definition.h
2020-07-17 11:11:30 -07:00
Meng Xu
098cdfb558
Replace actor_cancelled error with dd_cancelled
2020-07-16 20:26:07 -07:00
Meng Xu
59132e2cc8
Merge pull request #3237 from ajbeamon/tag-throttling-documentation
...
Add documentation for the tag throttling feature
2020-05-27 09:33:30 -07:00
A.J. Beamon
530da59d62
Add documentation for the tag throttling feature
2020-05-26 15:27:06 -07:00
A.J. Beamon
86f712657e
Eliminate the undefined behavior of calling run_network twice, instead returning an error.
2020-05-22 13:31:06 -07:00
Andrew Noyes
8bd5dcaff8
Merge branch 'master' into atn34/special-key-versioning
2020-05-09 15:34:20 -07:00
Andrew Noyes
6b35b1b686
Disallow no-module read by default
2020-05-08 05:37:37 +00:00
Andrew Noyes
1d6209e304
Check for cross-module reads
2020-05-08 05:37:37 +00:00
A.J. Beamon
aed97a9f20
Merge branch 'master' into transaction-tagging
2020-05-07 14:52:22 -07:00
tclinken
4ec652f59f
Fixed backup_invalid_info error message
2020-05-01 10:33:13 -07:00
A.J. Beamon
dfec896438
Enforce a throttle limit. Don't count transaction tags on RK if the proxy has updated us in a while.
2020-04-17 11:48:02 -07:00
A.J. Beamon
ebeca10bce
Change the serialization of tags sent in some messages. Add communication of the sampling rate from cluster to clients.
2020-04-09 16:55:56 -07:00
A.J. Beamon
26b7e02d4c
Some initial work to support tagging transactions and passing them around.
2020-03-20 11:23:11 -07:00
Xin Dong
e21426d12a
Send error back to the GRV requests with batch priority when the cluster is saturated, instead of blindly enqueue the requests and let the client timeout.
2020-01-30 14:13:56 -08:00
Evan Tschannen
6c0b934dda
Merge pull request #2242 from alexmiller-apple/fix-10min-stall-again
...
Fix the 10min multi-region recovery stall again
2020-01-23 17:53:02 -08:00
Jingyu Zhou
a4d6ebe79e
Recruit backup worker in newEpoch
2020-01-22 19:37:48 -08:00
Alex Miller
1cb311fcb8
Add an ASSERT_WE_THINK that peek cursors don't get timed_out()
...
This should prevent us from regressing and having multi-region
recoveries hang for 10min again.
2020-01-21 17:07:37 -08:00
sramamoorthy
c59168fd07
error msg: Snapshot error -> Disk Snapshot error
2019-08-30 08:46:36 -07:00
sramamoorthy
5d87443323
improved error msgs for snapshot cmd
2019-08-27 16:43:52 -07:00
Evan Tschannen
7a932479dd
throw away state if we ever read popped data from the disk queue adapter
2019-07-30 10:14:39 -07:00
Vishesh Yadav
867986cdea
fdbrpc: Reduced connection monitoring from clients
...
This patch does two changes to connection monitoring:
1. Connection monitoring at client side will check if the connection
has been stayed idle for some time. If connection is unused for a
while, we close the connection. There is some weirdness involved here
as ping messages are by themselves are connection traffic. We get over
this by making it two-phase process, first being checking idle
reliable traffic, followed by disabling pings and then checking for
idle unreliable traffic.
2. Connection monitoring of clients from server will no longer send
pings to clients. Instead, it keep monitor the received bytes and
close after certain period of inactivity.
2019-07-09 14:24:16 -07:00
Vishesh Yadav
6fa7081a21
net: Don't make FailureMonitoring requests from client
...
This patch removes the need for clients to continuously contact
cluster coordinator for failure monitoring information. Instead, it
uses the FlowTransport to monitor the statuses of peers and update
FailureMonitor accordingly.
2019-06-09 00:43:38 -07:00
sramamoorthy
b17ad85497
exec op not supported when log_anti_quorum > 0
2019-05-28 22:07:46 -07:00
sramamoorthy
bb474dc323
if recovery < fully_recovered then fail the exec
...
Will do more cleanup, pushing it for a test run in CI
2019-05-28 22:07:46 -07:00
sramamoorthy
925499954b
New status cluster_not_fully_recovered
2019-05-28 22:07:46 -07:00
sramamoorthy
898bed66c1
Allow only whitelisted binary path for exec op
2019-05-28 22:07:46 -07:00
Nikolas Ioannou
d6170210e7
AsyncFileCached: throw (new) exception, through helper static fn, if cache eviction polity not recognized.
2019-05-06 10:11:46 +02:00
Evan Tschannen
8e05713a5d
do not log a SevError trace event if we cannot deserialize the connect packet
2019-04-10 17:41:02 -07:00
Trevor Clinkenbeard
dc2b740415
Added server_overloaded error and client message
2019-01-25 13:15:19 -08:00
Evan Tschannen
684a22a52b
Merge branch 'release-6.0'
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# fdbbackup/backup.actor.cpp
# fdbclient/BackupContainer.actor.cpp
# fdbclient/HTTP.actor.cpp
# fdbserver/storageserver.actor.cpp
# fdbserver/workloads/BackupCorrectness.actor.cpp
# versions.target
2019-01-09 16:14:46 -08:00
Stephen Atherton
0ec216a1fa
Added X-Request-ID header to HTTP requests and verification of matching ID in response, if present.
2019-01-07 17:56:38 -08:00
Stephen Atherton
f64d5321e9
Backup describe, expire, and delete will now clearly indicate when there is no backup at the given URL.
2018-12-20 18:05:23 -08:00
Stephen Atherton
5d9cd9acdc
Correctness test now has additional random reader which doesn't do verification but isn't stopped when the btree is closed. Fixed bug exposed by this where pager snapshots will still try to read pages after the pager has been shut down or even destroyed. Added new error type, shutdown_in_progress.
2018-10-04 23:46:37 -07:00
Stephen Atherton
ce3f01a0cf
Added concept of type to JsonString. Appending single items or key/value pairs is now type-safe and only allowed in certain cases. JsonString will refuse to produce invalid JSON. All duplicative functions have been replaced with templates. Encoding of values uses json_spirit's value writer which should be no worse performance than format() and it will escape everything properly. Final string form is now built directly using knowledge of type, such as when an instance becomes an Array or Object the appropriate opening character is written. This avoids a full copy just to prepend the opening character later. Index interface for key/value pairs no longer makes a temporary copy of the key string. JsonString is now only needed by Status.actor.cpp. Still more work to be done here.
2018-09-08 07:15:28 -07:00
Alec Grieser
be9c34c6f8
Merge remote-tracking branch 'upstream/release-5.2' into merge-release-5.2
2018-07-10 10:04:48 -07:00
Stephen Atherton
3ce7c78d36
If an HTTP request fails due to a connection failure or a timeout, do not convert the error to the more generic http_request_failed.
2018-07-09 18:58:33 -07:00
A.J. Beamon
0ca51989bb
Merge branch 'master' into trace-log-refactor
...
# Conflicts:
# fdbserver/QuietDatabase.actor.cpp
# fdbserver/Status.actor.cpp
# flow/Trace.cpp
2018-06-08 13:24:30 -07:00
Balachandar Namasivayam
529d0497f1
Proxy going OOM when applying high volumes of writes to a proxy, particular in a sudden fashion before ratekeeper can control the workload.
...
Address this issue by proactively monitoring the memory used by commit batches and dropping requests if a certain memory limit is exceeded.
2018-06-01 15:21:40 -07:00
A.J. Beamon
ce0c991e78
Refactor trace events to store a vector of fields that aren't encoded until write time. Better support for pre-network trace events. Rework how trace events are queried. Some initial work towards pluggable formatting of logs.
2018-05-02 10:44:38 -07:00
Alec Grieser
0bae9880f1
remove trailing whitespace from our copyright headers ; fixed formatting of python setup.py
2018-02-21 10:25:11 -08:00
Stephen Atherton
69425a303b
Improved error handling for cases where blob account credentials are either not found in the provided credentials sources and/or some of the credentials sources provided are not readable or parseable.
2018-02-07 21:50:43 -08:00
Stephen Atherton
66de9d392b
New error code, http_auth_failed, which is used when blob authentication fails instead of the previous generic http_request_failed.
2018-01-22 14:58:56 -08:00
Stephen Atherton
93b34a945f
Major usability and performance improvements to backup management. Backup descriptions now calculate and display timestamps using TimeKeeper data (if given a cluster) and restorability of snapshots. Expire now requires a --force option to leave a backup unrestorable or unrestorable after a given point in time, specified by version or timestamp. BackupContainerFilesystem now maintains metadata on key version boundaries in order to avoid large list operations for describe and expire operations. Blob parallel recursive list operations can now take a path (aka prefix) filter function. New describe and expire options are available in fdbbackup.
2018-01-17 04:09:43 -08:00
A.J. Beamon
653a46f12f
Update error string fro cluster_version_changed error
2018-01-04 15:06:09 -08:00
Stephen Atherton
aeebe711ce
TaskBucket’s saveAndExtend() is now accomplished through extendTimeout() with an option to save parameters. SaveAndExtendIncrementally() has been removed as it is no longer needed because TaskBucket’s normal execution loop calls extendTimeout() periodically as long as the TaskFunc’s execute() actor has not finished or thrown. If a TaskFunc wants to save changes to task parameters to checkpoint progress for task restarts to benefit from it can call extendTimeout() explicitly with the updateParams flag set to true.
2017-11-30 17:18:57 -08:00
Stephen Atherton
3dfaf13b67
IBackupContainer has been rewritten to be a logical interface for storing, reading, deleting, expiring, and querying backup data. The details of how the data is organized or stored is now hidden from users of the interface. Both the local and blobstore containers have been rewritten, the key changes being a multi level directory structure and no more use of temporary files or pseudo-symlinks in the blob store implementation. This refactor has a large impact radius as the previous backup container was just a thin wrapper that presented a single level list of files and offered no methods for managing or interpreting the file structure so all of that logic was spread around other places in the code base. This made moving to the new blob store schema very messy, and without this refactor further changes in the future would only be worse.
...
Several backup tasks have been cleaned up / simplified because they no longer need to manage the ‘raw’ structure of the backup. The addition of IBackupFile and its finish() method simplified the log and range writer tasks. Updated BlobStoreEndpoint to support now-required bucket creation and bucket listing prefix/delimiter options for finding common prefixes. Added KeyBackedSet<T> type. Moved JSONDoc to its own header. Added platform::findFilesRecursively().
Still to do: update command line tool to use new IBackupContainer interface, fix bugs in Restore startup.
2017-11-14 23:33:17 -08:00
Stephen Atherton
3afc85881e
Merge branch 'master' into backup-container-refactor
...
# Conflicts:
# fdbrpc/BlobStore.actor.cpp
2017-10-20 21:38:28 -07:00