foundationdb

Commit Graph

Author	SHA1	Message	Date
Alex Miller	da73164eda	Move crc32c from fdbrpc to flow So that we can use it from a piece of flow code without breaking module boundaries. Also rename generated-constants to crc32c-generated-constants so that it's more apparent that they're related files.	2020-01-13 18:19:30 -08:00
Alvin Moore	3bf971ba8b	Merge branch 'release-6.2' of github.com:apple/foundationdb into release_6.2_merge # Conflicts: # documentation/sphinx/source/release-notes.rst # fdbserver/storageserver.actor.cpp	2019-12-12 07:13:12 -08:00
Andrew Noyes	46d10dc7dc	Fix "null passed as argument declared not null" Fix several such reports from ubsan E.g. /Users/anoyes/workspace/foundationdb/flow/Arena.h:794:16: runtime error: null pointer passed as argument 1, which is declared to never be null	2019-12-03 14:46:53 -08:00
Alex Miller	35a0fc948d	Make DiskQueue V1 not ignore min recovery location. I can't figure out why I made this branch on version, and it's breaking having value and reference tlogs in the same SharedTLog	2019-10-03 01:45:10 -07:00
Meng Xu	c9c50ceff8	Comments:Add comments to DiskQueue No functional change.	2019-08-01 15:20:01 -07:00
A.J. Beamon	e5381e0612	Fix some new usages of g_random	2019-05-23 09:23:27 -07:00
A.J. Beamon	603721e125	Merge branch 'master' into thread-safe-random-number-generation # Conflicts: # fdbclient/ManagementAPI.actor.cpp # fdbrpc/AsyncFileCached.actor.h # fdbrpc/genericactors.actor.cpp # fdbrpc/sim2.actor.cpp # fdbserver/DiskQueue.actor.cpp # fdbserver/workloads/BulkSetup.actor.h # flow/ActorCollection.actor.cpp # flow/Net2.actor.cpp # flow/Trace.cpp # flow/flow.cpp	2019-05-23 08:35:47 -07:00
Alex Miller	4a7e0319c7	Refactor away pushlock. Pushing was already a serialized, sequential operation. Instead make it explicit that there are two waits as part of a push: 1. The setup work to reserve a spot on in the file 2. The work of writing and sync'ing the data And we return a Future<Future<Void>> to force these to be done sequentially.	2019-05-10 20:30:52 -10:00
Alex Miller	ea12a54946	Rename DISK_QUEUE_MAX_TRUNCATE_EXTENTS -> ..._BYTES So as to not make filesystem assumptions. This knob did technically appear in (only the) 6.1.5 release, but this feature was broken 6.1.5, so thus impossible to use anyway.	2019-05-10 18:26:22 -10:00
Alex Miller	c95d09f9fd	Convert truncate(0) to truncate(4KB) on Windows. Blindly, in case Windows doesn't like 0 length truncates too.	2019-05-10 14:55:11 -10:00
Alex Miller	c502ed3d15	Fix a variety of problems stemming from a wait() being added to push(). And that this code was previously insufficiently tested.	2019-05-10 14:55:11 -10:00
A.J. Beamon	5f55f3f613	Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used.	2019-05-10 14:01:52 -07:00
Alex Miller	510b0b2fcd	Fix DiskQueue not replaceFile'ing frequently enough for the final time.	2019-05-08 23:08:25 -10:00
Alex Miller	c6c33a4daa	Make replaceFile more likely to be tested.	2019-05-08 21:23:42 -10:00
Alex Miller	0d0f54d1e6	Fix IAsyncFileSystem::open() flags to stop a crash. OPEN_ATOMIC_WRITE_AND_CREATE was missing a required OPEN_CREATE. I'm honestly baffled how this was missed in testing.	2019-05-08 21:22:40 -10:00
Alex Miller	b50926c792	replaceFile is truncate(0) on windows	2019-05-08 21:22:14 -10:00
Alex Miller	e4ba2f5788	Add an ending TraceEvent.	2019-05-08 12:35:12 -10:00
Alex Miller	c093017c2f	Add a TraceEvent and release note.	2019-05-08 12:34:25 -10:00
Alex Miller	0685e6c1c7	Avoid large truncates in the DiskQueue. And instead create a new file while incrementally truncating the old one down. This avoids queueing up a massive number of filesystem metadata operations in one call, thus flooding the disk with requests and stalling out all other filesystem operations. This sets the knobs so that a truncate of >10GB causes us to create a new file rather than trying to truncate the old one.	2019-05-08 12:33:31 -10:00
Alex Miller	36dfbf4fb3	Only truncate DiskQueues down to TLOG_HARD_LIMIT2. DiskQueue shrinking was implemented for spill-by-reference, as now a DiskQueue could grow "unboundedly" large. Without a minimum file size, write burst workloads would cause the DiskQueue to shrink down to 100MB, and then grow back to its usual ~4GB size in a cycle. File growth means filesystem metadata mutations, which we'd prefer to avoid if possible since they're more unpredicatble in terms of latency. In a healthy cluster, the TLog never spills, so the disk of a single DiskQueue file should stay less than 2TLOG_SPILL_THRESHOLD. In the worst case of spill-by-value, the DiskQueue could grow to 2*TLOG_HARD_LIMIT. Therefore, having this limit will cause DiskQueue shrinking to never behave sub-optimally for spill-by-value, and will cause the DiskQueue files to return to the optimal size with spill-by-reference.	2019-05-08 12:33:31 -10:00
Alex Miller	a269a784cc	Convert push() into an actor.	2019-05-08 12:33:31 -10:00
Alex Miller	37ea71b117	Implement limiting how many bytes recovery will read. This time, track what location in the DiskQueue has been spilled in persistent state, and then feed it back into the disk queue before recovery. This also introduces an ASSERT that recovery only reads exactly the bytes that it needs to have in memory.	2019-03-18 15:09:43 -07:00
Alex Miller	ee4721a63f	Make checking or ignoring checksums part of the IDiskQueue::read API.	2019-03-15 21:01:18 -07:00
Alex Miller	bf247eeed0	If TLogVersion >= 3, use crc32c for the DiskQueue hash for TLogs. We don't have a forward compatibility story for the memory storage engine, so its DiskQueue will still be hashlittle2 until one exists.	2019-03-15 21:01:16 -07:00
Alex Miller	686b097397	Remove verification code from DiskQueue and TLogServer.	2019-03-15 21:01:15 -07:00
Alex Miller	bdd7d5d3df	Initialize firstPages with 0xFF. There's various ASSERT()'s that assume firstPages is empty, and enforces things about `seq`. Some of these asserts have spuriously passed, since uninitialized pages look like they have a `seq` of 0, which would be the beginning of the disk queue. Now they'll look like the end of the disk queue, which is far easier to fail on.	2019-03-15 21:01:14 -07:00
Alex Miller	baa3e1af2c	Replace `/sizeof(Page)*sizeof(Page)` with `pageFloor()`.	2019-03-04 01:42:39 -08:00
Alex Miller	ee64b43366	Change DQ shrink logic to consider "active" bytes rather than file size. We know what the current ideal size of the DQ file should be, so we should use it.	2019-03-04 01:42:39 -08:00
Alex Miller	94bf75cb00	Allow the disk queue to shrink if it has unneeded slack space.	2019-03-04 01:42:38 -08:00
Alex Miller	52d5a721a6	Don't allocate 2x the memory for a read to save 1% of allocated memory.	2019-03-04 01:42:38 -08:00
Alex Miller	ee8964c8ec	Plumb through getNext{Commit,Push}Location	2019-02-26 18:00:55 -08:00
Alex Miller	b725d841ea	Restore a hash check as an ASSERT_WE_THINK	2019-02-19 22:30:15 -08:00
Alex Miller	334730ce7d	Do not re-hash firstPages. They're already known to be valid.	2019-02-19 22:10:46 -08:00
Alex Miller	12123f41d6	Plumb a read function up the stack to IDiskQueue	2019-02-12 23:44:13 -08:00
Alex Miller	6c7229ec07	read fix while recovery	2019-02-12 23:44:13 -08:00
Alex Miller	8b21d1ac8f	Add a standalone recovery initialization function.	2019-02-12 23:44:13 -08:00
Alex Miller	2f49acc8a0	Add a read function.	2019-02-12 23:44:13 -08:00
Alex Miller	63eb62cd36	Fix a bug when a read was delayed until after the entire disk queue has been rewritten.	2019-02-12 23:44:13 -08:00
Alex Miller	9886386a83	temporarily verify commited data as a test for read	2019-02-12 23:44:13 -08:00
Alex Miller	efa8aa7e2e	Adjust findPhysicalLocation to not spam. Context is now optional, so that our high-volume calls don't get logged, but low-volume calls still get logged the same way that they did before.	2019-02-12 23:44:13 -08:00
Alex Miller	f1c31e2305	Add a read function to disk queue	2019-02-12 23:44:13 -08:00
Alex Miller	2d2b03a9ff	prepare DiskQueue for actors	2019-02-12 23:44:13 -08:00
Alex Miller	40fe29c29b	Abstract TrackMe into a reusable CRTP class.	2019-02-12 23:44:13 -08:00
Alex Miller	018d12fe90	use firstpages instead of recoveryfirstpages	2019-02-12 23:43:10 -08:00
Alex Miller	dbf7cefcd8	Add firstPages to DiskQueue	2019-02-12 23:43:10 -08:00
Alex Miller	2570b37e6e	Add function to read pages from RawDiskQueue_TwoFiles	2019-02-12 23:43:10 -08:00
Andrew Noyes	067a445e06	Replace unused _ variables with wait(success(...))	2019-02-12 17:30:30 -08:00
Alex Miller	22a08b2b4e	Change mutable ref to pointer so outparams are obvious.	2019-02-04 18:04:22 -08:00
Alex Miller	0efcccc06f	Fix a long standing minor bug in disk queue that could lead to unnecessary commits. If the disk queue is called with the following series of operations: Push(a) -> 1 Commit() Pop(1) Push(b) Commit() Commit() Then the last Commit() should be a no-op, and not actually run accordingly. However, anyPopped was only set to `false` if no pages were pushed, and thus we'd falsely think that an extra empty page commit needed to happen to log to record the new popped position, but there actually was no new popped page position to record. Aside from the extra commit, it maybe makes getCommitOverhead slightly inaccurate, but that's only used for some accounting inside of the memory storage engine and at a quick glance doesn't look like it should have caused any bad effects. I dug through history, and this code has been this way since the initial commit by Dave, and then no one has touched the anyPopped logic since.	2019-02-04 18:04:22 -08:00
Alex Miller	bfab7c150a	Require PageHeader to be 36 bytes, and don't use magic numbers.	2018-12-17 13:37:44 -08:00

1 2

73 Commits