[Modules] Start making explicit modules produce deterministic output.

There are two aspects of non-determinism fixed here, which was the
minimum required to cause at least an empty module to be deterministic.

First, the random number signature is only inserted into the module when
we are building modules implicitly. The use case for these random
signatures is to work around the very fact that modules are not
deterministic in their output when working with the implicitly built and
populated module cache. Eventually this should go away entirely when
we're confident that Clang is producing deterministic output.

Second, the on-disk hash table is populated based on the order of
iteration over a DenseMap. Instead, use a MapVector so that we can walk
it in insertion order.

I've added a test that an empty module, when built twice, produces the
same binary PCM file.

llvm-svn: 233115
This commit is contained in:
Chandler Carruth 2015-03-24 21:18:10 +00:00
parent d0aa59f77f
commit 885e78cb22
4 changed files with 33 additions and 14 deletions

View File

@ -225,7 +225,7 @@ private:
/// The ID numbers for identifiers are consecutive (in order of
/// discovery), starting at 1. An ID of zero refers to a NULL
/// IdentifierInfo.
llvm::DenseMap<const IdentifierInfo *, serialization::IdentID> IdentifierIDs;
llvm::MapVector<const IdentifierInfo *, serialization::IdentID> IdentifierIDs;
/// \brief The first ID number we can use for our own macros.
serialization::MacroID FirstMacroID;

View File

@ -1160,12 +1160,17 @@ void ASTWriter::WriteControlBlock(Preprocessor &PP, ASTContext &Context,
Stream.EmitRecordWithBlob(MetadataAbbrevCode, Record,
getClangFullRepositoryVersion());
// Signature
Record.clear();
Record.push_back(getSignature());
Stream.EmitRecord(SIGNATURE, Record);
if (WritingModule) {
// For implicit modules we output a signature that we can use to ensure
// duplicate module builds don't collide in the cache as their output order
// is non-deterministic.
// FIXME: Remove this when output is deterministic.
if (Context.getLangOpts().ImplicitModules) {
Record.clear();
Record.push_back(getSignature());
Stream.EmitRecord(SIGNATURE, Record);
}
// Module name
BitCodeAbbrev *Abbrev = new BitCodeAbbrev();
Abbrev->Add(BitCodeAbbrevOp(MODULE_NAME));
@ -3507,14 +3512,12 @@ void ASTWriter::WriteIdentifierTable(Preprocessor &PP,
// Create the on-disk hash table representation. We only store offsets
// for identifiers that appear here for the first time.
IdentifierOffsets.resize(NextIdentID - FirstIdentID);
for (llvm::DenseMap<const IdentifierInfo *, IdentID>::iterator
ID = IdentifierIDs.begin(), IDEnd = IdentifierIDs.end();
ID != IDEnd; ++ID) {
assert(ID->first && "NULL identifier in identifier table");
if (!Chain || !ID->first->isFromAST() ||
ID->first->hasChangedSinceDeserialization())
Generator.insert(const_cast<IdentifierInfo *>(ID->first), ID->second,
Trait);
for (auto IdentIDPair : IdentifierIDs) {
IdentifierInfo *II = const_cast<IdentifierInfo *>(IdentIDPair.first);
IdentID ID = IdentIDPair.second;
assert(II && "NULL identifier in identifier table");
if (!Chain || !II->isFromAST() || II->hasChangedSinceDeserialization())
Generator.insert(II, ID, Trait);
}
// Create the on-disk hash table in a buffer.

View File

@ -0,0 +1 @@
// This file intentionally left empty.

View File

@ -0,0 +1,15 @@
// RUN: rm -rf %t
//
// RUN: %clang_cc1 -fmodules -x c++ -fmodules-cache-path=%t \
// RUN: -fno-implicit-modules -fno-modules-implicit-maps \
// RUN: -emit-module -fmodule-name=empty -o %t/base.pcm \
// RUN: %s
//
// RUN: %clang_cc1 -fmodules -x c++ -fmodules-cache-path=%t \
// RUN: -fno-implicit-modules -fno-modules-implicit-maps \
// RUN: -emit-module -fmodule-name=empty -o %t/check.pcm \
// RUN: %s
//
// RUN: diff %t/base.pcm %t/check.pcm
module empty { header "Inputs/empty.h" export * }