Commit Graph

612 Commits

Author SHA1 Message Date
Richard Smith 66a8186ed4 Rename MacroDefinition -> MacroDefinitionRecord, Preprocessor::MacroDefinition -> MacroDefinition.
clang::MacroDefinition now models the currently-defined value of a macro. The
previous MacroDefinition type, which represented a record of a macro definition
directive for a detailed preprocessing record, is now called MacroDefinitionRecord.

llvm-svn: 236400
2015-05-04 02:25:31 +00:00
Richard Smith 10434f307c [modules] Remove dead code from Module for tracking macro import locations.
llvm-svn: 236376
2015-05-02 02:08:26 +00:00
Richard Smith ee977933f7 [modules] Add -fmodules-local-submodule-visibility flag.
This flag specifies that the normal visibility rules should be used even for
local submodules (submodules of the currently-being-built module). Thus names
will only be visible if a header / module that declares them has actually been
included / imported, and not merely because a submodule that happened to be
built earlier declared those names. This also removes the need to modularize
bottom-up: textually-included headers will be included into every submodule
that includes them, since their include guards will not leak between modules.

So far, this only governs visibility of macros, not of declarations, so is not
ready for real use yet.

llvm-svn: 236350
2015-05-01 21:22:17 +00:00
Richard Smith a7e2cc684f [modules] Start moving the module visibility information off the Module itself.
It has no place there; it's not a property of the Module, and it makes
restoring the visibility set when we leave a submodule more difficult.

llvm-svn: 236300
2015-05-01 01:53:09 +00:00
Richard Smith 3981b17709 Remove dead code: a MacroDirective can't be imported or ambiguous any more.
llvm-svn: 236197
2015-04-30 02:16:23 +00:00
Richard Smith 20e883e59b [modules] Stop trying to fake up a linear MacroDirective history.
Modules builds fundamentally have a non-linear macro history. In the interest
of better source fidelity, represent the macro definition information
faithfully: we have a linear macro directive history within each module, and at
any point we have a unique "latest" local macro directive and a collection of
visible imported directives. This also removes the attendent complexity of
attempting to create a correct MacroDirective history (which we got wrong
in the general case).

No functionality change intended.

llvm-svn: 236176
2015-04-29 23:20:19 +00:00
Richard Smith 713369b057 [modules] Store a ModuleMacro* on an imported macro directive rather than duplicating the info within it.
llvm-svn: 235644
2015-04-23 20:40:50 +00:00
Richard Smith b8b2ed6529 [modules] Determine the set of macros exported by a submodule at the end of that submodule.
Previously we'd defer this determination until writing the AST, which doesn't
allow us to use this information when building other submodules of the same
module. This change also allows us to use a uniform mechanism for writing
module macro records, independent of whether they are local or imported.

llvm-svn: 235614
2015-04-23 18:18:26 +00:00
Richard Smith d732939592 [modules] Move list of exported module macros from IdentifierInfo lookup table to separate storage, adjacent to the macro directive history.
This is substantially simpler, provides better space usage accounting in bcanalyzer,
and gives a more compact representation. No functionality change intended.

llvm-svn: 235420
2015-04-21 21:46:32 +00:00
Richard Smith 9eb309371b Try to work around failure to convert this lambda to a function pointer in some versions of GCC.
llvm-svn: 235264
2015-04-19 01:47:53 +00:00
Richard Smith cf97cf6693 [modules] Refactor macro emission. No functionality change.
llvm-svn: 235263
2015-04-19 01:34:23 +00:00
Richard Smith 13090a2304 [modules] Remove unused MACRO_TABLE record.
llvm-svn: 234555
2015-04-10 02:02:24 +00:00
Chandler Carruth 12c8f65408 [Modules] Make Sema's map of referenced selectors have a deterministic
order based on order of insertion.

This should cause both our warnings about these and the modules
serialization to be deterministic as a consequence.

Found by inspection.

llvm-svn: 233343
2015-03-27 00:55:05 +00:00
Chandler Carruth acbbeb9782 [Modules] Make our on-disk hash table of selector IDs be built in
a deterministic order.

This uses a MapVector to track the insertion order of selectors.

Found by inspection.

llvm-svn: 233342
2015-03-27 00:47:43 +00:00
Chandler Carruth b306d73a2d [Modules] Sort the file IDs prior to building the flattened array of
DeclIDs so that in addition to be grouped by file, the order of these
groups is stable.

Found by inspection, no test case. Not sure this can be observed without
a randomized seed for the hash table, but we shouldn't be relying on the
hash table layout under any circumstances.

llvm-svn: 233339
2015-03-27 00:31:20 +00:00
Chandler Carruth 8440c98642 [Modules] Make the AST serialization always use lexicographic order when
traversing the identifier table.

No easy test case as this table is somewhere between hard and impossible
to observe as non-deterministically ordered. The table is a hash table
but we hash the string contents and never remove entries from the table
so the growth pattern, etc, is all completely fixed. However, relying on
the hash function being deterministic is specifically against the
long-term direction of LLVM's hashing datastructures, which are intended
to provide *no* ordering guarantees. As such, this defends against these
things by sorting the identifiers. Sorting identifiers right before we
emit them to a serialized form seems a low cost for predictability here.

llvm-svn: 233332
2015-03-26 23:54:15 +00:00
Chandler Carruth 7bcfdd516c [Modules] Delete stale, pointless code. All tests still pass with this
logic removed.

This logic was both inserting all builtins into the identifier table and
ensuring they would get serialized. The first happens unconditionally
now, and we always write out the entire identifier table. This code can
simply go away.

llvm-svn: 233331
2015-03-26 23:45:40 +00:00
Chandler Carruth ffbf705cc3 [Modules] Fix a sneaky bug in r233249 where we would look for implicit
constructors in the current lexical context even though name lookup
found them via some other context merged into the redecl chain.

This can only happen for implicit constructors which can only have the
name of the type of the current context, so we can fix this by simply
*always* merging those names first. This also has the advantage of
removing the walk of the current lexical context from the common case
when this is the only constructor name we need to deal with (implicit or
otherwise).

I've enhanced the tests to cover this case (and uncovered an unrelated
bug which I fixed in r233325).

llvm-svn: 233327
2015-03-26 22:27:09 +00:00
Chandler Carruth 52cee4dad2 [Modules] Preserve source order for the map of late parsed templates.
Clang was inserting these into a dense map. While it never iterated the
dense map during normal compilation, it did when emitting a module. Fix
this by using a standard MapVector to preserve the order in which we
encounter the late parsed templates.

I suspect this still isn't ideal, as we don't seem to remove things from
this map even when we mark the templates as no longer late parsed. But
I don't know enough about this particular extension to craft a nice,
subtle test case covering this. I've managed to get the stress test to
at least do some late parsing and demonstrate the core problem here.
This patch fixes the test and provides deterministic behavior which is
a strict improvement over the prior state.

I've cleaned up some of the code here as well to be explicit about
inserting when that is what is actually going on.

llvm-svn: 233264
2015-03-26 09:08:15 +00:00
Chandler Carruth f85d98285e [Modules] Make "#pragma weak" undeclared identifiers be tracked
deterministically.

This fixes a latent issue where even Clang's Sema (and diagnostics) were
non-deterministic in the face of this pragma. The fix is super simple --
just use a MapVector so we track the order in which these are parsed (or
imported). Especially considering how rare they are, this seems like the
perfect tradeoff. I've also simplified the client code with judicious
use of auto and range based for loops.

I've added some pretty hilarious code to my stress test which now
survives the binary diff without issue.

llvm-svn: 233261
2015-03-26 08:32:49 +00:00
Chandler Carruth 8a3d24dcf8 [Modules] Delete a bunch of complex code for ensuring visible decls in
updated decl contexts get emitted.

Since this code was added, we have newer vastly simpler code for
handling this. The code I'm removing was very expensive and also
generated unstable order of declarations which made module outputs
non-deterministic.

All of the tests continue to pass for me and I'm able to check the
difference between the .pcm files after merging modules together.

llvm-svn: 233251
2015-03-26 04:27:10 +00:00
Richard Smith 65ebb4ac8a [modules] If we reach a definition of a class for which we already have a
non-visible definition, skip the new definition and make the old one visible
instead of trying to parse it again and failing horribly. C++'s ODR allows
us to assume that the two definitions are identical.

llvm-svn: 233250
2015-03-26 04:09:53 +00:00
Chandler Carruth e972c36221 [Modules] A second attempt at writing out on-disk hash tables for the
decl context lookup tables.

The first attepmt at this caused problems. We had significantly more
sources of non-determinism that I realized at first, and my change
essentially turned them from non-deterministic output into
use-after-free. Except that they weren't necessarily caught by tools
because the data wasn't really freed.

The new approach is much simpler. The first big simplification is to
inline the "visit" code and handle this directly. That works much
better, and I'll try to go and clean up the other caller of the visit
logic similarly.

The second key to the entire approach is that we need to *only* collect
names into a stable order at first. We then need to issue all of the
actual 'lookup()' calls in the stable order of the names so that we load
external results in a stable order. Once we have loaded all the results,
the table of results will stop being invalidated and we can walk all of
the names again and use the cheap 'noload_lookup()' method to quickly
get the results and serialize them.

To handle constructors and conversion functions (whose names can't be
stably ordered) in this approach, what we do is record only the visible
constructor and conversion function names at first. Then, if we have
any, we walk the decls of the class and add those names in the order
they occur in the AST. The rest falls out naturally.

This actually ends up simpler than the previous approach and seems much
more robust.

It uncovered a latent issue where we were building on-disk hash tables
for lookup results when the context was a linkage spec! This happened to
dodge all of the assert by some miracle. Instead, add a proper predicate
to the DeclContext class and use that which tests both for function
contexts and linkage specs.

It also uncovered PR23030 where we are forming somewhat bizarre negative
lookup results. I've just worked around this with a FIXME in place
because fixing this particular Clang bug seems quite hard.

I've flipped the first part of the test case I added for stability back
on in this commit. I'm taking it gradually to try and make sure the
build bots are happy this time.

llvm-svn: 233249
2015-03-26 03:11:40 +00:00
Rafael Espindola c606b3e6d0 Revert "[Modules] When writing out the on-disk hash table for the decl context lookup tables, we need to establish a stable ordering for constructing the hash table. This is trickier than it might seem."
This reverts commit r233156. It broke the bots.

llvm-svn: 233172
2015-03-25 04:43:15 +00:00
Chandler Carruth 75c9f13aa8 [Modules] When writing out the on-disk hash table for the decl context
lookup tables, we need to establish a stable ordering for constructing
the hash table. This is trickier than it might seem.

Most of these cases are easily handled by sorting the lookup results
associated with a specific name that has an identifier. However for
constructors and conversion functions, the story is more complicated.
Here we need to merge all of the constructors or conversion functions
together and this merge needs to be stable. We don't have any stable
ordering for either constructors or conversion functions as both would
require a stable ordering across types.

Instead, when we have constructors or conversion functions in the
results, we reconstruct a stable order by walking the decl context in
lexical order and merging them in the order their particular declaration
names are encountered. This doesn't generalize as there might be found
declaration names which don't actually occur within the lexical context,
but for constructors and conversion functions it is safe. It does
require loading the entire decl context if necessary to establish the
ordering but there doesn't seem to be a meaningful way around that.

Many thanks to Richard for talking through all of the design choices
here. While I wrote the code, he guided all the actual decisions about
how to establish the order of things.

No test case yet because the test case I have doesn't pass yet -- there
are still more sources of non-determinism. However, this is complex
enough that I wanted it to go into its own commit in case it causes some
unforseen issue or needs to be reverted.

llvm-svn: 233156
2015-03-25 00:34:51 +00:00
Chandler Carruth 885e78cb22 [Modules] Start making explicit modules produce deterministic output.
There are two aspects of non-determinism fixed here, which was the
minimum required to cause at least an empty module to be deterministic.

First, the random number signature is only inserted into the module when
we are building modules implicitly. The use case for these random
signatures is to work around the very fact that modules are not
deterministic in their output when working with the implicitly built and
populated module cache. Eventually this should go away entirely when
we're confident that Clang is producing deterministic output.

Second, the on-disk hash table is populated based on the order of
iteration over a DenseMap. Instead, use a MapVector so that we can walk
it in insertion order.

I've added a test that an empty module, when built twice, produces the
same binary PCM file.

llvm-svn: 233115
2015-03-24 21:18:10 +00:00
Daniel Jasper c77b0a1b60 Silence unused warning in non-assert builds.
llvm-svn: 233053
2015-03-24 08:06:38 +00:00
Richard Smith c2bb81860b [modules] Deserialize CXXCtorInitializer list for a constructor lazily.
Previously we'd deserialize the list of mem-initializers for a constructor when
we deserialized the declaration of the constructor. That could trigger a
significant amount of unnecessary work (pulling in all base classes
recursively, for a start) and was causing problems for the modules buildbot due
to cyclic deserializations. We now deserialize these on demand.

This creates a certain amount of duplication with the handling of
CXXBaseSpecifiers; I'll look into reducing that next.

llvm-svn: 233052
2015-03-24 06:36:48 +00:00
Richard Smith 9e2341d093 [modules] Remove redundant import of lexical decls when building a lookup table
for a DeclContext, and fix propagation of exception specifications along
redeclaration chains.

This reverts r232905, r232907, and r232907, which reverted r232793, r232853,
and r232853.

One additional change is present here to resolve issues with LLDB: distinguish
between whether lexical decls missing from the lookup table are local or are
provided by the external AST source, and still look in the external source if
that's where they came from.

llvm-svn: 232928
2015-03-23 03:25:59 +00:00
Vince Harron 08dcf60295 Reverting 232853 and 232870 because they depend on 232793,
which was reverted because it was causing LLDB test failures

llvm-svn: 232907
2015-03-22 08:47:07 +00:00
Richard Smith decef8007f [modules] When either redecl chain merging or an update record causes us to
give an exception specification to a declaration that didn't have an exception
specification in any of our imported modules, emit an update record ourselves.
Without this, code importing the current module would not see an exception
specification that we could see and might have relied on.

llvm-svn: 232870
2015-03-21 00:58:54 +00:00
Yaron Keren 92e1b62d45 Remove many superfluous SmallString::str() calls.
Now that SmallString is a first-class citizen, most SmallString::str()
calls are not required. This patch removes a whole bunch of them, yet
there are lots more.

There are two use cases where str() is really needed:
1) To use one of StringRef member functions which is not available in
SmallString.
2) To convert to std::string, as StringRef implicitly converts while 
SmallString do not. We may wish to change this, but it may introduce
ambiguity.

llvm-svn: 232622
2015-03-18 10:17:07 +00:00
Richard Smith 7f330cdb31 Make module files passed to a module build via -fmodule-file= available to
consumers of that module.

Previously, such a file would only be available if the module happened to
actually import something from that module.

llvm-svn: 232583
2015-03-18 01:42:29 +00:00
Richard Smith 2095ffea41 Lambdaify some helper functions. No functionality change.
llvm-svn: 232407
2015-03-16 20:11:03 +00:00
Richard Smith df8a83127f Deduplicate #undef directives imported from multiple modules.
No functionality change, but deeply-importing module files are smaller and
faster now.

llvm-svn: 232140
2015-03-13 04:05:01 +00:00
Richard Smith 9ab4ccecb3 [modules] Fix iterator invalidation issue with names being added to a module
while we're writing out the identifier table.

llvm-svn: 231890
2015-03-11 00:00:51 +00:00
Richard Smith f81340096d [modules] Don't clobber a destructor's operator delete when adding another one;
move the operator delete updating into a separate update record so we can cope
with updating another module's destructor's operator delete.

llvm-svn: 231735
2015-03-10 01:41:22 +00:00
Richard Smith f19e12794d Replace Sema's map of locally-scoped extern "C" declarations with a DeclContext
of extern "C" declarations. This is simpler and vastly more efficient for
modules builds (we no longer need to load *all* extern "C" declarations to
determine if we have a redeclaration).

No functionality change intended.

llvm-svn: 231538
2015-03-07 00:04:49 +00:00
Aaron Ballman 5a2046a948 Removing code that is unused after r231424; NFC.
llvm-svn: 231477
2015-03-06 14:24:53 +00:00
Richard Smith fe620d26ea [modules] Rework merging of redeclaration chains on module import.
We used to save out and eagerly load a (potentially huge) table of merged
formerly-canonical declarations when we loaded each module. This was extremely
inefficient in the presence of large amounts of merging, and didn't actually
save any merging lookup work, because we still needed to perform name lookup to
check that our merged declaration lists were complete. This also resulted in a
loss of laziness -- even if we only needed an early declaration of an entity, we
would eagerly pull in all declarations that had been merged into it regardless.

We now store the relevant fragments of the table within the declarations
themselves. In detail:

 * The first declaration of each entity within a module stores a list of first
   declarations from imported modules that are merged into it.
 * Loading that declaration pre-loads those other entities, so that they appear
   earlier within the redeclaration chain.
 * The name lookup tables list the most recent local lookup result, if there
   is one, or all directly-imported lookup results if not.

llvm-svn: 231424
2015-03-05 23:24:12 +00:00
Richard Smith 91c18de755 Rework our handling of key functions. We used to track a complete list of all
dynamic classes in the translation unit and check whether each one's key
function is defined when we got to the end of the TU (and when we got to the
end of each module). This is really terrible for modules performance, since it
causes unnecessary deserialization of every dynamic class in every compilation.

We now use a much simpler (and, in a modules build, vastly more efficient)
system: when we see an out-of-line definition of a virtual function, we check
whether that function was in fact its class's key function. (If so, we need to
emit the vtable.)

llvm-svn: 230830
2015-02-28 01:01:56 +00:00
Richard Smith 4a7e390c12 [modules] Don't write out name lookup table entries merely because the module
happened to query them; only write them out if something new was added.

llvm-svn: 230727
2015-02-27 03:40:09 +00:00
Richard Smith c4c34e2722 Remove slow and apparently pointless updating of all identifiers at the start
of writing out an AST file.

llvm-svn: 230428
2015-02-25 01:45:32 +00:00
Richard Smith cf4bdde33a Cleanup: remove artificial division between lookup results and const lookup
results. No-one was ever modifying a lookup result, and it would not be
reasonable to do so.

llvm-svn: 230123
2015-02-21 02:45:19 +00:00
Richard Smith 40c7806451 Revert r167816 and replace it with a proper fix for the issue: do not
invalidate lookup_iterators and lookup_results for some name within a
DeclContext if the lookup results for a *different* name change.

llvm-svn: 230121
2015-02-21 02:31:57 +00:00
Ben Langmuir d89dc561c7 Revert "Mangle the IsSystem bit into the .pcm file name"
While I investigate some possible problems with this patch.

This reverts commit r228966

llvm-svn: 229910
2015-02-19 20:23:22 +00:00
Argyrios Kyrtzidis bd0b651bd2 [PCH/Modules] Check that the specific module cache path the PCH was built with, is the same as
the one in the current compiler invocation. If they differ reject the PCH.

This protects against the badness occurring from getting modules loaded from different module caches (see crashes).

rdar://19889860

llvm-svn: 229909
2015-02-19 20:12:20 +00:00
Benjamin Kramer f989042f18 Prefer SmallVector::append/insert over push_back loops. Clang edition.
Same functionality, but hoists the vector growth out of the loop.

llvm-svn: 229508
2015-02-17 16:48:30 +00:00
Richard Smith ec216500f2 [modules] Improve llvm-bcanalyzer output on AST files a little. No functionality change.
llvm-svn: 229145
2015-02-13 19:48:37 +00:00
Ben Langmuir 18dd78a8fd Mangle the IsSystem bit into the .pcm file name
When mangling the module map path into a .pcm file name, also mangle the
IsSystem bit, which can also depend on the header search paths. For
example, the user may change from -I to -isystem.  This can affect
diagnostics in the importing TU.

llvm-svn: 228966
2015-02-12 21:51:31 +00:00