Introduced DIMacro and DIMacroFile debug info metadata in the LLVM IR to support macros.
Differential Revision: http://reviews.llvm.org/D14687
llvm-svn: 255245
There is no real reason the index has to have the concept of an
exporting Module. We should be able to have one single unique
instance of the Index, and it should be read-only after creation
for the whole ThinLTO processing.
The linker plugin should be able to process multiple modules (in
parallel or in sequence) with the same index.
The only reason the ExportingModule was present seems to be to
implement hasExportedFunctions() that is used by the Module linker
to decide what to do with the current Module.
For now I replaced it with a query to the map of Modules path to
see if this module was declared in the Index and consider that if
it is the case then it is probably exporting function.
On the long term the Linker interface needs to evolve and this
call should not be needed anymore.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 254581
Summary:
Several fixes to the handling of bitcode files without function summary
sections so that they are skipped during ThinLTO processing in llvm-lto
and the gold plugin when appropriate instead of aborting.
1 Don't assert when trying to add a FunctionInfo that doesn't have
a summary attached.
2 Skip FunctionInfo structures that don't have attached function summary
sections when trying to create the combined function summary.
3 In both llvm-lto and gold-plugin, check whether a bitcode file has
a function summary section before trying to parse the index, and skip
the bitcode file if it does not.
4 Fix hasFunctionSummaryInMemBuffer in BitcodeReader, which had a bug
where we returned to early while looking for the summary section.
Also added llvm-lto and gold-plugin based tests for cases where we
don't have function summaries in the bitcode file. I verified that
either the first couple fixes described above are enough to avoid the
crashes, or fixes 1,3,4. But have combined them all here for added
robustness.
Reviewers: joker.eph
Subscribers: llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D14903
llvm-svn: 253796
This assert was meant to execute at the end of parseMetadata, but
we return early and never reach the end of the function. Caught
by a compile-time warning since the function doesn't return a value
from that location.
llvm-svn: 253762
Summary:
This is split out from the ThinLTO metadata mapping patch
http://reviews.llvm.org/D14752.
To avoid needing to parse the module level metadata during function
importing, a new module-level record is added which holds the
number of module-level metadata values. This is required because
metadata value ids are assigned implicitly during parsing, and the
function-level metadata ids start after the module-level metadata ids.
I made a change to this version of the code compared to D14752
in order to add more consistent and thorough assertion checking of the
new record value. We now unconditionally use the record value to
initialize the MDValueList size, and handle it the same in parseMetadata
for all module level metadata cases (lazy loading or not).
Reviewers: dexonsmith, joker.eph
Subscribers: davidxl, llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D14825
llvm-svn: 253668
The LLVMContext was only used for Diagnostic. Pass a DiagnosticHandler
instead.
Differential Revision: http://reviews.llvm.org/D14794
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 253540
Summary:
There are currently two blocks with the METADATA_BLOCK id at module
scope. The first has the module-level metadata values (consisting of
some combination of METADATA_* record codes except for METADATA_KIND).
The second consists only of METADATA_KIND records. The latter is used
only in the METADATA_ATTACHMENT block within function blocks (for
metadata attached to instructions).
For ThinLTO we want to delay the parsing of module level metadata
until all functions have been imported from that module (there is some
bookkeeping used to suture it up when we read it during a post-pass).
However, we do need the METADATA_KIND records when parsing the function
body during importing, since those kinds are used as described above.
To simplify identification and parsing of just the block containing
the metadata kinds, use a different block id (METADATA_KIND_BLOCK_ID).
Support older bitcode without the new block id as well.
Reviewers: dexonsmith, joker.eph
Subscribers: davidxl, llvm-commits
Differential Revision: http://reviews.llvm.org/D14654
llvm-svn: 253154
Summary: Mimic parseTriple(); and exposes it to LTOModule.cpp
Reviewers: dexonsmith, rafael
Subscribers: llvm-commits
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 252442
This commit adds enums in LLVMBitCodes.h to improve readability and
maintainability. This is a follow-up to r252368 which was discussed
here:
http://reviews.llvm.org/D12923
llvm-svn: 252395
This marker prevents optimization passes from adding 'tail' or
'musttail' markers to a call. Is is used to prevent tail call
optimization from being performed on the call.
rdar://problem/22667622
Differential Revision: http://reviews.llvm.org/D12923
llvm-svn: 252368
This attribute allows the compiler to assume that the function never recurses into itself, either directly or indirectly (transitively). This can be used among other things to demote global variables to locals.
llvm-svn: 252282
Previously, subprograms contained a metadata reference to the function they
described. Because most clients need to get or set a subprogram for a given
function rather than the other way around, this created unneeded inefficiency.
For example, many passes needed to call the function llvm::makeSubprogramMap()
to build a mapping from functions to subprograms, and the IR linker needed to
fix up function references in a way that caused quadratic complexity in the IR
linking phase of LTO.
This change reverses the direction of the edge by storing the subprogram as
function-level metadata and removing DISubprogram's function field.
Since this is an IR change, a bitcode upgrade has been provided.
Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is
attached to the PR.
Differential Revision: http://reviews.llvm.org/D14265
llvm-svn: 252219
No test, since it would depend on what the compiler can optimize/reuse.
My next commit made this bug visible on Linux Release compiles with some
versions of gcc.
llvm-svn: 251909
This reverts commit r251837, due to a number of bot failures of the form:
/home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.obj/tools/llvm-link/Release+Asserts/llvm-link.o:llvm-link.cpp:function
loadIndex(llvm::LLVMContext&, llvm::Module const*): error: undefined
reference to
'llvm::object::FunctionIndexObjectFile::create(llvm::MemoryBufferRef,
llvm::LLVMContext&, llvm::Module const*, bool)'
/home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.obj/tools/llvm-link/Release+Asserts/llvm-link.o:llvm-link.cpp:function
loadIndex(llvm::LLVMContext&, llvm::Module const*): error: undefined
reference to 'llvm::object::FunctionIndexObjectFile::takeIndex()'
I'm not sure why these are happening - I added Object to the requred
libraries in tools/llvm-link/LLVMBuild.txt and the LLVM_LINK_COMPONENTS
in tools/llvm-link/CMakeLists.txt. Confirmed for my build that these
symbols come out of libLLVMObject.a. What am I missing?
llvm-svn: 251841
Summary:
Support for necessary linkage changes and symbol renaming during
ThinLTO function importing.
Also includes llvm-link support for manually importing functions
and associated llvm-link based tests.
Note that this does not include support for intelligently importing
metadata, which is currently imported duplicate times. That support will
be in the follow-on patch, and currently is ignored by the tests.
Reviewers: dexonsmith, joker.eph, davidxl
Subscribers: tobiasvk, tejohnson, llvm-commits
Differential Revision: http://reviews.llvm.org/D13515
llvm-svn: 251837
Use 10 bits to represent calling convention ID's instead of 13, and
update the bitcode compatibility tests accordingly. We now error-out in
the bitcode reader when we see bad calling conv ID's.
Thanks to rnk and dexonsmith for feedback!
Differential Revision: http://reviews.llvm.org/D13826
llvm-svn: 251452
Processing bitcode from a different LLVM version can lead to
unexpected behavior. The LLVM project guarantees autoupdating
bitcode from a previous minor revision for the same major, but
can't make any promise when reading bitcode generated from a
either a non-released LLVM, a vendor toolchain, or a "future"
LLVM release. This patch aims at being more user-friendly and
allows a bitcode produce to emit an optional block at the
beginning of the bitcode that will contains an opaque string
intended to describe the bitcode producer information. The
bitcode reader will dump this information alongside any error it
reports.
The optional block also includes an "epoch" number, monotonically
increasing when incompatible changes are made to the bitcode. The
reader will reject bitcode whose epoch is different from the one
expected.
Differential Revision: http://reviews.llvm.org/D13666
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 251325
Now LLVMBitWriter compiles without implicit ilist iterator conversions.
In these cases, the cleanest thing was to switch to range-based for
loops. Since there wasn't much noise I converted sub-loops and parent
loops as a drive-by.
llvm-svn: 250144
Summary:
The change to use the VST function entries for lazy deserialization did
not handle the case of anonymous functions without aliases. In that case
we must fall back to scanning the function blocks as there is no VST
entry.
Reviewers: dexonsmith, joker.eph, davidxl
Subscribers: tstellarAMD, llvm-commits
Differential Revision: http://reviews.llvm.org/D13596
llvm-svn: 249947
Removed an unused abbrev op in the VST_CODE_COMBINED_FNENTRY abbrev.
I noticed while writing/testing an array string dumper for
llvm-bcanalyze that the combined function's VST entry abbrevs contained
an old field that I am not using. Everything was working fine since the
bitcode writer and reader were in sync on how the record fields were
actually being set up and interpreted.
llvm-svn: 249691
Summary:
The bitcode format is described in this document:
https://drive.google.com/file/d/0B036uwnWM6RWdnBLakxmeDdOeXc/view
For more info on ThinLTO see:
https://sites.google.com/site/llvmthinlto
The first customer is ThinLTO, however the data structures are designed
and named more generally based on prior feedback. There are a few
comments regarding how certain interfaces are used by ThinLTO, and the
options added here to gold currently have ThinLTO-specific names as the
behavior they provoke is currently ThinLTO-specific.
This patch includes support for generating per-module function indexes,
the combined index file via the gold plugin, and several tests
(more are included with the associated clang patch D11908).
Reviewers: dexonsmith, davidxl, joker.eph
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13107
llvm-svn: 249270
Summary:
This also adds the first set of tests for operand bundles.
The optimizer has not been audited to ensure that it does the right
thing with operand bundles.
Depends on D12456.
Reviewers: reames, chandlerc, majnemer, dexonsmith, kmod, JosephTremoulet, rnk, bogner
Subscribers: maksfb, llvm-commits
Differential Revision: http://reviews.llvm.org/D12457
llvm-svn: 248551
Since aliases actually use and verify their explicit type already, no
further invalid testing is required here. The
invalid.test:ALIAS-TYPE-MISMATCH case catches errors due to emitting a
non-pointee type in the new format or a non-pointer type in the old
format.
llvm-svn: 247952
This reverts commit r247898 (which reverted r247894).
Patch fixed to address two issues exposed by buildbots:
- unused variable warning in NDEBUG mode
- std::initializer_list lifetime issue causing test failures
Original Summary:
Support for including the function bitcode indices in the Value Symbol
Table. This requires writing the VST after the function blocks, which in
turn requires a new VST forward declaration record encoding the offset of
the full VST (which is backpatched to contain the offset after the VST
is written).
This patch also enables the lazy function reader to use the new function
indices out of the VST. This support will be used by ThinLTO as well, which
will be in a follow on patch. Backwards compatibility with older bitcode
files is maintained.
A new test is also included.
The bitcode format (used for the lazy reader as well as the upcoming
ThinLTO patches) came out of discussions with Duncan and others and is
described here:
https://drive.google.com/file/d/0B036uwnWM6RWdnBLakxmeDdOeXc/view
Reviewers: dexonsmith, davidxl, joker.eph
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12536
llvm-svn: 247927
Temporarily revert to fix some buildbot issues. One is a minor issue
with a variable unused in NDEBUG mode. More concerning are some test
failures on win7 that I need to dig into.
This reverts commit 4e66a74543459832cfd571db42b4543580ae1d1d.
llvm-svn: 247898
Summary:
Support for including the function bitcode indices in the Value Symbol
Table. This requires writing the VST after the function blocks, which in
turn requires a new VST forward declaration record encoding the offset of
the full VST (which is backpatched to contain the offset after the VST
is written).
This patch also enables the lazy function reader to use the new function
indices out of the VST. This support will be used by ThinLTO as well, which
will be in a follow on patch. Backwards compatibility with older bitcode
files is maintained.
A new test is also included.
The bitcode format (used for the lazy reader as well as the upcoming
ThinLTO patches) came out of discussions with Duncan and others and is
described here:
https://drive.google.com/file/d/0B036uwnWM6RWdnBLakxmeDdOeXc/view
Reviewers: dexonsmith, davidxl, joker.eph
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12536
llvm-svn: 247894
This was a flawed change - it just caused the getElementType call to be
deferred until later, when we really need to remove it. Now that the IR
for GlobalAliases has been updated, the root cause is addressed that way
instead and this change is no longer needed (and in fact gets in the way
- because we want to pass the pointee type directly down further).
Follow up patches to push this through GlobalValue, bitcode format, etc,
will come along soon.
This reverts commit 236160.
llvm-svn: 247585
Summary:
Add a `cleanupendpad` instruction, used to mark exceptional exits out of
cleanups (for languages/targets that can abort a cleanup with another
exception). The `cleanupendpad` instruction is similar to the `catchendpad`
instruction in that it is an EH pad which is the target of unwind edges in
the handler and which itself has an unwind edge to the next EH action.
The `cleanupendpad` instruction, similar to `cleanupret` has a `cleanuppad`
argument indicating which cleanup it exits. The unwind successors of a
`cleanuppad`'s `cleanupendpad`s must agree with each other and with its
`cleanupret`s.
Update WinEHPrepare (and docs/tests) to accomodate `cleanupendpad`.
Reviewers: rnk, andrew.w.kaylor, majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12433
llvm-svn: 246751
Summary:
Constant vectors weren't allowed to have an i1 condition in the
BitcodeReader. Make sure we have the same restrictions that are
documented, not more.
Reviewers: nlewycky, rafael, kschimpf
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12440
llvm-svn: 246459
As a follow-up to r246098, require `DISubprogram` definitions
(`isDefinition: true`) to be 'distinct'. Specifically, add an assembler
check, a verifier check, and bitcode upgrading logic to combat testcase
bitrot after the `DIBuilder` change.
While working on the testcases, I realized that
test/Linker/subprogram-linkonce-weak-odr.ll isn't relevant anymore. Its
purpose was to check for a corner case in PR22792 where two subprogram
definitions match exactly and share the same metadata node. The new
verifier check, requiring that subprogram definitions are 'distinct',
precludes that possibility.
I updated almost all the IR with the following script:
git grep -l -E -e '= !DISubprogram\(.* isDefinition: true' |
grep -v test/Bitcode |
xargs sed -i '' -e 's/= \(!DISubprogram(.*, isDefinition: true\)/= distinct \1/'
Likely some variant of would work for out-of-tree testcases.
llvm-svn: 246327
Summary:
WinEHPrepare is going to require that cleanuppad and catchpad produce values
of token type which are consumed by any cleanupret or catchret exiting the
pad. This change updates the signatures of those operators to require/enforce
that the type produced by the pads is token type and that the rets have an
appropriate argument.
The catchpad argument of a `CatchReturnInst` must be a `CatchPadInst` (and
similarly for `CleanupReturnInst`/`CleanupPadInst`). To accommodate that
restriction, this change adds a notion of an operator constraint to both
LLParser and BitcodeReader, allowing appropriate sentinels to be constructed
for forward references and appropriate error messages to be emitted for
illegal inputs.
Also add a verifier rule (noted in LangRef) that a catchpad with a catchpad
predecessor must have no other predecessors; this ensures that WinEHPrepare
will see the expected linear relationship between sibling catches on the
same try.
Lastly, remove some superfluous/vestigial casts from instruction operand
setters operating on BasicBlocks.
Reviewers: rnk, majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12108
llvm-svn: 245797
Some personality routines require funclet exit points to be clearly
marked, this is done by producing a token at the funclet pad and
consuming it at the corresponding ret instruction. CleanupReturnInst
already had a spot for this operand but CatchReturnInst did not.
Other personality routines don't need to use this which is why it has
been made optional.
llvm-svn: 245149
This introduces the basic functionality to support "token types".
The motivation stems from the need to perform operations on a Value
whose provenance cannot be obscured.
There are several applications for such a type but my immediate
motivation stems from WinEH. Our personality routine enforces a
single-entry - single-exit regime for cleanups. After several rounds of
optimizations, we may be left with a terminator whose "cleanup-entry
block" is not entirely clear because control flow has merged two
cleanups together. We have experimented with using labels as operands
inside of instructions which are not terminators to indicate where we
came from but found that LLVM does not expect such exotic uses of
BasicBlocks.
Instead, we can use this new type to clearly associate the "entry point"
and "exit point" of our cleanup. This is done by having the cleanuppad
yield a Token and consuming it at the cleanupret.
The token type makes it impossible to obscure or otherwise hide the
Value, making it trivial to track the relationship between the two
points.
What is the burden to the optimizer? Well, it turns out we have already
paid down this cost by accepting that there are certain calls that we
are not permitted to duplicate, optimizations have to watch out for
such instructions anyway. There are additional places in the optimizer
that we will probably have to update but early examination has given me
the impression that this will not be heroic.
Differential Revision: http://reviews.llvm.org/D11861
llvm-svn: 245029
Since r241097, `DIBuilder` has only created distinct `DICompileUnit`s.
The backend is liable to start relying on that (if it hasn't already),
so make uniquable `DICompileUnit`s illegal and automatically upgrade old
bitcode. This is a nice cleanup, since we can remove an unnecessary
`DenseSet` (and the associated uniquing info) from `LLVMContextImpl`.
Almost all the testcases were updated with this script:
git grep -e '= !DICompileUnit' -l -- test |
grep -v test/Bitcode |
xargs sed -i '' -e 's,= !DICompileUnit,= distinct !DICompileUnit,'
I imagine something similar should work for out-of-tree testcases.
llvm-svn: 243885
Remove the fake `DW_TAG_auto_variable` and `DW_TAG_arg_variable` tags,
using `DW_TAG_variable` in their place Stop exposing the `tag:` field at
all in the assembly format for `DILocalVariable`.
Most of the testcase updates were generated by the following sed script:
find test/ -name "*.ll" -o -name "*.mir" |
xargs grep -l 'DILocalVariable' |
xargs sed -i '' \
-e 's/tag: DW_TAG_arg_variable, //' \
-e 's/tag: DW_TAG_auto_variable, //'
There were only a handful of tests in `test/Assembly` that I needed to
update by hand.
(Note: a follow-up could change `DILocalVariable::DILocalVariable()` to
set the tag to `DW_TAG_formal_parameter` instead of `DW_TAG_variable`
(as appropriate), instead of having that logic magically in the backend
in `DbgVariable`. I've added a FIXME to that effect.)
llvm-svn: 243774
This introduces new instructions neccessary to implement MSVC-compatible
exception handling support. Most of the middle-end and none of the
back-end haven't been audited or updated to take them into account.
Differential Revision: http://reviews.llvm.org/D11097
llvm-svn: 243766
Swift has a custom calling convention that also requires some new flags
on arguments and one new attribute on alloca instructions. This patch
does not include the implementation of that calling convention - that
will be provided as part of the open-source release of Swift; this only
reserves the bitcode constant values so that they are not used for
other purposes.
llvm-svn: 243379
This change adds new attribute called "argmemonly". Function marked with this attribute can only access memory through it's argument pointers. This attribute directly corresponds to the "OnlyAccessesArgumentPointees" ModRef behaviour in alias analysis.
Differential Revision: http://reviews.llvm.org/D10398
llvm-svn: 241979
FCmp behaves a lot like a floating-point binary operator in many ways,
and can benefit from fast-math information. Flags such as nsz and nnan
can affect if this fcmp (in combination with a select) can be treated
as a fminnum/fmaxnum operation.
This adds backwards-compatible bitcode support, IR parsing and writing,
LangRef changes and IRBuilder changes. I'll need to audit InstSimplify
and InstCombine in a followup to find places where flags should be
copied.
llvm-svn: 241901
Summary:
This introduces new instructions neccessary to implement MSVC-compatible
exception handling support. Most of the middle-end and none of the
back-end haven't been audited or updated to take them into account.
Reviewers: rnk, JosephTremoulet, reames, nlewycky, rjmccall
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11041
llvm-svn: 241888
Summary:
Looking at r241279, I noticed that UpgradedIntrinsics only gets written
to in the following code:
if (UpgradeIntrinsicFunction(&F, NewFn))
UpgradedIntrinsics[&F] = NewFn;
Looking through UpgradeIntrinsicFunction, we always return false OR
NewFn will be set to a different function from our source.
This patch pulls the F != NewFn into UpgradeIntrinsicFunction as an
assert, and removes the check from callers of UpgradeIntrinsicFunction.
Reviewers: rafael, chandlerc
Subscribers: llvm-commits-list
Differential Revision: http://reviews.llvm.org/D10915
llvm-svn: 241369
When trying to upgrade @llvm.x86.sse2.psrl.dq while parsing a module,
BitcodeReader adds the function to its worklist twice, resulting in a
crash when accessing it the second time.
This patch replaces the worklist vector by a map.
Patch by Philip Pfaffe.
llvm-svn: 241281
It is meant to be used to record modules @imported by the current
compile unit, so a debugger an import the same modules to replicate this
environment before dropping into the expression evaluator.
DIModule is a sibling to DINamespace and behaves quite similarly.
In addition to the name of the module it also records the module
configuration details that are necessary to uniquely identify the module.
This includes the configuration macros (e.g., -DNDEBUG), the include path
where the module.map file is to be found, and the isysroot.
The idea is that the backend will turn this into a DW_TAG_module.
http://reviews.llvm.org/D9614
rdar://problem/20965932
llvm-svn: 241017
Having different code paths for streamed and regular bitcode reading was a
source of bugs in the past and this defines them away.
It has a small but noticeable impact on performance. I timed running
"opt -disable-output -disable-verify" on a ltoed clang. It goes from
14.752845231 seconds time elapsed ( +- 0.16% )
to
15.012463721 seconds time elapsed ( +- 0.11% )
Extracted from a patch by Karl Schimpf.
llvm-svn: 240305
The patch is generated using this command:
tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
-checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \
llvm/lib/
Thanks to Eugene Kosov for the original patch!
llvm-svn: 240137
The personality routine currently lives in the LandingPadInst.
This isn't desirable because:
- All LandingPadInsts in the same function must have the same
personality routine. This means that each LandingPadInst beyond the
first has an operand which produces no additional information.
- There is ongoing work to introduce EH IR constructs other than
LandingPadInst. Moving the personality routine off of any one
particular Instruction and onto the parent function seems a lot better
than have N different places a personality function can sneak onto an
exceptional function.
Differential Revision: http://reviews.llvm.org/D10429
llvm-svn: 239940
Before this patch the bitcode reader would read a module from a file
that contained in order:
* Any number of non MODULE_BLOCK sub blocks.
* One MODULE_BLOCK
* Any number of non MODULE_BLOCK sub blocks.
* 4 '\n' characters to handle OS X's ranlib.
Since we support lazy reading of modules, any information that is relevant
for the module has to be in the MODULE_BLOCK or before it. We don't gain
anything from checking what is after.
This patch then changes the reader to stop once the MODULE_BLOCK has been
successfully parsed.
This avoids the ugly special case for .bc files in an archive and makes it
easier to embed bitcode files.
llvm-svn: 239845
`LLVM_ENABLE_MODULES` builds sometimes fail because `Intrinsics.td`
needs to regenerate `Instrinsics.h` before anyone can include anything
from the LLVM_IR module. Represent the dependency explicitly to prevent
that.
llvm-svn: 239796
This patch adds the safe stack instrumentation pass to LLVM, which separates
the program stack into a safe stack, which stores return addresses, register
spills, and local variables that are statically verified to be accessed
in a safe way, and the unsafe stack, which stores everything else. Such
separation makes it much harder for an attacker to corrupt objects on the
safe stack, including function pointers stored in spilled registers and
return addresses. You can find more information about the safe stack, as
well as other parts of or control-flow hijack protection technique in our
OSDI paper on code-pointer integrity (http://dslab.epfl.ch/pubs/cpi.pdf)
and our project website (http://levee.epfl.ch).
The overhead of our implementation of the safe stack is very close to zero
(0.01% on the Phoronix benchmarks). This is lower than the overhead of
stack cookies, which are supported by LLVM and are commonly used today,
yet the security guarantees of the safe stack are strictly stronger than
stack cookies. In some cases, the safe stack improves performance due to
better cache locality.
Our current implementation of the safe stack is stable and robust, we
used it to recompile multiple projects on Linux including Chromium, and
we also recompiled the entire FreeBSD user-space system and more than 100
packages. We ran unit tests on the FreeBSD system and many of the packages
and observed no errors caused by the safe stack. The safe stack is also fully
binary compatible with non-instrumented code and can be applied to parts of
a program selectively.
This patch is our implementation of the safe stack on top of LLVM. The
patches make the following changes:
- Add the safestack function attribute, similar to the ssp, sspstrong and
sspreq attributes.
- Add the SafeStack instrumentation pass that applies the safe stack to all
functions that have the safestack attribute. This pass moves all unsafe local
variables to the unsafe stack with a separate stack pointer, whereas all
safe variables remain on the regular stack that is managed by LLVM as usual.
- Invoke the pass as the last stage before code generation (at the same time
the existing cookie-based stack protector pass is invoked).
- Add unit tests for the safe stack.
Original patch by Volodymyr Kuznetsov and others at the Dependable Systems
Lab at EPFL; updates and upstreaming by myself.
Differential Revision: http://reviews.llvm.org/D6094
llvm-svn: 239761
If the type isn't trivially moveable emplace can skip a potentially
expensive move. It also saves a couple of characters.
Call sites were found with the ASTMatcher + some semi-automated cleanup.
memberCallExpr(
argumentCountIs(1), callee(methodDecl(hasName("push_back"))),
on(hasType(recordDecl(has(namedDecl(hasName("emplace_back")))))),
hasArgument(0, bindTemporaryExpr(
hasType(recordDecl(hasNonTrivialDestructor())),
has(constructExpr()))),
unless(isInTemplateInstantiation()))
No functional change intended.
llvm-svn: 238602
so DWARF skeleton CUs can be expression in IR. A skeleton CU is a
(typically empty) DW_TAG_compile_unit that has a DW_AT_(GNU)_dwo_name and
a DW_AT_(GNU)_dwo_id attribute. It is used to refer to external debug info.
This is a prerequisite for clang module debugging as discussed in
http://lists.cs.uiuc.edu/pipermail/cfe-dev/2014-November/040076.html.
In order to refer to external types stored in split DWARF (dwo) objects,
such as clang modules, we need to emit skeleton CUs, which identify the
dwarf object (i.e., the clang module) by filename (the SplitDebugFilename)
and a hash value, the dwo_id.
This patch only contains the IR changes. The idea is that a CUs with a
non-zero dwo_id field will be emitted together with a DW_AT_GNU_dwo_name
and DW_AT_GNU_dwo_id attribute.
http://reviews.llvm.org/D9488
rdar://problem/20091852
llvm-svn: 237949
Summary:
Added isLoadableOrStorableType to PointerType.
We were doing some checks in some places, occasionally assert()ing instead
of telling the caller. With this patch, I'm putting all type checking in
the same place for load/store type instructions, and verifying the same
thing every time.
I also added a check for load/store of a function type.
Applied extracted check to Load, Store, and Cmpxcg.
I don't have exhaustive tests for all of these, but all Error() calls in
TypeCheckLoadStoreInst are being tested (in invalid.test).
Reviewers: dblaikie, rafael
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9785
llvm-svn: 237619
Somehow I dropped this in r233585, and we haven't had `DEBUG_LOC_AGAIN`
records since. Add it back. Also tests that the output assembly looks
okay.
Fixes PR23436.
llvm-svn: 236661
Many of the callers already have the pointer type anyway, and for the
couple of callers that don't it's pretty easy to call PointerType::get
on the pointee type and address space.
This avoids LLParser from using PointerType::getElementType when parsing
GlobalAliases from IR.
llvm-svn: 236160
Finish off PR23080 by renaming the debug info IR constructs from `MD*`
to `DI*`. The last of the `DIDescriptor` classes were deleted in
r235356, and the last of the related typedefs removed in r235413, so
this has all baked for about a week.
Note: If you have out-of-tree code (like a frontend), I recommend that
you get everything compiling and tests passing with the *previous*
commit before updating to this one. It'll be easier to keep track of
what code is using the `DIDescriptor` hierarchy and what you've already
updated, and I think you're extremely unlikely to insert bugs. YMMV of
course.
Back to *this* commit: I did this using the rename-md-di-nodes.sh
upgrade script I've attached to PR23080 (both code and testcases) and
filtered through clang-format-diff.py. I edited the tests for
test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns
were off-by-three. It should work on your out-of-tree testcases (and
code, if you've followed the advice in the previous paragraph).
Some of the tests are in badly named files now (e.g.,
test/Assembler/invalid-mdcompositetype-missing-tag.ll should be
'dicompositetype'); I'll come back and move the files in a follow-up
commit.
llvm-svn: 236120
Summary:
We don't seem to need to assert here, since this function's callers expect
to get a nullptr on error. This way we don't assert on user input.
Bug found with AFL fuzz.
Reviewers: rafael
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9308
llvm-svn: 236027
As a space optimization, this instruction would just encode the pointer
type of the first operand and use the knowledge that the second and
third operands would be of the pointee type of the first. When typed
pointers go away, this assumption will no longer be available - so
encode the type of the second operand explicitly and rely on that for
the third.
Test case added to demonstrate the backwards compatibility concern,
which only comes up when the definition of the second operand comes
after the use (hence the weird basic block sequence) - at which point
the type needs to be explicitly encoded in the bitcode and the record
length changes to accommodate this.
llvm-svn: 235966
Use a few extra bits in the const field (after widening it from a fixed
single bit) to stash the address space which is no longer provided by
the type (and an extra bit in there to specify that we're using that new
encoding).
llvm-svn: 235911
Add serialization support for function metadata attachments (added in
r235783). The syntax is:
define @foo() !attach !0 {
Metadata attachments are only allowed on functions with bodies. Since
they come before the `{`, they're not really part of the body; since
they require a body, they're not really part of the header. In
`LLParser` I gave them a separate function called from `ParseDefine()`,
`ParseOptionalFunctionMetadata()`.
In bitcode, I'm using the same `METADATA_ATTACHMENT` record used by
instructions. Instruction metadata attachments are included in a
special "attachment" block at the end of a `Function`. The attachment
records are laid out like this:
InstID (KindID MetadataID)+
Note that these records always have an odd number of fields. The new
code takes advantage of this to recognize function attachments (which
don't need an instruction ID):
(KindID MetadataID)+
This means we can use the same attachment block already used for
instructions.
This is part of PR23340.
llvm-svn: 235785
(reverted in r235533)
Original commit message:
"Calls to llvm::Value::mutateType are becoming extra-sensitive now that
instructions have extra type information that will not be derived from
operands or result type (alloca, gep, load, call/invoke, etc... ). The
special-handling for mutateType will get more complicated as this work
continues - it might be worth making mutateType virtual & pushing the
complexity down into the classes that need special handling. But with
only two significant uses of mutateType (vectorization and linking) this
seems OK for now.
Totally open to ideas/suggestions/improvements, of course.
With this, and a bunch of exceptions, we can roundtrip an indirect call
site through bitcode and IR. (a direct call site is actually trickier...
I haven't figured out how to deal with the IR deserializer's lazy
construction of Function/GlobalVariable decl's based on the type of the
entity which means looking through the "pointer to T" type referring to
the global)"
The remapping done in ValueMapper for LTO was insufficient as the types
weren't correctly mapped (though I was using the post-mapped operands,
some of those operands might not have been mapped yet so the type
wouldn't be post-mapped yet). Instead use the pre-mapped type and
explicitly map all the types.
llvm-svn: 235651
Summary:
Make sure the abbrev operands are valid and that we can read/skip them
afterwards.
Bug found with AFL fuzz.
Reviewers: rafael
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9030
llvm-svn: 235595
This reverts commit r235458.
It looks like this might be breaking something LTO-ish. Looking into it
& will recommit with a fix/test case/etc once I've got more to go on.
llvm-svn: 235533
Without pointee types the space optimization of storing only the pointer
type and not the value type won't be viable - so add the extra type
information that would be missing.
llvm-svn: 235475
Without pointee types the space optimization of storing only the pointer
type and not the value type won't be viable - so add the extra type
information that would be missing.
Storeatomic coming soon.
llvm-svn: 235474
Calls to llvm::Value::mutateType are becoming extra-sensitive now that
instructions have extra type information that will not be derived from
operands or result type (alloca, gep, load, call/invoke, etc... ). The
special-handling for mutateType will get more complicated as this work
continues - it might be worth making mutateType virtual & pushing the
complexity down into the classes that need special handling. But with
only two significant uses of mutateType (vectorization and linking) this
seems OK for now.
Totally open to ideas/suggestions/improvements, of course.
With this, and a bunch of exceptions, we can roundtrip an indirect call
site through bitcode and IR. (a direct call site is actually trickier...
I haven't figured out how to deal with the IR deserializer's lazy
construction of Function/GlobalVariable decl's based on the type of the
entity which means looking through the "pointer to T" type referring to
the global)
llvm-svn: 235458
Now (with a few carefully placed suppressions relating to general type
serialization, etc) we can round trip a simple load through bitcode and
textual IR without calling getElementType on a PointerType.
llvm-svn: 235221
Use an extra bit in the CCInfo to flag the newer version of the
instructiont hat includes the type explicitly.
Tested the newer error cases I added, but didn't add tests for the finer
granularity improvements to existing error paths.
llvm-svn: 235160
Summary:
If a pointer is marked as dereferenceable_or_null(N), LLVM assumes it
is either `null` or `dereferenceable(N)` or both. This change only
introduces the attribute and adds a token test case for the `llvm-as`
/ `llvm-dis`. It does not hook up other parts of the optimizer to
actually exploit the attribute -- those changes will come later.
For pointers in address space 0, `dereferenceable(N)` is now exactly
equivalent to `dereferenceable_or_null(N)` && `nonnull`. For other
address spaces, `dereferenceable(N)` is potentially weaker than
`dereferenceable_or_null(N)` && `nonnull` (since we could have a null
`dereferenceable(N)` pointer).
The motivating case for this change is Java (and other managed
languages), where pointers are either `null` or dereferenceable up to
some usually known-at-compile-time constant offset.
Reviewers: rafael, hfinkel
Reviewed By: hfinkel
Subscribers: nicholas, llvm-commits
Differential Revision: http://reviews.llvm.org/D8650
llvm-svn: 235132
Remove 'inlinedAt:' from MDLocalVariable. Besides saving some memory
(variables with it seem to be single largest `Metadata` contributer to
memory usage right now in -g -flto builds), this stops optimization and
backend passes from having to change local variables.
The 'inlinedAt:' field was used by the backend in two ways:
1. To tell the backend whether and into what a variable was inlined.
2. To create a unique id for each inlined variable.
Instead, rely on the 'inlinedAt:' field of the intrinsic's `!dbg`
attachment, and change the DWARF backend to use a typedef called
`InlinedVariable` which is `std::pair<MDLocalVariable*, MDLocation*>`.
This `DebugLoc` is already passed reliably through the backend (as
verified by r234021).
This commit removes the check from r234021, but I added a new check
(that will survive) in r235048, and changed the `DIBuilder` API in
r235041 to require a `!dbg` attachment whose 'scope:` is in the same
`MDSubprogram` as the variable's.
If this breaks your out-of-tree testcases, perhaps the script I used
(mdlocalvariable-drop-inlinedat.sh) will help; I'll attach it to PR22778
in a moment.
llvm-svn: 235050
Remove all the global bits to do with preserving use-list order by
moving the `cl::opt`s to the individual tools that want them. There's a
minor functionality change to `libLTO`, in that you can't send in
`-preserve-bc-uselistorder=false`, but making that bit settable (if it's
worth doing) should be through explicit LTO API.
As a drive-by fix, I removed some includes of `UseListOrder.h` that were
made unnecessary by recent commits.
llvm-svn: 234973
Change the callers of `WriteToBitcodeFile()` to pass `true` or
`shouldPreserveBitcodeUseListOrder()` explicitly. I left the callers
that want to send `false` alone.
I'll keep pushing the bit higher until hopefully I can delete the global
`cl::opt` entirely.
llvm-svn: 234957
Canonicalize access to whether to preserve use-list order in bitcode on
a `bool` stored in `ValueEnumerator`. Next step, expose this as a
`bool` through `WriteBitcodeToFile()`.
llvm-svn: 234956
Summary:
Without this check the following case failed:
Skip a SubBlock which is not a MODULE_BLOCK_ID nor a BLOCKINFO_BLOCK_ID
Got to end of file
TheModule would still be == nullptr, and we would subsequentially fail
when materializing the Module (assert at the start of
BitcodeReader::MaterializeModule).
Bug found with AFL.
Reviewers: dexonsmith, rafael
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9014
llvm-svn: 234887
Change `MDSubprogram::getFunction()` and
`MDGlobalVariable::getConstant()` to return a `Constant`. Previously,
both returned `ConstantAsMetadata`.
llvm-svn: 234699
The patch is generated using clang-tidy misc-use-override check.
This command was used:
tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py \
-checks='-*,misc-use-override' -header-filter='llvm|clang' \
-j=32 -fix -format
http://reviews.llvm.org/D8925
llvm-svn: 234679
Require the pointee type to be passed explicitly and assert that it is
correct. For now it's possible to pass nullptr here (and I've done so in
a few places in this patch) but eventually that will be disallowed once
all clients have been updated or removed. It'll be a long road to get
all the way there... but if you have the cahnce to update your callers
to pass the type explicitly without depending on a pointer's element
type, that would be a good thing to do soon and a necessary thing to do
eventually.
llvm-svn: 233938
Keep a note in the materializer that we are stripping debug info so that
user doing a lazy read of the module don't hit outdated formats.
Thanks to Duncan for suggesting the fix.
llvm-svn: 233603
Update lib/IR and lib/Bitcode to use the new `DebugLoc` API. Added an
explicit conversion to `bool` (avoiding a conversion to `MDLocation`),
since a couple of these use cases need to handle broken code.
llvm-svn: 233585
Check accessors of `MDLocation`, and change them to `cast<>` down to the
right types. Also add type-safe factory functions.
All the callers that handle broken code need to use the new versions of
the accessors (`getRawScope()` instead of `getScope()`) that still
return `Metadata*`. This is also necessary for things like
`MDNodeKeyImpl<MDLocation>` (in LLVMContextImpl.h) that need to unique
the nodes when their operands might still be forward references of the
wrong type.
In the `Value` hierarchy, consumers that handle broken code use
`getOperand()` directly. However, debug info nodes have a ton of
operands, and their order (even their existence) isn't stable yet. It's
safer and more maintainable to add an explicit "raw" accessor on the
class itself.
llvm-svn: 233322
Assert that `MDNode::isResolved()`. While in theory the `Verifier`
should catch this, it doesn't descend into all debug info, and the
`DebugInfoVerifier` doesn't call into the `Verifier`. Besides, this
helps to catch bugs when `-disable-verify=true`.
Note that I haven't come across a place where this fails with clang
today, so no testcase.
llvm-svn: 232442
(turns out I had regressed this when sinking handling of this type down
into GetElementPtrInst::Create - since that asserted before the error
handling was performed)
llvm-svn: 232420
This happened to be fairly easy to support backwards compatibility based
on the number of operands (old format had an even number, new format has
one more operand so an odd number).
test/Bitcode/old-aliases.ll already appears to test old gep operators
(if I remove the backwards compatibility in the BitcodeReader, this and
another test fail) so I'm not adding extra test coverage here.
llvm-svn: 232216
I don't think we test invalid bitcode records in any detail, so no test
here - just a change for consistency with existing error checks in
surrounding code.
llvm-svn: 232215
We only defer loading metadata inside ParseModule when ShouldLazyLoadMetadata
is true and we have not loaded any Metadata block yet.
This commit implements all-or-nothing loading of Metadata. If there is a
request to load any metadata block, we will load all deferred metadata blocks.
We make sure the deferred metadata blocks are loaded before we materialize any
function or a module.
The default value of the added parameter ShouldLazyLoadMetadata for
getLazyBitcodeModule is false, so the default behavior stays the same.
We only set the parameter to true when creating LTOModule in local contexts.
These can only really be used for parsing symbols, so it's unnecessary to ever
load the metadata blocks.
If we are going to enable lazy-loading of Metadata for other usages of
getLazyBitcodeModule, where deferred metadata blocks need to be loaded, we can
expose BitcodeReader::materializeMetadata to Module, similar to
Module::materialize.
rdar://19804575
llvm-svn: 232198
Like r230414, add bitcode support including backwards compatibility, for
an explicit type parameter to GEP.
At the suggestion of Duncan I tried coalescing the two older bitcodes into a
single new bitcode, though I did hit a wrinkle: I couldn't figure out how to
create an explicit abbreviation for a record with a variable number of
arguments (the indicies to the gep). This means the discriminator between
inbounds and non-inbounds gep is a full variable-length field I believe? Is my
understanding correct? Is there a way to create such an abbreviation? Should I
just use two bitcodes as before?
Reviewers: dexonsmith
Differential Revision: http://reviews.llvm.org/D7736
llvm-svn: 230415
Summary:
I've taken my best guess at this, but I've cargo culted in places & so
explanations/corrections would be great.
This seems to pass all the tests (check-all, covering clang and llvm) so I
believe that pretty well exercises both the backwards compatibility and common
(same version) compatibility given the number of checked in bitcode files we
already have. Is that a reasonable approach to testing here? Would some more
explicit tests be desired?
1) is this the right way to do back-compat in this case (looking at the number
of entries in the bitcode record to disambiguate between the old schema and
the new?)
2) I don't quite understand the logarithm logic to choose the encoding type of
the type parameter in the abbreviation description, but I found another
instruction doing the same thing & it seems to work. Is that the right
approach?
Reviewers: dexonsmith
Differential Revision: http://reviews.llvm.org/D7655
llvm-svn: 230414
While fuzzing LLVM bitcode files, I discovered that (1) the bitcode reader doesn't check that alignments are no larger than 2**29; (2) downstream code doesn't check the range; and (3) for values out of range, corresponding large memory requests (based on alignment size) will fail. This code fixes the bitcode reader to check for valid alignments, fixing this problem.
This CL fixes alignment value on global variables, functions, and instructions: alloca, load, load atomic, store, store atomic.
Patch by Karl Schimpf (kschimpf@google.com).
llvm-svn: 230180
When writing the bitcode serialization for the new debug info hierarchy,
I assumed two fields would never be null.
Drop that assumption, since it's brittle (and crashes the
`BitcodeWriter` if wrong), and is a check better left for the verifier
anyway. (No need for a bitcode upgrade here, since the new hierarchy is
still not in place.)
The fields in question are `MDCompileUnit::getFile()` and
`MDDerivedType::getBaseType()`, the latter of which isn't null in
test/Transforms/Mem2Reg/ConvertDebugInfo2.ll (see !14, a pointer to
nothing). While the testcase might have bitrotted, there's no reason
for the bitcode format to rely on non-null for metadata operands.
This also fixes a bug in `AsmWriter` where if the `file:` is null it
isn't emitted (caught by the double-round trip in the testcase I'm
adding) -- this is a required field in `LLParser`.
I'll circle back to ConvertDebugInfo2. Once the specialized nodes are
in place, I'll be trying to turn the debug info verifier back on by
default (in the newer module pass form committed r206300) and throwing
more logic in there. If the testcase has bitrotted (as opposed to me
not understanding the schema correctly) I'll fix it then.
llvm-svn: 229960
Follow-up to r229740, which removed `DITemplate*::getContext()` after my
upgrade script revealed that scopes are always `nullptr` for template
parameters. This is the other shoe: drop `scope:` from
`MDTemplateParameter` and its two subclasses. (Note: a bitcode upgrade
would be pointless, since the hierarchy hasn't been moved into place.)
llvm-svn: 229791
The metadata/value split introduced a major regression reading large
bitcode files that contain debug info (or other cyclic (non-self
reference) metadata graphs). For the first time in a while, I dropped
from libLTO.dylib down to `llvm-lto` with a non-trivial bitcode file
(~350MB), and I hit this when reading the result of ld64's `-save-temps`
in `llvm-lto`.
Here's pseudo-code for what was going on:
read-main-metadata-block:
for each md:
if has-fwd-ref: // Only true for cyclic graphs.
any-fwd-refs <- true
if any-fwd-refs:
foreach md:
resolve-cycles(md) // Handle cycles.
foreach function:
read-function-metadata-block: // Such as !alias, !loop
if any-fwd-refs:
foreach md: // (all metadata, not just this block)
resolve-cycles(md) // A no-op, but the loop is expensive!!
This commit resets the `AnyFwdRefs` flag to `false`. This on its own
was enough to change my Release+Asserts `llvm-lto` time for reading this
bitcode from over 20 minutes (I gave up on it) to 20 seconds. I've gone
further by tracking the min/max metadata forward-references in a
metadata block. This protects against a schema that has lots of
functions that each reference their own metadata cycle.
Unfortunately, this regression is in the 3.6 branch as well.
llvm-svn: 229421
Summary:
When creating {insert,extract}value instructions from a BitcodeReader, we
weren't verifying the fields were valid.
Bugs found with afl-fuzz
Reviewers: rafael
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D7325
llvm-svn: 229345
This allows IDEs to recognize the entire set of header files for
each of the core LLVM projects.
Differential Revision: http://reviews.llvm.org/D7526
Reviewed By: Chris Bieneman
llvm-svn: 228798
Add specialized debug info metadata nodes that match the `DIDescriptor`
wrappers (used by `DIBuilder`) closely. Assembly and bitcode support to
follow soon (it'll mostly just be obvious), but this sketches in today's
schema. This is the first big commit (well, the only *big* one aside
from the testcase changes that'll come when I move this into place) for
PR22464.
I've marked a bunch of obvious changes as `TODO`s in the source; I plan
to make those changes promptly after this hierarchy is moved underneath
`DIDescriptor`, but for now I'm aiming mostly to match the status quo.
llvm-svn: 228640
Move debug-info-centred `Metadata` subclasses into their own
header/source file. A couple of private template functions are needed
from both `Metadata.cpp` and `DebugInfoMetadata.cpp`, so I've moved them
to `lib/IR/MetadataImpl.h`.
llvm-svn: 227835
Eventually we can make some of these pass the error along to the caller.
Reports a fatal error if:
We find an invalid abbrev record
We try to get an invalid abbrev number
We can't fill the current word due to an EOF
Fixed an invalid bitcode test to check for output with FileCheck
Bugs found with afl-fuzz
llvm-svn: 226986
These things are potentially used for non-DWARF data (see the discussion
in PR22235), so take the `Dwarf` out of the name. Since the new name
gives fewer clues, update the doxygen to properly describe what they
are.
llvm-svn: 226874
As pointed out in r226501, the distinction between `MDNode` and
`UniquableMDNode` is confusing. When we need subclasses of `MDNode`
that don't use all its functionality it might make sense to break it
apart again, but until then this makes the code clearer.
llvm-svn: 226520
Change `MDTuple::getTemporary()` and `MDLocation::getTemporary()` to
return (effectively) `std::unique_ptr<T, MDNode::deleteTemporary>`, and
clean up call sites. (For now, `DIBuilder` call sites just call
`release()` immediately.)
There's an accompanying change in each of clang and polly to use the new
API.
llvm-svn: 226504
Remove `MDNodeFwdDecl` (as promised in r226481). Aside from API
changes, there's no real functionality change here.
`MDNode::getTemporary()` now forwards to `MDTuple::getTemporary()`,
which returns a tuple with `isTemporary()` equal to true.
The main point is that we can now add temporaries of other `MDNode`
subclasses, needed for PR22235 (I introduced `MDNodeFwdDecl` in the
first place because I didn't recognize this need, and thought they were
only needed to handle forward references).
A few things left out of (or highlighted by) this commit:
- I've had to remove the (few) uses of `std::unique_ptr<>` to deal
with temporaries, since the destructor is no longer public.
`getTemporary()` should probably return the equivalent of
`std::unique_ptr<T, MDNode::deleteTemporary>`.
- `MDLocation::getTemporary()` doesn't exist yet (worse, it actually
does exist, but does the wrong thing: `MDNode::getTemporary()` is
inherited and returns an `MDTuple`).
- `MDNode` now only has one subclass, `UniquableMDNode`, and the
distinction between them is actually somewhat confusing.
I'll fix those up next.
llvm-svn: 226501
No change in this commit, but clang was changed to also produce trivial comdats when
needed.
Original message:
Don't create new comdats in CodeGen.
This patch stops the implicit creation of comdats during codegen.
Clang now sets the comdat explicitly when it is required. With this patch clang and gcc
now produce the same result in pr19848.
llvm-svn: 226467
This reverts commit r226173, adding r226038 back.
No change in this commit, but clang was changed to also produce trivial comdats for
costructors, destructors and vtables when needed.
Original message:
Don't create new comdats in CodeGen.
This patch stops the implicit creation of comdats during codegen.
Clang now sets the comdat explicitly when it is required. With this patch clang and gcc
now produce the same result in pr19848.
llvm-svn: 226242
This patch stops the implicit creation of comdats during codegen.
Clang now sets the comdat explicitly when it is required. With this patch clang and gcc
now produce the same result in pr19848.
llvm-svn: 226038
utils/sort_includes.py.
I clearly haven't done this in a while, so more changed than usual. This
even uncovered a missing include from the InstrProf library that I've
added. No functionality changed here, just mechanical cleanup of the
include order.
llvm-svn: 225974
This adds assembly and bitcode support for `MDLocation`. The assembly
side is rather big, since this is the first `MDNode` subclass (that
isn't `MDTuple`). Part of PR21433.
(If you're wondering where the mountains of testcase updates are, we
don't need them until I update `DILocation` and `DebugLoc` to actually
use this class.)
llvm-svn: 225830
Refactor logic so that we know up-front whether to open a block and
whether we need an MDString abbreviation.
This is almost NFC, but will start emitting `MDString` abbreviations
when the first record is not an `MDString`.
llvm-svn: 225712
Split `GenericMDNode` into two classes (with more descriptive names).
- `UniquableMDNode` will be a common subclass for `MDNode`s that are
sometimes uniqued like constants, and sometimes 'distinct'.
This class gets the (short-lived) RAUW support and related API.
- `MDTuple` is the basic tuple that has always been returned by
`MDNode::get()`. This is as opposed to more specific nodes to be
added soon, which have additional fields, custom assembly syntax,
and extra semantics.
This class gets the hash-related logic, since other sublcasses of
`UniquableMDNode` may need to hash based on other fields.
To keep this diff from getting too big, I've added casts to `MDTuple`
that won't really scale as new subclasses of `UniquableMDNode` are
added, but I'll clean those up incrementally.
(No functionality change intended.)
llvm-svn: 225682
The bitcode reading interface used std::error_code to report an error to the
callers and it is the callers job to print diagnostics.
This is not ideal for error handling or diagnostic reporting:
* For error handling, all that the callers care about is 3 possibilities:
* It worked
* The bitcode file is corrupted/invalid.
* The file is not bitcode at all.
* For diagnostic, it is user friendly to include far more information
about the invalid case so the user can find out what is wrong with the
bitcode file. This comes up, for example, when a developer introduces a
bug while extending the format.
The compromise we had was to have a lot of error codes.
With this patch we use the DiagnosticHandler to communicate with the
human and std::error_code to communicate with the caller.
This allows us to have far fewer error codes and adds the infrastructure to
print better diagnostics. This is so because the diagnostics are printed when
he issue is found. The code that detected the problem in alive in the stack and
can pass down as much context as needed. As an example the patch updates
test/Bitcode/invalid.ll.
Using a DiagnosticHandler also moves the fatal/non-fatal error decision to the
caller. A simple one like llvm-dis can just use fatal errors. The gold plugin
needs a bit more complex treatment because of being passed non-bitcode files. An
hypothetical interactive tool would make all bitcode errors non-fatal.
llvm-svn: 225562
This reverts commit r225498 (but leaves r225499, which was a worthy
cleanup).
My plan was to change `DEBUG_LOC` to store the `MDNode` directly rather
than its operands (patch was to go out this morning), but on reflection
it's not clear that it's strictly better. (I had missed that the
current code is unlikely to emit the `MDNode` at all.)
Conflicts:
lib/Bitcode/Reader/BitcodeReader.cpp (due to r225499)
llvm-svn: 225531
Propagate whether `MDNode`s are 'distinct' through the other types of IR
(assembly and bitcode). This adds the `distinct` keyword to assembly.
Currently, no one actually calls `MDNode::getDistinct()`, so these nodes
only get created for:
- self-references, which are never uniqued, and
- nodes whose operands are replaced that hit a uniquing collision.
The concept of distinct nodes is still not quite first-class, since
distinct-ness doesn't yet survive across `MapMetadata()`.
Part of PR22111.
llvm-svn: 225474
units.
This was debated back and forth a bunch, but using references is now
clearly cleaner. Of all the code written using pointers thus far, in
only one place did it really make more sense to have a pointer. In most
cases, this just removes immediate dereferencing from the code. I think
it is much better to get errors on null IR units earlier, potentially
at compile time, than to delay it.
Most notably, the legacy pass manager uses references for its routines
and so as more and more code works with both, the use of pointers was
likely to become really annoying. I noticed this when I ported the
domtree analysis over and wrote the entire thing with references only to
have it fail to compile. =/ It seemed better to switch now than to
delay. We can, of course, revisit this is we learn that references are
really problematic in the API.
llvm-svn: 225145
`MDString`s can have arbitrary characters in them. Prevent an assertion
that fired in `BitcodeWriter` because of sign extension by copying the
characters into the record as `unsigned char`s.
Based on a patch by Keno Fischer; fixes PR21882.
llvm-svn: 224077
This reflects the typelessness of `Metadata` in the bitcode format,
removing types from all metadata operands.
`METADATA_VALUE` represents a `ValueAsMetadata`, and always has two
fields: the type and the value.
`METADATA_NODE` represents an `MDNode`, and unlike `METADATA_OLD_NODE`,
doesn't store types. It stores operands at their ID+1 so that `0` can
reference `nullptr` operands.
Part of PR21532.
llvm-svn: 224073
Split `Metadata` away from the `Value` class hierarchy, as part of
PR21532. Assembly and bitcode changes are in the wings, but this is the
bulk of the change for the IR C++ API.
I have a follow-up patch prepared for `clang`. If this breaks other
sub-projects, I apologize in advance :(. Help me compile it on Darwin
I'll try to fix it. FWIW, the errors should be easy to fix, so it may
be simpler to just fix it yourself.
This breaks the build for all metadata-related code that's out-of-tree.
Rest assured the transition is mechanical and the compiler should catch
almost all of the problems.
Here's a quick guide for updating your code:
- `Metadata` is the root of a class hierarchy with three main classes:
`MDNode`, `MDString`, and `ValueAsMetadata`. It is distinct from
the `Value` class hierarchy. It is typeless -- i.e., instances do
*not* have a `Type`.
- `MDNode`'s operands are all `Metadata *` (instead of `Value *`).
- `TrackingVH<MDNode>` and `WeakVH` referring to metadata can be
replaced with `TrackingMDNodeRef` and `TrackingMDRef`, respectively.
If you're referring solely to resolved `MDNode`s -- post graph
construction -- just use `MDNode*`.
- `MDNode` (and the rest of `Metadata`) have only limited support for
`replaceAllUsesWith()`.
As long as an `MDNode` is pointing at a forward declaration -- the
result of `MDNode::getTemporary()` -- it maintains a side map of its
uses and can RAUW itself. Once the forward declarations are fully
resolved RAUW support is dropped on the ground. This means that
uniquing collisions on changing operands cause nodes to become
"distinct". (This already happened fairly commonly, whenever an
operand went to null.)
If you're constructing complex (non self-reference) `MDNode` cycles,
you need to call `MDNode::resolveCycles()` on each node (or on a
top-level node that somehow references all of the nodes). Also,
don't do that. Metadata cycles (and the RAUW machinery needed to
construct them) are expensive.
- An `MDNode` can only refer to a `Constant` through a bridge called
`ConstantAsMetadata` (one of the subclasses of `ValueAsMetadata`).
As a side effect, accessing an operand of an `MDNode` that is known
to be, e.g., `ConstantInt`, takes three steps: first, cast from
`Metadata` to `ConstantAsMetadata`; second, extract the `Constant`;
third, cast down to `ConstantInt`.
The eventual goal is to introduce `MDInt`/`MDFloat`/etc. and have
metadata schema owners transition away from using `Constant`s when
the type isn't important (and they don't care about referring to
`GlobalValue`s).
In the meantime, I've added transitional API to the `mdconst`
namespace that matches semantics with the old code, in order to
avoid adding the error-prone three-step equivalent to every call
site. If your old code was:
MDNode *N = foo();
bar(isa <ConstantInt>(N->getOperand(0)));
baz(cast <ConstantInt>(N->getOperand(1)));
bak(cast_or_null <ConstantInt>(N->getOperand(2)));
bat(dyn_cast <ConstantInt>(N->getOperand(3)));
bay(dyn_cast_or_null<ConstantInt>(N->getOperand(4)));
you can trivially match its semantics with:
MDNode *N = foo();
bar(mdconst::hasa <ConstantInt>(N->getOperand(0)));
baz(mdconst::extract <ConstantInt>(N->getOperand(1)));
bak(mdconst::extract_or_null <ConstantInt>(N->getOperand(2)));
bat(mdconst::dyn_extract <ConstantInt>(N->getOperand(3)));
bay(mdconst::dyn_extract_or_null<ConstantInt>(N->getOperand(4)));
and when you transition your metadata schema to `MDInt`:
MDNode *N = foo();
bar(isa <MDInt>(N->getOperand(0)));
baz(cast <MDInt>(N->getOperand(1)));
bak(cast_or_null <MDInt>(N->getOperand(2)));
bat(dyn_cast <MDInt>(N->getOperand(3)));
bay(dyn_cast_or_null<MDInt>(N->getOperand(4)));
- A `CallInst` -- specifically, intrinsic instructions -- can refer to
metadata through a bridge called `MetadataAsValue`. This is a
subclass of `Value` where `getType()->isMetadataTy()`.
`MetadataAsValue` is the *only* class that can legally refer to a
`LocalAsMetadata`, which is a bridged form of non-`Constant` values
like `Argument` and `Instruction`. It can also refer to any other
`Metadata` subclass.
(I'll break all your testcases in a follow-up commit, when I propagate
this change to assembly.)
llvm-svn: 223802
Disallow complex types of function-local metadata. The only valid
function-local metadata is an `MDNode` whose sole argument is a
non-metadata function-local value.
Part of PR21532.
llvm-svn: 223564
When lazy reading a module, the types used in a function will not be visible to
a TypeFinder until the body is read.
This patch fixes that by asking the module for its identified struct types.
If a materializer is present, the module asks it. If not, it uses a TypeFinder.
This fixes pr21374.
I will be the first to say that this is ugly, but it was the best I could find.
Some of the options I looked at:
* Asking the LLVMContext. This could be made to work for gold, but not currently
for ld64. ld64 will load multiple modules into a single context before merging
them. This causes us to see types from future merges. Unfortunately,
MappedTypes is not just a cache when it comes to opaque types. Once the
mapping has been made, we have to remember it for as long as the key may
be used. This would mean moving MappedTypes to the Linker class and having
to drop the Linker::LinkModules static methods, which are visible from C.
* Adding an option to ignore function bodies in the TypeFinder. This would
fix the PR by picking the worst result. It would work, but unfortunately
we are currently quite dependent on the upfront type merging. I will
try to reduce our dependency, but it is not clear that we will be able
to get rid of it for now.
The only clean solution I could think of is making the Module own the types.
This would have other advantages, but it is a much bigger change. I will
propose it, but it is nice to have this fixed while that is discussed.
With the gold plugin, this patch takes the number of types in the LTO clang
binary from 52817 to 49669.
llvm-svn: 223215
Patch by Ben Gamari!
This redefines the `prefix` attribute introduced previously and
introduces a `prologue` attribute. There are a two primary usecases
that these attributes aim to serve,
1. Function prologue sigils
2. Function hot-patching: Enable the user to insert `nop` operations
at the beginning of the function which can later be safely replaced
with a call to some instrumentation facility
3. Runtime metadata: Allow a compiler to insert data for use by the
runtime during execution. GHC is one example of a compiler that
needs this functionality for its tables-next-to-code functionality.
Previously `prefix` served cases (1) and (2) quite well by allowing the user
to introduce arbitrary data at the entrypoint but before the function
body. Case (3), however, was poorly handled by this approach as it
required that prefix data was valid executable code.
Here we redefine the notion of prefix data to instead be data which
occurs immediately before the function entrypoint (i.e. the symbol
address). Since prefix data now occurs before the function entrypoint,
there is no need for the data to be valid code.
The previous notion of prefix data now goes under the name "prologue
data" to emphasize its duality with the function epilogue.
The intention here is to handle cases (1) and (2) with prologue data and
case (3) with prefix data.
References
----------
This idea arose out of discussions[1] with Reid Kleckner in response to a
proposal to introduce the notion of symbol offsets to enable handling of
case (3).
[1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-May/073235.html
Test Plan: testsuite
Differential Revision: http://reviews.llvm.org/D6454
llvm-svn: 223189
Instead, we're going to separate metadata from the Value hierarchy. See
PR21532.
This reverts commit r221375.
This reverts commit r221373.
This reverts commit r221359.
This reverts commit r221167.
This reverts commit r221027.
This reverts commit r221024.
This reverts commit r221023.
This reverts commit r220995.
This reverts commit r220994.
llvm-svn: 221711
This removes calls to isMaterializable in the following cases:
* It was redundant with a call to isDeclaration now that isDeclaration returns
the correct answer for materializable functions.
* It was followed by a call to Materialize. Just call Materialize and check EC.
llvm-svn: 221050
To do this, change the representation of lazy loaded functions.
The previous representation cannot differentiate between a function whose body
has been removed and one whose body hasn't been read from the .bc file. That
means that in order to drop a function, the entire body had to be read.
llvm-svn: 220580
Enumerate `MDNode`'s operands *before* the node itself, so that the
reader requires less RAUW. Although this will cause different code
paths to be hit in the reader, this should effectively be no
functionality change.
llvm-svn: 220340
1. Use const with autos.
2. Don't bother with explicit const in cast ops because they do it automagically.
Thanks, David B. / Aaron B. / Reid K.
llvm-svn: 219817
The function deleteBody() converts the linkage to external and thus destroys
original linkage type value. Lack of correct linkage type causes wrong
relocations to be emitted later.
Calling dropAllReferences() instead of deleteBody() will fix the issue.
Differential Revision: http://reviews.llvm.org/D5415
llvm-svn: 218302
Summary: This is part of the overall goal of removing static initializers from LLVM.
Reviewers: chandlerc
Reviewed By: chandlerc
Subscribers: chandlerc, llvm-commits
Differential Revision: http://reviews.llvm.org/D5416
llvm-svn: 218149
This doesn't change the interface or gives additional safety but removes
a ton of retain/release boilerplate.
No functionality change.
llvm-svn: 217778
This forces callers to use std::move when calling it. It is somewhat odd to have
code with std::move that doesn't always move, but it is also odd to have code
without std::move that sometimes moves.
llvm-svn: 217049
The attached patch simplifies a few interfaces that don't need to take
ownership of a buffer.
For example, both parseAssembly and parseBitcodeFile will parse the
entire buffer before returning. There is no need to take ownership.
Using a MemoryBufferRef makes it obvious in the type signature that
there is no ownership transfer.
llvm-svn: 216488
Take a StringRef instead of a "const char *".
Take a "std::error_code &" instead of a "std::string &" for error.
A create static method would be even better, but this patch is already a bit too
big.
llvm-svn: 216393
Block address forward-references are implemented by creating a
`BasicBlock` ahead of time that gets inserted in the `Function` when
it's eventually encountered.
However, if the same blockaddress was used in two separate functions
that were parsed *before* the referenced function (and the blockaddress
was never used at global scope), two separate basic blocks would get
created, one of which would be forgotten creating invalid IR.
This commit changes the forward-reference logic to create only one basic
block (and always return the same blockaddress).
llvm-svn: 215805
This is an off-by-one bug I found by inspection, which would only
trigger if the bitcode writer sees more uses of a `Value` than the
reader. Since this is only relevant when an instruction gets upgraded
somehow, there unfortunately isn't a reasonable way to add test
coverage.
llvm-svn: 215804
Add header guards to files that were missing guards. Remove #endif comments
as they don't seem common in LLVM (we can easily add them back if we decide
they're useful)
Changes made by clang-tidy with minor tweaks.
llvm-svn: 215558
`BasicBlockFwdRefs` (and `BlockAddrFwdRefs` before it) was being emptied
in a non-deterministic order. When predicting use-list order I've
worked around this another way, but even when parsing lazily (and we
can't recreate use-list order) use-lists should be deterministic.
Make them so by using a side-queue of functions with forward-referenced
blocks that gets visited in order.
llvm-svn: 214899
`parseBitcodeFile()` uses the generic `getLazyBitcodeFile()` function as
a helper. Since `parseBitcodeFile()` isn't actually lazy -- it calls
`MaterializeAllPermanently()` -- bypass the unnecessary call to
`materializeForwardReferencedFunctions()` by extracting out a common
helper function. This removes the last of the use-list churn caused by
blockaddresses.
This highlights that we can't reproduce use-list order of globals and
constants when parsing lazily -- but that's necessarily out of scope.
When we're parsing lazily, we never have all the functions in memory, so
the use-lists of globals (and constants that reference globals) are
always incomplete.
This is part of PR5680.
llvm-svn: 214581
Now that we can reliably handle forward references to `BlockAddress`
(r214563), change the mechanics to simplify predicting use-list order.
Previously, we created dummy `GlobalVariable`s to represent block
addresses. After every function was materialized, we'd go through any
forward references to its blocks and RAUW them with a proper
`BlockAddress` constant. This causes some (potentially a lot of)
unnecessary use-list churn, since any constant expression that it's a
part of will need to be rematerialized as well.
Instead, pre-construct a `BasicBlock` immediately -- without attaching
it to its (empty) `Function` -- and use that to construct a
`BlockAddress`. This constant will not have to be regenerated. When
the function body is parsed, hook this pre-constructed basic block up
in the right place using `BasicBlock::insertInto()`.
Both before and after this change, the IR is temporarily in an invalid
state that gets resolved when `materializeForwardReferencedFunctions()`
gets called.
This is a prep commit that's part of PR5680, but the only functionality
change is the reduction of churn in the constant pool.
llvm-svn: 214570
`BlockAddress`es are interesting in that they can reference basic blocks
from *outside* the block's function. Since basic blocks are not global
values, this presents particular challenges for lazy parsing.
One corner case was found in PR11677 and fixed in r147425. In that
case, a global variable references a block address. It's necessary to
load the relevant function to resolve the forward reference before doing
anything with the module.
By inspection, I found (and have fixed here) two other cases:
- An instruction from one function references a block address from
another function, and only the first function is lazily loaded.
I fixed this the same way as PR11677: by eagerly loading the
referenced function.
- A function whose block address is taken is dematerialized, leaving
invalid references to it.
I fixed this by refusing to dematerialize functions whose block
addresses are taken (if you have to load it, you can't unload it).
llvm-svn: 214559
Correctly sort self-users (such as PHI nodes). I added a targeted test
in `test/Bitcode/use-list-order.ll` and the final missing RUN line to
tests in `test/Assembly`.
This is part of PR5680.
llvm-svn: 214417
Since initializers of GlobalValues are being assigned IDs before
GlobalValues themselves, explicitly exclude GlobalValues from the
constant pool. Added targeted test in `test/Bitcode/use-list-order.ll`
and added two more RUN lines in `test/Assembly`.
This is part of PR5680.
llvm-svn: 214368
When predicting use-list order, we visit functions in reverse order
followed by `GlobalValue`s and write out use-lists at the first
opportunity. In the reader, this will translate to *after* the last use
has been added.
For this to work, we actually need to descend into `GlobalValue`s.
Added a targeted test in `use-list-order.ll` and `RUN` lines to the
newly passing tests in `test/Bitcode`.
There are two remaining failures in `test/Bitcode`:
- blockaddress.ll: I haven't thought through how to model the way
block addresses change the order of use-lists (or how to work around
it).
- metadata-2.ll: There's an old-style `@llvm.used` global array here
that I suspect the .ll parser isn't upgrading properly. When it
round-trips through bitcode, the .bc reader *does* upgrade it, so
the extra variable (`i8* null`) has an extra use, and the shuffle
vector doesn't match.
I think the fix is to upgrade old-style global arrays (or reject
them?) in the .ll parser.
This is part of PR5680.
llvm-svn: 214321
This commit fixes undefined behaviour that caused the revert in r214249.
The problem was two unsequenced operations on a `DenseMap<>`, giving
different behaviour in GCC and Clang. This:
DenseMap<T*, unsigned> DM;
for (auto &X : ...)
DM[&X] = DM.size() + 1;
should have been:
DenseMap<T*, unsigned> DM;
for (auto &X : ...) {
unsigned Size = DM.size();
DM[&X] = Size + 1;
}
Until r214242, this difference between compilers didn't matter. In
r214242, `OrderMap::LastGlobalValueID` was introduced and compared
against IDs, which in GCC were off-by-one my expectations.
llvm-svn: 214270
To avoid unnecessary forward references, the reader doesn't process
initializers of `GlobalValue`s until after the constant pool has been
processed, and then in reverse order. Model this when predicting
use-list order. This gets two more Bitcode tests passing with
`llvm-uselistorder`.
Part of PR5680.
llvm-svn: 214242
This will let users in other libraries know which error occurred. In particular,
it will be possible to check if the parsing failed or if the file is not
bitcode.
llvm-svn: 214209
Fix the sort of expected order in the reader to correctly return `false`
when comparing a `Use` against itself.
This was caught by test/Bitcode/binaryIntInstructions.3.2.ll, so I'm
adding a `RUN` line using `llvm-uselistorder` for every test in
`test/Bitcode` that passes.
A few tests still fail, so I'll investigate those next.
This is part of PR5680.
llvm-svn: 214157