Add serialization support for function metadata attachments (added in
r235783). The syntax is:
define @foo() !attach !0 {
Metadata attachments are only allowed on functions with bodies. Since
they come before the `{`, they're not really part of the body; since
they require a body, they're not really part of the header. In
`LLParser` I gave them a separate function called from `ParseDefine()`,
`ParseOptionalFunctionMetadata()`.
In bitcode, I'm using the same `METADATA_ATTACHMENT` record used by
instructions. Instruction metadata attachments are included in a
special "attachment" block at the end of a `Function`. The attachment
records are laid out like this:
InstID (KindID MetadataID)+
Note that these records always have an odd number of fields. The new
code takes advantage of this to recognize function attachments (which
don't need an instruction ID):
(KindID MetadataID)+
This means we can use the same attachment block already used for
instructions.
This is part of PR23340.
llvm-svn: 235785
(reverted in r235533)
Original commit message:
"Calls to llvm::Value::mutateType are becoming extra-sensitive now that
instructions have extra type information that will not be derived from
operands or result type (alloca, gep, load, call/invoke, etc... ). The
special-handling for mutateType will get more complicated as this work
continues - it might be worth making mutateType virtual & pushing the
complexity down into the classes that need special handling. But with
only two significant uses of mutateType (vectorization and linking) this
seems OK for now.
Totally open to ideas/suggestions/improvements, of course.
With this, and a bunch of exceptions, we can roundtrip an indirect call
site through bitcode and IR. (a direct call site is actually trickier...
I haven't figured out how to deal with the IR deserializer's lazy
construction of Function/GlobalVariable decl's based on the type of the
entity which means looking through the "pointer to T" type referring to
the global)"
The remapping done in ValueMapper for LTO was insufficient as the types
weren't correctly mapped (though I was using the post-mapped operands,
some of those operands might not have been mapped yet so the type
wouldn't be post-mapped yet). Instead use the pre-mapped type and
explicitly map all the types.
llvm-svn: 235651
Summary:
Make sure the abbrev operands are valid and that we can read/skip them
afterwards.
Bug found with AFL fuzz.
Reviewers: rafael
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9030
llvm-svn: 235595
This reverts commit r235458.
It looks like this might be breaking something LTO-ish. Looking into it
& will recommit with a fix/test case/etc once I've got more to go on.
llvm-svn: 235533
Without pointee types the space optimization of storing only the pointer
type and not the value type won't be viable - so add the extra type
information that would be missing.
llvm-svn: 235475
Without pointee types the space optimization of storing only the pointer
type and not the value type won't be viable - so add the extra type
information that would be missing.
Storeatomic coming soon.
llvm-svn: 235474
Calls to llvm::Value::mutateType are becoming extra-sensitive now that
instructions have extra type information that will not be derived from
operands or result type (alloca, gep, load, call/invoke, etc... ). The
special-handling for mutateType will get more complicated as this work
continues - it might be worth making mutateType virtual & pushing the
complexity down into the classes that need special handling. But with
only two significant uses of mutateType (vectorization and linking) this
seems OK for now.
Totally open to ideas/suggestions/improvements, of course.
With this, and a bunch of exceptions, we can roundtrip an indirect call
site through bitcode and IR. (a direct call site is actually trickier...
I haven't figured out how to deal with the IR deserializer's lazy
construction of Function/GlobalVariable decl's based on the type of the
entity which means looking through the "pointer to T" type referring to
the global)
llvm-svn: 235458
Now (with a few carefully placed suppressions relating to general type
serialization, etc) we can round trip a simple load through bitcode and
textual IR without calling getElementType on a PointerType.
llvm-svn: 235221
Use an extra bit in the CCInfo to flag the newer version of the
instructiont hat includes the type explicitly.
Tested the newer error cases I added, but didn't add tests for the finer
granularity improvements to existing error paths.
llvm-svn: 235160
Summary:
If a pointer is marked as dereferenceable_or_null(N), LLVM assumes it
is either `null` or `dereferenceable(N)` or both. This change only
introduces the attribute and adds a token test case for the `llvm-as`
/ `llvm-dis`. It does not hook up other parts of the optimizer to
actually exploit the attribute -- those changes will come later.
For pointers in address space 0, `dereferenceable(N)` is now exactly
equivalent to `dereferenceable_or_null(N)` && `nonnull`. For other
address spaces, `dereferenceable(N)` is potentially weaker than
`dereferenceable_or_null(N)` && `nonnull` (since we could have a null
`dereferenceable(N)` pointer).
The motivating case for this change is Java (and other managed
languages), where pointers are either `null` or dereferenceable up to
some usually known-at-compile-time constant offset.
Reviewers: rafael, hfinkel
Reviewed By: hfinkel
Subscribers: nicholas, llvm-commits
Differential Revision: http://reviews.llvm.org/D8650
llvm-svn: 235132
Remove 'inlinedAt:' from MDLocalVariable. Besides saving some memory
(variables with it seem to be single largest `Metadata` contributer to
memory usage right now in -g -flto builds), this stops optimization and
backend passes from having to change local variables.
The 'inlinedAt:' field was used by the backend in two ways:
1. To tell the backend whether and into what a variable was inlined.
2. To create a unique id for each inlined variable.
Instead, rely on the 'inlinedAt:' field of the intrinsic's `!dbg`
attachment, and change the DWARF backend to use a typedef called
`InlinedVariable` which is `std::pair<MDLocalVariable*, MDLocation*>`.
This `DebugLoc` is already passed reliably through the backend (as
verified by r234021).
This commit removes the check from r234021, but I added a new check
(that will survive) in r235048, and changed the `DIBuilder` API in
r235041 to require a `!dbg` attachment whose 'scope:` is in the same
`MDSubprogram` as the variable's.
If this breaks your out-of-tree testcases, perhaps the script I used
(mdlocalvariable-drop-inlinedat.sh) will help; I'll attach it to PR22778
in a moment.
llvm-svn: 235050
Summary:
Without this check the following case failed:
Skip a SubBlock which is not a MODULE_BLOCK_ID nor a BLOCKINFO_BLOCK_ID
Got to end of file
TheModule would still be == nullptr, and we would subsequentially fail
when materializing the Module (assert at the start of
BitcodeReader::MaterializeModule).
Bug found with AFL.
Reviewers: dexonsmith, rafael
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9014
llvm-svn: 234887
The patch is generated using clang-tidy misc-use-override check.
This command was used:
tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py \
-checks='-*,misc-use-override' -header-filter='llvm|clang' \
-j=32 -fix -format
http://reviews.llvm.org/D8925
llvm-svn: 234679
Require the pointee type to be passed explicitly and assert that it is
correct. For now it's possible to pass nullptr here (and I've done so in
a few places in this patch) but eventually that will be disallowed once
all clients have been updated or removed. It'll be a long road to get
all the way there... but if you have the cahnce to update your callers
to pass the type explicitly without depending on a pointer's element
type, that would be a good thing to do soon and a necessary thing to do
eventually.
llvm-svn: 233938
Keep a note in the materializer that we are stripping debug info so that
user doing a lazy read of the module don't hit outdated formats.
Thanks to Duncan for suggesting the fix.
llvm-svn: 233603
Check accessors of `MDLocation`, and change them to `cast<>` down to the
right types. Also add type-safe factory functions.
All the callers that handle broken code need to use the new versions of
the accessors (`getRawScope()` instead of `getScope()`) that still
return `Metadata*`. This is also necessary for things like
`MDNodeKeyImpl<MDLocation>` (in LLVMContextImpl.h) that need to unique
the nodes when their operands might still be forward references of the
wrong type.
In the `Value` hierarchy, consumers that handle broken code use
`getOperand()` directly. However, debug info nodes have a ton of
operands, and their order (even their existence) isn't stable yet. It's
safer and more maintainable to add an explicit "raw" accessor on the
class itself.
llvm-svn: 233322
(turns out I had regressed this when sinking handling of this type down
into GetElementPtrInst::Create - since that asserted before the error
handling was performed)
llvm-svn: 232420
This happened to be fairly easy to support backwards compatibility based
on the number of operands (old format had an even number, new format has
one more operand so an odd number).
test/Bitcode/old-aliases.ll already appears to test old gep operators
(if I remove the backwards compatibility in the BitcodeReader, this and
another test fail) so I'm not adding extra test coverage here.
llvm-svn: 232216
I don't think we test invalid bitcode records in any detail, so no test
here - just a change for consistency with existing error checks in
surrounding code.
llvm-svn: 232215
We only defer loading metadata inside ParseModule when ShouldLazyLoadMetadata
is true and we have not loaded any Metadata block yet.
This commit implements all-or-nothing loading of Metadata. If there is a
request to load any metadata block, we will load all deferred metadata blocks.
We make sure the deferred metadata blocks are loaded before we materialize any
function or a module.
The default value of the added parameter ShouldLazyLoadMetadata for
getLazyBitcodeModule is false, so the default behavior stays the same.
We only set the parameter to true when creating LTOModule in local contexts.
These can only really be used for parsing symbols, so it's unnecessary to ever
load the metadata blocks.
If we are going to enable lazy-loading of Metadata for other usages of
getLazyBitcodeModule, where deferred metadata blocks need to be loaded, we can
expose BitcodeReader::materializeMetadata to Module, similar to
Module::materialize.
rdar://19804575
llvm-svn: 232198
Like r230414, add bitcode support including backwards compatibility, for
an explicit type parameter to GEP.
At the suggestion of Duncan I tried coalescing the two older bitcodes into a
single new bitcode, though I did hit a wrinkle: I couldn't figure out how to
create an explicit abbreviation for a record with a variable number of
arguments (the indicies to the gep). This means the discriminator between
inbounds and non-inbounds gep is a full variable-length field I believe? Is my
understanding correct? Is there a way to create such an abbreviation? Should I
just use two bitcodes as before?
Reviewers: dexonsmith
Differential Revision: http://reviews.llvm.org/D7736
llvm-svn: 230415
Summary:
I've taken my best guess at this, but I've cargo culted in places & so
explanations/corrections would be great.
This seems to pass all the tests (check-all, covering clang and llvm) so I
believe that pretty well exercises both the backwards compatibility and common
(same version) compatibility given the number of checked in bitcode files we
already have. Is that a reasonable approach to testing here? Would some more
explicit tests be desired?
1) is this the right way to do back-compat in this case (looking at the number
of entries in the bitcode record to disambiguate between the old schema and
the new?)
2) I don't quite understand the logarithm logic to choose the encoding type of
the type parameter in the abbreviation description, but I found another
instruction doing the same thing & it seems to work. Is that the right
approach?
Reviewers: dexonsmith
Differential Revision: http://reviews.llvm.org/D7655
llvm-svn: 230414
While fuzzing LLVM bitcode files, I discovered that (1) the bitcode reader doesn't check that alignments are no larger than 2**29; (2) downstream code doesn't check the range; and (3) for values out of range, corresponding large memory requests (based on alignment size) will fail. This code fixes the bitcode reader to check for valid alignments, fixing this problem.
This CL fixes alignment value on global variables, functions, and instructions: alloca, load, load atomic, store, store atomic.
Patch by Karl Schimpf (kschimpf@google.com).
llvm-svn: 230180
When writing the bitcode serialization for the new debug info hierarchy,
I assumed two fields would never be null.
Drop that assumption, since it's brittle (and crashes the
`BitcodeWriter` if wrong), and is a check better left for the verifier
anyway. (No need for a bitcode upgrade here, since the new hierarchy is
still not in place.)
The fields in question are `MDCompileUnit::getFile()` and
`MDDerivedType::getBaseType()`, the latter of which isn't null in
test/Transforms/Mem2Reg/ConvertDebugInfo2.ll (see !14, a pointer to
nothing). While the testcase might have bitrotted, there's no reason
for the bitcode format to rely on non-null for metadata operands.
This also fixes a bug in `AsmWriter` where if the `file:` is null it
isn't emitted (caught by the double-round trip in the testcase I'm
adding) -- this is a required field in `LLParser`.
I'll circle back to ConvertDebugInfo2. Once the specialized nodes are
in place, I'll be trying to turn the debug info verifier back on by
default (in the newer module pass form committed r206300) and throwing
more logic in there. If the testcase has bitrotted (as opposed to me
not understanding the schema correctly) I'll fix it then.
llvm-svn: 229960
Follow-up to r229740, which removed `DITemplate*::getContext()` after my
upgrade script revealed that scopes are always `nullptr` for template
parameters. This is the other shoe: drop `scope:` from
`MDTemplateParameter` and its two subclasses. (Note: a bitcode upgrade
would be pointless, since the hierarchy hasn't been moved into place.)
llvm-svn: 229791
The metadata/value split introduced a major regression reading large
bitcode files that contain debug info (or other cyclic (non-self
reference) metadata graphs). For the first time in a while, I dropped
from libLTO.dylib down to `llvm-lto` with a non-trivial bitcode file
(~350MB), and I hit this when reading the result of ld64's `-save-temps`
in `llvm-lto`.
Here's pseudo-code for what was going on:
read-main-metadata-block:
for each md:
if has-fwd-ref: // Only true for cyclic graphs.
any-fwd-refs <- true
if any-fwd-refs:
foreach md:
resolve-cycles(md) // Handle cycles.
foreach function:
read-function-metadata-block: // Such as !alias, !loop
if any-fwd-refs:
foreach md: // (all metadata, not just this block)
resolve-cycles(md) // A no-op, but the loop is expensive!!
This commit resets the `AnyFwdRefs` flag to `false`. This on its own
was enough to change my Release+Asserts `llvm-lto` time for reading this
bitcode from over 20 minutes (I gave up on it) to 20 seconds. I've gone
further by tracking the min/max metadata forward-references in a
metadata block. This protects against a schema that has lots of
functions that each reference their own metadata cycle.
Unfortunately, this regression is in the 3.6 branch as well.
llvm-svn: 229421
Summary:
When creating {insert,extract}value instructions from a BitcodeReader, we
weren't verifying the fields were valid.
Bugs found with afl-fuzz
Reviewers: rafael
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D7325
llvm-svn: 229345