This reverts commit r243888, recommitting r243824.
This broke the Windows build due to a difference in the C++ standard
library implementation. Using emplace/forward_as_tuple should ensure
there's no need to copy ValIDs.
llvm-svn: 243896
Since r241097, `DIBuilder` has only created distinct `DICompileUnit`s.
The backend is liable to start relying on that (if it hasn't already),
so make uniquable `DICompileUnit`s illegal and automatically upgrade old
bitcode. This is a nice cleanup, since we can remove an unnecessary
`DenseSet` (and the associated uniquing info) from `LLVMContextImpl`.
Almost all the testcases were updated with this script:
git grep -e '= !DICompileUnit' -l -- test |
grep -v test/Bitcode |
xargs sed -i '' -e 's,= !DICompileUnit,= distinct !DICompileUnit,'
I imagine something similar should work for out-of-tree testcases.
llvm-svn: 243885
* generate function with string attribute using API,
* dump it in LL format,
* try to parse.
Add parser support for string attributes to fix the issue.
Reviewed By: reames, hfinkel
Differential Revision: http://reviews.llvm.org/D11058
llvm-svn: 243877
Remove the fake `DW_TAG_auto_variable` and `DW_TAG_arg_variable` tags,
using `DW_TAG_variable` in their place Stop exposing the `tag:` field at
all in the assembly format for `DILocalVariable`.
Most of the testcase updates were generated by the following sed script:
find test/ -name "*.ll" -o -name "*.mir" |
xargs grep -l 'DILocalVariable' |
xargs sed -i '' \
-e 's/tag: DW_TAG_arg_variable, //' \
-e 's/tag: DW_TAG_auto_variable, //'
There were only a handful of tests in `test/Assembly` that I needed to
update by hand.
(Note: a follow-up could change `DILocalVariable::DILocalVariable()` to
set the tag to `DW_TAG_formal_parameter` instead of `DW_TAG_variable`
(as appropriate), instead of having that logic magically in the backend
in `DbgVariable`. I've added a FIXME to that effect.)
llvm-svn: 243774
This introduces new instructions neccessary to implement MSVC-compatible
exception handling support. Most of the middle-end and none of the
back-end haven't been audited or updated to take them into account.
Differential Revision: http://reviews.llvm.org/D11097
llvm-svn: 243766
When parsing calls to inline asm the pointee type (of the pointer type
representing the value type of the InlineAsm value) was used. To avoid
using it, use the ValID structure to ferry the FunctionType directly
through to the InlineAsm construction.
This is a bit of a workaround - alternatively the inline asm could
explicitly describe the type but that'd be verbose/redundant in the IR
and so long as the inline asm calls directly in the context of a call or
invoke, this should suffice.
llvm-svn: 243349
This commit extends the interface provided by the AsmParser library by adding a
function that allows the user to parse a standalone contant value.
This change is useful for MIR serialization, as it will allow the MIR Parser to
parse the constant values in a machine constant pool.
Reviewers: Duncan P. N. Exon Smith
Differential Revision: http://reviews.llvm.org/D10280
llvm-svn: 242579
This change adds new attribute called "argmemonly". Function marked with this attribute can only access memory through it's argument pointers. This attribute directly corresponds to the "OnlyAccessesArgumentPointees" ModRef behaviour in alias analysis.
Differential Revision: http://reviews.llvm.org/D10398
llvm-svn: 241979
FCmp behaves a lot like a floating-point binary operator in many ways,
and can benefit from fast-math information. Flags such as nsz and nnan
can affect if this fcmp (in combination with a select) can be treated
as a fminnum/fmaxnum operation.
This adds backwards-compatible bitcode support, IR parsing and writing,
LangRef changes and IRBuilder changes. I'll need to audit InstSimplify
and InstCombine in a followup to find places where flags should be
copied.
llvm-svn: 241901
Summary:
This introduces new instructions neccessary to implement MSVC-compatible
exception handling support. Most of the middle-end and none of the
back-end haven't been audited or updated to take them into account.
Reviewers: rnk, JosephTremoulet, reames, nlewycky, rjmccall
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11041
llvm-svn: 241888
The justification of this change is here: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-March/082989.html
According to the current GEP syntax, vector GEP requires that each index must be a vector with the same number of elements.
%A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets
In this implementation I let each index be or vector or scalar. All vector indices must have the same number of elements. The scalar value will mean the splat vector value.
(1) %A = getelementptr i8, i8* %ptr, <4 x i64> %offsets
or
(2) %A = getelementptr i8, <4 x i8*> %ptrs, i64 %offset
In all cases the %A type is <4 x i8*>
In the case (2) we add the same offset to all pointers.
The case (1) covers C[B[i]] case, when we have the same base C and different offsets B[i].
The documentation is updated.
http://reviews.llvm.org/D10496
llvm-svn: 241788
It is meant to be used to record modules @imported by the current
compile unit, so a debugger an import the same modules to replicate this
environment before dropping into the expression evaluator.
DIModule is a sibling to DINamespace and behaves quite similarly.
In addition to the name of the module it also records the module
configuration details that are necessary to uniquely identify the module.
This includes the configuration macros (e.g., -DNDEBUG), the include path
where the module.map file is to be found, and the isysroot.
The idea is that the backend will turn this into a DW_TAG_module.
http://reviews.llvm.org/D9614
rdar://problem/20965932
llvm-svn: 241017
This commit moves the APSInt initialization code that's used by
the LLLexer class into a new APSInt constructor that constructs
APSInts from strings.
This change is useful for MIR Serialization, as it would allow
the MILexer class to use the same APSInt initialization as
LLexer when parsing immediate machine operands.
llvm-svn: 240436
This commit creates a new structure called 'SlotMapping' in the AsmParser library.
This structure can be passed into the public parsing APIs from the AsmParser library
in order to extract the data structures that map from slot numbers to unnamed global
values and metadata nodes.
This change is useful for MIR Serialization, as the MIR Parser has to lookup the
unnamed global values and metadata nodes by their slot numbers.
Reviewers: Duncan P. N. Exon Smith
Differential Revision: http://reviews.llvm.org/D10551
llvm-svn: 240427
The patch is generated using this command:
tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
-checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \
llvm/lib/
Thanks to Eugene Kosov for the original patch!
llvm-svn: 240137
The personality routine currently lives in the LandingPadInst.
This isn't desirable because:
- All LandingPadInsts in the same function must have the same
personality routine. This means that each LandingPadInst beyond the
first has an operand which produces no additional information.
- There is ongoing work to introduce EH IR constructs other than
LandingPadInst. Moving the personality routine off of any one
particular Instruction and onto the parent function seems a lot better
than have N different places a personality function can sneak onto an
exceptional function.
Differential Revision: http://reviews.llvm.org/D10429
llvm-svn: 239940
If globals can be unnamed, there is no reason for aliases to be different.
The restriction was there since the original implementation in r36435. I
can only guess it was there because of the old bison parser for the old
alias syntax.
llvm-svn: 239921
`LLVM_ENABLE_MODULES` builds sometimes fail because `Intrinsics.td`
needs to regenerate `Instrinsics.h` before anyone can include anything
from the LLVM_IR module. Represent the dependency explicitly to prevent
that.
llvm-svn: 239796
This patch adds the safe stack instrumentation pass to LLVM, which separates
the program stack into a safe stack, which stores return addresses, register
spills, and local variables that are statically verified to be accessed
in a safe way, and the unsafe stack, which stores everything else. Such
separation makes it much harder for an attacker to corrupt objects on the
safe stack, including function pointers stored in spilled registers and
return addresses. You can find more information about the safe stack, as
well as other parts of or control-flow hijack protection technique in our
OSDI paper on code-pointer integrity (http://dslab.epfl.ch/pubs/cpi.pdf)
and our project website (http://levee.epfl.ch).
The overhead of our implementation of the safe stack is very close to zero
(0.01% on the Phoronix benchmarks). This is lower than the overhead of
stack cookies, which are supported by LLVM and are commonly used today,
yet the security guarantees of the safe stack are strictly stronger than
stack cookies. In some cases, the safe stack improves performance due to
better cache locality.
Our current implementation of the safe stack is stable and robust, we
used it to recompile multiple projects on Linux including Chromium, and
we also recompiled the entire FreeBSD user-space system and more than 100
packages. We ran unit tests on the FreeBSD system and many of the packages
and observed no errors caused by the safe stack. The safe stack is also fully
binary compatible with non-instrumented code and can be applied to parts of
a program selectively.
This patch is our implementation of the safe stack on top of LLVM. The
patches make the following changes:
- Add the safestack function attribute, similar to the ssp, sspstrong and
sspreq attributes.
- Add the SafeStack instrumentation pass that applies the safe stack to all
functions that have the safestack attribute. This pass moves all unsafe local
variables to the unsafe stack with a separate stack pointer, whereas all
safe variables remain on the regular stack that is managed by LLVM as usual.
- Invoke the pass as the last stage before code generation (at the same time
the existing cookie-based stack protector pass is invoked).
- Add unit tests for the safe stack.
Original patch by Volodymyr Kuznetsov and others at the Dependable Systems
Lab at EPFL; updates and upstreaming by myself.
Differential Revision: http://reviews.llvm.org/D6094
llvm-svn: 239761
As a follow-up to r235955, actually support up to 65535 arguments in a
subprogram. r235955 missed assembly support, having only tested the new
limit via C++ unit tests. Code patch by Amjad Aboud.
llvm-svn: 238854
If the type isn't trivially moveable emplace can skip a potentially
expensive move. It also saves a couple of characters.
Call sites were found with the ASTMatcher + some semi-automated cleanup.
memberCallExpr(
argumentCountIs(1), callee(methodDecl(hasName("push_back"))),
on(hasType(recordDecl(has(namedDecl(hasName("emplace_back")))))),
hasArgument(0, bindTemporaryExpr(
hasType(recordDecl(hasNonTrivialDestructor())),
has(constructExpr()))),
unless(isInTemplateInstantiation()))
No functional change intended.
llvm-svn: 238602
so DWARF skeleton CUs can be expression in IR. A skeleton CU is a
(typically empty) DW_TAG_compile_unit that has a DW_AT_(GNU)_dwo_name and
a DW_AT_(GNU)_dwo_id attribute. It is used to refer to external debug info.
This is a prerequisite for clang module debugging as discussed in
http://lists.cs.uiuc.edu/pipermail/cfe-dev/2014-November/040076.html.
In order to refer to external types stored in split DWARF (dwo) objects,
such as clang modules, we need to emit skeleton CUs, which identify the
dwarf object (i.e., the clang module) by filename (the SplitDebugFilename)
and a hash value, the dwo_id.
This patch only contains the IR changes. The idea is that a CUs with a
non-zero dwo_id field will be emitted together with a DW_AT_GNU_dwo_name
and DW_AT_GNU_dwo_id attribute.
http://reviews.llvm.org/D9488
rdar://problem/20091852
llvm-svn: 237949
This commit modifies the memory buffer creation in the AsmParser library so
that it requires a terminating null character. The LLLexer in the AsmParser
library checks for EOF only when it sees a null character, thus it would
be best to require it when creating a memory buffer so that the memory
buffer constructor can verify that a terminating null character is indeed
present.
Reviewers: Duncan P. N. Exon Smith, Matthias Braun
Differential Revision: http://reviews.llvm.org/D9883
llvm-svn: 237833
Many of the callers already have the pointer type anyway, and for the
couple of callers that don't it's pretty easy to call PointerType::get
on the pointee type and address space.
This avoids LLParser from using PointerType::getElementType when parsing
GlobalAliases from IR.
llvm-svn: 236160
Finish off PR23080 by renaming the debug info IR constructs from `MD*`
to `DI*`. The last of the `DIDescriptor` classes were deleted in
r235356, and the last of the related typedefs removed in r235413, so
this has all baked for about a week.
Note: If you have out-of-tree code (like a frontend), I recommend that
you get everything compiling and tests passing with the *previous*
commit before updating to this one. It'll be easier to keep track of
what code is using the `DIDescriptor` hierarchy and what you've already
updated, and I think you're extremely unlikely to insert bugs. YMMV of
course.
Back to *this* commit: I did this using the rename-md-di-nodes.sh
upgrade script I've attached to PR23080 (both code and testcases) and
filtered through clang-format-diff.py. I edited the tests for
test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns
were off-by-three. It should work on your out-of-tree testcases (and
code, if you've followed the advice in the previous paragraph).
Some of the tests are in badly named files now (e.g.,
test/Assembler/invalid-mdcompositetype-missing-tag.ll should be
'dicompositetype'); I'll come back and move the files in a follow-up
commit.
llvm-svn: 236120
Add serialization support for function metadata attachments (added in
r235783). The syntax is:
define @foo() !attach !0 {
Metadata attachments are only allowed on functions with bodies. Since
they come before the `{`, they're not really part of the body; since
they require a body, they're not really part of the header. In
`LLParser` I gave them a separate function called from `ParseDefine()`,
`ParseOptionalFunctionMetadata()`.
In bitcode, I'm using the same `METADATA_ATTACHMENT` record used by
instructions. Instruction metadata attachments are included in a
special "attachment" block at the end of a `Function`. The attachment
records are laid out like this:
InstID (KindID MetadataID)+
Note that these records always have an odd number of fields. The new
code takes advantage of this to recognize function attachments (which
don't need an instruction ID):
(KindID MetadataID)+
This means we can use the same attachment block already used for
instructions.
This is part of PR23340.
llvm-svn: 235785
Remove unused `PFS` variable and take the `Instruction` by reference.
(Not really related to PR23340, but might as well clean this up while
I'm here.)
llvm-svn: 235782
Same as r235145 for the call instruction - the justification, tradeoffs,
etc are all the same. The conversion script worked the same without any
false negatives (after replacing 'call' with 'invoke').
llvm-svn: 235755
(reverted in r235533)
Original commit message:
"Calls to llvm::Value::mutateType are becoming extra-sensitive now that
instructions have extra type information that will not be derived from
operands or result type (alloca, gep, load, call/invoke, etc... ). The
special-handling for mutateType will get more complicated as this work
continues - it might be worth making mutateType virtual & pushing the
complexity down into the classes that need special handling. But with
only two significant uses of mutateType (vectorization and linking) this
seems OK for now.
Totally open to ideas/suggestions/improvements, of course.
With this, and a bunch of exceptions, we can roundtrip an indirect call
site through bitcode and IR. (a direct call site is actually trickier...
I haven't figured out how to deal with the IR deserializer's lazy
construction of Function/GlobalVariable decl's based on the type of the
entity which means looking through the "pointer to T" type referring to
the global)"
The remapping done in ValueMapper for LTO was insufficient as the types
weren't correctly mapped (though I was using the post-mapped operands,
some of those operands might not have been mapped yet so the type
wouldn't be post-mapped yet). Instead use the pre-mapped type and
explicitly map all the types.
llvm-svn: 235651
This reverts commit r235458.
It looks like this might be breaking something LTO-ish. Looking into it
& will recommit with a fix/test case/etc once I've got more to go on.
llvm-svn: 235533
Calls to llvm::Value::mutateType are becoming extra-sensitive now that
instructions have extra type information that will not be derived from
operands or result type (alloca, gep, load, call/invoke, etc... ). The
special-handling for mutateType will get more complicated as this work
continues - it might be worth making mutateType virtual & pushing the
complexity down into the classes that need special handling. But with
only two significant uses of mutateType (vectorization and linking) this
seems OK for now.
Totally open to ideas/suggestions/improvements, of course.
With this, and a bunch of exceptions, we can roundtrip an indirect call
site through bitcode and IR. (a direct call site is actually trickier...
I haven't figured out how to deal with the IR deserializer's lazy
construction of Function/GlobalVariable decl's based on the type of the
entity which means looking through the "pointer to T" type referring to
the global)
llvm-svn: 235458
See r230786 and r230794 for similar changes to gep and load
respectively.
Call is a bit different because it often doesn't have a single explicit
type - usually the type is deduced from the arguments, and just the
return type is explicit. In those cases there's no need to change the
IR.
When that's not the case, the IR usually contains the pointer type of
the first operand - but since typed pointers are going away, that
representation is insufficient so I'm just stripping the "pointerness"
of the explicit type away.
This does make the IR a bit weird - it /sort of/ reads like the type of
the first operand: "call void () %x(" but %x is actually of type "void
()*" and will eventually be just of type "ptr". But this seems not too
bad and I don't think it would benefit from repeating the type
("void (), void () * %x(" and then eventually "void (), ptr %x(") as has
been done with gep and load.
This also has a side benefit: since the explicit type is no longer a
pointer, there's no ambiguity between an explicit type and a function
that returns a function pointer. Previously this case needed an explicit
type (eg: a function returning a void() function was written as
"call void () () * @x(" rather than "call void () * @x(" because of the
ambiguity between a function returning a pointer to a void() function
and a function returning void).
No ambiguity means even function pointer return types can just be
written alone, without writing the whole function's type.
This leaves /only/ the varargs case where the explicit type is required.
Given the special type syntax in call instructions, the regex-fu used
for migration was a bit more involved in its own unique way (as every
one of these is) so here it is. Use it in conjunction with the apply.sh
script and associated find/xargs commands I've provided in rr230786 to
migrate your out of tree tests. Do let me know if any of this doesn't
cover your cases & we can iterate on a more general script/regexes to
help others with out of tree tests.
About 9 test cases couldn't be automatically migrated - half of those
were functions returning function pointers, where I just had to manually
delete the function argument types now that we didn't need an explicit
function type there. The other half were typedefs of function types used
in calls - just had to manually drop the * from those.
import fileinput
import sys
import re
pat = re.compile(r'((?:=|:|^|\s)call\s(?:[^@]*?))(\s*$|\s*(?:(?:\[\[[a-zA-Z0-9_]+\]\]|[@%](?:(")?[\\\?@a-zA-Z0-9_.]*?(?(3)"|)|{{.*}}))(?:\(|$)|undef|inttoptr|bitcast|null|asm).*$)')
addrspace_end = re.compile(r"addrspace\(\d+\)\s*\*$")
func_end = re.compile("(?:void.*|\)\s*)\*$")
def conv(match, line):
if not match or re.search(addrspace_end, match.group(1)) or not re.search(func_end, match.group(1)):
return line
return line[:match.start()] + match.group(1)[:match.group(1).rfind('*')].rstrip() + match.group(2) + line[match.end():]
for line in sys.stdin:
sys.stdout.write(conv(re.search(pat, line), line))
llvm-svn: 235145
Summary:
If a pointer is marked as dereferenceable_or_null(N), LLVM assumes it
is either `null` or `dereferenceable(N)` or both. This change only
introduces the attribute and adds a token test case for the `llvm-as`
/ `llvm-dis`. It does not hook up other parts of the optimizer to
actually exploit the attribute -- those changes will come later.
For pointers in address space 0, `dereferenceable(N)` is now exactly
equivalent to `dereferenceable_or_null(N)` && `nonnull`. For other
address spaces, `dereferenceable(N)` is potentially weaker than
`dereferenceable_or_null(N)` && `nonnull` (since we could have a null
`dereferenceable(N)` pointer).
The motivating case for this change is Java (and other managed
languages), where pointers are either `null` or dereferenceable up to
some usually known-at-compile-time constant offset.
Reviewers: rafael, hfinkel
Reviewed By: hfinkel
Subscribers: nicholas, llvm-commits
Differential Revision: http://reviews.llvm.org/D8650
llvm-svn: 235132