This patch implements a minimum spanning tree (MST) based instrumentation for
PGO. The use of MST guarantees minimum number of CFG edges getting
instrumented. An addition optimization is to instrument the less executed
edges to further reduce the instrumentation overhead. The patch contains both the
instrumentation and the use of the profile to set the branch weights.
Differential Revision: http://reviews.llvm.org/D12781
llvm-svn: 254021
We had two code paths. One would create names like "foo.1" and the other
names like "foo1".
For globals it is important to use "foo.1" to help C++ name demangling.
For locals there is no strong reason to go one way or the other so I
kept the most common mangling (foo1).
llvm-svn: 253804
Summary:
Several fixes to the handling of bitcode files without function summary
sections so that they are skipped during ThinLTO processing in llvm-lto
and the gold plugin when appropriate instead of aborting.
1 Don't assert when trying to add a FunctionInfo that doesn't have
a summary attached.
2 Skip FunctionInfo structures that don't have attached function summary
sections when trying to create the combined function summary.
3 In both llvm-lto and gold-plugin, check whether a bitcode file has
a function summary section before trying to parse the index, and skip
the bitcode file if it does not.
4 Fix hasFunctionSummaryInMemBuffer in BitcodeReader, which had a bug
where we returned to early while looking for the summary section.
Also added llvm-lto and gold-plugin based tests for cases where we
don't have function summaries in the bitcode file. I verified that
either the first couple fixes described above are enough to avoid the
crashes, or fixes 1,3,4. But have combined them all here for added
robustness.
Reviewers: joker.eph
Subscribers: llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D14903
llvm-svn: 253796
Terrifyingly, one of them is a mishandling of floating point vectors
in Constant::isZero(). How exactly this issue survived this long
is beyond me.
llvm-svn: 253655
The masked intrinsics support all integer and floating point data types. I added the pointer type to this list.
Added tests for CodeGen and for Loop Vectorizer.
Updated the Language Reference.
Differential Revision: http://reviews.llvm.org/D14150
llvm-svn: 253544
Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
These intrinsics currently have an explicit alignment argument which is
required to be a constant integer. It represents the alignment of the
source and dest, and so must be the minimum of those.
This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments. The alignment
argument itself is removed.
There are a few places in the code for which the code needs to be
checked by an expert as to whether using only src/dest alignment is
safe. For those places, they currently take the minimum of src/dest
alignments which matches the current behaviour.
For example, code which used to read:
call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false)
will now read:
call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false)
For out of tree owners, I was able to strip alignment from calls using sed by replacing:
(call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\)
with:
$1i1 false)
and similarly for memmove and memcpy.
I then added back in alignment to test cases which needed it.
A similar commit will be made to clang which actually has many differences in alignment as now
IRBuilder can generate different source/dest alignments on calls.
In IRBuilder itself, a new argument was added. Instead of calling:
CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false)
you now call
CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false)
There is a temporary class (IntegerAlignment) which takes the source alignment and rejects
implicit conversion from bool. This is to prevent isVolatile here from passing its default
parameter to the source alignment.
Note, changes in future can now be made to codegen. I didn't change anything here, but this
change should enable better memcpy code sequences.
Reviewed by Hal Finkel.
llvm-svn: 253511
Summary:
This change teaches LLVM's inliner to track and suitably adjust
deoptimization state (tracked via deoptimization operand bundles) as it
inlines through call sites. The operation is described in more detail
in the LangRef changes.
Reviewers: reames, majnemer, chandlerc, dexonsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14552
llvm-svn: 253438
The way prelink used to work was
* The compiler decides if a given section only has relocations that
are know to point to the same DSO. If so, it names it
.data.rel.ro.local<something>.
* The static linker puts all of these together.
* The prelinker program assigns addresses to each library and resolves
the local relocations.
There are many problems with this:
* It is incompatible with address space randomization.
* The information passed by the compiler is redundant. The linker
knows if a given relocation is in the same DSO or not. If could sort
by that if so desired.
* There are newer ways of speeding up DSO (gnu hash for example).
* Even if we want to implement this again in the compiler, the previous
implementation is pretty broken. It talks about relocations that are
"resolved by the static linker". If they are resolved, there are none
left for the prelinker. What one needs to track is if an expression
will require only dynamic relocations that point to the same DSO.
At this point it looks like the prelinker is an historical curiosity.
For example, fedora has retired it because it failed to build for two
releases
(http://pkgs.fedoraproject.org/cgit/prelink.git/commit/?id=eb43100a8331d91c801ee3dcdb0a0bb9babfdc1f)
This patch removes support for it. That is, it stops printing the
".local" sections.
llvm-svn: 253280
Summary: Since we're passing references to dbg.value as pointers,
we need to have the frontend properly declare their sizes and
alignments (as it already does for regular pointers) in preparation
for my upcoming patch to have the verifer check that the sizes agree.
Also augment the backend logic that skips actually emitting this
information into DWARF such that it also handles reference types.
Reviewers: aprantl, dexonsmith, dblaikie
Subscribers: dblaikie, llvm-commits
Differential Revision: http://reviews.llvm.org/D14275
llvm-svn: 253186
Summary: The Old personality function gets copied over, but the
Materializer didn't have a chance to inspect it (e.g. to fix up
references to the correct module for the target function).
Also add a verifier check that makes sure the personality routine
is in the same module as the function whose personality it is.
Reviewers: majnemer
Subscribers: jevinskie, llvm-commits
Differential Revision: http://reviews.llvm.org/D14474
llvm-svn: 253183
This reapplies r252949. I've changed the type of FuncName to be
std::string instead of StringRef in emitFnAttrCompatCheck.
Original commit message for r252949:
Provide a way to specify inliner's attribute compatibility and merging
rules using table-gen. NFC.
This commit adds new classes CompatRule and MergeRule to Attributes.td,
which are used to generate code to check attribute compatibility and
merge attributes of the caller and callee.
rdar://problem/19836465
llvm-svn: 252990
rules using table-gen. NFC.
This commit adds new classes CompatRule and MergeRule to Attributes.td,
which are used to generate code to check attribute compatibility and
merge attributes of the caller and callee.
rdar://problem/19836465
llvm-svn: 252949
When working with tokens, it is often the case that one has instructions
which consume a token and produce a new token. Currently, we have no
mechanism to represent an initial token state.
Instead, we can create a notional "empty token" by inventing a new
constant which captures the semantics we would like. This new constant
is called ConstantTokenNone and is written textually as "token none".
Differential Revision: http://reviews.llvm.org/D14581
llvm-svn: 252811
Summary:
This change introduces the notion of "deoptimization" operand bundles.
LLVM can recognize and optimize these in more precise ways than it can a
generic "unknown" operand bundles.
The current form of this special recognition / optimization is an enum
entry in LLVMContext, a LangRef blurb and a verifier rule. Over time we
will teach LLVM to do more aggressive optimization around deoptimization
operand bundles, exploiting known facts about kinds of state
deoptimization operand bundles are allowed to track.
Reviewers: reames, majnemer, chandlerc, dexonsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14551
llvm-svn: 252806
This is a step towards consolidating some of the information regarding
attributes in a single place.
This patch moves the enum attributes in Attributes.h to the table-gen
file. Additionally, it adds definitions of target independent string
attributes that will be used in follow-up commits by the inliner to
check attribute compatibility.
rdar://problem/19836465
llvm-svn: 252796
This was an omission in the patch that landed initial support for
operand bundles. So far we haven't hit this, but we will once the
inliner is able to inline calls to functions that contain calls with
operand bundles.
llvm-svn: 252645
This marker prevents optimization passes from adding 'tail' or
'musttail' markers to a call. Is is used to prevent tail call
optimization from being performed on the call.
rdar://problem/22667622
Differential Revision: http://reviews.llvm.org/D12923
llvm-svn: 252368
This attribute allows the compiler to assume that the function never recurses into itself, either directly or indirectly (transitively). This can be used among other things to demote global variables to locals.
llvm-svn: 252282
Previously, subprograms contained a metadata reference to the function they
described. Because most clients need to get or set a subprogram for a given
function rather than the other way around, this created unneeded inefficiency.
For example, many passes needed to call the function llvm::makeSubprogramMap()
to build a mapping from functions to subprograms, and the IR linker needed to
fix up function references in a way that caused quadratic complexity in the IR
linking phase of LTO.
This change reverses the direction of the edge by storing the subprogram as
function-level metadata and removing DISubprogram's function field.
Since this is an IR change, a bitcode upgrade has been provided.
Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is
attached to the PR.
Differential Revision: http://reviews.llvm.org/D14265
llvm-svn: 252219
Summary:
Data operands of a call or invoke consist of the call arguments, and
the bundle operands associated with the `call` (or `invoke`)
instruction. The motivation for this change is that we'd like to be
able to query "argument attributes" like `readonly` and `nocapture`
for bundle operands naturally.
This change also provides a conservative "implementation" for these
attributes for any bundle operand, and an extension point for future
work.
Reviewers: chandlerc, majnemer, reames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14305
llvm-svn: 252077
Summary:
This is intended to make a later change simpler.
Note: adding this bounds checking required fixing `X86FastISel`. As
far I can tell I've preserved original behavior but a careful review
will be appreciated.
Reviewers: reames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14304
llvm-svn: 252073
XOP has the VPCMOV instruction that performs the common vector bit select operation OR( AND( SRC1, SRC3 ), AND( SRC2, ~SRC3 ) )
This patch adds tablegen pattern matching for this instruction.
Differential Revision: http://reviews.llvm.org/D8841
llvm-svn: 251975
Summary:
An unsigned comparision is equivalent to is corresponding signed version
if both the operands being compared are positive. Teach SCEV to use
this fact when profitable.
Reviewers: atrick, hfinkel, reames, nlewycky
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13687
llvm-svn: 251051
Summary: This will be used in a future change to ScalarEvolution.
Reviewers: hfinkel, reames, nlewycky
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13612
llvm-svn: 250975
Summary:
This makes attribute accessors on `CallInst` and `InvokeInst` do the
(conservatively) right thing. This essentially involves, in some
cases, *not* falling back querying the attributes on the called
`llvm::Function` when operand bundles are present.
Attributes locally present on the `CallInst` or `InvokeInst` will still
override operand bundle semantics. The LangRef has been amended to
reflect this. Note: this change does not do anything prevent
`-function-attrs` from inferring `CallSite` local attributes after
inspecting the called function -- that will be done as a separate
change.
I've used `-adce` and `-early-cse` to test these changes. There is
nothing special about these passes (and they did not require any
changes) except that they seemed be the easiest way to write the tests.
This change does not add deal with `argmemonly`. That's a later change
because alias analysis requires a related fix before `argmemonly` can be
tested.
Reviewers: reames, chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13961
llvm-svn: 250973
Stop converting implicitly between iterators and pointers/references in
lib/IR. For convenience, I've added a `getIterator()` accessor to
`ilist_node` so that callers don't need to know how to spell the
iterator class (i.e., they can use `X.getIterator()` instead of
`Function::iterator(X)`).
I'll eventually disallow these implicit conversions entirely, but
there's a lot of code, so it doesn't make sense to do it all in one
patch. One library or so at a time.
Why? To root out cases of `getNextNode()` and `getPrevNode()` being
used in iterator logic. The design of `ilist` makes that invalid when
the current node could be at the back of the list, but it happens to
"work" right now because of a bug where those functions never return
`nullptr` if you're using a half-node sentinel. Before I can fix the
function, I have to remove uses of it that rely on it misbehaving.
(Maybe the function should just be deleted anyway? But I don't want
deleting it -- potentially a huge project -- to block fixing
ilist/iplist.)
llvm-svn: 249782
This is to enable me to address review for D13491 -- `Flags` is a
bitfield of `StatepointFlags`, not an individual item out of the enum,
so it should be represented as an `uint32_t`.
llvm-svn: 249778
Create `SymbolTableList`, a wrapper around `iplist` for lists that
automatically manage a symbol table. This commit reduces a ton of code
duplication between the six traits classes that were used previously.
As a drive by, reduce the number of template parameters from 2 to 1 by
using a SymbolTableListParentType metafunction (I originally had this as
a separate commit, but it touched most of the same lines so I squashed
them).
I'm in the process of trying to remove the UB in `createSentinel()` (see
the FIXMEs I added for `ilist_embedded_sentinel_traits` and
`ilist_half_embedded_sentinel_traits`). My eventual goal is to separate
the list logic into a base class layer that knows nothing about (and
isn't templated on) the downcasted nodes -- removing the need to invoke
UB -- but for now I'm just trying to get a handle on all the current use
cases (and cleaning things up as I see them).
Besides these six SymbolTable lists, there are two others that use the
addNode/removeNode/transferNodes() hooks: the `MachineInstruction` and
`MachineBasicBlock` lists. Ideally there'll be a way to factor these
hooks out of the low-level API entirely, but I'm not quite there yet.
llvm-svn: 249602
Summary:
This adds some more routines to `IRBuilder` around creating calls and
invokes to `gc.statepoint`. These will be used later.
Reviewers: reames, swaroop.sridhar
Subscribers: sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D13371
llvm-svn: 249596
No classes are specializing the symbol table traits, so no need to look
through a typedef for class API. Make a few more functions private
since only SymbolTableListTraits should be using them.
llvm-svn: 249476
The only specializations of `getSymTab()` were identical to the default
defined in `SymbolTableListTraits::getSymTab()`. Remove the
specializations, and stop treating it like a configuration point. Just
to be sure no one else accesses this, make it private.
llvm-svn: 249469
Factor out some common code used to get+set function prefix/prologue
data. This may come in handy if we ever decide to store personality
functions in the same way we store prefix/prologue data.
Differential Revision: http://reviews.llvm.org/D13120
Reviewed-by: bogner
llvm-svn: 249460
Summary:
The bitcode format is described in this document:
https://drive.google.com/file/d/0B036uwnWM6RWdnBLakxmeDdOeXc/view
For more info on ThinLTO see:
https://sites.google.com/site/llvmthinlto
The first customer is ThinLTO, however the data structures are designed
and named more generally based on prior feedback. There are a few
comments regarding how certain interfaces are used by ThinLTO, and the
options added here to gold currently have ThinLTO-specific names as the
behavior they provoke is currently ThinLTO-specific.
This patch includes support for generating per-module function indexes,
the combined index file via the gold plugin, and several tests
(more are included with the associated clang patch D11908).
Reviewers: dexonsmith, davidxl, joker.eph
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13107
llvm-svn: 249270
This commit changes the interface of the vld[1234], vld[234]lane, and vst[1234],
vst[234]lane ARM neon intrinsics and associates an address space with the
pointer that these intrinsics take. This changes, e.g.,
<2 x i32> @llvm.arm.neon.vld1.v2i32(i8*, i32)
to
<2 x i32> @llvm.arm.neon.vld1.v2i32.p0i8(i8*, i32)
This change ensures that address spaces are fully taken into account in the ARM
target during lowering of interleaved loads and stores.
Differential Revision: http://reviews.llvm.org/D12985
llvm-svn: 248887
HHVM calling convention, hhvmcc, is used by HHVM JIT for
functions in translated cache. We currently support LLVM back end to
generate code for X86-64 and may support other architectures in the
future.
In HHVM calling convention any GP register could be used to pass and
return values, with the exception of R12 which is reserved for
thread-local area and is callee-saved. Other than R12, we always
pass RBX and RBP as args, which are our virtual machine's stack pointer
and frame pointer respectively.
When we enter translation cache via hhvmcc function, we expect
the stack to be aligned at 16 bytes, i.e. skewed by 8 bytes as opposed
to standard ABI alignment. This affects stack object alignment and stack
adjustments for function calls.
One extra calling convention, hhvm_ccc, is used to call C++ helpers from
HHVM's translation cache. It is almost identical to standard C calling
convention with an exception of first argument which is passed in RBP
(before we use RDI, RSI, etc.)
Differential Revision: http://reviews.llvm.org/D12681
llvm-svn: 248832
When llvm declarations have argument names, it's helpful to actually
print those names when debugging. Arguably, it'd be nice to print them
all the time, but that would mean the IR we output wouldn't round trip
through bitcode, which doesn't store the names.
Make the varous print() methods in AsmWriter optionally print "for
debug" and set that flag in the dump() methods. The only thing this
does differently for now is print the argument names in declarations.
llvm-svn: 248692
Summary:
This also adds the first set of tests for operand bundles.
The optimizer has not been audited to ensure that it does the right
thing with operand bundles.
Depends on D12456.
Reviewers: reames, chandlerc, majnemer, dexonsmith, kmod, JosephTremoulet, rnk, bogner
Subscribers: maksfb, llvm-commits
Differential Revision: http://reviews.llvm.org/D12457
llvm-svn: 248551
Summary:
This change teaches `CallInst`s and `InvokeInst`s to maintain a set of
operand bundles as part of its operands. `CallInst`s and `InvokeInst`s
with operand bundles co-allocate some space before their `Use` array to
hold meta information about which of its operands are part of an operand
bundle.
The strings corresponding to the bundle tags are interned into
`LLVMContextImpl::BundleTagCache`
This change does not include any parsing / bitcode support. That's the
next change.
Depends on D12455.
Reviewers: reames, chandlerc, majnemer, dexonsmith, kmod, JosephTremoulet, rnk, bogner
Subscribers: MatzeB, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D12456
llvm-svn: 248527
Summary:
With this change, subclasses of `llvm::User` will be able to co-allocate
a variable number of bytes (called a "descriptor") with the `llvm::User`
instance. The co-allocated descriptor can later be accessed using
`llvm::User::getDescriptor`. This will be used in later changes to
implement operand bundles.
This change steals one bit from `NumUserOperands`, but given that it is
still 28 bits wide I don't think this will be a practical issue.
This change does not allow allocating hung off uses with descriptors.
This only for simplicity, not for any fundamental reason; and we can
easily add this functionality later if needed.
Reviewers: reames, chandlerc, dexonsmith, kmod, majnemer, pete, JosephTremoulet
Subscribers: pete, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D12455
llvm-svn: 248453
Patch by: simoncook
Unlike BitCasts, AddrSpaceCasts do not always produce an output the same size as its input, which was previously assumed. This fixes cases where two address spaces do not have the same size pointer, as an assertion failure would occur when trying to prove deferenceability. LoopUnswitch is used in the particular test, but LICM also exhibits the same problem.
Differential Revision: http://reviews.llvm.org/D13008
llvm-svn: 248422
This patches removes the x86.sse41.pmovsx* intrinsics, provides a suitable upgrade path and updates relevant tests to sign extend a subvector instead.
LLVM counterpart to D12835
Differential Revision: http://reviews.llvm.org/D13002
llvm-svn: 248368
Because mod is always exact, this function should have never taken a rounding mode argument. The actual implementation still has issues, which I'll look at resolving in a subsequent patch.
llvm-svn: 248195
This is needed by all GlobalObjects (GlobalAlias, Function,
GlobalVariable), see the GlobalObject::getValueType which is used in
many places. If at some point that can be removed, then we can remove
this member.
llvm-svn: 247621
This was a flawed change - it just caused the getElementType call to be
deferred until later, when we really need to remove it. Now that the IR
for GlobalAliases has been updated, the root cause is addressed that way
instead and this change is no longer needed (and in fact gets in the way
- because we want to pass the pointee type directly down further).
Follow up patches to push this through GlobalValue, bitcode format, etc,
will come along soon.
This reverts commit 236160.
llvm-svn: 247585
Summary: This fixes a variety of typos in docs, code and headers.
Subscribers: jholewinski, sanjoy, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D12626
llvm-svn: 247495
The rest of the EH pads are fine, since they have at most one label and
take fewer operands for the personality.
Old catchpad vs. new:
%5 = catchpad [i8* bitcast (i32 ()* @"\01?filt$0@0@main@@" to i8*)] to label %__except.ret.10 unwind label %catchendblock.9
-----
%5 = catchpad [i8* bitcast (i32 ()* @"\01?filt$0@0@main@@" to i8*)]
to label %__except.ret.10 unwind label %catchendblock.9
llvm-svn: 247433
This brings a warning.
cl : Command line warning D9035: option 'Og-' has been deprecated and will be removed in a future release
We should resolve PR11951 to remove this tweak.
llvm-svn: 247427
Fix embarrassing bugs I introduced to the `SlotTracker` in or around
r235785. I had us iterating through every instruction in a function
(and hitting a map in the LLVMContext) for every basic block in the
function.
While there, completely avoid the call to
`SlotTracker::processFunctionMetadata()` from
`SlotTracker::processFunction()` if we've speculatively done this
already in `SlotTracker::processModule()` by checking
`ShouldInitializeAllMetadata` (this wasn't an algorithmic problem, but
it's touching the same line of code).
Fixes PR24699.
llvm-svn: 247372
The exact semantics of 'catchpad' are really in the hands of the
personality routine so we shouldn't assume that they have no side
effects.
llvm-svn: 247322
don't correctly implement the scoping rules of C++11 range based for
loops. This kind of aliasing isn't a good idea anyways (and wasn't
really intended).
llvm-svn: 247241
manager to avoid a slow linear scan of every immutable pass and on every
attempt to find an analysis pass.
This speeds up 'check-llvm' on an unoptimized build for me by 15%, YMMV.
It should also help (a tiny bit) other folks that are really
bottlenecked on repeated runs of tiny pass pipelines across small IR
files.
llvm-svn: 247240
Currently this hits an assert that extload should
always be supported, which assumes integer extloads.
This moves a hack out of SI's argument lowering and
is covered by existing tests.
llvm-svn: 247113
Summary:
Fixes bug 24646. Previous code was not checking if an index into a vector
was valid, resulting in a SEGV. Fixed by assuming the construct can't
be parsed when given this input.
Reformat and add test.
Differential Revision: http://reviews.llvm.org/D12539
llvm-svn: 246774
Fixes bug 24646. Previous code was not checking if an index into a vector
was valid, resulting in a SEGV. Fixed by assuming the construct can't
be parsed when given this input.
llvm-svn: 246773
Summary:
This intrinsic can be used to extract a pointer to the exception caught by
a given catchpad. Its argument has token type and must be a `catchpad`.
Also clarify ExtendingLLVM documentation regarding overloaded intrinsics.
Reviewers: majnemer, andrew.w.kaylor, sanjoy, rnk
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12533
llvm-svn: 246752
Summary:
Add a `cleanupendpad` instruction, used to mark exceptional exits out of
cleanups (for languages/targets that can abort a cleanup with another
exception). The `cleanupendpad` instruction is similar to the `catchendpad`
instruction in that it is an EH pad which is the target of unwind edges in
the handler and which itself has an unwind edge to the next EH action.
The `cleanupendpad` instruction, similar to `cleanupret` has a `cleanuppad`
argument indicating which cleanup it exits. The unwind successors of a
`cleanuppad`'s `cleanupendpad`s must agree with each other and with its
`cleanupret`s.
Update WinEHPrepare (and docs/tests) to accomodate `cleanupendpad`.
Reviewers: rnk, andrew.w.kaylor, majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12433
llvm-svn: 246751
We used to accept (and even test, and generate) 16-byte alignment
for 32-byte nontemporal stores, but they require 32-byte alignment,
per SDM. Found by inspection.
Instead of hardcoding 16 in the patfrag, check for natural alignment.
Also fix the autoupgrade and the various tests.
Also, use explicit -mattr instead of -mcpu: I stared at the output
several minutes wondering why I get 2x movntps for the unaligned
case (which is the ideal output, but needs some work: see FIXME),
until I remembered corei7-avx implies +slow-unaligned-mem-32.
llvm-svn: 246733
Function::print isn't interestingly different from Value::print. Just
let the only caller (in PrintCallGraphPass) call the Value version.
llvm-svn: 246720
This patch defines 'unpredictable' metadata. This metadata can be used to signal to the optimizer
or backend that a branch or switch is unpredictable, and therefore, it's probably better to not
split a compound predicate into multiple branches such as in CodeGenPrepare::splitBranchCondition().
This was discussed in:
https://llvm.org/bugs/show_bug.cgi?id=23827
Dependent patches to alter codegen and expose this in clang to follow.
Differential Revision; http://reviews.llvm.org/D12341
llvm-svn: 246688
Summary:
Add the necessary plumbing so that llvm_token_ty can be used as an
argument/return type in intrinsic definitions and correspondingly require
TokenTy in function types. TokenTy is an opaque type that has no target
lowering, but can be used in machine-independent intrinsics. It is
required for the upcoming llvm.eh.padparam intrinsic.
Reviewers: majnemer, rnk
Subscribers: stoklund, llvm-commits
Differential Revision: http://reviews.llvm.org/D12532
llvm-svn: 246651
This would have suppressed bug 24578, about use-after-
destroy on User and MDNode. Rolled back suppression for
the sake of code cleanliness, in preferance for bug
tracking to keep track of this issue.
This reverts commit 6ff2baabc4625d5b0a8dccf76aa0f72d930ea6c0.
llvm-svn: 246484
This fixes PR24621 and matches what we do for `DILocation`. Although
the limit seems somewhat artificial, there are places in the backend
that also assume 16-bit columns, so we may as well just be consistent
about the limits.
llvm-svn: 246349
Add `Function::setSubprogram()` and `Function::getSubprogram()`,
convenience methods to forward to `setMetadata()` and `getMetadata()`,
respectively, and deal in `DISubprogram` instead of `MDNode`.
Also add a verifier check to enforce that `!dbg` attachments are always
subprograms.
Originally (when I had the llvm-dev discussion back in April) I thought
I'd store a pointer directly on `llvm::Function` for these attachments
-- we frequently have debug info, and that's much cheaper than using map
in the context if there are no other function-level attachments -- but
for now I'm just using the generic infrastructure. Let's add the extra
complexity only if this shows up in a profile.
llvm-svn: 246339
Currently the DWARF backend requires that subprograms have a type, and
the type is ignored if it has an empty type array. The long term
direction here -- see PR23079 -- is instead to skip the type entirely if
there's no valid type.
It turns out we have cases in tree of missing types on subprograms, but
since they're not referenced by compile units, the backend never crashes
on them. One option would be to add a Verifier check that subprograms
have types, and fix the bitrot. However, this is a fair bit of churn
(20-30 testcases) that would be reversed anyway by PR23079.
I found this inconsistency because of a WIP patch and upgrade script for
PR23367 that started crashing on test/DebugInfo/2010-10-01-crash.ll.
This commit updates the testcase to reference the subprogram from the
compile unit, and fixes the resulting crash (in line with the direction
of PR23079). This also updates `DIBuilder` to stop assuming a non-null
pointer for the subroutine types.
llvm-svn: 246333
As a follow-up to r246098, require `DISubprogram` definitions
(`isDefinition: true`) to be 'distinct'. Specifically, add an assembler
check, a verifier check, and bitcode upgrading logic to combat testcase
bitrot after the `DIBuilder` change.
While working on the testcases, I realized that
test/Linker/subprogram-linkonce-weak-odr.ll isn't relevant anymore. Its
purpose was to check for a corner case in PR22792 where two subprogram
definitions match exactly and share the same metadata node. The new
verifier check, requiring that subprogram definitions are 'distinct',
precludes that possibility.
I updated almost all the IR with the following script:
git grep -l -E -e '= !DISubprogram\(.* isDefinition: true' |
grep -v test/Bitcode |
xargs sed -i '' -e 's/= \(!DISubprogram(.*, isDefinition: true\)/= distinct \1/'
Likely some variant of would work for out-of-tree testcases.
llvm-svn: 246327
Change `DIBuilder` always to produce 'distinct' nodes when creating
`DISubprogram` definitions. I measured a ~5% memory improvement in the
link step (of ld64) when using `-flto -g`.
`DISubprogram`s are used in two ways in the debug info graph.
Some are definitions, point at actual functions, and can't really be
shared between compile units. With full debug info, these point down at
their variables, forming uniquing cycles. These uniquing cycles are
expensive to link between modules, since all unique nodes that reference
them transitively need to be duplicated (see commit message for r244181
for more details).
Others are declarations, primarily used for member functions in the type
hierarchy. Definitions never show up there; instead, a definition
points at its corresponding declaration node.
I started by making all subprograms 'distinct'. However, that was too
big a hammer: memory usage *increased* ~5% (net increase vs. this patch
of ~10%) because the 'distinct' declarations undermine LTO type
uniquing. This is a targeted fix for the definitions (where uniquing is
an observable problem).
A couple of notes:
- There's an accompanying commit to update IRGen testcases in clang.
- ^ That's what I'm using to test this commit.
- In a follow-up, I'll change the verifier to require 'distinct' on
definitions and add an upgrade to `BitcodeReader`.
llvm-svn: 246098
Summary:
WinEHPrepare is going to require that cleanuppad and catchpad produce values
of token type which are consumed by any cleanupret or catchret exiting the
pad. This change updates the signatures of those operators to require/enforce
that the type produced by the pads is token type and that the rets have an
appropriate argument.
The catchpad argument of a `CatchReturnInst` must be a `CatchPadInst` (and
similarly for `CleanupReturnInst`/`CleanupPadInst`). To accommodate that
restriction, this change adds a notion of an operator constraint to both
LLParser and BitcodeReader, allowing appropriate sentinels to be constructed
for forward references and appropriate error messages to be emitted for
illegal inputs.
Also add a verifier rule (noted in LangRef) that a catchpad with a catchpad
predecessor must have no other predecessors; this ensures that WinEHPrepare
will see the expected linear relationship between sibling catches on the
same try.
Lastly, remove some superfluous/vestigial casts from instruction operand
setters operating on BasicBlocks.
Reviewers: rnk, majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12108
llvm-svn: 245797
There was already a good error path for this. Added a test for it & made
a minor code change to ensure the error path was actually reached,
rather than crashing before we got that far.
llvm-svn: 245795
Gets a bit tricky in the ValueMapper, of course - not sure if we should
just expose a list of explicit types for each Value so that the
ValueMapper can be neutral to these special cases (it's OK for things
like load, where the explicit type is the result type - but when that's
not the case, it means plumbing through another "special" type... )
llvm-svn: 245728
David Majnemer (the original author) believes this to be an impossible
condition to reach anyway, and no test cases cover this so we'll go with
that.
llvm-svn: 245712
and make it always preserve debug locations, since all callers wanted this
behavior anyway.
This is addressing a post-commit review feedback for r245589.
NFC (inside the LLVM tree).
llvm-svn: 245622
Since r245605, the clang headers don't use these anymore.
r245165 updated some of the tests already; update the others, add
an autoupgrade, remove the intrinsics, and cleanup the definitions.
Differential Revision: http://reviews.llvm.org/D10555
llvm-svn: 245606
Instruction::dropUnknownMetadata(KnownSet) is supposed to preserve all
metadata in KnownSet, but the condition for DebugLocs was inverted.
Most users of dropUnknownMetadata() actually worked around this by not
adding LLVMContext::MD_dbg to their list of KnowIDs.
This is now made explicit.
llvm-svn: 245589
without *requiring* it.
This allows a pass indicate that it will use an analysis if available
(through getAnalysisIfAvailable). When the pass manager knows this, it
will refrain from deleting that analysis if it can. Naturally, it will
still get invalidated at the correct time. These passes are not
considered when scheduling the pass pipeline, so typically they will
require manual scheduling, but this may also allow passes with
getAnalysisIfAvailable to find the analysis more often if nothing after
them requires that analysis and it wasn't invalidated.
I don't have a particular use case with the current passes, but with my
new structure for alias analyses, this will be very useful. We want to
allow people to customize the set of AAs available by scheduling
additional passes. These's aren't ever *required* for obvious reasons.
So we need some way to mark in the legacy pass manager that they will
still be used if available.
This is essentially how analysis groups already work. But this makes the
feature generally available and more explicit. It should allow the AA
change to not impact how people trigger a custom alias analysis being
available at a certain point in compilation.
Differential Revision: http://reviews.llvm.org/D12114
llvm-svn: 245409
Some personality routines require funclet exit points to be clearly
marked, this is done by producing a token at the funclet pad and
consuming it at the corresponding ret instruction. CleanupReturnInst
already had a spot for this operand but CatchReturnInst did not.
Other personality routines don't need to use this which is why it has
been made optional.
llvm-svn: 245149
This introduces the basic functionality to support "token types".
The motivation stems from the need to perform operations on a Value
whose provenance cannot be obscured.
There are several applications for such a type but my immediate
motivation stems from WinEH. Our personality routine enforces a
single-entry - single-exit regime for cleanups. After several rounds of
optimizations, we may be left with a terminator whose "cleanup-entry
block" is not entirely clear because control flow has merged two
cleanups together. We have experimented with using labels as operands
inside of instructions which are not terminators to indicate where we
came from but found that LLVM does not expect such exotic uses of
BasicBlocks.
Instead, we can use this new type to clearly associate the "entry point"
and "exit point" of our cleanup. This is done by having the cleanuppad
yield a Token and consuming it at the cleanupret.
The token type makes it impossible to obscure or otherwise hide the
Value, making it trivial to track the relationship between the two
points.
What is the burden to the optimizer? Well, it turns out we have already
paid down this cost by accepting that there are certain calls that we
are not permitted to duplicate, optimizations have to watch out for
such instructions anyway. There are additional places in the optimizer
that we will probably have to update but early examination has given me
the impression that this will not be heroic.
Differential Revision: http://reviews.llvm.org/D11861
llvm-svn: 245029
This patch and a relatec clang patch solve the problem of having to explicitly enable analysis when specifying a loop hint pragma to get the diagnostics. Passing AlwasyPrint as the pass name (see below) causes the front-end to print the diagnostic if the user has specified '-Rpass-analysis' without an '=<target-pass>’. Users of loop hints can pass that compiler option without having to specify the pass and they will get diagnostics for only those loops with loop hints.
llvm-svn: 244555
This patch moves checking the threshold of runtime pointer checks to the vectorization requirements (late diagnostics) and emits a diagnostic that infroms the user the loop would be vectorized if not for exceeding the pointer-check threshold. Clang will also append the options that can be used to allow vectorization.
llvm-svn: 244523
This patch moves the verification of fast-math to just before vectorization is done. This way we can tell clang to append the command line options would that allow floating-point commutativity. Specifically those are enableing fast-math or specifying a loop hint.
llvm-svn: 244489
After r244074, we now have a successors() method to iterate over
all the successors of a TerminatorInst. This commit changes a bunch
of eligible loops to use it.
llvm-svn: 244260
This is intended to help support the idiom of a class that has some
other objects (or multiple arrays of different types of objects)
appended on the end, which is used quite heavily in clang.
Differential Revision: http://reviews.llvm.org/D11272
llvm-svn: 244164
Summary:
Emit both DWARF and CodeView if "CodeView" and "Dwarf Version" module
flags are set.
Reviewers: majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11756
llvm-svn: 244158
As documented in the LLVM Coding Standards, indeed MSVC incorrectly asserts
on this in Debug mode. This happens when building clang with Visual C++ and
-triple i686-pc-windows-gnu on these clang regression tests:
clang/test/CodeGen/2011-03-08-ZeroFieldUnionInitializer.c
clang/test/CodeGen/empty-union-init.c
llvm-svn: 243996
This change was done as an audit and is by inspection. The new EH
system is still very much a work in progress. NFC for the landingpad
case.
llvm-svn: 243965
Summary: This patch adds enum value for an existing metadata type -- make.implicit. Using preassigned enum will be helpful to get compile time type checking and avoid string construction and comparison. The patch also changes uses of make.implicit from string metadata to enum metadata. There is no functionality change.
Reviewers: reames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11698
llvm-svn: 243954
contained types into the space when we have no contained types. This
fixes the UB stemming from a call to memcpy with a null pointer. This
also reduces the calls to allocate because this actually happens in
a notable client - Clang.
Found by UBSan.
llvm-svn: 243944
Since r241097, `DIBuilder` has only created distinct `DICompileUnit`s.
The backend is liable to start relying on that (if it hasn't already),
so make uniquable `DICompileUnit`s illegal and automatically upgrade old
bitcode. This is a nice cleanup, since we can remove an unnecessary
`DenseSet` (and the associated uniquing info) from `LLVMContextImpl`.
Almost all the testcases were updated with this script:
git grep -e '= !DICompileUnit' -l -- test |
grep -v test/Bitcode |
xargs sed -i '' -e 's,= !DICompileUnit,= distinct !DICompileUnit,'
I imagine something similar should work for out-of-tree testcases.
llvm-svn: 243885
Remove the fake `DW_TAG_auto_variable` and `DW_TAG_arg_variable` tags,
using `DW_TAG_variable` in their place Stop exposing the `tag:` field at
all in the assembly format for `DILocalVariable`.
Most of the testcase updates were generated by the following sed script:
find test/ -name "*.ll" -o -name "*.mir" |
xargs grep -l 'DILocalVariable' |
xargs sed -i '' \
-e 's/tag: DW_TAG_arg_variable, //' \
-e 's/tag: DW_TAG_auto_variable, //'
There were only a handful of tests in `test/Assembly` that I needed to
update by hand.
(Note: a follow-up could change `DILocalVariable::DILocalVariable()` to
set the tag to `DW_TAG_formal_parameter` instead of `DW_TAG_variable`
(as appropriate), instead of having that logic magically in the backend
in `DbgVariable`. I've added a FIXME to that effect.)
llvm-svn: 243774
This introduces new instructions neccessary to implement MSVC-compatible
exception handling support. Most of the middle-end and none of the
back-end haven't been audited or updated to take them into account.
Differential Revision: http://reviews.llvm.org/D11097
llvm-svn: 243766
Replace the general `createLocalVariable()` with two more specific
functions: `createParameterVariable()` and `createAutoVariable()`, and
rewrite the documentation.
Besides cleaning up the API, this avoids exposing the fake DWARF tags
`DW_TAG_arg_variable` and `DW_TAG_auto_variable` to frontends, and is
preparation for removing them completely.
llvm-svn: 243764
Summary:
As added initially, statepoints required their call targets to be a
constant pointer null if ``numPatchBytes`` was non-zero. This turns out
to be a problem ergonomically, since there is no way to mark patchable
statepoints as calling a (readable) symbolic value.
This change remove the restriction of requiring ``null`` call targets
for patchable statepoints, and changes PlaceSafepoints to maintain the
symbolic call target through its transformation.
Reviewers: reames, swaroop.sridhar
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11550
llvm-svn: 243502
As a stop-gap, retrieving the InlineAsm's function type was done via the
pointee type of its (pointer) Value type.
Instead, pass down and store the FunctionType in the InlineAsm object.
The only wrinkle with this is the ConstantUniqueMap, which then needs to
ferry the FunctionType down through the InlineAsmKeyType. This could be
done a bit differently if the ConstantInfo trait were broadened a bit to
provide an extension point for access to the TypeClass object from the
ValType objects, so that the ConstantUniqueMap<InlineAsm> would then be
keyed on FunctionTypes instead of PointerTypes that point to
FunctionTypes.
This drops the number of IR tests that don't roundtrip through bitcode*
without calling PointerType::getElementType from 416 to 8 (out of
10733). 3 of those crash when roundtripping at ToT anyway.
* modulo various unavoidable uses of pointer types when validating IR
(for now) and in the way globals are parsed, unfortunately. These
cases will either go away (because such validation will no longer be
necessary or possible when pointee types are opaque), or have to be
made simultaneously with the removal of pointee types.
llvm-svn: 243356
This commit publicly exposes the method 'getLocalSlot' in the
'ModuleSlotTracker' class.
This change is useful for MIR serialization, to serialize the unnamed basic
block and unnamed alloca references.
Reviewers: Duncan P. N. Exon Smith
llvm-svn: 243336
This reverts commit r243135.
Feedback from Craig Topper and David Blaikie was that we don't put const on Type as it has no mutable state.
llvm-svn: 243283
Add a verifier check that `DILocalVariable`s of tag
`DW_TAG_arg_variable` always have a non-zero 'arg:' field, and those of
tag `DW_TAG_auto_variable` always have a zero 'arg:' field. These are
the only configurations that are properly understood by the backend.
(Also, fix the bad examples in LangRef and test/Assembler, and fix the
bug in Kaleidoscope Ch8.)
A large number of testcases seem to have bitrotted their way forward
from some ancient version of the debug info hierarchy that didn't have
`arg:` parameters. If you have out-of-tree testcases that start failing
in the verifier and you don't care enough to get the `arg:` right, you
may have some luck just calling:
sed -e 's/, arg: 0/, arg: 1/'
or some such, but I hand-updated the ones in tree.
llvm-svn: 243183
Instead of the pattern
for (auto I = x.rbegin(), E = x.end(); I != E; ++I)
we can use make_range to construct the reverse range and iterate using
that instead.
llvm-svn: 243163
Remove unnecessary and confusing common base class for `DICompositeType`
and `DISubroutineType`.
While at a high-level `DISubroutineType` is a sort of composite of other
types, it has no shared code paths, and its fields are completely
disjoint. This relationship was left over from the old debug info
hierarchy.
llvm-svn: 243160
Handle `DISubroutineType` up-front rather than as part of a branch for
`DICompositeTypeBase`. The only shared code path was looking through
the base type, but `DISubroutineType` can never have a base type.
This also removes the last use of `DICompositeTypeBase`, since we can
strengthen the cast to `DICompositeType`.
llvm-svn: 243159
Remove an unnecessary (and confusing) common subclass for
`DIDerivedType` and `DICompositeType`. These classes aren't really
related, and even in the old debug info hierarchy, there was a
long-standing FIXME to separate them.
llvm-svn: 243152
We really only want to check this for unions and classes (all the other
tags have been ruled out), so simplify the check and move it to the
right place.
llvm-svn: 243150
Remove unnecessary references to `DW_TAG_subroutine_type` in
`visitDICompositeType()` and `visitDIDerivedTypeBase()`, since
`visitDISubroutineType()` doesn't call either of those (and shouldn't,
since subroutine types are really quite special).
llvm-svn: 243149
The surrounding code proves in both cases that these must be
`DIDerivedType` if they're `DIDerivedTypeBase`, so strengthen the
`dyn_cast`s to the more specific type.
llvm-svn: 243143
Almost all methods in DataLayout took mutable pointers but didn't need to.
These were only accessing constant methods of the types, or using the Type*
to key a map. Neither of these needs a mutable pointer.
llvm-svn: 243135
This commit extracts the code that prints out a name of an LLVM value without a
prefix from a function 'PrintLLVMName' into a publicly accessible function named
'printLLVMNameWithoutPrefix'.
This change would be useful for MIR serialization, as it would allow the MIR
printer to reuse this function to print out the names of the external symbol
machine operands.
Reviewers: Duncan P. N. Exon Smith
llvm-svn: 242803
Revert the changes to the C API LLVMBuildLandingPad that were part of
the personality function move. We now set the personality on the parent
function when the C API attempts to construct a landingpad with a
personality.
This reverts commit r240010.
llvm-svn: 242372
This is a necessary prerequisite for bootstrapping the emission
of debug info inside modules.
- Adds a FlagExternalTypeRef to DICompositeType.
External types must have a unique identifier.
- External type references are emitted using a forward declaration
with a DW_AT_signature([DW_FORM_ref_sig8]) based on the UID.
http://reviews.llvm.org/D9612
llvm-svn: 242302
Summary:
The capability was lost with D10429 where the personality function was set at function level rather than landing pad level. Now there is no way to get/set the personality function from the C API. That is a problem.
Note that the whole thing could be avoided by improving the C API testing, as started by D10725
Reviewers: chandlerc, bogner, majnemer, andrew.w.kaylor, rafael, rnk, axw
Subscribers: rafael, llvm-commits
Differential Revision: http://reviews.llvm.org/D10946
llvm-svn: 242104
This change adds new attribute called "argmemonly". Function marked with this attribute can only access memory through it's argument pointers. This attribute directly corresponds to the "OnlyAccessesArgumentPointees" ModRef behaviour in alias analysis.
Differential Revision: http://reviews.llvm.org/D10398
llvm-svn: 241979
Summary:
This introduces new instructions neccessary to implement MSVC-compatible
exception handling support. Most of the middle-end and none of the
back-end haven't been audited or updated to take them into account.
Reviewers: rnk, JosephTremoulet, reames, nlewycky, rjmccall
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11041
llvm-svn: 241888
The justification of this change is here: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-March/082989.html
According to the current GEP syntax, vector GEP requires that each index must be a vector with the same number of elements.
%A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets
In this implementation I let each index be or vector or scalar. All vector indices must have the same number of elements. The scalar value will mean the splat vector value.
(1) %A = getelementptr i8, i8* %ptr, <4 x i64> %offsets
or
(2) %A = getelementptr i8, <4 x i8*> %ptrs, i64 %offset
In all cases the %A type is <4 x i8*>
In the case (2) we add the same offset to all pointers.
The case (1) covers C[B[i]] case, when we have the same base C and different offsets B[i].
The documentation is updated.
http://reviews.llvm.org/D10496
llvm-svn: 241788
Summary:
Initially, these intrinsics seemed like part of a family of "frame"
related intrinsics, but now I think that's more confusing than helpful.
Initially, the LangRef specified that this would create a new kind of
allocation that would be allocated at a fixed offset from the frame
pointer (EBP/RBP). We ended up dropping that design, and leaving the
stack frame layout alone.
These intrinsics are really about sharing local stack allocations, not
frame pointers. I intend to go further and add an `llvm.localaddress()`
intrinsic that returns whatever register (EBP, ESI, ESP, RBX) is being
used to address locals, which should not be confused with the frame
pointer.
Naming suggestions at this point are welcome, I'm happy to re-run sed.
Reviewers: majnemer, nicholas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11011
llvm-svn: 241633
This reverts commit r241602. We had a latent bug in SCCP where we would
make a basic block empty and then proceed to ask questions about it's
terminator.
llvm-svn: 241616
getFirstNonPHI's documentation states that it returns null if there is
no non-PHI instruction. However, it instead returns a pointer to the
end iterator. The implementation of getFirstNonPHI claims that
dereferencing the iterator will result in an assertion failure but this
doesn't occur. Instead, machinery like getFirstInsertionPt will attempt
to isa<> this invalid memory which results in unpredictable behavior.
Instead, make getFirst* return null if no such instruction exists.
llvm-svn: 241570
Summary:
Looking at r241279, I noticed that UpgradedIntrinsics only gets written
to in the following code:
if (UpgradeIntrinsicFunction(&F, NewFn))
UpgradedIntrinsics[&F] = NewFn;
Looking through UpgradeIntrinsicFunction, we always return false OR
NewFn will be set to a different function from our source.
This patch pulls the F != NewFn into UpgradeIntrinsicFunction as an
assert, and removes the check from callers of UpgradeIntrinsicFunction.
Reviewers: rafael, chandlerc
Subscribers: llvm-commits-list
Differential Revision: http://reviews.llvm.org/D10915
llvm-svn: 241369
It is meant to be used to record modules @imported by the current
compile unit, so a debugger an import the same modules to replicate this
environment before dropping into the expression evaluator.
DIModule is a sibling to DINamespace and behaves quite similarly.
In addition to the name of the module it also records the module
configuration details that are necessary to uniquely identify the module.
This includes the configuration macros (e.g., -DNDEBUG), the include path
where the module.map file is to be found, and the isysroot.
The idea is that the backend will turn this into a DW_TAG_module.
http://reviews.llvm.org/D9614
rdar://problem/20965932
llvm-svn: 241017
Allow callers of `Value::print()` and `Metadata::print()` to pass in a
`ModuleSlotTracker`. This allows them to pay only once for calculating
module-level slots (such as Metadata).
This is related to PR23865, where there was a huge cost for
`MachineFunction::print()`. Although I don't have a *particular* user
in mind for this new code, I have hit big slowdowns before when running
`opt -debug`, and I think this will be useful. Going forward, if
someone hits a big slowdown with `print()` statements, they can create a
`ModuleSlotTracker` and send it through. Similarly, adding support to
`Value::dump()` and `Metadata::dump()` should be trivial.
I added unit tests to be sure the `print()` functions actually behave
the same way with and without the slot tracker.
llvm-svn: 240867
For another 1% speedup on the testcase in PR23865, push the
`ModuleSlotTracker` through to metadata-related printing in
`MachineBasicBlock::print()`.
llvm-svn: 240848
Expose enough of the IR-level `SlotTracker` so that
`MachineFunction::print()` can use a single one for printing
`BasicBlock`s. Next step would be to lift this through a few more APIs
so that we can make other print methods faster.
Fixes PR23865, changing the runtime of `llc -print-machineinstrs` from
many minutes (killed after 3 minutes, but it wasn't very close) to
13 seconds for a 502185 line dump.
llvm-svn: 240842
We support invoking a subset of llvm's intrinsics, but the verifier didn't account for this. We had previously added a special case to verify invokes of statepoints. By generalizing the code in terms of CallSite, we can verify invokes of other intrinsics as well. Interestingly, this found one test case which was invalid.
Note: I'm deliberately leaving the naming change from CI to CS to a follow up change. That will happen shortly, I just wanted to reduce the diff to make it clear what was happening with this one.
Differential Revision: http://reviews.llvm.org/D10118
llvm-svn: 240836