With commit r219944, InstCombine can now turn a sqrtl into a llvm.fabs.f64.
The call graph edge originally representing the call to sqrtl becomes invalid.
This patch modifies CGPassManager::RefreshCallGraph() to remove the invalid
call graph edge, which can triggers an assert in
CallGraphNode::addCalledFunction().
Phabricator Review: http://reviews.llvm.org/D7705
Patch by Lawrence Hu <lawrence@codeaurora.org>.
llvm-svn: 234902
if ((a & 0x1) == 0x1) {
..
}
In this case we don't actually have any branch probability information and
should not assume to have any. LLVM transforms this into:
%and = and i32 %a, 1
%tobool = icmp eq i32 %and, 0
So, in this case, the result of a bitwise and is compared against 0,
but nevertheless, we should not assume to have probability
information.
llvm-svn: 234898
The ARMv8 ARMARM states that for these instructions in A64 state:
"Unspecified bits in "imm5" are ignored but should be set to zero by an assembler.", (imm4 for INS).
Make the disassembler accept any encoding with these ignored bits set to 1.
llvm-svn: 234896
This pass will always try to insert llvm.SI.ifbreak intrinsics
in the same block that its conditional value is computed in. This is
a problem when conditions for breaks or continue are computed outside
of the loop, because the llvm.SI.ifbreak intrinsic ends up being inserted
outside of the loop.
This patch fixes this problem by inserting the llvm.SI.ifbreak
intrinsics in the loop header when the condition is computed outside
the loop.
llvm-svn: 234891
Summary:
Without this check the following case failed:
Skip a SubBlock which is not a MODULE_BLOCK_ID nor a BLOCKINFO_BLOCK_ID
Got to end of file
TheModule would still be == nullptr, and we would subsequentially fail
when materializing the Module (assert at the start of
BitcodeReader::MaterializeModule).
Bug found with AFL.
Reviewers: dexonsmith, rafael
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9014
llvm-svn: 234887
Some targets (ie. Mips) have additional rules for ordering the relocation
table entries. Allow them to override generic sortRelocs(), which sorts
entries by Offset.
Then override this function for Mips, to emit HI16 and GOT16 relocations
against the local symbol in pair with the corresponding LO16 relocation.
Patch by Vladimir Stefanovic.
Differential Revision: http://reviews.llvm.org/D7414
llvm-svn: 234883
TargetRegisterInfo::getRegPressureLimit has a note that it is an old
model that relies on manually entered classes. Using the newer model of
register pressure sets seems more appropriate. We might eventually even
switch to lib/CodeGen/RegisterPressure.cpp, but we should probably do
incremental changes here.
Using the newer model also makes it easier to take regmasks into account
which is necessary to fix llvm.org/PR23143. I am currently also
preparing a patch for that, but would like to do this switch
independently.
Review: http://reviews.llvm.org/D8986
llvm-svn: 234880
The testcase that is included in the patch caused a crash when doing DebugInfoFinder::processModule
on the module due to DCT->getElements() returning nullptr in DebugInfoFinder::processType.
By doing "DCT->getElements()" instead of "DCT->getElements()->operands()" one gets a DIArray
instead of a raw MDTuple. The former has code to handle null as a 0-element array and
therefore avoids the crash.
Differential Revision: http://reviews.llvm.org/D9008
llvm-svn: 234875
Summary:
This transformation reassociates a n-ary add so that the add can partially reuse
existing instructions. For example, this pass can simplify
void foo(int a, int b) {
bar(a + b);
bar((a + 2) + b);
}
to
void foo(int a, int b) {
int t = a + b;
bar(t);
bar(t + 2);
}
saving one add instruction.
Fixes PR22357 (https://llvm.org/bugs/show_bug.cgi?id=22357).
Test Plan: nary-add.ll
Reviewers: broune, dberlin, hfinkel, meheff, sanjoy, atrick
Reviewed By: sanjoy, atrick
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8950
llvm-svn: 234855
Change `DICompileUnit::replaceSubprograms()` and
`DICompileUnit::replaceGlobalVariables()` to match the `MDCompileUnit`
equivalents that they're wrapping.
llvm-svn: 234852
When a loadable (.so or .dylib) pass is built with assertions enabled and
loaded into the 'opt' tool, we need to ensure that the extra symbols that such
passes depend on are linked into the tool.
llvm-svn: 234851
Gut the `DIDescriptor` wrappers around `MDLocalScope` subclasses. Note
that `DILexicalBlock` wraps `MDLexicalBlockBase`, not `MDLexicalBlock`.
llvm-svn: 234850
Bring function documentation for ScalarEvolutionExpander up to code by
not repeating the function name in the comment documenting
functionality. Reflow the edited comments where needed.
llvm-svn: 234847
Summary:
Runtime unrolling of loops needs to emit an expression to compute the
loop's runtime trip-count. Avoid runtime unrolling if this computation
will be expensive.
Depends on D8993.
Reviewers: atrick
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8994
llvm-svn: 234846
Summary:
Teach `isHighCostExpansion` to consider divisions by power-of-two
constants as cheap and add a test case. This change is needed for a new
user of `isHighCostExpansion` that will be added in a subsequent change.
Depends on D8995.
Reviewers: atrick
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8993
llvm-svn: 234845
Summary:
Move isHighCostExpansion from IndVarSimplify to SCEVExpander. This
exposed function will be used in a subsequent change.
Reviewers: bogner, atrick
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8995
llvm-svn: 234844
Add a few functions from `DILexicalBlock` to `MDLexicalBlockBase`,
leaving `DILexicalBlock` a simple wrapper.
IMO, the new functions (`getLine()` and `getColumn()`) don't really
belong in the base class, but to simplify transitioning old code it
seems like the right incremental step. I've explicitly deleted them in
`MDLexicalBlockFile`, and eventually the callers should be updated to
downcast to `MDLexicalBlock` directly and the forwarding functions
removed.
llvm-svn: 234842
Gut all the non-pointer API from the variable wrappers, except an
implicit conversion from `DIGlobalVariable` to `DIDescriptor`. Note
that if you're updating out-of-tree code, `DIVariable` wraps
`MDLocalVariable` (`MDVariable` is a common base class shared with
`MDGlobalVariable`).
llvm-svn: 234840
Summary:
This is the first in a series of patches to eventually add support for TLS relocations to RuntimeDyld. This patch resolves an issue in the current GOT handling, where GOT entries would be reused between object files, which leads to the same situation that necessitates the GOT in the first place, i.e. that the 32-bit offset can not cover all of the address space. Thus this patch makes the GOT object-file-local.
Unfortunately, this still isn't quite enough, because the MemoryManager does not yet guarantee that sections are allocated sufficiently close to each other, even if they belong to the same object file. To address this concern, this patch also adds a small API abstraction on top of the GOT allocation mechanism that will allow (temporarily, until the MemoryManager is improved) using the stub mechanism instead of allocating a different section. The actual switch from separate section to stub mechanism will be part of a follow-on commit, so that it can be easily reverted independently at the appropriate time.
Test Plan: Includes a test case where the GOT of two object files is artificially forced to be apart by several GB.
Reviewers: lhames
Reviewed By: lhames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8813
llvm-svn: 234839
This is along the same lines as r234832, but for `DILocation`. Clean
out all accessors from `DILocation`. Any callers should be using
`MDLocation` directly (e.g., via `operator->()`).
llvm-svn: 234835
Fix oversight in -analyze output. PtrRtCheck contains the pointers that
need to be checked against each other and not whether memchecks are
necessary.
For instance in the testcase PtrRtCheck has four elements but all
no-alias so no checking is necessary.
llvm-svn: 234833
Completely gut `DIExpression`, turning it into a simple wrapper around
`MDExpression *`. There are two bits of magic left:
- It's constructed from `const MDExpression*` but convertible to
`MDExpression*`.
- It's default-constructed to `nullptr`.
Otherwise, it should behave quite like a raw pointer. Once I've done
the same to the rest of the `DIDescriptor` subclasses, I'll come back to
delete them entirely (and update call sites as necessary to deal with
the missing magic).
llvm-svn: 234832
Before we had real liveness, we needed to track every value that base pointer
insertion code created because these now might be live. We now just rerun
the data flow liveness algorithm (which is actually faster!) and no longer
need the associated code.
llvm-svn: 234827
As documented in PR23200 (and the FIXMEs I've added to the code here),
this logic is fairly broken: it modifies the `LLVMContext` in a way that
affects other modules and cannot be serialized to assembly/bitcode. For
now, move it over to `MDLocation::computeNewDiscriminators()` anyway.
llvm-svn: 234825
I don't see a reason to add the `copyWithNewScope()` API over to
`MDLocation` -- it seems to be a holdover from when creating locations
required knowing details of operand layout -- so change
`AddDiscriminators` to call `MDLocation::get()` directly. Should be no
functionality change here.
llvm-svn: 234824
There's only one user of the various `DIObjCProperty::is*Property()`
accessors -- `DwarfUnit::constructTypeDIE()` -- and it's just using the
reverse logic to reconstruct the bitfield. Drop this API and simplify
the only caller.
llvm-svn: 234818
This keeps the program and JIT output in sync, enabling FileCheck to test the
order of target program and JIT events.
In particular we can now test that main is not compiled until after the global
constructor has run.
llvm-svn: 234815
These accessors in `DIDerivedType` should only be called when `DbgNode`
really is a `MDDerivedType`, not just a `MDDerivedTypeBase`. Assume
that it is.
llvm-svn: 234812
Combine something like:
(v8i8 concat_vectors (v2i8 bitcast (i16)) x4)
into:
(v8i8 (bitcast (v4i16 BUILD_VECTOR (i16) x4)))
If any of the scalars are floating point, use that throughout.
Differential Revision: http://reviews.llvm.org/D8948
llvm-svn: 234809
This is almost NFC, but I'm removing some assertions against `nullptr`.
The assertions aren't worth all that much since we'll typically get
segfaults at the same site (and I imagine ASan catches this sort of
thing), and besides: the whole idea is to replace the `DIDescriptor`
hierarchy with raw pointers to the new one.
llvm-svn: 234802
Instead of calling the somewhat confusingly-named
`DIVariable::isInlinedFnArgument()`, do the check directly here.
There's possibly a small functionality change here: instead of
`dyn_cast<>`'ing `DV->getScope()` to `MDSubprogram`, I'm looking up the
scope chain for the actual subprogram. I suspect that this is a no-op
for function arguments so in practise there isn't a real difference.
I've also added a `FIXME` to check the `inlinedAt:` chain instead, since
I wonder if that would be more reliable than the
`MDSubprogram::describes()` function.
Since this was the only user of `DIVariable::isInlinedFnArgument()`,
delete it.
llvm-svn: 234799
`DIGlobalVariable::getGlobal()` isn't really helpful, it just does a
`dyn_cast_or_null<>`. Simplify its only user by doing the cast directly
and delete the code.
llvm-svn: 234796
The only difference between the two is a `dyn_cast<>` to
`GlobalVariable`. If optimizations have left anything behind when a
global gets replaced, then it doesn't seem like the debug info is dead.
I can't seem to find an optimization that would leave behind a
non-`GlobalVariable` without nulling the reference entirely, so I
haven't added a testcase (but I'll be deleting `getGlobal()` in a future
commit).
llvm-svn: 234792
Use early-return style that's preferred in LLVM and updating the naming in places I touched with other changes in the last few days. Hopefully, NFC.
llvm-svn: 234785
We use dummy calls to adjust the liveness of values over statepoints in the midst of the insertion. If there are no values which need held live, there's no point in actually inserting the holder.
llvm-svn: 234779
I don't really like this function at all -- I think it should be as
simple as `return getFunction() == F` -- but for now this seems like the
best we can do.
llvm-svn: 234778
This reverts commit r234717, reapplying r234698 (in spirit).
As described in r234717, the original `Verifier` check had a
use-after-free. Instead of storing pointers to "interesting" debug info
intrinsics whose bit piece expressions should be verified once we have
typerefs, do a second traversal. I've added a testcase to catch the
`llc` crasher.
Original commit message:
Verifier: Check for incompatible bit piece expressions
Convert an assertion into a `Verifier` check. Bit piece expressions
must fit inside the variable, and mustn't be the entire variable.
Catching this in the verifier will help us find bugs sooner, and makes
`DIVariable::getSizeInBits()` dead code.
llvm-svn: 234776
Since we're restructuring the CFG, we also need to make sure to update the analsis passes. While I'm touching the code, I dedicided to restructure it a bit. The code involved here was very confusing. This change moves the normalization to essentially being a pre-pass before the main insertion work and updates a few comments to actually say what is happening and *why*.
The restructuring should be covered by existing tests. I couldn't easily see how to create a test for the invalidation bug. Suggestions welcome.
llvm-svn: 234769
Revert "Remove default in fully-covered switch (to fix Clang -Werror -Wcovered-switch-default)"
Revert "R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO"
Revert "LegalizeDAG: Try to use Overflow operations when expanding ADD/SUB"
Using overflow operations fails CodeGen/Generic/2011-07-07-ScheduleDAGCrash.ll
on hexagon, nvptx, and r600. Revert while I investigate.
llvm-svn: 234768
This is related to the issues addressed in 234651. These assertions check the properties ensured by that change at the place of use. Note that a similiar property is checked in checkBasicSSA, but without the reachability constraint. Technically, the liveness would be correct to include unreachable values, but this would be problematic for actual relocation.
llvm-svn: 234766
In case of different types used for the condition of the selects the
select(select) -> select(and) normalisation cannot be performed.
See also: http://reviews.llvm.org/D7622
llvm-svn: 234763
The check in question is attempting to help find cases where we haven't relocated a pointer at a safepoint we should have. It does this by coercing the value to null at any safepoint which doesn't relocate it.
Unfortunately, this turns out to be rather expensive in terms of memory usage and time. The number of stores inserted can grow with O(number of values x number of statepoints). On at least one example I looked at, over half of peak memory usage was coming from this check.
With this change, the check is no longer enabled by default in Asserts builds. It is enabled for expensive asserts builds and has a command line option to enable it in both Asserts and non-Asserts builds.
llvm-svn: 234761
v2: tighten the sub64 tests
v3: rename to CARRY/BORROW
v4: fixup test cmdline
add known bits computation
use sign extend instead of sub 0,x
better add test
v5: remove redundant break
move lowering to separate functions
fix comments
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewers: arsenm
llvm-svn: 234759
Summary:
The document is still incomplete in some degrees, but updated to reflect the
latest changes. Anyway we can detail it if any one think it is not enough. For
the sake of it, some useful examples are listed below:
Refer to r113618 "Add X86 MMX type to bitcode and Type" for how to add a new
type.
> One notable change from then is only one thing that ``lib/VMCore`` is renamed
to ``lib/IR``.
Refer to r194760 "Add addrspacecast instruction" for how to add a new
instruction.
Patch by Chilledheart (rwindz0@gmail.com).
Reviewed By: echristo
Differential Revision: http://reviews.llvm.org/D8897
llvm-svn: 234757
Fill in the TODO in CodeGenPrepare::OptimizeCallInst so that global
variables that are passed to memory intrinsics are aligned in the same
way that allocas are.
Differential Revision: http://reviews.llvm.org/D8421
llvm-svn: 234735
Original message.
Have one raw_fd_ostream constructor forward to the other.
This fixes some odd behaviour differences between the two. In particular,
the version that takes a FD no longer unconditionally sets stdout to binary.
llvm-svn: 234734
This reverts commit r234698.
This caused a use-after-free: `QueuedBitPieceExpressions` holds onto
references to `DbgInfoIntrinsic`s and references them past where they're
deleted (this is because the verifier is run as a function pass, and
then `verifyTypeRefs()` is called during `doFinalization()`).
I'll include a reduced crasher for `llc` when I recommit the check.
llvm-svn: 234717
Summary:
When instruction bundling is enabled and the -mc-relax-all flag is
set, we can write bundle padding directly into fragments and avoid
creating large number of fragments significantly reducing LLVM MC
memory usage.
Test Plan: Regression test attached
Reviewers: eliben
Subscribers: jfb, mseaborn
Differential Revision: http://reviews.llvm.org/D8072
llvm-svn: 234714
When I fixed these a couple of days ago to iterate over all loops, not just
depth == 1 loops, I inadvertently made it such that we'd only look at the first
top-level loop. Make sure that we really look at all of them.
llvm-svn: 234705
Clean up a predicate I added in r229731, fix the relevant comment and
add a test case. The earlier version is confusing to read and was also
buggy (probably not a coincidence) till Alexey fixed it in r233881.
llvm-svn: 234701
Change `MDSubprogram::getFunction()` and
`MDGlobalVariable::getConstant()` to return a `Constant`. Previously,
both returned `ConstantAsMetadata`.
llvm-svn: 234699
Convert an assertion into a `Verifier` check. Bit piece expressions
must fit inside the variable, and mustn't be the entire variable.
Catching this in the verifier will help us find bugs sooner, and makes
`DIVariable::getSizeInBits()` dead code.
llvm-svn: 234698
r234696 replaced the only use of `DIDescriptor::replaceAllUsesWith()`
with `DIBuilder::replaceTemporary()` (added in r234695). Delete the
dead code.
llvm-svn: 234697
Add `DIBuilder::replaceTemporary()` as a replacement for
`DIDescriptor::replaceAllUsesWith()`. I'll update clang to use the new
method, and then come back to delete the original.
This method dispatches to `replaceAllUsesWith()` or
`replaceWithUniqued()`, depending on whether the replacement is actually
a different node from the original.
llvm-svn: 234695
Continue gutting the `DIDescriptor` hierarchy. In this case, move the
guts of `DIScope::getName()` and `DIScope::getContext()` to
`MDScope::getName()` and `MDScope::getScope()`.
llvm-svn: 234691
As it turns out, even though these are part of ISA 2.06, the P7 does not
support them (or, at least, not any P7s we're tested so far).
llvm-svn: 234686
This patch corresponds to review:
http://reviews.llvm.org/D8928
It adds direct move instructions to/from VSX registers to GPR's. These are
exploited for FP <-> INT conversions.
llvm-svn: 234682
The patch is generated using clang-tidy misc-use-override check.
This command was used:
tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py \
-checks='-*,misc-use-override' -header-filter='llvm|clang' \
-j=32 -fix -format
http://reviews.llvm.org/D8925
llvm-svn: 234679
Rewrite `DILocation::atSameLineAs()` as `MDLocation::canDiscriminate()`
with a doxygen comment explaining its purpose. I've added a few FIXMEs
where I think this check is too weak; fixing that is tracked by PR23199.
llvm-svn: 234674
Add forwarding `getFilename()` and `getDirectory()` accessors to nodes
in the new hierarchy that define a `getFile()`. Use that to
re-implement existing functionality in the `DIDescriptor` hierarchy.
llvm-svn: 234671
This pass had the same problem as the data-prefetching pass: it was only
checking for depth == 1 loops in practice. Fix that, add some debugging
statements, and make sure that, when we grab an AddRec, it is for the loop we
expect.
llvm-svn: 234670
Currently, there's a single flag, checked by the pass itself.
It can't force-enable the pass (and is on by default), because it
might not even have been created, as that's the targets decision.
Instead, have separate explicit flags, so that the decision is
consistently made in the target.
Keep the flag as a last-resort "force-disable GlobalMerge" for now,
for backwards compatibility.
llvm-svn: 234666
The spilled registers are pristine and thus, correctly handled by
the register scavenger and so on, but the liveness information is
strictly speaking wrong at this point.
Fix that.
llvm-svn: 234664
This allows winehprepare to build sensible llvm.eh.actions calls for SEH
finally blocks. The pattern matching in this change is brittle and
should be replaced with something more robust soon. In the meantime,
this will let us write the code that produces __C_specific_handler xdata
tables, which we need regardless of how we decide to get finally blocks
through EH preparation.
llvm-svn: 234663
Previously, isMask_N returned false for 0 but isShiftedMask_N returned true.
Almost all uses are for pattern matching bitfield operations in the backends,
and expect false (this was discovered because of AArch64's copy of this logic).
Unfortunately, I couldn't put together a small non-fragile test for this. The
nature of the bitfield operations means that this edge case is only really
triggered for nodes like "(and N, 0)", which the DAG combiner is usually very
good at folding away before they get to this stage.
rdar://20501377
llvm-svn: 234659
When rewriting statepoints to make relocations explicit, we need to have a conservative but consistent notion of where a particular pointer is live at a particular site. The old code just used dominance, which is correct, but decidedly more conservative then it needed to be. This patch implements a simple dataflow algorithm that's run one per function (well, twice counting fixup after base pointer insertion). There's still lots of room to make this faster, but it's fast enough for all practical purposes today.
Differential Revision: http://reviews.llvm.org/D8674
llvm-svn: 234657
r234638 chained another transform below which was tripping over the
deleted instruction. Use after free found by asan in many regression
tests.
llvm-svn: 234654
After submitting 234651, I noticed I hadn't responded to a review comment by mjacob. This patch addresses that comment and fixes a Release only build problem due to an unused variable.
llvm-svn: 234653
Two related small changes:
Various dominance based queries about liveness can get confused if we're talking about unreachable blocks. To avoid reasoning about such cases, just remove them before rewriting statepoints.
Remove single entry phis (likely left behind by LCSSA) to reduce the number of live values.
Both of these are motivated by http://reviews.llvm.org/D8674 which will be submitted shortly.
Differential Revision: http://reviews.llvm.org/D8675
llvm-svn: 234651
This patch adds limited support for inserting explicit relocations when there's a vector of pointers live over the statepoint. This doesn't handle the case where the vector contains a mix of base and non-base pointers; that's future work.
The current implementation just scalarizes the vector over the gc.statepoint before doing the explicit rewrite. An alternate approach would be to plumb the vector all the way though the backend lowering, but doing that appears challenging. In particular, the size of the indirect spill slot is currently assumed to be sizeof(pointer) throughout the backend.
In practice, this is enough to allow running the SLP and Loop vectorizers before RewriteStatepointsForGC.
Differential Revision: http://reviews.llvm.org/D8671
llvm-svn: 234647
Summary:
This change moves creating calls to `llvm.uadd.with.overflow` from
InstCombine to CodeGenPrep. Combining overflow check patterns into
calls to the said intrinsic in InstCombine inhibits optimization because
it introduces an intrinsic call that not all other transforms and
analyses understand.
Depends on D8888.
Reviewers: majnemer, atrick
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8889
llvm-svn: 234638
Stop leaking temporary nodes from `DIBuilder::createCompileUnit()`.
`replaceAllUsesWith()` doesn't delete the nodes, so we need to delete
them "manually" (well, `TempMDTuple` does that for us).
Similarly, stop leaking the temporary nodes used for variables of
subprograms.
llvm-svn: 234617
This fixes some odd behavior differences between the two. In particular,
the version that takes a FD no longer unconditionally sets stdout to binary.
llvm-svn: 234615
Previously we would always report success, which is pretty bogus.
I'm too lazy to write a test where rename will portably fail on all
platforms. I'm just trying to fix breakage introduced by r234597, which
happened to tickle this.
llvm-svn: 234611
WinEH currently turns invokes into calls. Long term, we will reconsider
this, but for now, make sure we remap the operands and clone the
successors of the new terminator.
llvm-svn: 234608
Iterating over loops from the LoopInfo instance only provides top-level loops.
We need to search the whole tree of loops to find the inner ones.
llvm-svn: 234603
CallSite roughly behaves as a common base CallInst and InvokeInst. Bring
the behavior closer to that model by making upcasts explicit. Downcasts
remain implicit and work as before.
Following dyn_cast as a mental model checking whether a Value *V isa
CallSite now looks like this:
if (auto CS = CallSite(V)) // think dyn_cast
instead of:
if (CallSite CS = V)
This is an extra token but I think it is slightly clearer. Making the
ctor explicit has the advantage of not accidentally creating nullptr
CallSites, e.g. when you pass a Value * to a function taking a CallSite
argument.
llvm-svn: 234601
Using SchedAliases is convenient and works well for latency and resource
lookup for instructions. However, this creates an entry in
AArch64WriteLatencyTable with a WriteResourceID of 0, breaking any
SchedReadAdvance since the lookup will fail.
http://reviews.llvm.org/D8043
Patch by Dave Estes <cestes@codeaurora.org>!
llvm-svn: 234594
Cache NumEntries locally, it's only used in an assert and using the member
variable prevents the compiler from eliminating the tombstone check for types
with trivial destructors. No functionality change intended.
llvm-svn: 234589
Summary:
Some optimizations such as jump threading and loop unswitching can negatively
affect performance when applied to divergent branches. The divergence analysis
added in this patch conservatively estimates which branches in a GPU program
can diverge. This information can then help LLVM to run certain optimizations
selectively.
Test Plan: test/Analysis/DivergenceAnalysis/NVPTX/diverge.ll
Reviewers: resistor, hfinkel, eliben, meheff, jholewinski
Subscribers: broune, bjarke.roune, madhur13490, tstellarAMD, dberlin, echristo, jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D8576
llvm-svn: 234567
The IPToState table must be emitted after we have generated labels for
all functions in the table. Don't rely on the order of the list of
globals. Instead, utilize WinEHFuncInfo to tell us how many catch
handlers we expect to outline. Once we know we've visited all the catch
handlers, emit the cppxdata.
llvm-svn: 234566
When we have an instruction for this (and, thus, don't generate a runtime
call), we need to custom type legalize this (in a trivial way, just as we do
for fp_to_sint).
Fixes PR23173.
llvm-svn: 234561
For the most common ones (such as fadd), we already did the promotion.
Do the same thing for all the others.
Currently, we'll just crash/assert on all these operations, as
there's no hardware or libcall support whatsoever.
f16 (half) is specified as an interchange - not arithmetic - format,
and is expected to be promoted to single-precision for arithmetic
operations.
While there, teach the legalizer about promoting some of the (mostly
floating-point) operations that we never needed before.
Differential Revision: http://reviews.llvm.org/D8648
See related discussion on the thread for: http://reviews.llvm.org/D8755
llvm-svn: 234550
This is the patch corresponding to review:
http://reviews.llvm.org/D8406
It adds some missing instructions from ISA 2.06 to the PPC back end.
llvm-svn: 234546
formatted_raw_ostream is a wrapper over another stream to add column and line
number tracking.
It is used only for asm printing.
This patch moves the its creation down to where we know we are printing
assembly. This has the following advantages:
* Simpler lifetime management: std::unique_ptr
* We don't compute column and line number of object files :-)
llvm-svn: 234535
We already do:
concat_vectors(scalar, undef) -> scalar_to_vector(scalar)
When the scalar is legal.
When it's not, but is a truncated legal scalar, we can also do:
concat_vectors(trunc(scalar), undef) -> scalar_to_vector(scalar)
Which is equivalent, since the upper lanes are undef anyway.
While there, teach the combine to look at more than 2 operands.
Differential Revision: http://reviews.llvm.org/D8883
llvm-svn: 234530
The integer extend optimization tries to fold the extend into the load
instruction. This requires us to identify if the extend has already been
emitted or not and act accordingly on it.
The check that was originally performed for this was not sufficient. Besides
checking the ValueMap for a mapped register we also need to check if the
virtual register has already an associated machine instruction that defines it.
This fixes rdar://problem/20470788.
llvm-svn: 234529
Using this instead of
namespace llvm {
func...
}
Has the advantage that the build fails with a compiler error if it gets out
of sync with the .h file.
llvm-svn: 234515
Pull the `-preserve-*-use-list-order` flags out of "experimental" mode,
and preserve use-list order by default when serializing to bitcode.
llvm-svn: 234510
Revert "Add classof implementations to the raw_ostream classes."
Revert "Use the cast machinery to remove dummy uses of formatted_raw_ostream."
The underlying issue can be fixed without classof.
llvm-svn: 234495
Currently, llvm (backend) doesn't know cortex-r4, even though it is the
default target for armv7r. Using "--target=armv7r-arm-none-eabi" provokes
'cortex-r4' is not a recognized processor for this target' by llvm.
This patch adds support for cortex-r4 and, very closely related, r4f.
llvm-svn: 234486
Summary:
Make the code more readable by fusing the for-loops together and explicitly checking for each register class.
Also, this version is more straightforward because it doesn't assume that FPU registers always come before CPU registers in the CalleeSavedInfo vector.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8033
llvm-svn: 234475
restrictions when choosing a type for small-memcpy inlining in
SelectionDAGBuilder.
This ensures that the loads and stores output for the memcpy won't be further
expanded during legalization, which would cause the total number of instructions
for the memcpy to exceed (often significantly) the inlining thresholds.
<rdar://problem/17829180>
llvm-svn: 234462
The input to compileOptimized is already optimized and internalized, so remove
internalize pass from compileOptimized.
rdar://20227235
llvm-svn: 234446
The bug manifests when there are two loads and two stores chained as follows in
a DAG,
(ld v3f32) -> (st f32) -> (ld v3f32) -> (st f32)
and the stores' values are extracted from the preceding vector loads.
MergeConsecutiveStores would replace the first store in the chain with the
merged vector store, which would create a cycle between the merged store node
and the last load node that appears in the chain.
This commits fixes the bug by replacing the last store in the chain instead.
rdar://problem/20275084
Differential Revision: http://reviews.llvm.org/D8849
llvm-svn: 234430
r234262 changed some code in DIBuilderBindings.cpp to use the unwrap function
to unwrap debug metadata. The problem with this is that unwrap asserts that
its argument is non-null, which is not what we want in a number of places
in DIBuilder where the argument is optional. This change makes certain
arguments optional by adding null checks in places where it is required,
fixing the llgo build.
llvm-svn: 234428
The code uses a priority queue and a worklist, which share the same
visited set, but the visited set is only updated when inserting into
the priority queue. Instead, switch to using separate visited sets
for the priority queue and worklist.
llvm-svn: 234425
(Re-apply r234361 with a fix and a testcase for PR23157)
Both run-time pointer checking and the dependence analysis are capable
of dealing with uniform addresses. I.e. it's really just an orthogonal
property of the loop that the analysis computes.
Run-time pointer checking will only try to reason about SCEVAddRec
pointers or else gives up. If the uniform pointer turns out the be a
SCEVAddRec in an outer loop, the run-time checks generated will be
correct (start and end bounds would be equal).
In case of the dependence analysis, we work again with SCEVs. When
compared against a loop-dependent address of the same underlying object,
the difference of the two SCEVs won't be constant. This will result in
returning an Unknown dependence for the pair.
When compared against another uniform access, the difference would be
constant and we should return the right type of dependence
(forward/backward/etc).
The changes also adds support to query this property of the loop and
modify the vectorizer to use this.
Patch by Ashutosh Nema!
llvm-svn: 234424
Because -menable-no-nans causes fcmp conditions to be rewritten
without 'o' or 'u' the recognition code in needs to cope. Also
extended it to handle 'le' and 'ge.
Differential Revision: http://reviews.llvm.org/D8725
llvm-svn: 234421