Commit Graph

111397 Commits

Author SHA1 Message Date
Colin LeMahieu ff10c8c95c [Hexagon] Adding orand, bitsplit reg/reg, and modwrap instructions.
llvm-svn: 225197
2015-01-05 20:04:40 +00:00
Hal Finkel 49557f1b42 [PowerPC] Remove zexts after i32 ctlz
The 64-bit semantics of cntlzw are not special, the 32-bit population count is
stored as a 64-bit value in the range [0,32]. As a result, it is always zero
extended, and it can be added to the PPCISelDAGToDAG peephole optimization as a
frontier instruction for the removal of unnecessary zero extensions.

llvm-svn: 225192
2015-01-05 18:52:29 +00:00
Hal Finkel 4e2c78228a [PowerPC] Remove zexts after byte-swapping loads
lhbrx and lwbrx not only load their data with byte swapping, but also clear the
upper 32 bits (at least). As a result, they can be added to the PPCISelDAGToDAG
peephole optimization as frontier instructions for the removal of unnecessary
zero extensions.

llvm-svn: 225189
2015-01-05 18:09:06 +00:00
Colin LeMahieu 5e079577e1 [Hexagon] Adding round reg/imm and bitsplit instructions.
llvm-svn: 225188
2015-01-05 18:08:21 +00:00
Saleem Abdulrasool 150a1dc5c2 SymbolRewriter: use iplist::splice
The swap implementation for iplist is currently unsupported.  Simply splice the
old list into place, which achieves the same purpose.  This is needed in order
to thread the -frewrite-map-file frontend option correctly.  NFC.

llvm-svn: 225186
2015-01-05 17:56:32 +00:00
Saleem Abdulrasool d37ce30888 SymbolRewriter: 80-column
Wrap a couple of lines.  NFC.

llvm-svn: 225185
2015-01-05 17:56:29 +00:00
Ahmed Bougacha d54c448d34 [AArch64] Improve codegen of store lane instructions by avoiding GPR usage.
We used to generate code similar to:

  umov.b        w8, v0[2]
  strb  w8, [x0, x1]

because the STR*ro* patterns were preferred to ST1*.
Instead, we can avoid going through GPRs, and generate:

  add   x8, x0, x1
  st1.b { v0 }[2], [x8]

This patch increases the ST1* AddedComplexity to achieve that.

rdar://16372710
Differential Revision: http://reviews.llvm.org/D6202

llvm-svn: 225183
2015-01-05 17:10:26 +00:00
Ahmed Bougacha f964df3640 [AArch64] Improve codegen of store lane 0 instructions by directly storing the subregister.
For 0-lane stores, we used to generate code similar to:

  fmov w8, s0
  str w8, [x0, x1, lsl #2]

instead of:

  str s0, [x0, x1, lsl #2]

To correct that: for store lane 0 patterns, directly match to STR <subreg>0.

Byte-sized instructions don't have the special case for a 0 index,
because FPR8s are defined to have untyped content.

rdar://16372710
Differential Revision: http://reviews.llvm.org/D6772

llvm-svn: 225181
2015-01-05 17:02:28 +00:00
NAKAMURA Takumi f199f4c593 llvm/test/lit.cfg: have_ld_plugin_support(): Use decode() for stdout.
llvm-svn: 225171
2015-01-05 14:18:04 +00:00
Karthik Bhat 93f27ce886 Select lower fsub,fabs pattern to fabd on AArch64
This patch lowers patterns such as-
  fsub   v0.4s, v0.4s, v1.4s
  fabs   v0.4s, v0.4s
to
  fabd  v0.4s, v0.4s, v1.4s
on AArch64.

Review: http://reviews.llvm.org/D6791
llvm-svn: 225169
2015-01-05 13:57:59 +00:00
Charlie Turner 6632d1f67e Parse Tag_compatibility correctly.
Tag_compatibility takes two arguments, but before this patch it would
erroneously accept just one, it now produces an error in that case.

Change-Id: I530f918587620d0d5dfebf639944d6083871ef7d
llvm-svn: 225167
2015-01-05 13:26:37 +00:00
Charlie Turner 8b2caa458f Emit the build attribute Tag_conformance.
Claim conformance to version 2.09 of the ARM ABI.

This build attribute must be emitted first amongst the build attributes when
written to an object file. This is to simplify conformance detection by
consumers.

Change-Id: If9eddcfc416bc9ad6e5cc8cdcb05d0031af7657e
llvm-svn: 225166
2015-01-05 13:12:17 +00:00
Karthik Bhat 8ec742c2f9 Select lower sub,abs pattern to sabd on AArch64
This patch lowers patterns such as-
  sub	v0.4s, v0.4s, v1.4s
  abs	v0.4s, v0.4s
to
  sabd	v0.4s, v0.4s, v1.4s
on AArch64.

Review: http://reviews.llvm.org/D6781
llvm-svn: 225165
2015-01-05 13:11:07 +00:00
Michael Kuperstein 6ae456b0d7 Fix broken test from r225159.
llvm-svn: 225164
2015-01-05 12:34:01 +00:00
Chandler Carruth 539dc4b9d5 [PM] Don't run the machinery of invalidating all the analysis passes
when all are being preserved.

We want to short-circuit this for a couple of reasons. One, I don't
really want passes to grow a dependency on actually receiving their
invalidate call when they've been preserved. I'm thinking about removing
this entirely. But more importantly, preserving everything is likely to
be the common case in a lot of scenarios, and it would be really good to
bypass all of the invalidation and preservation machinery there.
Avoiding calling N opaque functions to try to invalidate things that are
by definition still valid seems important. =]

This wasn't really inpsired by much other than seeing the spam in the
logging for analyses, but it seems better ot get it checked in rather
than forgetting about it.

llvm-svn: 225163
2015-01-05 12:32:11 +00:00
Chandler Carruth e5e8fb3bf6 [PM] Add names and debug logging for analysis passes to the new pass
manager.

This starts to allow us to test analyses more easily, but it's really
only the beginning. Some of the code here is still untestable without
manual changes to create analysis passes, but I wanted to factor it into
a small of chunks as possible.

Next up in order to be able to test things are, in no particular order:
- No-op analyses passes so we don't have to use real ones to exercise
  the pass maneger itself.
- Automatic way of generating dummy passes that require an analysis be
  run, including a variant that calls a 'print' method on a pass to make
  it even easier to print out the results of an analysis.
- Dummy passes that invalidate all analyses for their IR unit so we can
  test invalidation and re-runs.
- Automatic way to print each analysis pass as it is re-run.
- Automatic but optional verification of analysis passes everywhere
  possible.

I'm not claiming I'll get to all of these immediately, but that's what
is in the pipeline at some stage. I'm fleshing out exactly what I need
and what to prioritize by working on converting analyses and then trying
to test the conversion. =]

llvm-svn: 225162
2015-01-05 12:21:44 +00:00
Craig Topper d3c02f177a Replace several 'assert(false' with 'llvm_unreachable' or fold a condition into the assert.
llvm-svn: 225160
2015-01-05 10:15:49 +00:00
Jiangning Liu 40c1b35292 Fixed a bug in memory dependence checking module of loop vectorization. The following loop should not be vectorized with current algorithm.
{code}
// loop body
   ... = a[i]          (1)
    ... = a[i+1]       (2)
 .......
a[i+1] = ....          (3)
   a[i] = ...          (4)
{code}

The algorithm tries to collect memory access candidates from AliasSetTracker, and then check memory dependences one another. The memory accesses are unique in AliasSetTracker, and a single memory access in AliasSetTracker may map to multiple entries in AccessAnalysis, which could cover both 'read' and 'write'. Originally the algorithm only checked 'write' entry in Accesses if only 'write' exists. This is incorrect and the consequence is it ignored all read access, and finally some RAW and WAR dependence are missed.

For the case given above, if we ignore two reads, the dependence between (1) and (3) would not be able to be captured, and finally this loop will be incorrectly vectorized.

The fix simply inserts a new loop to find all entries in Accesses. Since it will skip most of all other memory accesses by checking the Value pointer at the very beginning of the loop, it should not increase compile-time visibly.

llvm-svn: 225159
2015-01-05 10:08:58 +00:00
Michael Gottesman 9e674942e8 Convert SmallMapVector from a class to a struct.
llvm-svn: 225158
2015-01-05 08:55:19 +00:00
Craig Topper dc2fc8035d [X86] Remove the predicates from the register forms of the 2-byte inc and dec instructions. Remove the 32-bit mode only versions that existed for the disassembler. Move the patterns out of the instructions so they can still be qualified with predicates.
llvm-svn: 225157
2015-01-05 08:19:12 +00:00
Craig Topper 3dcdde2e92 [X86] Simplify code a little by just summing flags instead of conditionally incrementing. NFC
llvm-svn: 225156
2015-01-05 08:19:10 +00:00
Craig Topper 859677edef [X86] Remove unnecessary redeclaration of a variable with the same assignment as the beginning of the function. NFC.
llvm-svn: 225155
2015-01-05 08:19:07 +00:00
Craig Topper 9cf67c6f57 [X86] Remove a strange fixme referring to a hack that doesn't seem to exist since the code is in a comment. Can't figure out what the body of the 'if' was supposed to be anyway.
llvm-svn: 225154
2015-01-05 08:19:05 +00:00
Craig Topper e9e081cd29 [x86] Reduce text duplication for similar operand class declarations in tablegen instruction info. No functional change intended.
llvm-svn: 225153
2015-01-05 08:19:03 +00:00
Craig Topper 21f984966d [X86] Fix the immediate size to match the address size in the operand types for the move to/from absolute memory instructions.
llvm-svn: 225152
2015-01-05 08:18:59 +00:00
Craig Topper 62c0525b11 [X86] Remove unused operand type from disassembler handling. NFC
llvm-svn: 225151
2015-01-05 08:18:52 +00:00
Hal Finkel 9bb61de1be [PowerPC] Enable speculation of cttz/ctlz
PPC has an instruction for ctlz with defined zero behavior, and our lowering of
cttz (provided by DAGCombine) is also efficient and branchless, so speculating
these makes sense.

llvm-svn: 225150
2015-01-05 05:24:42 +00:00
Chandler Carruth 73b0164fe5 [SROA] Apply a somewhat heavy and unpleasant hammer to fix PR22093, an
assert out of the new pre-splitting in SROA.

This fix makes the code do what was originally intended -- when we have
a store of a load both dealing in the same alloca, we force them to both
be pre-split with identical offsets. This is really quite hard to do
because we can keep discovering problems as we go along. We have to
track every load over the current alloca which for any resaon becomes
invalid for pre-splitting, and go back to remove all stores of those
loads. I've included a couple of test cases derived from PR22093 that
cover the different ways this can happen. While that PR only really
triggered the first of these two, its the same fundamental issue.

The other challenge here is documented in a FIXME now. We end up being
quite a bit more aggressive for pre-splitting when loads and stores
don't refer to the same alloca. This aggressiveness comes at the cost of
introducing potentially redundant loads. It isn't clear that this is the
right balance. It might be considerably better to require that we only
do pre-splitting when we can presplit every load and store involved in
the entire operation. That would give more consistent if conservative
results. Unfortunately, it requires a non-trivial change to the actual
pre-splitting operation in order to correctly handle cases where we end
up pre-splitting stores out-of-order. And it isn't 100% clear that this
is the right direction, although I'm starting to suspect that it is.

llvm-svn: 225149
2015-01-05 04:17:53 +00:00
Hal Finkel 5dd8278f3f [LangRef] Correct a typo
llvm-svn: 225148
2015-01-05 04:05:21 +00:00
Hal Finkel 2f61879ff4 [PowerPC] Materialize i64 constants using rotation with masking
r225135 added the ability to materialize i64 constants using rotations in order
to reduce the instruction count. Sometimes we can use a rotation only with some
extra masking, so that we take advantage of the fact that generating a bunch of
extra higher-order 1 bits is easy using li/lis.

llvm-svn: 225147
2015-01-05 03:41:38 +00:00
Chandler Carruth cd17f8fc68 [PM] Cleanup a place where I forgot to update the header guards when
renaming a file from AssumptionTracker.h to AssumptionCache.h.

Thanks to Philip Reames for noticing and pointing it out in code review!

llvm-svn: 225146
2015-01-05 03:03:31 +00:00
Chandler Carruth d174ce4ad1 [PM] Switch the new pass manager to use a reference-based API for IR
units.

This was debated back and forth a bunch, but using references is now
clearly cleaner. Of all the code written using pointers thus far, in
only one place did it really make more sense to have a pointer. In most
cases, this just removes immediate dereferencing from the code. I think
it is much better to get errors on null IR units earlier, potentially
at compile time, than to delay it.

Most notably, the legacy pass manager uses references for its routines
and so as more and more code works with both, the use of pointers was
likely to become really annoying. I noticed this when I ported the
domtree analysis over and wrote the entire thing with references only to
have it fail to compile. =/ It seemed better to switch now than to
delay. We can, of course, revisit this is we learn that references are
really problematic in the API.

llvm-svn: 225145
2015-01-05 02:47:05 +00:00
Chandler Carruth 9c31db4f94 [PM] Wire up support for explicitly running the verifier pass.
The required functionality has been there for some time, but I never
managed to actually wire it into the command line registry of passes.
Let's do that.

llvm-svn: 225144
2015-01-05 00:08:53 +00:00
Chandler Carruth 75c11b88af [PM] Cleanup a const_cast and other machinery left over in this code
from before I removed thet non-const use of the function.

The unused variable that held the const_cast was already kindly removed
by Michael.

llvm-svn: 225143
2015-01-04 23:13:57 +00:00
Simon Pilgrim b65a6ee831 [X86][SSE] Added vector packing test for pr12412
llvm-svn: 225138
2015-01-04 19:08:03 +00:00
Simon Pilgrim a1540c11ec [X86][SSE] Added vector integer truncation tests - based off pr15524
llvm-svn: 225137
2015-01-04 17:52:00 +00:00
Hal Finkel 241ba79f95 [PowerPC] Materialize i64 constants using rotation
Materializing full 64-bit constants on PPC64 can be expensive, requiring up to
5 instructions depending on the locations of the non-zero bits. Sometimes
materializing a rotated constant, and then applying the inverse rotation, requires
fewer instructions than the direct method. If so, do that instead.

In r225132, I added support for forming constants using bit inversion. In
effect, this reverts that commit and replaces it with rotation support. The bit
inversion is useful for turning constants that are mostly ones into ones that
are mostly zeros (thus enabling a more-efficient shift-based materialization),
but the same effect can be obtained by using negative constants and a rotate,
and that is at least as efficient, if not more.

llvm-svn: 225135
2015-01-04 15:43:55 +00:00
Michael Kuperstein 58c3f6cc31 Fix unused variable warning for non-asserts builds. NFC.
llvm-svn: 225133
2015-01-04 13:35:44 +00:00
Hal Finkel ca6375fb75 [PowerPC] Materialize i64 constants using bit inversion
Materializing full 64-bit constants on PPC64 can be expensive, requiring up to
5 instructions depending on the locations of the non-zero bits. Sometimes
materializing the bit-reversed constant, and then flipping the bits, requires
fewer instructions than the direct method. If so, do that instead.

llvm-svn: 225132
2015-01-04 12:35:03 +00:00
Chandler Carruth 66b3130cda [PM] Split the AssumptionTracker immutable pass into two separate APIs:
a cache of assumptions for a single function, and an immutable pass that
manages those caches.

The motivation for this change is two fold. Immutable analyses are
really hacks around the current pass manager design and don't exist in
the new design. This is usually OK, but it requires that the core logic
of an immutable pass be reasonably partitioned off from the pass logic.
This change does precisely that. As a consequence it also paves the way
for the *many* utility functions that deal in the assumptions to live in
both pass manager worlds by creating an separate non-pass object with
its own independent API that they all rely on. Now, the only bits of the
system that deal with the actual pass mechanics are those that actually
need to deal with the pass mechanics.

Once this separation is made, several simplifications become pretty
obvious in the assumption cache itself. Rather than using a set and
callback value handles, it can just be a vector of weak value handles.
The callers can easily skip the handles that are null, and eventually we
can wrap all of this up behind a filter iterator.

For now, this adds boiler plate to the various passes, but this kind of
boiler plate will end up making it possible to port these passes to the
new pass manager, and so it will end up factored away pretty reasonably.

llvm-svn: 225131
2015-01-04 12:03:27 +00:00
David Majnemer 087dc8b831 InstCombine: match can find ConstantExprs, don't assume we have a Value
We assumed the output of a match was a Value, this would cause us to
assert because we would fail a cast<>.  Instead, use a helper in the
Operator family to hide the distinction between Value and Constant.

This fixes PR22087.

llvm-svn: 225127
2015-01-04 07:36:02 +00:00
David Majnemer 6ee8d17bc6 ValueTracking: ComputeNumSignBits should tolerate misshapen phi nodes
PHI nodes can have zero operands in the middle of a transform.  It is
expected that utilities in Analysis don't freak out when this happens.

Note that it is considered invalid to allow these misshapen phi nodes to
make it to another pass.

This fixes PR22086.

llvm-svn: 225126
2015-01-04 07:06:53 +00:00
Lang Hames 12b12e800b [APFloat][ADT] Fix sign handling logic for FMA results that truncate to zero.
This patch adds a check for underflow when truncating results back to lower
precision at the end of an FMA. The additional sign handling logic in
APFloat::fusedMultiplyAdd should only be performed when the result of the
addition step of the FMA (in full precision) is exactly zero, not when the
result underflows to zero.

Unit tests for this case and related signed zero FMA results are included.

Fixes <rdar://problem/18925551>.

llvm-svn: 225123
2015-01-04 01:20:55 +00:00
Saleem Abdulrasool ddd926441e llvm-readobj: add support to dump COFF export tables
This enhances llvm-readobj to print out the COFF export table, similar to the
-coff-import option.  This is useful for testing in lld.

llvm-svn: 225120
2015-01-03 21:35:09 +00:00
Saleem Abdulrasool 67f729933f ARM: permit tail calls to weak externals on COFF
Weak externals are resolved statically, so we can actually generate the tail
call on PE/COFF targets without breaking the requirements.  It is questionable
whether we want to propagate the current behaviour for MachO as the requirements
are part of the ARM ELF specifications, and it seems that prior to the SVN
r215890, we would have tail'ed the call.  For now, be conservative and only
permit it on PE/COFF where the call will always be fully resolved.

llvm-svn: 225119
2015-01-03 21:35:00 +00:00
Hal Finkel 5772566ed6 [PowerPC/BlockPlacement] Allow target to provide a per-loop alignment preference
The existing code provided for specifying a global loop alignment preference.
However, the preferred loop alignment might depend on the loop itself. For
recent POWER cores, loops between 5 and 8 instructions should have 32-byte
alignment (while the others are better with 16-byte alignment) so that the
entire loop will fit in one i-cache line.

To support this, getPrefLoopAlignment has been made virtual, and can be
provided with an optional MachineLoop* so the target can inspect the loop
before answering the query. The default behavior, as before, is to return the
value set with setPrefLoopAlignment. MachineBlockPlacement now queries the
target for each loop instead of only once per function. There should be no
functional change for other targets.

llvm-svn: 225117
2015-01-03 17:58:24 +00:00
Hal Finkel d73bfba7eb [PowerPC] Use 16-byte alignment for modern cores for functions/loops
Most modern PowerPC cores prefer that functions and loops start on
16-byte-aligned boundaries (*), so instruct block placement, etc. to make this
happen. The branch selector has also been adjusted so account for the extra
nops that might now be inserted before loop headers.

(*) Some cores actually prefer other alignments for small loops, but that will
    be addressed in a follow-up commit.

llvm-svn: 225115
2015-01-03 14:58:25 +00:00
Craig Topper 589ceee7f4 Minor cleanup to all the switches after MatchInstructionImpl in all the AsmParsers.
Make sure they all have llvm_unreachable on the default path out of the switch. Remove unnecessary "default: break". Remove a 'return' after unreachable. Fix some indentation.

llvm-svn: 225114
2015-01-03 08:16:34 +00:00
Craig Topper a5754e6e82 Fix some formatting in tablegen output.
llvm-svn: 225113
2015-01-03 08:16:29 +00:00
Craig Topper 8c714d10bd Replace some 'unreachable' comments with llvm_unreachable.
llvm-svn: 225112
2015-01-03 08:16:14 +00:00
David Majnemer 8df46c9289 ValueTracking: Make computeKnownBits for Arguments a little more clear
We would sometimes leave the out-param APInts untouched while going
through computeKnownBits.  While I don't know of a way to trigger a bug
involving this in practice, it goes against the overall design of
computeKnownBits.

Found via code inspection.

llvm-svn: 225109
2015-01-03 02:33:25 +00:00
Hal Finkel 4edc66b8de [PowerPC] Add support for the CMPB instruction
Newer POWER cores, and the A2, support the cmpb instruction. This instruction
compares its operands, treating each of the 8 bytes in the GPRs separately,
returning a 'mask' result of 0 (for false) or -1 (for true) in each byte.

Code generation support is added, in the form of a PPCISelDAGToDAG
DAG-preprocessing routine, that recognizes patterns close to what the
instruction computes (either exactly, or related by a constant masking
operation), and generates the cmpb instruction (along with any necessary
constant masking operation). This can be expanded if use cases arise.

llvm-svn: 225106
2015-01-03 01:16:37 +00:00
Kostya Serebryany d421db05bb [asan] simplify the tracing code, make it use the same guard variables as coverage
llvm-svn: 225103
2015-01-03 00:54:43 +00:00
Craig Topper ae8e1b3831 [X86] Disassembler support for move to/from %rax with a 32-bit memory offset is REX.W and AdSize prefix are both present.
llvm-svn: 225099
2015-01-03 00:00:20 +00:00
Craig Topper 017b830564 [X86] Use 32-bit sign extended immediate for 64-bit LOCK_ArithBinOp with sign extended immediate.
llvm-svn: 225098
2015-01-03 00:00:14 +00:00
Chandler Carruth d3e2b4c0ea [PM] Add proper documentation for the ModulePassManager and
FunctionPassManager. These never got documented, likely due to the
clutter of this header file. This fixes another problem people noticed
when they started trying to use the new pass manager.

I've also used this to document the aspirational constraints I would
like to hold passes to. I don't really have a better place to document
such things at this point, but eventually will probably create a proper
.rst file and page for the LLVM pass infrastructure that carries such
high-level concerns.

llvm-svn: 225097
2015-01-02 23:34:39 +00:00
Chandler Carruth 4664057108 [PM] Actually include the correct file name. Sorry for the breakage.
llvm-svn: 225096
2015-01-02 23:25:16 +00:00
Chandler Carruth 2ad2a8b943 [PM] Lift the majority of the template boilerplate used to implement the
concept-based polymorphism in the pass manager to a separate header.

I got feedback from someone reading the code and trying to use it that
this was really making it hard to dive in and start using these APIs and
that makes a lot of sense.

This only requires a moderate amount of gymnastics to separate in this
way, namely rinsing the PreservedAnalysis object through a template
argument in a few places so that it is dependent and we only examine it
on instantiation.

llvm-svn: 225094
2015-01-02 23:16:59 +00:00
Chandler Carruth ce08983110 [PM] Fix some formatting where clang-format has improved recently.
llvm-svn: 225092
2015-01-02 22:51:44 +00:00
Philip Reames dfc238b45f Reformat statepoint documentation and fix a couple of typos
Patch by Ramkumar Ramachandra <artagnon@gmail.com>.

llvm-svn: 225084
2015-01-02 19:46:49 +00:00
Andrea Di Biagio 6477847ef4 Improved comments. No functional change intended.
llvm-svn: 225080
2015-01-02 10:47:46 +00:00
Craig Topper 4e5ab81a12 [X86] Bring some better consistency to the naming of the move to/from %al/ax/eax/rax with memory offset.
llvm-svn: 225078
2015-01-02 07:36:23 +00:00
David Majnemer c8a576b5c0 InstCombine: Detect when llvm.umul.with.overflow always overflows
We know overflow always occurs if both ~LHSKnownZero * ~RHSKnownZero
and LHSKnownOne * RHSKnownOne overflow.

llvm-svn: 225077
2015-01-02 07:29:47 +00:00
David Majnemer 491331aca8 Analysis: Reformulate WillNotOverflowUnsignedMul for reusability
WillNotOverflowUnsignedMul's smarts will live in ValueTracking as
computeOverflowForUnsignedMul.  It now returns a tri-state result:
never overflows, always overflows and sometimes overflows.

llvm-svn: 225076
2015-01-02 07:29:43 +00:00
Craig Topper 055845f5cb [X86] Make the instructions that use AdSize16/32/64 co-exist together without using mode predicates.
This is necessary to allow the disassembler to be able to handle AdSize32 instructions in 64-bit mode when address size prefix is used.

Eventually we should probably also support 'addr32' and 'addr16' in the assembler to override the address size on some of these instructions. But for now we'll just use special operand types that will lookup the current mode size to select the right instruction.

llvm-svn: 225075
2015-01-02 07:02:25 +00:00
Chandler Carruth 24ac830d7c [SROA] Teach SROA to be more aggressive in splitting now that we have
a pre-splitting pass over loads and stores.

Historically, splitting could cause enough problems that I hamstrung the
entire process with a requirement that splittable integer loads and
stores must cover the entire alloca. All smaller loads and stores were
unsplittable to prevent chaos from ensuing. With the new pre-splitting
logic that does load/store pair splitting I introduced in r225061, we
can now very nicely handle arbitrarily splittable loads and stores. In
order to fully benefit from these smarts, we need to mark all of the
integer loads and stores as splittable.

However, we don't actually want to rewrite partitions with all integer
loads and stores marked as splittable. This will fail to extract scalar
integers from aggregates, which is kind of the point of SROA. =] In
order to resolve this, what we really want to do is only do
pre-splitting on the alloca slices with integer loads and stores fully
splittable. This allows us to uncover all non-integer uses of the alloca
that would benefit from a split in an integer load or store (and where
introducing the split is safe because it is just memory transfer from
a load to a store). Once done, we make all the non-whole-alloca integer
loads and stores unsplittable just as they have historically been,
repartition and rewrite.

The result is that when there are integer loads and stores anywhere
within an alloca (such as from a memcpy of a sub-object of a larger
object), we can split them up if there are non-integer components to the
aggregate hiding beneath. I've added the challenging test cases to
demonstrate how this is able to promote to scalars even a case where we
have even *partially* overlapping loads and stores.

This restores the single-store behavior for small arrays of i8s which is
really nice. I've restored both the little endian testing and big endian
testing for these exactly as they were prior to r225061. It also forced
me to be more aggressive in an alignment test to actually defeat SROA.
=] Without the added volatiles there, we actually split up the weird i16
loads and produce nice double allocas with better alignment.

This also uncovered a number of bugs where we failed to handle
splittable load and store slices which didn't have a begininng offset of
zero. Those fixes are included, and without them the existing test cases
explode in glorious fireworks. =]

I've kept support for leaving whole-alloca integer loads and stores as
splittable even for the purpose of rewriting, but I think that's likely
no longer needed. With the new pre-splitting, we might be able to remove
all the splitting support for loads and stores from the rewriter. Not
doing that in this patch to try to isolate any performance regressions
that causes in an easy to find and revert chunk.

llvm-svn: 225074
2015-01-02 03:55:54 +00:00
Chandler Carruth 5986b541d4 [SROA] Make the computation of adjusted pointers not leak GEP
instructions.

I noticed this when working on dialing up how aggressively we can
pre-split loads and stores. My test case wasn't passing because dead
GEPs into the allocas persisted when they were built by this routine.
This isn't terribly harmful, we still rewrote and promoted the alloca
and I can't conceive of how to cause this to happen in a case where we
will keep the exact same alloca but rewrite and promote the uses of it.
If that ever happened, we'd get an assert out of mem2reg.

So I don't have a direct test case yet, but the subsequent commit's test
case wouldn't pass without this. There are other problems fixed by this
patch that I spotted purely by inspection such as the fact that
getAdjustedPtr could have actually deleted dead base pointers. I don't
know how to get a base pointer to go into getAdjustedPtr today, so
I think this bug could never have manifested (and I certainly can't
write a test case for it) but, it wasn't the intent of the code. The
code really just wanted to GC the new instructions built. That can be
done more directly by comparing with the base pointer which is the only
non-new instruction that this code can return.

llvm-svn: 225073
2015-01-02 02:47:38 +00:00
Chandler Carruth e65ae89327 [SROA] Add a test case for r225068 / PR22080.
llvm-svn: 225070
2015-01-02 00:34:29 +00:00
Chandler Carruth 29c22fae46 [SROA] Fix the loop exit placement to be prior to indexing the splits
array. This prevents it from walking out of bounds on the splits array.

Bug found with the existing tests by ASan and by the MSVC debug build.

llvm-svn: 225069
2015-01-02 00:10:22 +00:00
Chandler Carruth c39eaa5041 [SROA] Fix two total think-os in r225061 that should have been caught on
a +asserts bootstrap, but my bootstrap had asserts off. Oops.

Anyways, in some places it is reasonable to cast (as a sanity check) the
pointer operand to a load or store to an instruction within SROA --
namely when the pointer operand is expected to be derived from an
alloca, and thus always an instruction. However, the pre-splitting code
also deals with loads and stores to non-alloca pointers and there we
need to just use the Value*. Nothing about the code relied on the
instruction cast, it was only there essentially as an invariant
assertion. Remove the two that don't actually hold.

This should fix the proximate issue in PR22080, but I'm also doing an
asserts bootstrap myself to see if there are other issues lurking.

I'll craft a reduced test case in a moment, but I wanted to get the tree
healthy as quickly as possible.

llvm-svn: 225068
2015-01-01 23:26:16 +00:00
Hal Finkel ddf8d7d155 [PowerPC] use UINT64_C instead of ul
Attempting to fix PR22078 (building on 32-bit systems) by replacing my careless
use of 1ul to be a uint64_t constant with UINT64_C(1).

llvm-svn: 225066
2015-01-01 19:33:59 +00:00
Michael Gottesman 40898a5a93 Revert "Just use a using directive in SmallMapVector instead of inheriting from MapVector itself."
This reverts commit r225059. I think MSVC 2012 has a problem with this. This is
an attempt to fix one of the MSVC 2012 bots.

llvm-svn: 225065
2015-01-01 13:54:05 +00:00
Chandler Carruth a1f4697dba Revert r225053: Add an ArrayRef upcasting constructor from ArrayRef<U*> -> ArrayRef<T*> where T is a base of U.
This appears to have broken at least the windows build bots due to
compile errors in the predicate that didn't simply supress the overload.
I'm not sure what the fix is, and the bots have been broken for a long
time now so I'm just reverting until Michael can figure out a fix.

llvm-svn: 225064
2015-01-01 13:01:25 +00:00
Chandler Carruth 6044c0bc78 [SROA] Switch to using a more direct debug logging technique in one part
of my new load and store splitting, and fix a bug where it logged
a totally irrelevant slice rather than the actual slice in question.

The logging here previously worked because we used to place new slices
onto the back of the core sequence, but that caused other problems.
I updated the actual code to store new slices in their own vector but
didn't update the logging. There isn't a good way to reuse the logging
any more, and frankly it wasn't needed. We can directly log this bit
more easily.

llvm-svn: 225063
2015-01-01 12:56:47 +00:00
Chandler Carruth 994cde8869 [SROA] Fix formatting with clang-format which I managed to fail to do
prior to committing r225061. Sorry for that.

llvm-svn: 225062
2015-01-01 12:01:03 +00:00
Chandler Carruth 0715cba02d [SROA] Teach SROA how to much more intelligently handle split loads and
stores.

When there are accesses to an entire alloca with an integer
load or store as well as accesses to small pieces of the alloca, SROA
splits up the large integer accesses. In order to do that, it uses bit
math to merge the small accesses into large integers. While this is
effective, it produces insane IR that can cause significant problems in
the rest of the optimizer:

- It can cause load and store mismatches with GVN on the non-alloca side
  where we end up loading an i64 (or some such) rather than loading
  specific elements that are stored.
- We can't always get rid of the integer bit math, which is why we can't
  always fix the loads and stores to work well with GVN.
- This is especially bad when we have operations that mix poorly with
  integer bit math such as floating point operations.
- It will block things like the vectorizer which might be able to handle
  the scalar stores that underly the aggregate.

At the same time, we can't just directly split up these loads and stores
in all cases. If there is actual integer arithmetic involved on the
values, then using integer bit math is actually the perfect lowering
because we can often combine it heavily with the surrounding math.

The solution this patch provides is to find places where SROA is
partitioning aggregates into small elements, and look for splittable
loads and stores that it can split all the way to some other adjacent
load and store. These are uniformly the cases where failing to split the
loads and stores hurts the optimizer that I have seen, and I've looked
extensively at the code produced both from more and less aggressive
approaches to this problem.

However, it is quite tricky to actually do this in SROA. We may have
loads and stores to the same alloca, or other complex patterns that are
hard to handle. This complexity leads to the somewhat subtle algorithm
implemented here. We have to do this entire process as a separate pass
over the partitioning of the alloca, and split up all of the loads prior
to splitting the stores so that we can handle safely the cases of
overlapping, including partially overlapping, loads and stores to the
same alloca. We also have to reconstitute the post-split slice
configuration so we can avoid iterating again over all the alloca uses
(the slow part of SROA). But we also have to ensure that when we split
up loads and stores to *other* allocas, we *do* re-iterate over them in
SROA to adapt to the more refined partitioning now required.

With this, I actually think we can fix a long-standing TODO in SROA
where I avoided splitting as many loads and stores as probably should be
splittable. This limitation historically mitigated the fallout of all
the bad things mentioned above. Now that we have more intelligent
handling, I plan to remove the FIXME and more aggressively mark integer
loads and stores as splittable. I'll do that in a follow-up patch to
help with bisecting any fallout.

The net result of this change should be more fine-grained and accurate
scalars being formed out of aggregates. At the very least, Clang now
generates perfect code for this high-level test case using
std::complex<float>:

  #include <complex>

  void g1(std::complex<float> &x, float a, float b) {
    x += std::complex<float>(a, b);
  }
  void g2(std::complex<float> &x, float a, float b) {
    x -= std::complex<float>(a, b);
  }

  void foo(const std::complex<float> &x, float a, float b,
           std::complex<float> &x1, std::complex<float> &x2) {
    std::complex<float> l1 = x;
    g1(l1, a, b);
    std::complex<float> l2 = x;
    g2(l2, a, b);
    x1 = l1;
    x2 = l2;
  }

This code isn't just hypothetical either. It was reduced out of the hot
inner loops of essentially every part of the Eigen math library when
using std::complex<float>. Those loops would consistently and
pervasively hop between the floating point unit and the integer unit due
to bit math extraction and insertion of floating point values that were
"stored" in a 64-bit integer register around the loop backedge.

So far, this change has passed a bootstrap and I have done some other
testing and so far, no issues. That doesn't mean there won't be though,
so I'll be prepared to help with any fallout. If you performance swings
in particular, please let me know. I'm very curious what all the impact
of this change will be. Stay tuned for the follow-up to also split more
integer loads and stores.

llvm-svn: 225061
2015-01-01 11:54:38 +00:00
Michael Gottesman 74ec802065 Just use a using directive in SmallMapVector instead of inheriting from MapVector itself.
llvm-svn: 225059
2015-01-01 08:05:41 +00:00
Hal Finkel c58ce4132a [PowerPC] Improve instruction selection bit-permuting operations (64-bit)
This is the second installment of improvements to instruction selection for "bit
permutation" instruction sequences. r224318 added logic for instruction
selection for 32-bit bit permutation sequences, and this adds lowering for
64-bit sequences. The 64-bit sequences are more complicated than the 32-bit
ones because:
  a) the 64-bit versions of the 32-bit rotate-and-mask instructions
     work by replicating the lower 32-bits of the value-to-be-rotated into the
     upper 32 bits -- and integrating this into the cost modeling for the various
     bit group operations is non-trivial
  b) unlike the 32-bit instructions in 32-bit mode, the rotate-and-mask instructions
     cannot, in one instruction, specify the
     mask starting index, the mask ending index, and the rotation factor. Also,
     forming arbitrary 64-bit constants is more complicated than in 32-bit mode
     because the number of instructions necessary is value dependent.

Plus, support for 'late masking' was added: it is sometimes more efficient to
treat the overall value as if it had no mandatory zero bits when planning the
bit-group insertions, and then mask them in at the very end. Unfortunately, as
the structure of the bit groups is different in the two cases, the more
feasible implementation technique was to generate both instruction sequences,
and then pick the shorter one.

And finally, we now generate reasonable code for i64 bswap:

        rldicl 5, 3, 16, 0
        rldicl 4, 3, 8, 0
        rldicl 6, 3, 24, 0
        rldimi 4, 5, 8, 48
        rldicl 5, 3, 32, 0
        rldimi 4, 6, 16, 40
        rldicl 6, 3, 48, 0
        rldimi 4, 5, 24, 32
        rldicl 5, 3, 56, 0
        rldimi 4, 6, 40, 16
        rldimi 4, 5, 48, 8
        rldimi 4, 3, 56, 0

vs. what we used to produce:

        li 4, 255
        rldicl 5, 3, 24, 40
        rldicl 6, 3, 40, 24
        rldicl 7, 3, 56, 8
        sldi 8, 3, 8
        sldi 10, 3, 24
        sldi 12, 3, 40
        rldicl 0, 3, 8, 56
        sldi 9, 4, 32
        sldi 11, 4, 40
        sldi 4, 4, 48
        andi. 5, 5, 65280
        andis. 6, 6, 255
        andis. 7, 7, 65280
        sldi 3, 3, 56
        and 8, 8, 9
        and 4, 12, 4
        and 9, 10, 11
        or 6, 7, 6
        or 5, 5, 0
        or 3, 3, 4
        or 7, 9, 8
        or 4, 6, 5
        or 3, 3, 7
        or 3, 3, 4

which is 12 instructions, instead of 25, and seems optimal (at least in terms
of code size).

llvm-svn: 225056
2015-01-01 02:53:29 +00:00
Michael Gottesman 4c638994ac Add 2x constructors for TinyPtrVector, one that takes in one elemenet and the other that takes in an ArrayRef<EltTy>
Currently one can only construct an empty TinyPtrVector. These are just missing
elements of the API.

llvm-svn: 225055
2014-12-31 23:33:24 +00:00
Michael Gottesman 64670671e6 Add a SmallMapVector class that is a MapVector with a Map of SmallDenseMap and a Vector of SmallVector.
llvm-svn: 225054
2014-12-31 23:33:21 +00:00
Michael Gottesman 77b1aa98ed Add an ArrayRef upcasting constructor from ArrayRef<U*> -> ArrayRef<T*> where T is a base of U.
llvm-svn: 225053
2014-12-31 23:33:18 +00:00
Sanjay Patel e68f71574f InstCombine: fsub nsz 0, X ==> fsub nsz -0.0, X
Some day the backend may handle instruction-level fast math flags and make
this transform unnecessary, but it's still better practice to use the canonical
representation of fneg when possible (use a -0.0).

This is a partial fix for PR20870 ( http://llvm.org/bugs/show_bug.cgi?id=20870 ).
See also http://reviews.llvm.org/D6723.

Differential Revision: http://reviews.llvm.org/D6731

llvm-svn: 225050
2014-12-31 22:14:05 +00:00
Rafael Espindola 54b435ec3c Add r224985 back with a fix.
The issues was that AArch64 has additional restrictions on when local
relocations can be used. We have to take those into consideration when
deciding to put a L symbol in the symbol table or not.

Original message:

Remove doesSectionRequireSymbols.

In an assembly expression like

bar:
.long L0 + 1

the intended semantics is that bar will contain a pointer one byte past L0.

In sections that are merged by content (strings, 4 byte constants, etc), a
single position in the section doesn't give the linker enough information.
For example, it would not be able to tell a relocation must point to the
end of a string, since that would look just like the start of the next.

The solution used in ELF to use relocation with symbols if there is a non-zero
addend.

In MachO before this patch we would just keep all symbols in some sections.

This would miss some cases (only cstrings on x86_64 were implemented) and was
inefficient since most relocations have an addend of 0 and can be represented
without the symbol.

This patch implements the non-zero addend logic for MachO too.

llvm-svn: 225048
2014-12-31 17:19:34 +00:00
Colin LeMahieu 5691eb5ee7 Reverting 225045 and 225043 and XFAIL multiline.ll on hexagon
llvm-svn: 225047
2014-12-31 17:14:35 +00:00
Rafael Espindola 35e42db3ed Add a test for the recent compiler-rt build failure.
llvm-svn: 225046
2014-12-31 16:58:05 +00:00
Colin LeMahieu 79e8ebada2 [Hexagon] Removing assertion to appease buildbot until I can reproduce the problem
llvm-svn: 225045
2014-12-31 16:20:00 +00:00
Rafael Espindola d4da9040de Revert "Remove doesSectionRequireSymbols."
This reverts commit r224985.

I am investigating why it made an Apple bot unhappy.

llvm-svn: 225044
2014-12-31 16:06:48 +00:00
Colin LeMahieu 94272611ac [Hexagon] Changing an llvm_unreachable to an assertion and returning 0. Relocations aren't implemented yet but we don't need to abort for this in release builds.
llvm-svn: 225043
2014-12-31 15:57:38 +00:00
Craig Topper a7a8c4c09e [X86] Update disassembler tests for absolute move instructions to check the encodings. This provides testing for r225036. 64-bit mode is still broken.
llvm-svn: 225037
2014-12-31 07:24:23 +00:00
Craig Topper 99bcab7b85 [X86] Fix disassembly of absolute moves to work correctly in 16 and 32-bit modes with all 4 combinations of OpSize and AdSize prefixes being present or not.
llvm-svn: 225036
2014-12-31 07:07:31 +00:00
Craig Topper 6e518776e3 [x86] Simplify detection of jcxz/jecxz/jrcxz in disassembler.
llvm-svn: 225035
2014-12-31 07:07:11 +00:00
David Majnemer f89dc3edc9 InstCombine: try to transform A-B < 0 into A < B
We are allowed to move the 'B' to the right hand side if we an prove
there is no signed overflow and if the comparison itself is signed.

llvm-svn: 225034
2014-12-31 04:21:41 +00:00
Alexey Samsonov 553185ee4b Revert "merge consecutive stores of extracted vector elements"
This reverts commit r224611. This change causes crashes
in X86 DAG->DAG Instruction Selection.

llvm-svn: 225031
2014-12-31 00:40:28 +00:00
Colin LeMahieu bc405294f0 [Hexagon] Adding accumulating add/sub, doubleword logic-not variants, doubleword bitfield extract, word parity, accumulating multiplies with saturation.
llvm-svn: 225024
2014-12-31 00:08:34 +00:00
David Blaikie 3927b97151 Fix a test case to not depend on asm comment syntax, so as to be portable
Too many different comment characters - instead of trying to account for
them all, instead disable the comments and just check for end-of-line
instead.

llvm-svn: 225020
2014-12-30 23:33:55 +00:00
David Blaikie 7a3f48292c Generalize even further, for ARM comment syntax (@)
llvm-svn: 225019
2014-12-30 23:23:58 +00:00
Colin LeMahieu 8971e055ae [Hexagon] Adding double-logic on predicate instructions.
llvm-svn: 225018
2014-12-30 23:22:39 +00:00
David Blaikie 56605aff6b Generalize test case to handle different asm syntax (# or // comments)
llvm-svn: 225017
2014-12-30 23:21:57 +00:00
Colin LeMahieu 65f3e12ed1 [Hexagon] Adding newvalue compare and jumps.
llvm-svn: 225015
2014-12-30 23:04:21 +00:00
Peter Collingbourne 64faca9b84 RTDyldMemoryManager.cpp: Make the reference to __morestack weak.
This fixes the DSO build for now. Eventually we should develop some
other mechanism to make this work correctly with DSOs.

llvm-svn: 225014
2014-12-30 22:52:33 +00:00
David Blaikie aeaa5bf55e DebugInfo: Omit is_stmt from line table entries on the same line.
GCC does this for non-zero discriminators and since GCC doesn't produce
column info, that was the only place it comes up there. For LLVM, since
we can emit discriminators and/or column info, it makes more sense to
invert the condition and just test for changes in line number.

This should resolve at least some of the GDB 7.5 test suite failures
created by recent Clang changes that increase the location fidelity
(which, since Clang defaults to including column info on Linux by
default created a bunch of cases that confused GDB).

In theory we could do this better/differently by grouping actual source
statements together in a similar manner to the way lexical scopes are
handled but given that GDB isn't really in a position to consume that (&
users are probably somewhat used to different lines being different
'statements') this seems the safest and cheapest change. (I'm concerned
that doing this 'right' would bloat the debugloc data even further -
something Duncan's working hard to address)

llvm-svn: 225011
2014-12-30 22:47:13 +00:00
Colin LeMahieu 0cba5f1b43 [Hexagon] Adding postincrement register newvalue stores.
llvm-svn: 225010
2014-12-30 22:34:08 +00:00
Colin LeMahieu 9014890819 [Hexagon] Removing old newvalue store variants. Adding postincrement immediate newvalue stores.
llvm-svn: 225009
2014-12-30 22:28:31 +00:00
Zoran Jovanovic 10646918d1 [mips][microMIPS] Relocate with symbol for micromips symbols
Differential Revision: http://reviews.llvm.org/D6796

llvm-svn: 225008
2014-12-30 22:04:16 +00:00
Colin LeMahieu 820d5cb608 [Hexagon] Adding indexed store new-value variants.
llvm-svn: 225007
2014-12-30 22:00:26 +00:00
Colin LeMahieu 2bad4a7177 [Hexagon] Adding indexed store of immediates.
llvm-svn: 225006
2014-12-30 21:01:38 +00:00
Colin LeMahieu 94a498bf0e [Hexagon] Adding indexed stores.
llvm-svn: 225005
2014-12-30 20:42:23 +00:00
Peter Collingbourne 7ef497b1f5 x86_64: Fix calls to __morestack under the large code model.
Under the large code model, we cannot assume that __morestack lives within
2^31 bytes of the call site, so we cannot use pc-relative addressing. We
cannot perform the call via a temporary register, as the rax register may
be used to store the static chain, and all other suitable registers may be
either callee-save or used for parameter passing. We cannot use the stack
at this point either because __morestack manipulates the stack directly.

To avoid these issues, perform an indirect call via a read-only memory
location containing the address.

This solution is not perfect, as it assumes that the .rodata section
is laid out within 2^31 bytes of each function body, but this seems to
be sufficient for JIT.

Differential Revision: http://reviews.llvm.org/D6787

llvm-svn: 225003
2014-12-30 20:05:19 +00:00
Kostya Serebryany aa185bfc4b [asan] change _sanitizer_cov_module_init to accept int* instead of int**
llvm-svn: 224999
2014-12-30 19:29:28 +00:00
Michael Kuperstein c43b063358 [COFF] Don't try to add quotes to already quoted linker directives
If a linker directive is already quoted, don't try to quote it again, otherwise it creates a mess.
This pops up in places like:
#pragma comment(linker,"\"/foo bar'\"")

Differential Revision: http://reviews.llvm.org/D6792

llvm-svn: 224998
2014-12-30 19:23:48 +00:00
Colin LeMahieu 9161d47476 [Hexagon] Adding reg-reg indexed load forms.
llvm-svn: 224997
2014-12-30 18:58:47 +00:00
Peter Collingbourne c5de311417 The __morestack function is only available on i386 and x86_64 architectures.
llvm-svn: 224994
2014-12-30 18:22:06 +00:00
Peter Collingbourne 06e8543cd0 Make the __morestack function available to the JIT memory manager under Linux.
This function's implementation lives in libgcc, a static library, so we need
to expose it explicitly, like the other such functions.

Differential Revision: http://reviews.llvm.org/D6788

llvm-svn: 224993
2014-12-30 18:06:52 +00:00
Colin LeMahieu 82fb8cba16 [Hexagon] Dropping old combine instructions without encodings.
llvm-svn: 224992
2014-12-30 17:53:54 +00:00
Colin LeMahieu 377ac65340 [Hexagon] Adding compare byte/halfword reg-reg/reg-imm forms. Adding compare to general register reg-imm form.
llvm-svn: 224991
2014-12-30 17:39:24 +00:00
Colin LeMahieu d7a56fd9ff [Hexagon] Updating constant extender def, adding alu-not instructions, compare to general register, and inverted compares.
llvm-svn: 224989
2014-12-30 15:44:17 +00:00
Elena Demikhovsky 84d1997b95 Some code improvements in Masked Load/Store.
No functional changes.

llvm-svn: 224986
2014-12-30 14:28:14 +00:00
Rafael Espindola b22d5aa49a Remove doesSectionRequireSymbols.
In an assembly expression like

bar:
.long L0 + 1

the intended semantics is that bar will contain a pointer one byte past L0.

In sections that are merged by content (strings, 4 byte constants, etc), a
single position in the section doesn't give the linker enough information.
For example, it would not be able to tell a relocation must point to the
end of a string, since that would look just like the start of the next.

The solution used in ELF to use relocation with symbols if there is a non-zero
addend.

In MachO before this patch we would just keep all symbols in some sections.

This would miss some cases (only cstrings on x86_64 were implemented) and was
inefficient since most relocations have an addend of 0 and can be represented
without the symbol.

This patch implements the non-zero addend logic for MachO too.

llvm-svn: 224985
2014-12-30 13:13:27 +00:00
Elena Demikhovsky 0705d0e8d4 reverted prev commit (it was a mistake)
llvm-svn: 224984
2014-12-30 10:17:21 +00:00
Elena Demikhovsky 74a35b4c5b v
llvm-svn: 224983
2014-12-30 10:13:38 +00:00
Philip Reames 4750dd18ea Add IRBuilder routines for gc.statepoints, gc.results, and gc.relocates
Nothing particularly interesting, just adding infrastructure for use by in tree users and out of tree users.

Note: These were extracted out of a working frontend, but they have not been well tested in isolation.

Differential Revision: http://reviews.llvm.org/D6807

llvm-svn: 224981
2014-12-30 05:55:58 +00:00
Rafael Espindola 2f471303c1 Simplify test a bit.
It looks like the original intent was to check which symbols were created.
With macho-dump the sections were being checked just to match which symbol
was in which section.

llvm-objdump prints the section a symbol is in.

llvm-svn: 224980
2014-12-30 05:09:17 +00:00
Peter Zotov ecb5c4f237 [OCaml] Fix bitrot in tests.
llvm-svn: 224979
2014-12-30 03:24:14 +00:00
Peter Zotov b45d5bd955 [lit] Make config.llvm_lib_dir available on cmake, too.
The OCaml tests require config.llvm_lib_dir to determine
the OCaml package search path.

llvm-svn: 224978
2014-12-30 03:24:11 +00:00
Peter Zotov 0bd44501c3 [OCaml] [cmake] Use LLVM_LIBRARY_DIR instead of LLVM_LIBRARY_OUTPUT_INTDIR.
The latter variable is internal.

Original patch by Ramkumar Ramachandra <artagnon@gmail.com>

llvm-svn: 224977
2014-12-30 03:24:07 +00:00
Craig Topper aa1c51ee01 Testcases for r224939.
llvm-svn: 224976
2014-12-30 02:35:56 +00:00
Rafael Espindola 81aef1bd52 Convert test to llvm-readobj. NFC.
llvm-svn: 224973
2014-12-30 01:34:06 +00:00
Philip Reames 5909ebcd6c Semantic tests for memory invalidation at statepoints
These are simply a collection of tests intended to show that information about the contents of gc references in the heap is lost at a statepoint. I've tried to write them so that they don't disallow correct transformations, while still being fairly easy to understand.

p.s. Ideas for additional tests are welcome.

Differential Revision: http://reviews.llvm.org/D6491

llvm-svn: 224971
2014-12-29 23:55:33 +00:00
Philip Reames 9db26ffc9a Carry facts about nullness and undef across GC relocation
This change implements four basic optimizations:

    If a relocated value isn't used, it doesn't need to be relocated.
    If the value being relocated is null, relocation doesn't change that. (Technically, this might be collector specific. I don't know of one which it doesn't work for though.)
    If the value being relocated is undef, the relocation is meaningless.
    If the value being relocated was known nonnull, the relocated pointer also isn't null. (Since it points to the same source language object.)

I outlined other planned work in comments.

Differential Revision: http://reviews.llvm.org/D6600

llvm-svn: 224968
2014-12-29 23:27:30 +00:00
Philip Reames b35f46ce06 Refine the notion of MayThrow in LICM to include a header specific version
In LICM, we have a check for an instruction which is guaranteed to execute and thus can't introduce any new faults if moved to the preheader. To handle a function which might unconditionally throw when first called, we check for any potentially throwing call in the loop and give up.

This is unfortunate when the potentially throwing condition is down a rare path. It prevents essentially all LICM of potentially faulting instructions where the faulting condition is checked outside the loop. It also greatly diminishes the utility of loop unswitching since control dependent instructions - which are now likely in the loops header block - will not be lifted by subsequent LICM runs.

define void @nothrow_header(i64 %x, i64 %y, i1 %cond) {
; CHECK-LABEL: nothrow_header
; CHECK-LABEL: entry
; CHECK: %div = udiv i64 %x, %y
; CHECK-LABEL: loop
; CHECK: call void @use(i64 %div)
entry:
  br label %loop
loop: ; preds = %entry, %for.inc
  %div = udiv i64 %x, %y
  br i1 %cond, label %loop-if, label %exit
loop-if:
  call void @use(i64 %div)
  br label %loop
exit:
  ret void
}

The current patch really only helps with non-memory instructions (i.e. divs, etc..) since the maythrow call down the rare path will be considered to alias an otherwise hoistable load.  The one exception is that it does kick in for loads which are known to be invariant without regard to other possible stores, i.e. those marked with either !invarant.load metadata of tbaa 'is constant memory' metadata.

Differential Revision: http://reviews.llvm.org/D6725

llvm-svn: 224965
2014-12-29 23:00:57 +00:00
Chandler Carruth cab512d629 [go] Teach the go cmake build functions to funnel the include directories down into the cgo-setup variables of llvm-go.
Summary:
This in turn allows us to use #includes with cgo that rely on CMake
provided include directories which is particularly useful for handling
generated headers that aren't reasonable to put in an "installable"
location.

Differential Revision: http://reviews.llvm.org/D6798

llvm-svn: 224962
2014-12-29 22:50:30 +00:00
Philip Reames 5ad26c353c Loading from null is valid outside of addrspace 0
This patches fixes a miscompile where we were assuming that loading from null is undefined and thus we could assume it doesn't happen.  This transform is perfectly legal in address space 0, but is not neccessarily legal in other address spaces.

We really should introduce a hook to control this property on a per target per address space basis.  We may be loosing valuable optimizations in some address spaces by being too conservative.

Original patch by Thomas P Raoux (submitted to llvm-commits), tests and formatting fixes by me.

llvm-svn: 224961
2014-12-29 22:46:21 +00:00
Rafael Espindola d351a18ebe Convert test to llvm-readobj. NFC.
llvm-svn: 224959
2014-12-29 22:14:35 +00:00
Colin LeMahieu 651b72095b [Hexagon] Adding allocframe, post-increment circular immediate stores, post-increment circular register stores, and bit reversed post-increment stores.
llvm-svn: 224957
2014-12-29 21:33:45 +00:00
Colin LeMahieu 488b6f7bbc [Hexagon] Fixing 224952 where an addressing mode update was missed.
llvm-svn: 224955
2014-12-29 21:18:02 +00:00
Alexey Samsonov 4c79845125 Remove unnecessary StringRef->std::string conversion.
llvm-svn: 224953
2014-12-29 20:59:02 +00:00
Colin LeMahieu bda31b42a0 [Hexagon] Adding post-increment register form stores and register-immediate form stores with tests.
llvm-svn: 224952
2014-12-29 20:44:51 +00:00
Colin LeMahieu 9a3cd3f58c [Hexagon] Replacing the remaining postincrement stores with versions that have encoding bits.
llvm-svn: 224951
2014-12-29 20:00:43 +00:00
Rafael Espindola 517b47232b Convert test to FileCheck. NFC.
llvm-svn: 224950
2014-12-29 19:50:32 +00:00
Colin LeMahieu 3d34afb32d [Hexagon] Renaming old multiclass for removal. Adding post-increment store classes and instruction defs.
llvm-svn: 224949
2014-12-29 19:42:14 +00:00
Chandler Carruth d86ab891c7 [py3] Teach the CMake build to reject Python versions older than 2.7.
Continue to require Python 2 however as recent experiments suggest
LLDB's build requires it.

llvm-svn: 224948
2014-12-29 19:36:05 +00:00
Craig Topper 31d6d9a0fb [X86] Fix some cases where some 8-bit instructions were marked as being convertible to three address instructions, but aren't really.
llvm-svn: 224940
2014-12-29 16:25:26 +00:00
Craig Topper 874a1966ae [X86] Add the 0x82 instructions to the disassebmler. They are identical in functionality to the 0x80 opcode instructions, but are not valid in 64-bit mode.
llvm-svn: 224939
2014-12-29 16:25:23 +00:00
Craig Topper c51b7993b8 [x86] Refactor some tablegen instruction info classes slightly to prepare for another change. NFC.
llvm-svn: 224938
2014-12-29 16:25:22 +00:00
Craig Topper 56c8e05c24 [x86] Remove unused classes from tablegen instruction info.
llvm-svn: 224937
2014-12-29 16:25:19 +00:00
Rafael Espindola 44eae72c40 Add segmented stack support for DragonFlyBSD.
Patch by Michael Neumann.

llvm-svn: 224936
2014-12-29 15:47:28 +00:00
Rafael Espindola bed67f3adc Refactor duplicated code.
No intended functionality change.

llvm-svn: 224935
2014-12-29 15:18:31 +00:00
Chandler Carruth 9db2b52468 [multilib] Add support to the autoconf build to substitute
a CLANG_LIBDIR_SUFFIX variable. This is necessary before I can add
support for using that variable to CMake and the C++ code in Clang, and
the autoconf build system does all substitutions in the LLVM tree.

As mentioned before, I'm not planning to add actual multilib support to
the autoconf build, just enough stubs for it to keep playing nicely with
the CMake build once that one has support.

llvm-svn: 224922
2014-12-29 11:58:17 +00:00
Chandler Carruth 7d58776fad [cmake] Teach the llvm-config program to respect LLVM_LIBDIR_SUFFIX.
For this to work, we have to encode it in the build variables and use it
from llvm-config.cpp. I've tried to do this reasonably cleanly, but the
code for llvm-config.cpp is pretty strange. However, with this,
llvm-config stops giving the wrong answer when using LLVM_LIBDIR_SUFFIX.

Note that the configure+make build just sets this to an empty string as
that build system has zero support for multilib of any form. I'm not
planning to add support there either, but this should leave a path for
anyone that wanted to.

llvm-svn: 224921
2014-12-29 11:16:25 +00:00
Chandler Carruth ab8df0b6c6 [cmake] Push LLVM_LIBDIR_SUFFIX through to the LLVMConfig.cmake file
that is used by other projects to build against LLVM. This will allow
subsequent patches to them to use LLVM_LIBDIR_SUFFIX, both when built as
part of the larger LLVM build an as part of a standalone build against
an installed set of LLVM libraries.

llvm-svn: 224920
2014-12-29 11:16:23 +00:00
Chandler Carruth a78e24e548 [cmake] Start making LLVM_LIBDIR_SUFFIX effective by adding it to
*numerous* places where it was missing in the CMake build. The primary
change here is that the suffix is now actually used for all of the lib
directories in the LLVM project's CMake. The various subprojects still
need similar treatment.

This is the first of a series of commits to try to make LLVM's cmake
effective in a multilib Linux installation. I don't think many people
are seriously using this variable so I'm hoping the fallout will be
minimal. A somewhat unfortunate consequence of the nature of these
commits is that until I land all of them, they will in part make the
brokenness of our multilib support more apparant. At the end, things
should actually work.

llvm-svn: 224919
2014-12-29 11:16:19 +00:00
Elena Demikhovsky e86c8c807f Fixed 2 minor typos in the documentation.
llvm-svn: 224917
2014-12-29 09:47:51 +00:00
NAKAMURA Takumi 4099328596 llvm/test/CodeGen/X86/fast-isel-call-bool.ll: Add explicit -mtriple=x86_64-unknown to satisfy x64.
llvm-svn: 224907
2014-12-28 23:37:11 +00:00
Keno Fischer fd22c6693b [X86][ISel] Fix a regression I introduced in r224884
The else case ResultReg was not checked for validity.
To my surprise, this case was not hit in any of the
existing test cases. This includes a new test cases
that tests this path.

Also drop the `target triple` declaration from the
original test as suggested by H.J. Lu, because
apparently with it the test won't be run on Linux

llvm-svn: 224901
2014-12-28 15:20:57 +00:00
Michael Kuperstein 683c3cde43 [X86] Add missing memory variants to AVX false dependency breaking
Adds missing memory instruction variants to AVX false dependency breaking handling. (SSE was handled in r224246)

Differential Revision: http://reviews.llvm.org/D6780

llvm-svn: 224900
2014-12-28 13:15:05 +00:00
Andrea Di Biagio 22ee3f63b9 [CodeGenPrepare] Teach when it is profitable to speculate calls to @llvm.cttz/ctlz.
If the control flow is modelling an if-statement where the only instruction in
the 'then' basic block (excluding the terminator) is a call to cttz/ctlz,
CodeGenPrepare can try to speculate the cttz/ctlz call and simplify the control
flow graph.

Example:
\code
entry:
  %cmp = icmp eq i64 %val, 0
  br i1 %cmp, label %end.bb, label %then.bb

then.bb:
  %c = tail call i64 @llvm.cttz.i64(i64 %val, i1 true)
  br label %end.bb

end.bb:
  %cond = phi i64 [ %c, %then.bb ], [ 64, %entry]
\code

In this example, basic block %then.bb is taken if value %val is not zero.
Also, the phi node in %end.bb would propagate the size-of in bits of %val
only if %val is equal to zero.

With this patch, CodeGenPrepare will try to hoist the call to cttz from %then.bb
into basic block %entry only if cttz is cheap to speculate for the target.

Added two new hooks in TargetLowering.h to let targets customize the behavior
(i.e. decide whether it is cheap or not to speculate calls to cttz/ctlz). The
two new methods are 'isCheapToSpeculateCtlz' and 'isCheapToSpeculateCttz'.
By default, both methods return 'false'.
On X86, method 'isCheapToSpeculateCtlz' returns true only if the target has
LZCNT. Method 'isCheapToSpeculateCttz' only returns true if the target has BMI.

Differential Revision: http://reviews.llvm.org/D6728

llvm-svn: 224899
2014-12-28 11:07:35 +00:00
Elena Demikhovsky 87700a734d Scalarizer for masked load and store intrinsics.
Masked vector intrinsics are a part of common LLVM IR, but they are really supported on AVX2 and AVX-512 targets. I added a code that translates masked intrinsic for all other targets. The masked vector intrinsic is converted to a chain of scalar operations inside conditional basic blocks.

http://reviews.llvm.org/D6436

llvm-svn: 224897
2014-12-28 08:54:45 +00:00
Craig Topper 6e3a582809 [x86] Prevent instruction selection of AVX512 cmp.ps/pd/ss/sd intrinsics with illegal immediates. Correctly this time. I did the wrong patterns the first time.
llvm-svn: 224891
2014-12-27 20:08:45 +00:00
David Majnemer d0bcef2040 PowerPC: CTR shouldn't fire if a TLS call is in the loop
Determining the address of a TLS variable results in a function call in
certain TLS models.  This means that a simple ICmpInst might actually
result in invalidating the CTR register.

In such cases, do not attempt to rely on the CTR register for loop
optimization purposes.

This fixes PR22034.

Differential Revision: http://reviews.llvm.org/D6786

llvm-svn: 224890
2014-12-27 19:45:38 +00:00
Aaron Ballman 4eb5c2e089 Fixing another -Wunused-variable warning, this time in release builds without asserts. NFC.
llvm-svn: 224889
2014-12-27 19:17:53 +00:00
Aaron Ballman b66d54c549 Removing a variable that is set but never used, to silence a -Wunused-but-set-variable warning; NFC.
llvm-svn: 224888
2014-12-27 19:01:19 +00:00
Craig Topper 1113fb343e [x86] Prevent instruction selection of AVX512 cmp.ps/pd/ss/sd intrinsics with illegal immediates. Forgot to do this when I did SSE/SSE2/AVX/AVX2.
llvm-svn: 224887
2014-12-27 18:51:06 +00:00
Craig Topper 53f75b9dc0 [x86] Assert on invalid immediates in the instruction printer for cmp.ps/pd/ss/sd instead of truncating the immediate. The assembly parser and instruction selection shouldn't generate invalid immediates.
llvm-svn: 224886
2014-12-27 18:11:00 +00:00
Craig Topper acc73445b7 [x86] Prevent llvm.x86.cmp.ps/pd/ss/sd from being selected with bad immediates. The frontend now checks this when the builtin is used. This will allow the instruction printer to not have to deal with invalid immediates on these instructions.
llvm-svn: 224885
2014-12-27 18:10:56 +00:00
Keno Fischer 8438b08663 [FastIsel][X86] Fix invalid register replacement for bool args
Summary:
Consider the following IR:

  %3 = load i8* undef
  %4 = trunc i8 %3 to i1
  %5 = call %jl_value_t.0* @foo(..., i1 %4, ...)
  ret %jl_value_t.0* %5

Bools (that are the result of direct truncs) are lowered as whatever
the argument to the trunc was and a "and 1", causing the part of the
MBB responsible for this argument to look something like this:

  %vreg8<def,tied1> = AND8ri %vreg7<kill,tied0>, 1, %EFLAGS<imp-def>; GR8:%vreg8,%vreg7

Later, when the load is lowered, it will insert

  %vreg15<def> = MOV8rm %vreg14, 1, %noreg, 0, %noreg; mem:LD1[undef] GR8:%vreg15 GR64:%vreg14

but remember to (at the end of isel) replace vreg7 by vreg15. Now for
the bug. In fast isel lowering, we mistakenly mark vreg8 as the result
of the load instead of the trunc. This adds a fixup to have
vreg8 replaced by whatever the result of the load is as well, so
we end up with

  %vreg15<def,tied1> = AND8ri %vreg15<kill,tied0>, 1, %EFLAGS<imp-def>; GR8:%vreg15

which is an SSA violation and causes problems later down the road.

This fixes PR21557.

Test Plan: Test test case from PR21557 is added to the test suite.

Reviewers: ributzka

Reviewed By: ributzka

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6245

llvm-svn: 224884
2014-12-27 13:10:15 +00:00
Rafael Espindola 47beb8ace0 Convert test to llvm-readobj. NFC.
llvm-svn: 224872
2014-12-26 22:47:39 +00:00
Colin LeMahieu 8233fb002d [Hexagon] Adding auto-incrementing loads with and without byte reversal.
llvm-svn: 224871
2014-12-26 21:09:25 +00:00
Colin LeMahieu 0a721cd4e1 [Hexagon] Adding locked loads.
llvm-svn: 224870
2014-12-26 20:42:27 +00:00
Colin LeMahieu ff370ed90e [Hexagon] Adding deallocframe and circular addressing loads.
llvm-svn: 224869
2014-12-26 20:30:58 +00:00
Colin LeMahieu c83cbbf6a1 [Hexagon] Adding remaining post-increment instruction variants. Removing unused classes.
llvm-svn: 224868
2014-12-26 19:31:46 +00:00
Colin LeMahieu fe9612e09d [Hexagon] Adding post-increment unsigned byte loads.
llvm-svn: 224867
2014-12-26 19:12:11 +00:00
Colin LeMahieu 96976a10a3 [Hexagon] Adding post-increment signed byte loads with tests.
llvm-svn: 224866
2014-12-26 18:57:13 +00:00
Rafael Espindola 1a7ef86add Use llvm-readobj. NFC.
llvm-svn: 224864
2014-12-26 18:22:05 +00:00
Craig Topper c4b12166f2 [X86] Add the debug registers DR8-DR15 so we can assemble and disassemble references to them.
llvm-svn: 224862
2014-12-26 18:20:05 +00:00
Craig Topper d5b39237a1 [X86] Don't fail disassembly if REX.R/REX.B is used on an MMX register. Similar fix to not fail to disassembler CR9-CR15 references.
llvm-svn: 224861
2014-12-26 18:19:44 +00:00
Timur Iskhodzhanov b6fa52f274 Band-aid fix for PR22032: don't emit DWARF debug info if AddressSanitizer is enabled on Windows
llvm-svn: 224860
2014-12-26 17:00:51 +00:00
Rafael Espindola 5d94634c13 No need to run llvm-as. NFC.
llvm-svn: 224859
2014-12-26 16:42:47 +00:00
David Majnemer b1296ec0fd InstCombine: Infer nuw for multiplies
A multiply cannot unsigned wrap if there are bitwidth, or more, leading
zero bits between the two operands.

llvm-svn: 224849
2014-12-26 09:50:35 +00:00
David Majnemer a55027f68a ValueTracking: Small cleanup in ComputeNumSignBits
Constant contains the isAllOnesValue and isNullValue predicates, not
ConstantInt.

llvm-svn: 224848
2014-12-26 09:20:17 +00:00
David Majnemer 54c2ca2539 InstCombe: Infer nsw for multiplies
We already utilize this logic for reducing overflow intrinsics, it makes
sense to reuse it for normal multiplies as well.

llvm-svn: 224847
2014-12-26 09:10:14 +00:00
Craig Topper ee9eef2fd8 Teach disassembler to handle illegal immediates on (v)cmpps/pd/ss/sd instructions. Instead of rejecting we'll just generate the _alt forms that don't try to alter the mnemonic. While I'm here, merge some common code in the Instruction printers for the condition code replacement and fix the mask on SSE to be 3-bits instead of 4.
llvm-svn: 224846
2014-12-26 06:36:28 +00:00
Craig Topper 2e44492b1d Use MCPhysReg for table of register encodings.
llvm-svn: 224845
2014-12-26 06:36:23 +00:00
Hal Finkel 0c505b08a5 [PowerPC] [FastISel] i1 constants must be zero extended
When materializing constant i1 values, they must be zero extended. We represent
i1 values as [0, 1], not [0, -1], in i32 registers. As it turns out, this code
path was dead for i1 values prior to r216006 (which is why this did not manifest in
miscompiles until recently).

Fixes -O0 self-hosting on PPC64/Linux.

llvm-svn: 224842
2014-12-25 23:08:25 +00:00
David Majnemer 25b383ac66 Silence GCC's -Wparentheses warning
No functionality change intended.

llvm-svn: 224833
2014-12-25 10:03:23 +00:00
Elena Demikhovsky 3d13f1c82c Documentation for Masked Load and Store intrinsics.
llvm-svn: 224832
2014-12-25 09:29:13 +00:00
Elena Demikhovsky fb81b93e17 Masked Load/Store - Changed the order of parameters in intrinsics.
No functional changes.
The documentation is coming.

llvm-svn: 224829
2014-12-25 07:49:20 +00:00
David Majnemer 2913eca4e2 CodeGen: Error on redefinitions instead of asserting
It's possible to have a prior definition of a symbol in module asm.
Raise an error instead of crashing.

llvm-svn: 224828
2014-12-24 23:06:55 +00:00
David Majnemer 8e92dfee20 CodeGen: Allow aliases to be overridden by variables
llvm-svn: 224827
2014-12-24 22:44:29 +00:00
Saleem Abdulrasool 747ec2dda3 MC: address some comments in deprecation checks
Bob Wilson pointed out the unnecessary checks that had been committed to the
instruction check predicates.  The check was meant to ensure that the check was
not accidentally applied to non-ARM instructions.  This is better served as an
assertion rather than a condition check.

llvm-svn: 224825
2014-12-24 18:40:42 +00:00
David Majnemer 58cb80c940 MC: Label definitions are permitted after .set directives
.set directives may be overridden by other .set directives as well as
label definitions.

This fixes PR22019.

llvm-svn: 224811
2014-12-24 10:27:50 +00:00
Saleem Abdulrasool 4d6ed7c778 IAS: correct debug line info for asm macros
Correct the line information generation for preprocessed assembly.  Although we
tracked the source information for the macro instantiation, we failed to account
for the fact that we were instantiating a macro, which is populated into a new
buffer and that the line information would be relative to the definition rather
than the actual instantiation location.  This could cause the line number
associated with the statement to be very high due to wrapping of the difference
calculated for the preprocessor line information emitted into the stream.
Properly calculate the line for the macro instantiation, referencing the line
where the macro is actually used as GCC/gas do.

The test case uses x86, though the same problem exists on any other target using
the LLVM IAS.

llvm-svn: 224810
2014-12-24 06:32:43 +00:00
Craig Topper b86338f7b2 [X86] Remove the single AdSize indicator and replace it with separate AdSize16/32/64 flags.
This removes a hardcoded list of instructions in the CodeEmitter. Eventually I intend to remove the predicates on the affected instructions since in any given mode two of them are valid if we supported addr32/addr16 prefixes in the assembler.

llvm-svn: 224809
2014-12-24 06:05:22 +00:00
David Majnemer 0fe246e079 MC: Don't emit .no_dead_strip on targets which don't support it
llvm-svn: 224808
2014-12-24 04:11:42 +00:00
Matthias Braun 51ca510094 LiveInterval: Remove accidentally committed debug code.
llvm-svn: 224807
2014-12-24 02:35:07 +00:00
Matthias Braun dbcca0dbb4 LiveInterval: Introduce createMainRangeFromSubranges().
This function constructs the main liverange by merging all subranges if
subregister liveness tracking is available. This should be slightly
faster to compute instead of performing the liveness calculation again
for the main range. More importantly it avoids cases where the main
liverange would cover positions where no subrange was live. These cases
happened for partial definitions where the actual defined part was dead
and only the undefined parts used later.

The register coalescing requires that every part covered by the main
live range has at least one subrange live.

I also expect this function to become usefull later for places where the
subranges are modified in a way that it is hard to correctly fix the
main liverange in the machine scheduler, we can simply reconstruct it
from subranges then.

llvm-svn: 224806
2014-12-24 02:11:51 +00:00
Matthias Braun 7030dda8d5 RegisterCoalescer: With subrange liveness there may be no RedefVNI for unused lanes.
llvm-svn: 224805
2014-12-24 02:11:48 +00:00
Matthias Braun 36768c684f LiveRangeEdit: Check for completely empy subranges after removing ValNos.
Completely empty subranges are not allowed and must be removed when
subreg liveness is enabled.

llvm-svn: 224804
2014-12-24 02:11:46 +00:00
Matthias Braun f603c88d13 LiveIntervalAnalysis: Fix performance bug that I introduced in r224663.
Without a reference the code did not remember when moving the iterators
of the subranges/registerunit ranges forward and instead would scan from
the beginning again at the next position.

llvm-svn: 224803
2014-12-24 02:11:43 +00:00
Peter Zotov 283e20219e [OCaml] PR21901: Update tests.
This finishes the fix partially applied by r224782.

llvm-svn: 224802
2014-12-24 01:58:45 +00:00
Peter Zotov af6535bf12 [OCaml] Expose Llvm_executionengine.get_{global_value,function}_address.
Patch by Ramkumar Ramachandra <artagnon@gmail.com>.

Also remove Llvm_executionengine.get_pointer_to_global, as it
is actually deprecated and didn't appear in a stable release.

llvm-svn: 224801
2014-12-24 01:52:51 +00:00