Commit Graph

22013 Commits

Author SHA1 Message Date
Tim Northover a4173715f7 ARM: fix folding of stack-adjustment (yet again).
When trying to eliminate an "sub sp, sp, #N" instruction by folding
it into an existing push/pop using dummy registers, we need to account
for the fact that this might affect precisely how "fp" gets set in the
prologue.

We were attempting this, but assuming that *whenever* we performed a
fold it would make a difference. This is false, for example, in:
    push {r4, r7, lr}
    add fp, sp, #4
    vpush {d8}
    sub sp, sp, #8

we can fold the "sub" into the "vpush", forming "vpush {d7, d8}".
However, in that case the "add fp" instruction mustn't change, which
we were getting wrong before.

Should fix PR18160.

llvm-svn: 196725
2013-12-08 15:56:50 +00:00
Michael Kuperstein e8cb16b58e Fixed CRLF
llvm-svn: 196719
2013-12-08 12:16:20 +00:00
Michael Kuperstein bb404ec3e5 Ensure bitcode encoding of visibility styles stays stable. Patch by Boaz Ouriel.
llvm-svn: 196718
2013-12-08 11:35:09 +00:00
Mark Seaborn 1b3dd3527e Fix inlining to not lose the "cleanup" clause from landingpads
This fixes PR17872.  This bug can lead to C++ destructors not being
called when they should be, when an exception is thrown.

llvm-svn: 196711
2013-12-08 00:51:21 +00:00
Mark Seaborn ef3dbb93ec Fix inlining to not produce duplicate landingpad clauses
Before this change, inlining one "invoke" into an outer "invoke" call
site can lead to the outer landingpad's catch/filter clauses being
copied multiple times into the resulting landingpad.  This happens:

 * when the inlined function contains multiple "resume" instructions,
   because forwardResume() copies the clauses but is called multiple
   times;

 * when the inlined function contains a "resume" and a "call", because
   HandleCallsInBlockInlinedThroughInvoke() copies the clauses but is
   redundant with forwardResume().

Fix this by deduplicating the code.

This problem doesn't lead to any incorrect execution; it's only
untidy.

This change will make fixing PR17872 a little easier.

llvm-svn: 196710
2013-12-08 00:50:58 +00:00
Renato Golin c6b580ac12 force vector width via cpu on vectorizer metadata enable
llvm-svn: 196669
2013-12-07 21:46:08 +00:00
NAKAMURA Takumi db47bcc63d Remove empty MCJIT/load-object-a.ll since r196641.
llvm-svn: 196645
2013-12-07 06:17:10 +00:00
Lang Hames 567befd88f Revert r196639 while I investigate a bot failure.
llvm-svn: 196641
2013-12-07 04:25:19 +00:00
Lang Hames a691358078 Add support for archives and object file caching under MCJIT.
Patch by Andy Kaylor, with minor edits to resolve merge conflicts.

llvm-svn: 196639
2013-12-07 03:05:51 +00:00
Matt Arsenault bbf18c6958 Fix assert with copy from global through addrspacecast
llvm-svn: 196638
2013-12-07 02:58:45 +00:00
Akira Hatanaka 42a91ef2ac [mips] Fix test case.
Indent the command lines to indicate they continue from previous lines. Also,
fix incorrect uses of CHECK-DAG and CHECK-NOT.

llvm-svn: 196636
2013-12-07 02:48:29 +00:00
Vincent Lejeune 92b0a64906 Add a RequireStructuredCFG Field to TargetMachine.
llvm-svn: 196634
2013-12-07 01:49:19 +00:00
Yuchen Wu a3bed4bd37 llvm-cov: Added test.h header to tests.
llvm-svn: 196632
2013-12-07 01:28:11 +00:00
Kaelyn Uhrain 4e8656077c Fix the segfault reported in PR 11990.
The sefault occurs due to an infinite loop when the verifier tries to
determine the size of a type of the form "%rt = type { %rt }" while
checking an alloca of the type.

llvm-svn: 196626
2013-12-07 00:13:34 +00:00
Duncan P. N. Exon Smith ce5f93efd5 Don't use isNullValue to evaluate ConstantExpr
ConstantExpr can evaluate to false even when isNullValue gives false.

Fixes PR18143.

llvm-svn: 196611
2013-12-06 21:48:36 +00:00
Yuchen Wu 0db73111a9 llvm-cov: Regenerated gcov files with r195513 changes.
llvm-svn: 196609
2013-12-06 21:33:50 +00:00
David Peixotto 2cdc56d26b Integrated assembler incorrectly lexes ARM-style comments
The integrated assembler fails to properly lex arm comments when
they are adjacent to an identifier in the input stream. The reason
is that the arm comment symbol '@' is also used as symbol variant in
other assembly languages so when lexing an identifier it allows the
'@' symbol as part of the identifier.

Example:
  $ cat comment.s
  foo:
    add r0, r0@got to parse this as a comment

  $ llvm-mc -triple armv7 comment.s
  comment.s:4:18: error: unexpected token in argument list
    add r0, r0@got to parse this as a comment
                   ^
This should be parsed as correctly as `add r0, r0`.

This commit modifes the assembly lexer to not include the '@' symbol
in identifiers when lexing for targets that use '@' for comments.

llvm-svn: 196607
2013-12-06 20:35:58 +00:00
David Blaikie 2666e24ca5 DebugInfo: Ensure unit IDs (for non-skeletal units) match thein index in the list
This simplifies reasoning about the code and enables simple navigation
from a skeleton to its full unit. (currently there are no type unit
skeletons, so the skeleton list doesn't have the same ID == index
property)

Eventually we should get rid of this ID and just store the labels we
need as the IDs are allowing this code to create difficult to
manage/understand associations (loops over non-skeletal units are
implicitly referencing their skeletal units during pub* emission, for
example). It may be necessary to have some kind of skeleton->full unit
association and a more direct pointer or similar device would be
preferable than an index.

llvm-svn: 196600
2013-12-06 19:38:46 +00:00
Weiming Zhao 43d8e6cb3b Bug 18149: [AArch32] VSel instructions has no ARMCC field
The current peephole optimizing for compare inst assumes an instr that
uses CPSR has an MO for ARM Cond code.However, for VSEL instructions
(vseqeq, vselgt, vselgt, vselvs), there is no such operand nor do
they support the modification of Cond Code.

llvm-svn: 196588
2013-12-06 17:56:48 +00:00
Cameron McInally e3cc4aacb9 Update AVX512 vector blend intrinsic names.
llvm-svn: 196581
2013-12-06 13:35:35 +00:00
Richard Sandiford 198ddf83c1 [SystemZ] Use LOAD AND TEST for comparisons with -0
...since it os equivalent to comparison with +0.

llvm-svn: 196580
2013-12-06 09:59:12 +00:00
Richard Sandiford 7b4118a0fc [SystemZ] Extend the use of C(L)GFR
instcombine prefers to put extended operands first, so this patch
handles that case for C(L)GFR.

llvm-svn: 196579
2013-12-06 09:56:50 +00:00
Richard Sandiford 48ef6abddc [SystemZ] Optimize selects between 0 and -1
Since z has no setcc instruction as such, the choice of setBooleanContents
is a bit arbitrary.  Currently it's set to ZeroOrOneBooleanContent,
so we produced a branch-free form when selecting between 0 and 1,
but not when selecting between 0 and -1.  This patch handles the latter
case too.

At some point I'd like to measure whether it's better to use conditional
moves for constant selects on z196, but that's future work.

llvm-svn: 196578
2013-12-06 09:53:09 +00:00
Kostya Serebryany 4fb7801b3f [asan] rewrite asan's stack frame layout
Summary:
Rewrite asan's stack frame layout.
First, most of the stack layout logic is moved into a separte file
to make it more testable and (potentially) useful for other projects.
Second, make the frames more compact by using adaptive redzones
(smaller for small objects, larger for large objects).
Third, try to minimized gaps due to large alignments (this is hypothetical since
today we don't see many stack vars aligned by more than 32).

The frames indeed become more compact, but I'll still need to run more benchmarks
before committing, but I am sking for review now to get early feedback.

This change will be accompanied by a trivial change in compiler-rt tests
to match the new frame sizes.

Reviewers: samsonov, dvyukov

Reviewed By: samsonov

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2324

llvm-svn: 196568
2013-12-06 09:00:17 +00:00
Juergen Ributzka e3cba95d5c [Stackmap] Update stackmap unit test to use AnyRegCC.
llvm-svn: 196552
2013-12-06 00:28:54 +00:00
Yi Jiang 01cfa94212 Apply transformation on OS X 10.9+ and iOS 7.0+: pow(10, x) ―> __exp10(x)
llvm-svn: 196544
2013-12-05 22:42:50 +00:00
Renato Golin e593fea5f7 Move test to X86 dir
Test is platform independent, but I don't want to force vector-width, or
that could spoil the pragma test.

llvm-svn: 196539
2013-12-05 21:45:39 +00:00
Renato Golin 729a3ae90a Add #pragma vectorize enable/disable to LLVM
The intended behaviour is to force vectorization on the presence
of the flag (either turn on or off), and to continue the behaviour
as expected in its absence. Tests were added to make sure the all
cases are covered in opt. No tests were added in other tools with
the assumption that they should use the PassManagerBuilder in the
same way.

This patch also removes the outdated -late-vectorize flag, which was
on by default and not helping much.

The pragma metadata is being attached to the same place as other loop
metadata, but nothing forbids one from attaching it to a function
(to enable #pragma optimize) or basic blocks (to hint the basic-block
vectorizers), etc. The logic should be the same all around.

Patches to Clang to produce the metadata will be produced after the
initial implementation is agreed upon and committed. Patches to other
vectorizers (such as SLP and BB) will be added once we're happy with
the pass manager changes.

llvm-svn: 196537
2013-12-05 21:20:02 +00:00
Yuchen Wu 9af3938b51 llvm-cov: Changed extension from .llcov to .gcov.
llvm-svn: 196530
2013-12-05 20:45:36 +00:00
Andrew Trick 880e573d98 MI-Sched: handle latency of in-order operations with the new machine model.
The per-operand machine model allows the target to define "unbuffered"
processor resources. This change is a quick, cheap way to model stalls
caused by the latency of operations that use such resources. This only
applies when the processor's micro-op buffer size is non-zero
(Out-of-Order). We can't precisely model in-order stalls during
out-of-order execution, but this is an easy and effective
heuristic. It benefits cortex-a9 scheduling when using the new
machine model, which is not yet on by default.

MI-Sched for armv7 was evaluated on Swift (and only not enabled because
of a performance bug related to predication). However, we never
evaluated Cortex-A9 performance on MI-Sched in its current form. This
change adds MI-Sched functionality to reach performance goals on
A9. The only remaining change is to allow MI-Sched to run as a PostRA
pass.

I evaluated performance using a set of options to estimate the performance impact once MI sched is default on armv7:
-mcpu=cortex-a9 -disable-post-ra -misched-bench -scheditins=false

For a simple saxpy loop I see a 1.7x speedup. Here are the llvm-testsuite results:
(min run time over 2 runs, filtering tiny changes)

Speedups:
| Benchmarks/BenchmarkGame/recursive         |  52.39% |
| Benchmarks/VersaBench/beamformer           |  20.80% |
| Benchmarks/Misc/pi                         |  19.97% |
| Benchmarks/Misc/mandel-2                   |  19.95% |
| SPEC/CFP2000/188.ammp                      |  18.72% |
| Benchmarks/McCat/08-main/main              |  18.58% |
| Benchmarks/Misc-C++/Large/sphereflake      |  18.46% |
| Benchmarks/Olden/power                     |  17.11% |
| Benchmarks/Misc-C++/mandel-text            |  16.47% |
| Benchmarks/Misc/oourafft                   |  15.94% |
| Benchmarks/Misc/flops-7                    |  14.99% |
| Benchmarks/FreeBench/distray               |  14.26% |
| SPEC/CFP2006/470.lbm                       |  14.00% |
| mediabench/mpeg2/mpeg2dec/mpeg2decode      |  12.28% |
| Benchmarks/SmallPT/smallpt                 |  10.36% |
| Benchmarks/Misc-C++/Large/ray              |   8.97% |
| Benchmarks/Misc/fp-convert                 |   8.75% |
| Benchmarks/Olden/perimeter                 |   7.10% |
| Benchmarks/Bullet/bullet                   |   7.03% |
| Benchmarks/Misc/mandel                     |   6.75% |
| Benchmarks/Olden/voronoi                   |   6.26% |
| Benchmarks/Misc/flops-8                    |   5.77% |
| Benchmarks/Misc/matmul_f64_4x4             |   5.19% |
| Benchmarks/MiBench/security-rijndael       |   5.15% |
| Benchmarks/Misc/flops-6                    |   5.10% |
| Benchmarks/Olden/tsp                       |   4.46% |
| Benchmarks/MiBench/consumer-lame           |   4.28% |
| Benchmarks/Misc/flops-5                    |   4.27% |
| Benchmarks/mafft/pairlocalalign            |   4.19% |
| Benchmarks/Misc/himenobmtxpa               |   4.07% |
| Benchmarks/Misc/lowercase                  |   4.06% |
| SPEC/CFP2006/433.milc                      |   3.99% |
| Benchmarks/tramp3d-v4                      |   3.79% |
| Benchmarks/FreeBench/pifft                 |   3.66% |
| Benchmarks/Ptrdist/ks                      |   3.21% |
| Benchmarks/Adobe-C++/loop_unroll           |   3.12% |
| SPEC/CINT2000/175.vpr                      |   3.12% |
| Benchmarks/nbench                          |   2.98% |
| SPEC/CFP2000/183.equake                    |   2.91% |
| Benchmarks/Misc/perlin                     |   2.85% |
| Benchmarks/Misc/flops-1                    |   2.82% |
| Benchmarks/Misc-C++-EH/spirit              |   2.80% |
| Benchmarks/Misc/flops-2                    |   2.77% |
| Benchmarks/NPB-serial/is                   |   2.42% |
| Benchmarks/ASC_Sequoia/CrystalMk           |   2.33% |
| Benchmarks/BenchmarkGame/n-body            |   2.28% |
| Benchmarks/SciMark2-C/scimark2             |   2.27% |
| Benchmarks/Olden/bh                        |   2.03% |
| skidmarks10/skidmarks                      |   1.81% |
| Benchmarks/Misc/flops                      |   1.72% |

Slowdowns:
| Benchmarks/llubenchmark/llu                | -14.14% |
| Benchmarks/Polybench/stencils/seidel-2d    |  -5.67% |
| Benchmarks/Adobe-C++/functionobjects       |  -5.25% |
| Benchmarks/Misc-C++/oopack_v1p8            |  -5.00% |
| Benchmarks/Shootout/hash                   |  -2.35% |
| Benchmarks/Prolangs-C++/ocean              |  -2.01% |
| Benchmarks/Polybench/medley/floyd-warshall |  -1.98% |
| Polybench/linear-algebra/kernels/3mm       |  -1.95% |
| Benchmarks/McCat/09-vor/vor                |  -1.68% |

llvm-svn: 196516
2013-12-05 17:55:58 +00:00
Arnold Schwaighofer 7ee53cac80 SLPVectorizer: An in-tree vectorized entry cannot also be a scalar external use
We were creating external uses for scalar values in MustGather entries that also
had a ScalarToTreeEntry (they also are present in a vectorized tuple). This
meant we would keep a value 'alive' as a scalar and vectorized causing havoc.
This is not necessary because when we create a MustGather vector we explicitly
create external uses entries for the insertelement instructions of the
MustGather vector elements.

Fixes PR18129.

radar://15582184

llvm-svn: 196508
2013-12-05 15:14:40 +00:00
Kostya Serebryany 2460c3fc73 [tsan] fix PR18146: sometimes a variable written into vptr could have an integer type (after other optimizations)
llvm-svn: 196507
2013-12-05 15:03:02 +00:00
Justin Holewinski 4459717bab [NVPTX] Fix off-by-one error when creating the VT list for an SDNode
llvm-svn: 196503
2013-12-05 12:58:00 +00:00
Matheus Almeida a6beac1acc [mips] Small code generation improvement for conditional operator (select)
in case the operands are constants and its difference is |1|.
It should be possible in those cases to rematerialize the result using
MIPS's slt and similar instructions.

The small update to some of the tests in cmov.ll, sel1c.ll and sel2c.ll was needed
otherwise the optimization implemented in this patch would have been triggered
(difference between the operands was 1) and that would have changed the semantic
of the tests.

llvm-svn: 196498
2013-12-05 12:07:05 +00:00
Matheus Almeida 6b59c449d9 [mips][msa] Fix issue with immediate fields of LD/ST instructions
not being correctly encoded/decoded.
In more detail, immediate fields of LD/ST instructions should be
divided/multiplied by the size of the data format before encoding and
after decoding, respectively.

llvm-svn: 196494
2013-12-05 11:06:22 +00:00
Tim Northover e4def5e228 ARM: fix yet another stack-folding bug
We were trying to fold the stack adjustment into the wrong instruction in the
situation where the entire basic-block was epilogue code. Really, it can only
ever be valid to do the folding precisely where the "add sp, ..." would be
placed so there's no need for a separate iterator to track that.

Should fix PR18136.

llvm-svn: 196493
2013-12-05 11:02:02 +00:00
Alp Toker f907b891da Correct word hyphenations
This patch tries to avoid unrelated changes other than fixing a few
hyphen-related ambiguities and contractions in nearby lines.

llvm-svn: 196471
2013-12-05 05:44:44 +00:00
Rafael Espindola 01d19d0299 Hide the stub created for MO_ExternalSymbol too.
given

declare void @llvm.memset.p0i8.i32(i8* nocapture, i8, i32, i32, i1)
declare void @foo()
define void @bar() {
  call void @foo()
  call void @llvm.memset.p0i8.i32(i8* null, i8 0, i32 188, i32 1, i1 false)
  ret void
}

We used to produce

L_foo$stub:
        .indirect_symbol        _foo
        .ascii  "\364\364\364\364\364"

_memset$stub:
        .indirect_symbol        _memset
        .ascii  "\364\364\364\364\364"

We not produce a private stub for memset too.

Stubs are not needed with recent linkers, but we still produce them for darwin8.

Thanks to David Fang for confirming that gcc used to do this too.

llvm-svn: 196468
2013-12-05 05:19:12 +00:00
Matt Arsenault 89cc49fe5d R600/SI: Add comments for number of used registers.
llvm-svn: 196467
2013-12-05 05:15:35 +00:00
NAKAMURA Takumi 57b20a7e54 Move llvm/test/MC/ELF/thumb-st_other.s to test/MC/ARM.
llvm-svn: 196457
2013-12-05 02:21:44 +00:00
Jiangning Liu 65d8e3422a For AArch64, add missing register cost calculation for big value types like v4i64 and v8i64.
llvm-svn: 196456
2013-12-05 02:12:01 +00:00
Cameron McInally 164097a6eb Add FileCheck statements for r196435.
llvm-svn: 196449
2013-12-05 01:20:36 +00:00
Eric Christopher c4dd56b9cf Make these two tests resilient in the face of compile unit size
changes.

llvm-svn: 196444
2013-12-05 01:00:12 +00:00
Logan Chien ee36595ce6 [mc] Fix ELF st_other flag.
ELF_Other_Weakref and ELF_Other_ThumbFunc seems to be LLVM
internal ELF symbol flags.  These should not be emitted to
object file.

This commit defines ELF_STO_Shift for the target-defined
flags for st_other, and increase the value of
ELF_Other_Shift to 16.

llvm-svn: 196440
2013-12-05 00:34:11 +00:00
Cameron McInally 30bbb214e5 Add AVX512 patterns for v16i32 broadcast and v2i64 zero extend load.
Patch by Aleksey Bader.

llvm-svn: 196435
2013-12-05 00:11:25 +00:00
Kevin Enderby 86496a45cb Fix a bug in darwin's 32-bit X86 handling of evaluating fixups.
Where it would use a scattered relocation entry but falls back to a
normal relocation entry because the FixupOffset is more than 24-bits.

The bug is in the X86MachObjectWriter::RecordScatteredRelocation() where
it changes reference parameter FixedValue but then returns false to indicate
it did not create a scattered relocation entry.  The fix is simply to save the
original value of the parameter FixedValue at the start of the method and
restore it if we are returning false in that case.

rdar://15526046

llvm-svn: 196432
2013-12-04 23:36:24 +00:00
David Peixotto 8ad70b3542 Add support for parsing ARM symbol variants on ELF targets
ARM symbol variants are written with parens instead of @ like this:

  .word __GLOBAL_I_a(target1)

This commit adds support for parsing these symbol variants in
expressions. We introduce a new flag to MCAsmInfo that indicates the
parser should use parens to parse the symbol variant. The expression
parser is modified to look for symbol variants using parens instead
of @ when the corresponding MCAsmInfo flag is true.

The MCAsmInfo parens flag is enabled only for ARM on ELF.

By adding this flag to MCAsmInfo, we are able to get rid of
redundant ARM-specific symbol variants and use the generic variants
instead (e.g. VK_GOT instead of VK_ARM_GOT). We use the new
UseParensForSymbolVariant attribute in MCAsmInfo to correctly print
the symbol variants for arm.

To achive this we need to keep a handle to the MCAsmInfo in the
MCSymbolRefExpr class that we can check when printing the symbol
variant.

Updated Tests:
  Changed case of symbol variant to match the generic kind.
  test/CodeGen/ARM/tls-models.ll
  test/CodeGen/ARM/tls1.ll
  test/CodeGen/ARM/tls2.ll
  test/CodeGen/Thumb2/tls1.ll
  test/CodeGen/Thumb2/tls2.ll

PR18080

llvm-svn: 196424
2013-12-04 22:43:20 +00:00
David Blaikie 6a439adf44 DebugInfo: Improve test to use llvm-dwarfdump
llvm-svn: 196396
2013-12-04 18:40:29 +00:00
David Blaikie 2deb6fd637 Test fix for r196394
llvm-svn: 196395
2013-12-04 18:34:28 +00:00
Cameron McInally cbb51dacfb Fix assembly syntax for AVX512 vector blend instructions.
llvm-svn: 196393
2013-12-04 18:05:36 +00:00
Cameron McInally c5f420e129 Suppress '(x < y) ? a : 0 -> (x < y) & a' transform on X86 architectures with dedicated mask registers.
Patch by Aleksey Bader.

llvm-svn: 196386
2013-12-04 14:52:33 +00:00
Daniel Jasper 87a24d5c27 Un-revert r196358: "llvm-cov: Added support for function checksums."
And add the proper fix.

llvm-svn: 196367
2013-12-04 08:57:17 +00:00
Daniel Jasper c176b5d1d6 Revert r196358: "llvm-cov: Added support for function checksums."
This currently breaks clang/test/CodeGen/code-coverage.c. The root cause
is that the newly introduced access to Funcs[j] is out of bounds.

llvm-svn: 196365
2013-12-04 08:23:33 +00:00
Kevin Qin afd095de8b [AArch64 Neon] Add ACLE intrinsic vceqz_f64.
llvm-svn: 196362
2013-12-04 08:02:34 +00:00
Kevin Qin f9832e8de7 [AArch64 NEON] Add missing compare intrinsics.
llvm-svn: 196360
2013-12-04 07:53:28 +00:00
Yuchen Wu 06655f3570 llvm-cov: Added support for function checksums.
The function checksums are hashed from the concatenation of the function
name and line number.

llvm-svn: 196358
2013-12-04 06:00:17 +00:00
Rafael Espindola 9201e6b535 Produce deterministic coff files.
llvm-svn: 196341
2013-12-04 02:02:55 +00:00
Rafael Espindola 757ae64731 Add -mcpu=core2 to all llc invocations in this test.
Should fix the atom buildbot.

llvm-svn: 196340
2013-12-04 01:25:24 +00:00
Juergen Ributzka c4c9b3717a [Stackmap] Specify the triple and cpu to fix the unit test.
llvm-svn: 196339
2013-12-04 01:02:37 +00:00
Juergen Ributzka 17e0d9ee6c [Stackmap] Emit multi-byte nops for X86.
llvm-svn: 196334
2013-12-04 00:39:08 +00:00
Reed Kotler 59975c2cba final patch for very long conditional branches for mips16 constant islands.
this completes the basic port of ARM constant islands to Mips16.
More testing, code review, cleanup is in order but basically everything
seems to be working. A bug in gas is preventing some of the runtime
testing but I hope to resolve this soon.

llvm-svn: 196331
2013-12-03 23:42:51 +00:00
NAKAMURA Takumi 303f0f5abd check-llvm: Ask llvm-config about assertion mode, instead of llc.
Add --assertion-mode to llvm-config. It emits ON or OFF according to NDEBUG.

llvm-svn: 196329
2013-12-03 23:22:25 +00:00
Rafael Espindola f4d34153f5 Use CHECK-LABEL to make this test more strict.
llvm-svn: 196321
2013-12-03 21:12:36 +00:00
Rafael Espindola 0a2baf8eaf Fix mingw32 thiscall + sret.
Unlike msvc, when handling a thiscall + sret gcc will
* Put the sret in %ecx
* Put the this pointer is (%esp)

This fixes, for example, calling stringstream::str.

llvm-svn: 196312
2013-12-03 20:51:23 +00:00
Yuchen Wu c8e0f81a37 llvm-cov: Another fix to llvm-cov test.
Copy all test files to temporary directory, not just test.* files. Tests
didn't fail because the missing files occurred in XFAILS.

llvm-svn: 196305
2013-12-03 19:05:03 +00:00
Yunzhong Gao 9163e8bce6 Teach the internalize pass to skip dllexported symbols because they could be
referenced in a way that even the linker does not see.

Differential Revision: http://llvm-reviews.chandlerc.com/D2280

llvm-svn: 196300
2013-12-03 18:05:14 +00:00
Arnold Schwaighofer 46db725a43 opt: Mirror vectorization presets of clang
clang enables vectorization at optimization levels > 1 and size level < 2. opt
should behave similarily.

Loop vectorization and SLP vectorization can be disabled with the flags
-disable-(loop/slp)-vectorization.

llvm-svn: 196294
2013-12-03 16:33:06 +00:00
Renato Golin 2b22f6ae96 Fix lit config for disabled MCJIT tests on ARM
Separating permanent from temporary targets, added the bug that
will fix the temporary (PR18057).

llvm-svn: 196274
2013-12-03 13:48:28 +00:00
James Molloy 8a25992f39 Addrspacecasts are no-ops on ARM.
Testcase added.

llvm-svn: 196269
2013-12-03 11:23:11 +00:00
Richard Sandiford ccc2a7c1a0 [SystemZ] Fix choice of known-zero mask in insertion optimization
The backend converts 64-bit ORs into subreg moves if the upper 32 bits
of one operand and the low 32 bits of the other are known to be zero.
It then tries to peel away redundant ANDs from the upper 32 bits.

Since AND masks are canonicalized to exclude known-zero bits,
the test ORs the mask and the known-zero bits together before
checking for redundancy.  The problem was that it was using the
wrong node when checking for known-zero bits, so could drop ANDs
that were still needed.

llvm-svn: 196267
2013-12-03 11:01:54 +00:00
Michael Liao 14b02848a3 Enhance the fix of PR17631
- The fix to PR17631 fixes part of the cases where 'vzeroupper' should
  not be issued before 'call' insn. There're other cases where helper
  calls will be inserted not limited to epilog. These helper calls do
  not follow the standard calling convention and won't clobber any YMM
  registers. (So far, all call conventions will clobber any or part of
  YMM registers.)
  This patch enhances the previous fix to cover more cases 'vzerosupper' should
  not be inserted by checking if that function call won't clobber any YMM
  registers and skipping it if so.

llvm-svn: 196261
2013-12-03 09:17:32 +00:00
Renato Golin 4281a1399a Disable Remote MCJIT tests on ARM
The communication protocol is unstable on ARM when compiled
with Clang, which is disrupting the self-hosting buildbots that
are going to be added this week. I'm working on a solution, but
remote MCJIT is not high-priority for ARM at the moment, so it
might take a while.

llvm-svn: 196257
2013-12-03 08:39:15 +00:00
Daniel Jasper cf765e35c2 Further fix to llvm-cov test.
It turns out that in some build systems, tests are executed in a
non-writable directory. Hopefully, this finally fixes the issue.

llvm-svn: 196256
2013-12-03 08:21:14 +00:00
Daniel Jasper 80b2ffb2c2 Fix llvm-cov test as suggested in r196228's post commit review.
llvm-svn: 196255
2013-12-03 07:56:23 +00:00
Daniel Jasper 2a2ed02069 Copy input files to test directory.
With r196184, llvm-cov creates a new file right next to the input file.
However, the Inputs-directory can't simply be assumed to be writable
under all build systems.

Also, this prevents a new source file from showing up in the source tree
if the test aborts before the call to "rm".

llvm-svn: 196228
2013-12-03 07:35:32 +00:00
Hao Liu dca64f4a20 [AArch64]Add missing floating point convert, round and misc intrinsics.
E.g. int64x1_t vcvt_s64_f64(float64x1_t a) -> FCVTZS Dd, Dn

llvm-svn: 196210
2013-12-03 06:06:55 +00:00
Hao Liu c250cbc095 AArch64: add missing ACLE intrinsics mapping to general arithmetic operation from VFP instructions.
E.g. float64x1_t vadd_f64(float64x1_t a, float64x1_t b) -> FADD Dd, Dn, Dm.

llvm-svn: 196208
2013-12-03 05:58:30 +00:00
NAKAMURA Takumi e15febf516 llvm-cov.test: Resurrect part of r194694 for win32 hosts.
llvm-svn: 196207
2013-12-03 05:40:25 +00:00
Hao Liu 21a461353a AArch64: Add missing scalar pair intrinsics.
E.g. "float32_t vaddv_f32(float32x2_t a)" to be matched into "faddp s0, v1.2s".

llvm-svn: 196198
2013-12-03 03:39:47 +00:00
NAKAMURA Takumi 9ae4da2e6d llvm/test/Transforms/SampleProfile/syntax.ll: Relax an expression, not to check locale-dependent message.
llvm-svn: 196195
2013-12-03 02:20:53 +00:00
Jiangning Liu 3a541d46a1 Add some missing pattern matches for AArch64 Neon intrinsics like vuqadd_s64 and friends.
llvm-svn: 196192
2013-12-03 01:33:52 +00:00
Jiangning Liu 94a7bb2130 Add some missing pattern matches for AArch64 Neon intrinsics like vmull_high_n_s16 and friends.
llvm-svn: 196190
2013-12-03 01:29:32 +00:00
Yuchen Wu 26326ad396 llvm-cov: Removed output to STDOUT/specified file.
Instead of asking the user to specify a single file to output coverage
info and defaulting to STDOUT, llvm-cov now creates files for each
source file with a naming system of: <source filename> + ".llcov".

This is what gcov does and although it can clutter the working directory
with numerous coverage files, it will be easier to hook the llvm-cov
output to tools which operate on this assumption (such as lcov).

llvm-svn: 196184
2013-12-03 00:57:11 +00:00
Manman Ren 8b4306ce05 Debug Info: drop debug info via upgrading path if version number does not match.
Add a helper function getDebugInfoVersionFromModule to return the debug info
version number for a module.

"Verifier/module-flags-1.ll" checks for verification errors.
It will seg fault when calling getDebugInfoVersionFromModule because of the
incorrect format for module flags in the testing case. We make
getModuleFlagsMetadata more robust by checking for error conditions.

PR17982

llvm-svn: 196158
2013-12-02 21:29:56 +00:00
Manman Ren e601b504b6 Update Ocaml/vmcore.ml to emit a "Debug Info Version" module flag.
llvm-svn: 196156
2013-12-02 21:25:56 +00:00
Chad Rosier 3106de3f9d [AArch64] Implemented vcopy_lane patterns using scalar DUP instruction.
Patch by Ana Pazos!

llvm-svn: 196151
2013-12-02 21:05:16 +00:00
David Blaikie 3c1d33241c DebugInfo: Type Units: Propagate the correct DW_AT_language into type units.
llvm-svn: 196130
2013-12-02 18:44:29 +00:00
Kay Tiong Khoo 5389f74655 Conservative fix for PR17827 - don't optimize a shift + and + compare sequence where the shift is logical unless the comparison is unsigned
llvm-svn: 196129
2013-12-02 18:43:59 +00:00
Vincent Lejeune 4b8d9e303c R600: Workaround for cayman loop bug
llvm-svn: 196121
2013-12-02 17:29:37 +00:00
Diego Novillo 21cb8d4dd5 Add tests for profile sample file parsing.
The profile file parser needed some tests for its parsing actions.
This adds tests for each of the error messages emitted by the parser.

llvm-svn: 196106
2013-12-02 15:12:50 +00:00
Rafael Espindola af7131d0a2 Output .eh_frames on COFF too now that the integrated as is used on mingw.
llvm-svn: 196104
2013-12-02 14:59:34 +00:00
Tim Northover dee8604caf ARM: decide whether to use movw/movt based on "minsize" attribute.
llvm-svn: 196102
2013-12-02 14:46:26 +00:00
Robert Lytton 7fbca3cce0 XCore target: Make handling of large frames not dependent upon an FP.
eliminateFrameIndex() has been reworked to handle both small & large frames
with either a FP or SP.
An additional Slot is required for Scavenging spills when not using FP for large frames.
Reworked the handling of Register Scavenging.

Whether we are using an FP or not, whether it is a large frame or not,
and whether we are using a large code model or not are now independent.

llvm-svn: 196091
2013-12-02 11:05:28 +00:00
Tim Northover 72360d201c ARM: add pseudo-instructions for lit-pool global materialisation
These are used by MachO only at the moment, and (much like the existing
MOVW/MOVT set) work around the fact that the labels used in the actual
instructions often contain PC-dependent components, which means that repeatedly
materialising the same global can't be CSEed.

With small modifications, it could be adapted to how ELF finds the address of
_GLOBAL_OFFSET_TABLE_, which would give similar benefits in PIC mode there.

llvm-svn: 196090
2013-12-02 10:35:41 +00:00
Robert Lytton d3ffa66c6c XCore target: fix large code model 'select' indirect address handling.
llvm-svn: 196088
2013-12-02 10:18:37 +00:00
Robert Lytton ff38d37c77 XCore target: Add large code model
When using large code model:
Global objects larger than 'CodeModelLargeSize' bytes are placed in sections named with a trailing ".large"
The folded global address of such objects are lowered into the const pool.

During inspection it was noted that LowerConstantPool() was using a default offset of zero.
A fix was made, but due to only offsets of zero being generated, testing only verifies the change is not detrimental.

Correct the flags emitted for explicitly specified sections.

We assume the size of the object queried by getSectionForConstant() is never greater than CodeModelLargeSize.
To handle greater than CodeModelLargeSize, changes to AsmPrinter would be required.

llvm-svn: 196087
2013-12-02 10:18:31 +00:00
Robert Lytton 7f4d1c9d11 XCore target: extend tests in preparation
llvm-svn: 196086
2013-12-02 10:18:24 +00:00
Robert Lytton 0abd2c96b5 XCore target: Fix eliminateFrameIndex() to handle large frames
Large frame offsets are loaded from the ConstantPool.
Where possible, offsets are encoded using the smaller MKMSK instruction.
Large frame offsets can only be used when there is a frame-pointer.

llvm-svn: 196085
2013-12-02 10:18:19 +00:00
Robert Lytton a9f984fb76 XCore target: Enable frames larger than 65535 to be lowered
llvm-svn: 196084
2013-12-02 10:18:14 +00:00
Kostya Serebryany 08b9cf56be [tsan] fix instrumentation of vector vptr updates (https://code.google.com/p/thread-sanitizer/issues/detail?id=43)
llvm-svn: 196079
2013-12-02 08:07:15 +00:00
Alp Toker 43d937fc3e Rename test with misspelt filename
llvm-svn: 196064
2013-12-02 04:31:36 +00:00
Rafael Espindola aaaf02206a Also test the created stubs on 32 bits.
llvm-svn: 196052
2013-12-01 21:24:30 +00:00
Andrew Trick ca45c817c3 Add -mcpu to stackmap.ll
llvm-svn: 196051
2013-12-01 18:17:05 +00:00
Tim Northover 45479dcf49 ARM: fix bug in -Oz stack adjustment folding
Previously, we clobbered callee-saved registers when folding an "add
sp, #N" into a "pop {rD, ...}" instruction. This change checks whether
a register we're going to add to the "pop" could actually be live
outside the function before doing so and should fix the issue.

This should fix PR18081.

llvm-svn: 196046
2013-12-01 14:16:24 +00:00
Michael Kuperstein 1627eeba71 Ensure bitcode encoding of linkage types stays stable. Patch by Boaz Ouriel
llvm-svn: 196042
2013-12-01 10:16:35 +00:00
Hal Finkel ca93e47258 Convert a PPC test from grep to FileCheck
Convert this test to FileCheck, and improve it to check for the instructions it
is trying to exclude instead of checking for register use (especially because
grepping for r1 can be thrown off, for example, by a use of r12).

llvm-svn: 195979
2013-11-30 20:04:33 +00:00
Hal Finkel 2651f97333 Desensitize a couple of PPC regression tests
Use CHECK-DAG to make these regression tests more resilient against changes in
instruction scheduling.

llvm-svn: 195978
2013-11-30 19:52:28 +00:00
Hal Finkel 2b655bb228 Update the cpu specified on some PPC regression tests
Some of these tests did not specify a cpu but were also sensitive to
instruction scheduling and/or register assignment choices. A few others
similarly-sensitive tests specified a cpu (often the POWER7), and while the P7
currently uses the default model for PPC64, this will soon change. For those
tests which should not really be cpu-dependent anyway, the cpu is set to the
generic 'ppc64'.

llvm-svn: 195977
2013-11-30 19:39:27 +00:00
Zoran Jovanovic 472486714e Test case for issue with microMIPS long branch.
llvm-svn: 195976
2013-11-30 19:13:15 +00:00
Daniel Sanders 7fd68d6018 [mips][msa] MSA loads and stores have a 10-bit offset. Account for this when lowering FrameIndex.
This prevents the compiler from emitting invalid ld.[bhwd]'s and st.[bhwd]'s
when the stack frame is between 512 and 32,768 bytes in size.

llvm-svn: 195973
2013-11-30 13:47:57 +00:00
Juergen Ributzka 5b6234dc4a Force CPU type to unbreak unit tests on Haswell machines.
llvm-svn: 195971
2013-11-30 03:07:16 +00:00
Reed Kotler ad450f239f Part 1 of 3 patches that completes very long conditional branches
in constant islands for Mips16. We introdcuce JalB16 as a synomnym
for Jal16. It makes it easier to read and is also necessary because
Jal16 is a call instruction but JalB16 is being used as a branch.
Various parts of LLVM will not work properly even in this late stage of
the backend if we use what was declared as a call instruction to function
as a branch. For one, basic block labels may not get emitted in some
situations. 

llvm-svn: 195968
2013-11-29 22:32:56 +00:00
Zoran Jovanovic 1bc3cce040 Revert revision 195965.
llvm-svn: 195967
2013-11-29 22:10:02 +00:00
Petar Jovanovic e3e940d887 mips: XFAIL llvm-cov test
XFAIL llvm-cov.test for MIPS until big-endian issues are fixed for llvm-cov.
The test does pass on MIPS little-endian.

llvm-svn: 195966
2013-11-29 21:59:09 +00:00
Zoran Jovanovic ff2a40ce4d Fixed issue with microMIPS long branch.
llvm-svn: 195965
2013-11-29 21:41:24 +00:00
Hao Liu ba38eee8ac AArch64: The pattern match should check the range of the immediate value.
Or we can generate some illegal instructions.
E.g. shrn2 v0.4s, v1.2d, #35. The legal range should be in [1, 16].

llvm-svn: 195941
2013-11-29 02:11:22 +00:00
Jiangning Liu f7b4c7c2ce Add missing test case for bsl_f64 support of AArch64 NEON.
llvm-svn: 195939
2013-11-29 01:38:08 +00:00
Kevin Qin 337cfcc83c [AArch64 NEON]Fix a assertion failure when disassemble SHLL instruction.
llvm-svn: 195936
2013-11-29 01:29:16 +00:00
Stephen Canon c454964c47 Rein in overzealous InstCombine of fptrunc(OP(fpextend, fpextend)).
llvm-svn: 195934
2013-11-28 21:38:05 +00:00
Hao Liu f9f468abee AArch64: Fix a bug about disassembling post-index load single element to 4 vectors
llvm-svn: 195903
2013-11-28 01:07:45 +00:00
Reed Kotler 0d409e2dfe Check in conditional branches for constant islands. Still need to finish
conditional branches for very large targets. That will be the next small
patch. Everything now should in principle work as good (functionality
wise) as without constant islands so we decided at Mips/Imagination to
make constant islands the default for Mips16 now so that it will get
excercised a lot and this port is still experimentatl though hopefully soon
we will change the status. Some more cleanup and code review is in order
but things are converging fast.

llvm-svn: 195902
2013-11-28 00:56:37 +00:00
David Blaikie bc7e0d43bf DebugInfo: Do not include variables only referenced by templates in aranges.
ARanges included even extern variables referenced by pointer non-type
template parameters even though that variable isn't part of this
compilation unit.

llvm-svn: 195895
2013-11-27 23:53:52 +00:00
Akira Hatanaka 168d4e5b20 [mips] Implement the following optimizations using dominance information to
make PIC calls a little more efficient:

1. Remove instructions setting up $gp if it is known that a function has been
   called at least once.
2. Save the address of a called function in a register instead of loading
   it from the GOT at every call site.

llvm-svn: 195892
2013-11-27 23:38:42 +00:00
Tom Stellard 175e7a8c97 R600: Expand vector FABS
NOTE: This is a candidate for the 3.4 branch.
llvm-svn: 195881
2013-11-27 21:23:39 +00:00
Tom Stellard c149dc02d3 R600/SI: Implement spilling of SGPRs v5
SGPRs are spilled into VGPRs using the {READ,WRITE}LANE_B32 instructions.

v2:
  - Fix encoding of Lane Mask
  - Use correct register flags, so we don't overwrite the low dword
    when restoring multi-dword registers.

v3:
  - Register spilling seems to hang the GPU, so replace all shaders
    that need spilling with a dummy shader.

v4:
  - Fix *LANE definitions
  - Change destination reg class for 32-bit SMRD instructions

v5:
  - Remove small optimization that was crashing Serious Sam 3.

https://bugs.freedesktop.org/show_bug.cgi?id=68224
https://bugs.freedesktop.org/show_bug.cgi?id=71285

NOTE: This is a candidate for the 3.4 branch.
llvm-svn: 195880
2013-11-27 21:23:35 +00:00
Tom Stellard 859199dad8 R600/SI: Use SGPR_32 register class for 32-bit SMRD outputs
Writing to the M0 register from an SMRD instruction hangs the GPU, so
we need to use the SGPR_32 register class, which does not include M0.

NOTE: This is a candidate for the 3.4 branch.
llvm-svn: 195879
2013-11-27 21:23:29 +00:00
Tom Stellard 4d566b2edf R600: Add support for ISD::FROUND
NOTE: This is a candidate for the 3.4 branch.
llvm-svn: 195878
2013-11-27 21:23:20 +00:00
Rafael Espindola 745bd85c6a Use FileCheck and expand the test a bit.
In particular, check the name of the symbol we are putting in the constant pool.

llvm-svn: 195865
2013-11-27 19:22:14 +00:00
Rafael Espindola 3c8e147a6b Use the same tls section name as msvc.
We currently error in clang with:
"error: thread-local storage is unsupported for the current target", but we
can start to get the llvm level ready.

When compiling

template<typename T>
struct foo {
  static __declspec(thread) int bar;
};
template<typename T>
__declspec(therad) int foo<T>::bar;
template struct foo<int>;

msvc produces

SECTION HEADER #3
   .tls$ name
       0 physical address
       0 virtual address
       4 size of raw data
     12F file pointer to raw data (0000012F to 00000132)
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
C0301040 flags
         Initialized Data
         COMDAT; sym= "public: static int foo<int>::bar" (?bar@?$foo@H@@2HA)
         4 byte align
         Read Write

gcc produces a ".data$__emutls_v.<symbol>" for the testcase with
__declspec(thread) replaced with thread_local.

llvm-svn: 195849
2013-11-27 15:52:11 +00:00
Jiangning Liu 97aa8cf8b7 Fix the AArch64 NEON bug exposed by checking constant integer argument range of ACLE intrinsics.
llvm-svn: 195843
2013-11-27 14:02:25 +00:00
Rafael Espindola 09cf06c75e Cleanup and test X86AsmPrinter::printPCRelImm.
It is only used for asm printing.

On X86 we put basic block addresses on register before passing them to inline
asm, so the MO_MachineBasicBlock case was dead.

MO_ExternalSymbol was dead since any symbol being passed to inline asm
is represented as MO_GlobalAddress.

The MO_GlobalAddress and MO_Register cases were not tested.

llvm-svn: 195824
2013-11-27 06:53:13 +00:00
Chad Rosier 75290c6307 [AArch64] Add support for NEON scalar floating-point absolute difference.
llvm-svn: 195803
2013-11-27 01:45:58 +00:00
Rafael Espindola 2d30ae2be9 Use simple section names for COMDAT sections on COFF.
With this patch we use simple names for COMDAT sections (like .text or .bss).
This matches the MSVC behavior.

When merging it is the COMDAT symbol that is used to decide if two sections
should be merged, so there is no point in building a fancy name.

This survived a bootstrap on mingw32.

llvm-svn: 195798
2013-11-27 01:18:37 +00:00
Nadav Rotem b0082d246a PR1860 - We can't save a list of ExtractElement instructions to CSE because some of these instructions
may be removed and optimized in future iterations. Instead we save a list of basic blocks that we need to CSE.

llvm-svn: 195791
2013-11-26 22:24:25 +00:00
Chad Rosier 9653d5c989 [AArch64] Add support for NEON scalar floating-point to integer convert
instructions.

llvm-svn: 195788
2013-11-26 22:17:37 +00:00
Arnold Schwaighofer a2c8e008d2 LoopVectorizer: Truncate i64 trip counts of i32 phis if necessary
In signed arithmetic we could end up with an i64 trip count for an i32 phi.
Because it is signed arithmetic we know that this is only defined if the i32
does not wrap. It is therefore safe to truncate the i64 trip count to a i32
value.

Fixes PR18049.

llvm-svn: 195787
2013-11-26 22:11:23 +00:00
Reed Kotler 3aeb1d0857 Fix a bug related to constant islands for Mips16 and mips16/32 dual mode.
The determination of when we are doing constant pools was being made too
early in the asm printer.

llvm-svn: 195781
2013-11-26 20:38:40 +00:00
Michael Liao d617a3015d Fix PR18054
- Fix bug in (vsext (vzext x)) -> (vsext x) in SIGN_EXTEND_IN_REG
  lowering where we need to check whether x is a vector type (in-reg
  type) of i8, i16 or i32; otherwise, that optimization is not valid.

llvm-svn: 195779
2013-11-26 20:31:31 +00:00
David Blaikie fd1eff5a0a DwarfDebug: Include type units in accelerator tables.
Since type units aren't in the CUMap, use the DwarfUnits list to iterate
over units for tasks such as accelerator table building.

llvm-svn: 195776
2013-11-26 19:14:34 +00:00
Nadav Rotem f9f8482e3a PR18060 - When we RAUW values with ExtractElement instructions in some cases
we generate PHI nodes with multiple entries from the same basic block but
with different values. Enabling CSE on ExtractElement instructions make sure
that all of the RAUWed instructions are the same.

llvm-svn: 195773
2013-11-26 17:29:19 +00:00
Stepan Dyatkovskiy abb8505dc5 PR17925 bugfix.
Short description.

This issue is about case of treating pointers as integers.
We treat pointers as different if they references different address space.
At the same time, we treat pointers equal to integers (with machine address
width). It was a point of false-positive. Consider next case on 32bit machine:

void foo0(i32 addrespace(1)* %p)
void foo1(i32 addrespace(2)* %p)
void foo2(i32 %p)

foo0 != foo1, while
foo1 == foo2 and foo0 == foo2.

As you can see it breaks transitivity. That means that result depends on order
of how functions are presented in module. Next order causes merging of foo0
and foo1: foo2, foo0, foo1
First foo0 will be merged with foo2, foo0 will be erased. Second foo1 will be
merged with foo2.
Depending on order, things could be merged we don't expect to.

The fix:
Forbid to treat any pointer as integer, except for those, who belong to address space 0.

llvm-svn: 195769
2013-11-26 16:11:03 +00:00
Tim Northover fa36dfeeca Darwin-ARM: use movw/movt for static relocations
llvm-svn: 195759
2013-11-26 12:45:05 +00:00
Richard Sandiford dd7dd930d1 [SystemZ] Fix incorrect use of RISBG for a zero-extended right shift
We would wrongly transform the testcase into the equivalent of an AND with 1.
The problem was that, when testing whether the shifted-in bits of the right
shift were significant, we used the width of the final zero-extended result
rather than the width of the shifted value.

llvm-svn: 195731
2013-11-26 10:53:16 +00:00
Kevin Qin 599c47d0de Refactored the implementation of AArch64 NEON instruction ZIP, UZP
and TRN.
Fix a bug when mixed use of vget_high_u8() and vuzp_u8().

llvm-svn: 195716
2013-11-26 03:26:47 +00:00
Kevin Qin 33ca18fdcf [AArch64]Implement 128 bit register copy with NEON.
llvm-svn: 195713
2013-11-26 02:33:42 +00:00
Andrew Trick 391dbadb51 StackMap: Implement support for DirectMemRefOp.
A Direct stack map location records the address of frame index. This
address is itself the value that the runtime requested. This differs
from IndirectMemRefOp locations, which refer to a stack locations from
which the requested values must be loaded. Direct locations can
directly communicate the address if an alloca, while IndirectMemRefOp
handle register spills.

For example:

entry:
  %a = alloca i64...
  llvm.experimental.stackmap(i32 <ID>, i32 <shadowBytes>, i64* %a)

Since both the alloca and stackmap intrinsic are in the entry block,
and the intrinsic takes the address of the alloca, the runtime can
assume that LLVM will not substitute alloca with any intervening
value. This must be verified by the runtime by checking that the stack
map's location is a Direct location type. The runtime can then
determine the alloca's relative location on the stack immediately after
compilation, or at any time thereafter. This differs from Register and
Indirect locations, because the runtime can only read the values in
those locations when execution reaches the instruction address of the
stack map.

llvm-svn: 195712
2013-11-26 02:03:25 +00:00
David Blaikie a357aa03af DebugInfo: Update test case due to dumper improvements in r195698
The dumper was only dumping one pubtypes set and it was /always/ dumping
one pubtypes set even when there were zero sets. Now that the dumper
correctly dumps zero, one, or many sets, we can update this test case to
test for the absolute absence of a set rather than a bogus/accidental
zero-valued set.

llvm-svn: 195706
2013-11-26 01:11:02 +00:00
David Blaikie 8a263cbc99 DebugInfo: Avoid emitting pubtype entries for type DIEs that just indirect to a type unit.
llvm-svn: 195698
2013-11-26 00:22:37 +00:00
Cameron McInally c592e5251c Add an intrinsic for the SSE2 PAUSE instruction.
llvm-svn: 195697
2013-11-26 00:20:43 +00:00
Chandler Carruth ec1fb5c705 Add the test case that I missed when committing r195528. Doh!
llvm-svn: 195691
2013-11-25 22:24:27 +00:00