Commit Graph

19468 Commits

Author SHA1 Message Date
Arnold Schwaighofer 2e7a922a15 LoopVectorize: Handle loops with multiple forward inductions
We used to give up if we saw two integer inductions. After this patch, we base
further induction variables on the chosen one like we do in the reverse
induction and pointer induction case.

Fixes PR15720.

radar://13851975

llvm-svn: 181746
2013-05-14 00:21:18 +00:00
Michael Gottesman a76143eeee [objc-arc-opts] In the presense of an alloca unconditionally remove RR pairs if and only if we are both KnownSafeBU/KnownSafeTD rather than just either or.
In the presense of a block being initialized, the frontend will emit the
objc_retain on the original pointer and the release on the pointer loaded from
the alloca. The optimizer will through the provenance analysis realize that the
two are related (albiet different), but since we only require KnownSafe in one
direction, will match the inner retain on the original pointer with the guard
release on the original pointer. This is fixed by ensuring that in the presense
of allocas we only unconditionally remove pointers if both our retain and our
release are KnownSafe (i.e. we are KnownSafe in both directions) since we must
deal with the possibility that the frontend will emit what (to the optimizer)
appears to be unbalanced retain/releases.

An example of the miscompile is:

  %A = alloca
  retain(%x)
  retain(%x) <--- Inner Retain
  store %x, %A
  %y = load %A
  ... DO STUFF ...
  release(%y)
  call void @use(%x)
  release(%x) <--- Guarding Release

getting optimized to:

  %A = alloca
  retain(%x)
  store %x, %A
  %y = load %A
  ... DO STUFF ...
  release(%y)
  call void @use(%x)

rdar://13750319

llvm-svn: 181743
2013-05-13 23:49:42 +00:00
Jack Carter f5f48d8ff7 Mips assembler: Assembler macro ADDIU $rs,imm
This patch adds alias for addiu instruction which enables following syntax:

    addiu $rs,imm

The macro is translated as:

    addiu $rs,$rs,imm


Contributer: Vladimir Medic
llvm-svn: 181729
2013-05-13 20:26:46 +00:00
Bill Schmidt 22d40dcfe9 PPC64: Constant initializers with dynamic relocations go in .data.rel.ro.
This fixes warning messages observed in the oggenc application test in
projects/test-suite.  Special handling is needed for the 64-bit
PowerPC SVR4 ABI when a constant is initialized with a pointer to a
function in a shared library.  Because a function address is
implemented as the address of a function descriptor, the use of copy
relocations can lead to problems with initialization.  GNU ld
therefore replaces copy relocations with dynamic relocations to be
resolved by the dynamic linker.  This means the constant cannot reside
in the read-only data section, but instead belongs in .data.rel.ro,
which is designed for constants containing dynamic relocations.

The implementation creates a class PPC64LinuxTargetObjectFile
inheriting from TargetLoweringObjectFileELF, which behaves like its
parent except to place constants of this sort into .data.rel.ro.

The test case is reduced from the oggenc application.

llvm-svn: 181723
2013-05-13 19:34:37 +00:00
Akira Hatanaka 9edae02db8 [mips] Add option -mno-ldc1-sdc1.
This option is used when the user wants to avoid emitting double precision FP
loads and stores. Double precision FP loads and stores are expanded to single
precision instructions after register allocation.

llvm-svn: 181718
2013-05-13 18:23:35 +00:00
Mihai Popa dc1764c5a4 The purpose of the patch is to fix the syntax of ARM mrc and mrc2 instructions when they are used to write to the APSR. In this case, the destination operand should be APSR_nzcv, and the encoding of the target should be 0b1111 (same as for PC). In pre-UAL syntax, this form used the PC register as a textual target. This is still allowed for backward compatibility.
llvm-svn: 181705
2013-05-13 14:10:04 +00:00
Lang Hames 67c09b3f88 Correctly preserve the input chain for potential tailcall nodes whose
return values are bitcasts.

The chain had previously been being clobbered with the entry node to
the dag, which sometimes caused other code in the function to be
erroneously deleted when tailcall optimization kicked in.

<rdar://problem/13827621>

llvm-svn: 181696
2013-05-13 10:21:19 +00:00
Hao Liu bc60196951 Fix PR15950 A bug in DAG Combiner about undef mask
llvm-svn: 181682
2013-05-13 02:07:05 +00:00
Rafael Espindola 65c016b106 XFAIL this test for mingw too.
llvm-svn: 181678
2013-05-13 00:18:24 +00:00
Nadav Rotem ce42cc6d4d SLPVectorizer: Fix a bug in the code that generates extracts for values with multiple users.
The external user does not have to be in lane #0. We have to save the lane for each scalar so that we know which vector lane to extract.

llvm-svn: 181674
2013-05-12 22:58:45 +00:00
David Majnemer 6c30f49af3 InstCombine: Flip the order of two urem transforms
There are two transforms in visitUrem that conflict with each other.

*) One, if a divisor is a power of two, subtracts one from the divisor
   and turns it into a bitwise-and.
*) The other unwraps both operands if they are surrounded by zext
   instructions.

Flipping the order allows the subtraction to go beneath the sign
extension.

llvm-svn: 181668
2013-05-12 00:07:05 +00:00
Arnold Schwaighofer f2305e4467 LoopVectorize: Use the widest induction variable type
Use the widest induction type encountered for the cannonical induction variable.

We used to turn the following loop into an empty loop because we used i8 as
induction variable type and truncated 1024 to 0 as trip count.

int a[1024];
void fail() {
  int reverse_induction = 1023;
  unsigned char forward_induction = 0;
  while ((reverse_induction) >= 0) {
    forward_induction++;
    a[reverse_induction] = forward_induction;
    --reverse_induction;
  }
}

radar://13862901

llvm-svn: 181667
2013-05-11 23:04:28 +00:00
David Majnemer 470b077bca InstCombine: Turn urem to bitwise-and more often
Use isKnownToBeAPowerOfTwo in visitUrem so that we may more aggressively
fold away urem instructions.

llvm-svn: 181661
2013-05-11 09:01:28 +00:00
Reed Kotler 739d36a265 Add -mtriple=mipsel-linux-gnu to the test so that the compiler does
not think it can support small data sections.

llvm-svn: 181654
2013-05-11 01:02:20 +00:00
Nadav Rotem cdfb48d2fe SLPVectorizer: Add support for trees with external users.
For example:
bar() {
  int a = A[i];
  int b = A[i+1];
  B[i] = a;
  B[i+1] = b;
  foo(a);  <--- a is used outside the vectorized expression.
}

llvm-svn: 181648
2013-05-10 22:59:33 +00:00
Nadav Rotem 5ff6db1153 Add an additional testcase for PR15882.
llvm-svn: 181646
2013-05-10 22:55:44 +00:00
Reed Kotler 783c79446b Checkin in of first of several patches to finish implementation of
mips16/mips32 floating point interoperability. 

This patch fixes returns from mips16 functions so that if the function
was in fact called by a mips32 hard float routine, then values
that would have been returned in floating point registers are so returned.

Mips16 mode has no floating point instructions so there is no way to
load values into floating point registers.

This is needed when returning float, double, single complex, double complex
in the Mips ABI.

Helper functions in libc for mips16 are available to do this.

For efficiency purposes, these helper functions have a different calling
convention from normal Mips calls.

Registers v0,v1,a0,a1 are used to pass parameters instead of
a0,a1,a2,a3.

This is because v0,v1,a0,a1 are the natural registers used to return
floating point values in soft float. These values can then be moved
to the appropriate floating point registers with no extra cost.

The only register that is modified is ra in this call.

The helper functions make sure that the return values are in the floating
point registers that they would be in if soft float was not in effect
(which it is for mips16, though the soft float is implemented using a mips32
library that uses hard float).
 

llvm-svn: 181641
2013-05-10 22:25:39 +00:00
David Blaikie f95485a69e Give the test from r181632 a target triple.
llvm-svn: 181637
2013-05-10 22:14:39 +00:00
David Blaikie a1e813dcd4 PR14492: Debug Info: Support for values of non-integer non-type template parameters.
This is only tested for global variables at the moment (& includes tests
for the unnamed parameter case, since apparently this entire function
was completely untested previously)

llvm-svn: 181632
2013-05-10 21:52:07 +00:00
Jyotsna Verma 300f0b966c Hexagon: Fix switch cases in HexagonVLIWPacketizer.cpp.
llvm-svn: 181624
2013-05-10 20:27:34 +00:00
Chad Rosier c8569cba93 [ms-inline asm] Fix a crasher when we fail on a direct match.
The issue was that the MatchingInlineAsm and VariantID args to the
MatchInstructionImpl function weren't being set properly.  Specifically, when
parsing intel syntax, the parser thought it was parsing inline assembly in the
at&t dialect; that will never be the case.  

The crash was caused when the emitter tried to emit the instruction, but the
operands weren't set.  When parsing inline assembly we only set the opcode, not
the operands, which is used to lookup the instruction descriptor.
rdar://13854391 and PR15945

Also, this commit reverts r176036.  Now that we're correctly parsing the intel
syntax the pushad/popad don't match properly.  I've reimplemented that fix using
a MnemonicAlias.

llvm-svn: 181620
2013-05-10 18:24:17 +00:00
Benjamin Kramer 14e915f7b4 InstCombine: Don't claim to be able to evaluate any shl in a zexted type.
The shift amount may be larger than the type leading to undefined behavior.
Limit the transform to constant shift amounts. While there update the bits to
clear in the result which may enable additional optimizations.

PR15959.

llvm-svn: 181604
2013-05-10 16:26:37 +00:00
Logan Chien 4ea23b56c5 Implement AsmParser for ARM unwind directives.
This commit implements the AsmParser for fnstart, fnend,
cantunwind, personality, handlerdata, pad, setfp, save, and
vsave directives.

This commit fixes some minor issue in the ARMELFStreamer:

* The switch back to corresponding section after the .fnend
  directive.

* Emit the unwind opcode while processing .fnend directive
  if there is no .handlerdata directive.

* Emit the unwind opcode to .ARM.extab while processing
  .handlerdata even if .personality directive does not exist.

llvm-svn: 181603
2013-05-10 16:17:24 +00:00
Aaron Ballman e42ccf32cc XFAILing this test on Win32 to unbreak the build bots.
llvm-svn: 181600
2013-05-10 14:42:16 +00:00
Benjamin Kramer a5d59333b3 DAGCombiner: Generate a correct constant for vector types when folding (xor (and)) into (and (not)).
PR15948.

llvm-svn: 181597
2013-05-10 14:09:52 +00:00
Benjamin Kramer a6645e8b8f InstCombine: Verify the type before transforming uitofp into select.
PR15952.

llvm-svn: 181586
2013-05-10 09:16:52 +00:00
Tom Stellard 2b971eb0d0 R600: Remove AMDILPeeopholeOptimizer and replace optimizations with tablegen patterns
The BFE optimization was the only one we were actually using, and it was
emitting an intrinsic that we don't support.

https://bugs.freedesktop.org/show_bug.cgi?id=64201

Reviewed-by: Christian König <christian.koenig@amd.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181580
2013-05-10 02:09:45 +00:00
Tom Stellard 3a7c34c778 R600: Expand SUB for v2i32/v4i32
Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181579
2013-05-10 02:09:39 +00:00
Tom Stellard 3deddc5079 R600: Expand MUL for v4i32/v2i32
Fixes piglit test for OpenCL builtin mul24, and allows mad24 to run.

Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181578
2013-05-10 02:09:34 +00:00
Tom Stellard 7fb3963498 R600: Expand SRA for v4i32/v2i32
v2: Add v4i32 test

Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181577
2013-05-10 02:09:29 +00:00
Tom Stellard a99c6ae47a R600: Expand vselect for v4i32 and v2i32
v2: Add vselect v4i32 test

Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181576
2013-05-10 02:09:24 +00:00
Chad Rosier edb1dc8498 [x86AsmParser] It's valid to stop parsing an operand at an immediate.
rdar://13854369 and PR15944

llvm-svn: 181564
2013-05-09 23:48:53 +00:00
Owen Anderson 32baf99b1d Teach SelectionDAG to constant fold all-constant FMA nodes the same way that it constant folds FADD, FMUL, etc.
llvm-svn: 181555
2013-05-09 22:27:13 +00:00
Bill Wendling 07fe235e2b Generate a compact unwind encoding in the face of a stack alignment push.
We generate a `push' of a random register (%rax) if the stack needs to be
aligned by the size of that register. However, this could mess up compact unwind
generation. In particular, we want to still generate compact unwind in the
presence of this monstrosity.

Check if the push of of the %rax/%eax register. If it is and it's marked with
the `FrameSetup' flag, then we can generate a compact unwind encoding for the
function only if the push is the last FrameSetup instruction.

llvm-svn: 181540
2013-05-09 20:10:38 +00:00
Jyotsna Verma 978e972ff9 Hexagon: Use relation map for getMatchingCondBranchOpcode() and
getInvertedPredicatedOpcode() functions instead of switch cases.

llvm-svn: 181530
2013-05-09 18:25:44 +00:00
Rafael Espindola 007521673b Don't replace an alias in llvm.used with its target.
When we replace an internal alias with its target, be careful not to
replace the entry in llvm.used (and llvm.compiler_used).

llvm-svn: 181524
2013-05-09 17:22:59 +00:00
Richard Osborne 1333fa3d68 [XCore] Fix handling of functions where only the LR is spilled.
Previously we only checked if the LR required saving if the frame size was
non zero. However because the caller reserves 1 word for the callee to use
that doesn't count towards our frame size it is possible for the LR to need
saving and for the frame size to be 0.

We didn't hit when the LR needed saving because of a function calls because
the 1 word of stack we must allocate for our callee means the frame size
is always non zero in this case. However we can hit this case if the LR is
clobbered in inline asm.

llvm-svn: 181520
2013-05-09 16:43:42 +00:00
Benjamin Kramer 21b972ae94 InstCombine: Don't just copy known bits from the first operand of an srem.
That's obviously wrong. Conservatively restrict it to the sign bit, which
matches the original intention of this analysis. Fixes PR15940.

llvm-svn: 181518
2013-05-09 16:32:32 +00:00
Rafael Espindola 0d15f7313f Change getRelocationAdditionalInfo to be ELF only.
It was only implemented for ELF where it collected the Addend, so this
patch also renames it to getRelocationAddend.

llvm-svn: 181502
2013-05-09 03:39:05 +00:00
Eric Christopher f20ff979e9 Revert "Make sure debug info contains linkage names (DW_AT_MIPS_linkage_name)"
temporarily while investigating gdb.cp/templates.exp.

This reverts commit r181471.

llvm-svn: 181496
2013-05-09 00:42:33 +00:00
Arnold Schwaighofer 2e8c69cf97 LoopVectorizer: Don't assert on the absence of induction variables
A computable loop exit count does not imply the presence of an induction
variable. Scalar evolution can return a value for an infinite loop.

Fixes PR15926.

llvm-svn: 181495
2013-05-09 00:32:18 +00:00
Daniel Malea abdd7f197a Revert 181475 as the DebugIR tests are breaking (automake) buildbots that re-use build dirs
- the temporaries "-debug.ll" files generated by DebugIR pass are considered tests, even though they are not

llvm-svn: 181476
2013-05-08 21:55:31 +00:00
Eric Christopher 697fa1c8be Make sure debug info contains linkage names (DW_AT_MIPS_linkage_name)
for constructors and destructors since the original declaration given
by the AT_specification both won't and can't.

Patch by Yacine Belkadi, I've cleaned up the testcases.

llvm-svn: 181471
2013-05-08 21:23:22 +00:00
Daniel Malea 0233db70f0 DebugIR tests -- lit tests for the line number transform
- simple one-function case
- function-calling case
- external function calling case
- exception throwing case
- vector case

Note: these tests are somewhat coupled to the current format of debug metadata.
llvm-svn: 181469
2013-05-08 21:03:00 +00:00
Akira Hatanaka b4526ea132 [mips] Add instruction selection pattern for (seteq $LHS, 0).
llvm-svn: 181459
2013-05-08 19:38:04 +00:00
Ulrich Weigand 689e34a824 [PowerPC] Add ELF relocation tests
This patch extends test/MC/PowerPC/ppc64-fixups.s to not only check for
the correct fixup type in the --show-encoding output, but also runs the
generated object file through llvm-readobj -r and verifies that the
correct ELF relocation records were generated.

llvm-svn: 181453
2013-05-08 17:51:44 +00:00
Bill Schmidt 38b6cb51bc Fix handling of anonymous aggregate parameters for powerpc*-apple-darwin8.
This fixes bug 15821 similarly to the powerpc64-linux fix for bug 14779.

Patch by David Fang.

llvm-svn: 181449
2013-05-08 17:22:33 +00:00
Michel Danzer 7bbd7aa7fe R600/SI: Add lit tests for llvm.SI.imageload and llvm.SI.resinfo intrinsics
Adapted from the llvm.SI.sample test.

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 181425
2013-05-08 13:07:29 +00:00
Hal Finkel 08e53ee551 PPCInstrInfo::optimizeCompareInstr should not optimize FP compares
The floating-point record forms on PPC don't set the condition register bits
based on a comparison with zero (like the integer record forms do), but rather
based on the exception status bits.

llvm-svn: 181423
2013-05-08 12:16:14 +00:00
Mihai Popa 1fb61c6eed This patch fixes two tests marked as XFAIL among the ARM assembler tests.
The reference encoding is correct, but written in the wrong byte order (these are Thumb tests, while the reference is in ARM byte order).

llvm-svn: 181420
2013-05-08 09:41:12 +00:00
Nick Lewycky 5fb1963f2a Fix a bug in codegenprep where it was losing track of values OptimizeMemoryInst
by switching to a ValueMap. Patch by Andrea DiBiagio!

llvm-svn: 181397
2013-05-08 09:00:10 +00:00
David Majnemer 386ab7f872 DAGCombiner: Simplify inverted bit tests
Fold (xor (and x, y), y) -> (and (not x), y)

This removes an opportunity for a constant to appear twice.

llvm-svn: 181395
2013-05-08 06:44:42 +00:00
David Blaikie 3b6038b6f3 Debug Info: Support DW_TAG_imported_declaration
This provides basic functionality for imported declarations. For
subprograms and types some amount of lazy construction is supported (so
the definition of a function can proceed the using declaration), but it
still doesn't handle declared-but-not-defined functions (since we don't
generally emit function declarations).

Variable support is really rudimentary at the moment - simply looking up
the existing definition with no support for out of order (declaration,
imported_module, then definition).

llvm-svn: 181392
2013-05-08 06:01:41 +00:00
Arnold Schwaighofer 3610139ac5 LoopVectorizer: Improve reduction variable identification
The two nested loops were confusing and also conservative in identifying
reduction variables. This patch replaces them by a worklist based approach.

llvm-svn: 181369
2013-05-07 21:55:37 +00:00
Kevin Enderby ca08df756e Fix a bug in the MC asm parser evaluating expressions. It was treating:
A = 9
B = 3 * A - 2 * A + 1 as  B = 3 * A - (2 * A + 1)

rdar://13816516

llvm-svn: 181366
2013-05-07 21:40:58 +00:00
Jyotsna Verma 5eb598001c Hexagon: Fix Small Data support to handle -G 0 correctly.
llvm-svn: 181344
2013-05-07 19:53:00 +00:00
David Blaikie 6baa776173 Debug Info: Fix for break due to r181271
Apparently we didn't keep an association of Compile Unit metadata nodes
to DIEs so looking up that parental context failed & thus caused no
DW_TAG_imported_modules to be emitted at the CU scope. Fix this by
adding the mapping & sure up the test case to verify this.

llvm-svn: 181339
2013-05-07 17:57:13 +00:00
Jyotsna Verma 03c6ca905c Reverting r181331.
Missing file, HexagonSplitConst32AndConst64.cpp, from lib/Target/Hexagon/CMakeLists.txt.

llvm-svn: 181334
2013-05-07 17:12:35 +00:00
Jyotsna Verma 19f0b40dcf Hexagon: Fix Small Data support to handle -G 0 correctly.
llvm-svn: 181331
2013-05-07 16:42:15 +00:00
Arnold Schwaighofer e78b76fbed LoopVectorize: getConsecutiveVector must respect signed arithmetic
We were passing an i32 to ConstantInt::get where an i64 was needed and we must
also pass the sign if we pass negatives numbers. The start index passed to
getConsecutiveVector must also be signed.

Should fix PR15882.

llvm-svn: 181286
2013-05-07 04:37:05 +00:00
David Blaikie 684fc5331e DebugInfo: Support imported modules in lexical blocks
llvm-svn: 181271
2013-05-06 23:33:07 +00:00
David Majnemer 70f286d95f InstCombine: (X ^ signbit) + C -> X + (signbit ^ C)
llvm-svn: 181249
2013-05-06 21:21:31 +00:00
Bill Wendling 66bdab1322 Reduce attributes.
llvm-svn: 181245
2013-05-06 20:57:23 +00:00
Rafael Espindola fa4513a178 Split Alignment out of the Section Characteristics.
The alignment is just a byte in the middle of Characteristics, not an
independent flag. Making it an independent field in the yaml
representation makes it more yamlio friendly.

llvm-svn: 181243
2013-05-06 20:11:21 +00:00
Jean-Luc Duprat 1610d571db Test results verified using FileCheck rather than grep | count
llvm-svn: 181234
2013-05-06 18:45:16 +00:00
Andrew Trick 9c72b071fe Rotate multi-exit loops even if the latch was simplified.
Test case by Michele Scandale!

Fixes PR10293: Load not hoisted out of loop with multiple exits.

There are few regressions with this patch, now tracked by
rdar:13817079, and a roughly equal number of improvements. The
regressions are almost certainly back luck because LoopRotate has very
little idea of whether rotation is profitable. Doing better requires a
more comprehensive solution.

This checkin is a quick fix that lacks generality (PR10293 has
a counter-example). But it trivially fixes the case in PR10293 without
interfering with other cases, and it does satify the criteria that
LoopRotate is a loop canonicalization pass that should avoid
heuristics and special cases.

I can think of two approaches that would probably be better in
the long run. Ultimately they may both make sense.

(1) LoopRotate should check that the current header would make a good
loop guard, and that the loop does not already has a sufficient
guard. The artifical SimplifiedLoopLatch check would be unnecessary,
and the design would be more general and canonical. Two difficulties:

- We need a strong guarantee that we won't endlessly rotate, so the
  analysis would need to be precise in order to avoid the
  SimplifiedLoopLatch precondition.

- Analysis like this are usually based on SCEV, which we don't want to
  rely on.

(2) Rotate on-demand in late loop passes. This could even be done by
shoving the loop back on the queue after the optimization that needs
it. This could work well when we find LICM opportunities in
multi-branch loops. This requires some work, and it doesn't really
solve the problem of SCEV wanting a loop guard before the analysis.

llvm-svn: 181230
2013-05-06 17:58:18 +00:00
Tom Stellard 043de4c5af R600: Emit config values in register / value pairs
Reviewed-by: Vincent Lejeune <vljn@ovi.com>
Tested-By: Aaron Watry <awatry@gmail.com>
llvm-svn: 181228
2013-05-06 17:50:51 +00:00
Tom Stellard cfe2ef8fea R600: Stop emitting the instruction type byte before each instruction
Reviewed-by: Vincent Lejeune <vljn@ovi.com>
Tested-By: Aaron Watry <awatry@gmail.com>
llvm-svn: 181225
2013-05-06 17:50:44 +00:00
Tom Stellard dbbcaf31b6 R600: Emit ISA for CALL_FS_* instructions
Reviewed-by: Vincent Lejeune <vljn@ovi.com>
Tested-By: Aaron Watry <awatry@gmail.com>
llvm-svn: 181223
2013-05-06 17:50:26 +00:00
Ulrich Weigand e7c6dfeb4b [SystemZ] Update non-pic DWARF encodings
As pointed out by Rafael Espindola, we should match the DWARF encodings
produced by GCC in both pic and non-pic modes.  This was not the case
for the non-pic case.

This patch changes all DWARF encodings to DW_EH_PE_absptr for the
non-pic case, just like GCC does.  The test case is updated to check
for both variants.

llvm-svn: 181222
2013-05-06 17:28:30 +00:00
Adhemerval Zanella e8bd03da5c PowerPC: Fix unimplemented relocation on ppc64
This patch handles the R_PPC64_REL64 relocation type for powerpc64
for mcjit.

llvm-svn: 181220
2013-05-06 17:21:23 +00:00
Jean-Luc Duprat 4189ef456a Fix add4.ll test cmdline so that it passes
llvm-svn: 181219
2013-05-06 17:18:47 +00:00
Jean-Luc Duprat 3e4fc3ef24 Provide InstCombines for the following 3 cases:
A * (1 - (uitofp i1 C)) -> select C, 0, A
B * (uitofp i1 C) -> select C, B, 0
select C, 0, A + select C, B, 0 -> select C, B, A

These come up in code that has been hand-optimized from a select to a linear blend, 
on platforms where that may have mattered. We want to undo such changes 
with the following transform:
A*(1 - uitofp i1 C) + B*(uitofp i1 C) -> select C, A, B

llvm-svn: 181216
2013-05-06 16:55:50 +00:00
Tim Northover 7dbbc28f72 AArch64: use MCJIT by default and enable related tests.
This just enables some testing I'd missed after implementing MCJIT
support.

llvm-svn: 181215
2013-05-06 16:51:08 +00:00
Ulrich Weigand 80435baa14 [SystemZ] Set up JIT/MCJIT test cases
This patch adds the necessary configuration bits and #ifdef's to set up
the JIT/MCJIT test cases for SystemZ.  Like other recent targets, we do
fully support MCJIT, but do not support the old JIT at all.  Set up the
lit config files accordingly, and disable old-JIT unit tests.

Patch by Richard Sandiford.

llvm-svn: 181207
2013-05-06 16:21:50 +00:00
Ulrich Weigand 982a21510e [SystemZ] Add MC test cases
This adds all MC tests for the SystemZ target.

Patch by Richard Sandiford.

llvm-svn: 181206
2013-05-06 16:20:58 +00:00
Ulrich Weigand 08bd6154d9 [SystemZ] Add DebugInfo test cases
This adds all DebugInfo tests for the SystemZ target.

This version of the patch incorporates feedback from reviews by
Eric Christopher and Rafael Espindola.  Thanks to all reviewers!

Patch by Richard Sandiford.

llvm-svn: 181205
2013-05-06 16:18:29 +00:00
Ulrich Weigand 9e3577ff44 [SystemZ] Add CodeGen test cases
This adds all CodeGen tests for the SystemZ target.

This version of the patch incorporates feedback from a review by
Sean Silva.  Thanks to all reviewers!

Patch by Richard Sandiford.

llvm-svn: 181204
2013-05-06 16:17:29 +00:00
Rafael Espindola 57dc142d40 Free the exception object. Should fix the vg bots.
llvm-svn: 181195
2013-05-06 13:30:52 +00:00
Michael Kuperstein ac868757d0 Fix slightly too aggressive conact_vector optimization.
(Would sometimes optimize away conacts used to extend a vector with undef values)

llvm-svn: 181186
2013-05-06 08:06:13 +00:00
Bill Wendling b07a68ebb0 Add a testcase that checks that we generate functions with frame
pointers or not depending upon the function attributes.

llvm-svn: 181180
2013-05-06 05:45:57 +00:00
Rafael Espindola 072f4d9a1e XFAIL for cygwin.
Looks like symbol resolution is not working on cygwin, the test fails
because __gxx_personality_v0 is not found.

llvm-svn: 181179
2013-05-06 03:35:56 +00:00
Nadav Rotem c70ef4e93c Revert r164763 because it introduces new shuffles.
Thanks Nick Lewycky for pointing this out.

llvm-svn: 181177
2013-05-06 02:39:09 +00:00
Matt Arsenault c23753a53e Fix unchecked uses of DominatorTree in MemoryDependenceAnalysis.
Use unknown results for places where it would be needed

llvm-svn: 181176
2013-05-06 02:07:24 +00:00
Rafael Espindola c229a4fff4 Fix const merging when an alias of a const is llvm.used.
We used to disable constant merging not only if a constant is llvm.used, but
also if an alias of a constant is llvm.used. This change fixes that.

llvm-svn: 181175
2013-05-06 01:48:55 +00:00
Rafael Espindola f48af0ae1d This should also fail on ARM.
We currently have no way to register new eh frames on ARM.

llvm-svn: 181172
2013-05-05 22:42:34 +00:00
Rafael Espindola b32c880b31 Fix XFAIL line.
llvm-svn: 181171
2013-05-05 21:30:10 +00:00
Rafael Espindola e639744c4b XFAIL this on ppc64.
It looks like eh uses an unimplemented relocation on pp64

llvm-svn: 181169
2013-05-05 21:04:18 +00:00
Rafael Espindola fa5942bc2c Add EH support to the MCJIT.
This gets exception handling working on ELF and Macho (x86-64 at least).
Other than the EH frame registration, this patch also implements support
for GOT relocations which are used to locate the personality function on
MachO.

llvm-svn: 181167
2013-05-05 20:43:10 +00:00
Evan Cheng dc5436a3cb Test case for r181160 and r181161. rdar://13782395
llvm-svn: 181162
2013-05-05 18:07:15 +00:00
Richard Osborne 4498bd352f [XCore] Add LDAPB instructions.
With the change the disassembler now supports the XCore ISA in its
entirety.

llvm-svn: 181155
2013-05-05 13:36:53 +00:00
Richard Osborne 4d3514ee94 [XCore] Add BLRB instructions.
llvm-svn: 181152
2013-05-05 13:24:16 +00:00
Stepan Dyatkovskiy 8c02c98259 For ARM backend, fixed "byval" attribute support.
Now even the small structures could be passed within byval (small enough
to be stored in GPRs).
In regression tests next function prototypes are checked:

PR15293:
  %artz = type { i32 }
  define void @foo(%artz* byval %s)
  define void @foo2(%artz* byval %s, i32 %p, %artz* byval %s2)
foo: "s" stored in R0
foo2: "s" stored in R0, "s2" stored in R2.

Next AAPCS rules are checked:
5.5 Parameters Passing, C.4 and C.5,
"ParamSize" is parameter size in 32bit words:
-- NSAA != 0, NCRN < R4 and NCRN+ParamSize > R4.
   Parameter should be sent to the stack; NCRN := R4.
-- NSAA != 0, and NCRN < R4, NCRN+ParamSize < R4.
   Parameter stored in GPRs; NCRN += ParamSize.

llvm-svn: 181148
2013-05-05 07:48:36 +00:00
David Majnemer 66fb70de38 Remove a recently redundant transform from X86ISelLowering.
X86ISelLowering has support to treat:
(icmp ne (and (xor %flags, -1), (shl 1, flag)), 0)

as if it were actually:
(icmp eq (and %flags, (shl 1, flag)), 0)

However, r179386 has code at the InstCombine level to handle this.

llvm-svn: 181145
2013-05-05 02:00:10 +00:00
Arnold Schwaighofer d96e427eac LoopVectorize: Add support for floating point min/max reductions
Add support for min/max reductions when "no-nans-float-math" is enabled. This
allows us to assume we have ordered floating point math and treat ordered and
unordered predicates equally.

radar://13723044

llvm-svn: 181144
2013-05-05 01:54:48 +00:00
Arnold Schwaighofer a670a0a3aa LoopVectorize: We don't need an identity element for min/max reductions
We can just use the initial element that feeds the reduction.

  max(max(x, y), z) == max(max(x,y), max(x,z))

radar://13723044

llvm-svn: 181141
2013-05-05 01:54:42 +00:00
Tim Northover 7b55b97dba AArch64: enable MCJIT and tests now that everything passes.
This removes dire warnings about AArch64 being unsupported and enables
the tests when appropriate on this platform.

llvm-svn: 181135
2013-05-04 20:14:22 +00:00
Tim Northover 885698a25c AArch64: support literal pool access in large memory model.
llvm-svn: 181120
2013-05-04 16:54:07 +00:00
Tim Northover 8ff187df5f AArch64: support large code model for jump-tables
llvm-svn: 181119
2013-05-04 16:54:00 +00:00
Tim Northover 9fc1cddb21 AArch64: implement support for blockaddress in large code model
llvm-svn: 181118
2013-05-04 16:53:53 +00:00
Tim Northover 2dbef3452c AArch64: implement large code model access to global variables.
The MOVZ/MOVK instruction sequence may not be the most efficient (a
literal-pool load could be better) but adding that would require
reinstating the ConstantIslands pass.

For now the sequence is correct, and that's enough. Beware, as of
commit GNU ld does not appear to support the relocations needed for
this. Its primary purpose (for now) will be to support JITed code,
since in that case there is no guarantee of where your code will end
up in memory relative to external symbols it references.

llvm-svn: 181117
2013-05-04 16:53:46 +00:00
Tim Northover fee13d1e11 Allow host triple to be correctly overridden in CMake builds
The intended semantics mirror autoconf, where the user is able to
specify a host triple, but if it's left to the build system then
"config.guess" is invoked for the default.

This also renames the LLVM_HOSTTRIPLE define to LLVM_HOST_TRIPLE to
fit in with the style of the surrounding defines.

llvm-svn: 181112
2013-05-04 07:36:23 +00:00
Amara Emerson d9104c0359 Revert r181009.
llvm-svn: 181079
2013-05-03 23:57:17 +00:00
Reed Kotler 0f2b10eb0d Remove some uneeded pseudos in the presence of the naked function attribute.
llvm-svn: 181072
2013-05-03 23:17:24 +00:00
Amara Emerson b2a1cb87b1 Delete test instead.
llvm-svn: 181066
2013-05-03 22:39:03 +00:00
Amara Emerson 55be0c840e Temporarily disable failing test.
llvm-svn: 181062
2013-05-03 22:27:48 +00:00
Ulrich Weigand 2c3a219b76 [PowerPC] Parse platform-specifc variant kinds in AsmParser
This patch adds support for PowerPC platform-specific variant
kinds in MCSymbolRefExpr::getVariantKindForName, and also
adds a test case to verify they are translated to the appropriate
fixup type.

llvm-svn: 181053
2013-05-03 19:52:35 +00:00
Ulrich Weigand 300b6875fb [PowerPC] Add some Book II instructions to AsmParser
This patch adds a couple of Book II instructions (isync, icbi) to the
PowerPC assembler parser.  These are needed when bootstrapping clang
with the integrated assembler forced on, because they are used in
inline asm statements in the code base.

The test case adds the full list of Book II storage control instructions,
including associated extended mnemonics.  Again, those that are not yet
supported as marked as FIXME.

llvm-svn: 181052
2013-05-03 19:51:09 +00:00
Ulrich Weigand d839490f16 [PowerPC] Support extended mnemonics in AsmParser
This patch adds infrastructure to support extended mnemonics in the
PowerPC assembler parser.  It adds support specifically for those
extended mnemonics that LLVM will itself generate.

The test case lists *all* extended mnemonics according to the
PowerPC ISA v2.06 Book I, but marks those not yet supported
as FIXME.

llvm-svn: 181051
2013-05-03 19:50:27 +00:00
Ulrich Weigand 640192daa8 [PowerPC] Add assembler parser
This adds assembler parser support to the PowerPC back end.

The parser will run for any powerpc-*-* and powerpc64-*-* triples,
but was tested only on 64-bit Linux.  The supported syntax is
intended to be compatible with the GNU assembler.

The parser does not yet support all PowerPC instructions, but
it does support anything that is generated by LLVM itself.
There is no support for testing restricted instruction sets yet,
i.e. the parser will always accept any instructions it knows,
no matter what feature flags are given.

Instruction operands will be checked for validity and errors
generated.  (Error handling in general could still be improved.)

The patch adds a number of test cases to verify instruction
and operand encodings.  The tests currently cover all instructions
from the following PowerPC ISA v2.06 Book I facilities:
Branch, Fixed-point, Floating-Point, and Vector. 
Note that a number of these instructions are not yet supported
by the back end; they are marked with FIXME.

A number of follow-on check-ins will add extra features.  When
they are all included, LLVM passes all tests (including bootstrap)
when using clang -cc1as as the system assembler.

llvm-svn: 181050
2013-05-03 19:49:39 +00:00
Akira Hatanaka e86bd4f652 [mips] Split the DSP control register and define one register for each field of
its fields.

This removes false dependencies between DSP instructions which access different
fields of the the control register. Implicit register operands are added to
instructions RDDSP and WRDSP after instruction selection, depending on the
value of the mask operand.

llvm-svn: 181041
2013-05-03 18:37:49 +00:00
Nadav Rotem 4ce060b3da LoopVectorizer: Add support for if-conversion of PHINodes with 3+ incoming values.
By supporting the vectorization of PHINodes with more than two incoming values we can increase the complexity of nested if statements.

We can now vectorize this loop:

int foo(int *A, int *B, int n) {
  for (int i=0; i < n; i++) {
    int x = 9;
    if (A[i] > B[i]) {
      if (A[i] > 19) {
        x = 3;
      } else if (B[i] < 4 ) {
        x = 4;
      } else {
        x = 5;
      }
    }
    A[i] = x;
  }
}

llvm-svn: 181037
2013-05-03 17:42:55 +00:00
Tom Stellard 4489b85f2b R600: Expand vector or, shl, srl, and xor nodes
llvm-svn: 181035
2013-05-03 17:21:31 +00:00
Tom Stellard eac65dde30 R600: Add pattern for SHA-256 Ma function
This can be optimized using the BFI_INT instruction.

llvm-svn: 181033
2013-05-03 17:21:20 +00:00
Tobias Grosser a7ddc98206 RegionInfo: Do not crash if unreachable block is found
llvm-svn: 181025
2013-05-03 15:48:34 +00:00
Amara Emerson 2f54d9fe10 Add support for reading ARM ELF build attributes.
Build attribute sections can now be read if they exist via ELFObjectFile, and
the llvm-readobj tool has been extended with an option to dump this information
if requested. Regression tests are also included which exercise these features.

Also update the docs with a fixed ARM ABI link and a new link to the Addenda
which provides the build attributes specification.

llvm-svn: 181009
2013-05-03 11:36:35 +00:00
Akira Hatanaka 5705f546e5 [mips] Handle reading, writing or copying of ccond field of DSP control
register.

- Define pseudo instructions which store or load ccond field of the DSP
  control register.
- Emit the pseudos in MipsSEInstrInfo::storeRegToStack and loadRegFromStack.
- Expand the pseudos before callee-scan save.
- Emit instructions RDDSP or WRDSP to copy between ccond field and GPRs. 

llvm-svn: 180969
2013-05-02 23:07:05 +00:00
Vincent Lejeune ddd43383ef R600: Signed literals are 64bits wide
llvm-svn: 180960
2013-05-02 21:53:03 +00:00
Vincent Lejeune 2a44ae0053 R600: If previous bundle is dot4, PV valid chan is always X
llvm-svn: 180959
2013-05-02 21:52:55 +00:00
Vincent Lejeune 7cedb7161d R600: Add a test to check that use_kill is emitted
llvm-svn: 180958
2013-05-02 21:52:46 +00:00
Vincent Lejeune f97af796a9 R600: Prettier asmPrint of Alu
llvm-svn: 180956
2013-05-02 21:52:30 +00:00
Pranav Bhandarkar 7dda912cd7 Hexagon - Add peephole optimizations for zero extends.
* lib/Target/Hexagon/HexagonInstrInfo.td: Add patterns to combine a
	sequence of a pair of i32->i64 extensions followed by a "bitwise or"
	into COMBINE_rr.
	* lib/Target/Hexagon/HexagonPeephole.cpp: Copy propagate Rx in the
	instruction Rp = COMBINE_Ir_V4(0, Rx) to the uses of Rp:subreg_loreg.
	* test/CodeGen/Hexagon/union-1.ll: New test.
	* test/CodeGen/Hexagon/combine_ir.ll: Fix test.

llvm-svn: 180946
2013-05-02 20:22:51 +00:00
Manman Ren 16649b0107 TBAA: remove !tbaa from testing cases if not used.
This will make it easier to turn on struct-path aware TBAA since the metadata
format will change.

llvm-svn: 180935
2013-05-02 18:11:35 +00:00
Michael Liao 126680ffa1 Rewrite X86 codegen regression test with FileCheck
llvm-svn: 180910
2013-05-02 06:20:42 +00:00
David Majnemer a18dfe6b96 Add a test for the foldSelectICmpAndOr fix committed in r180779.
This tests a case where C1 and C2 were the same but X and Y were different
widths.

llvm-svn: 180907
2013-05-02 02:44:23 +00:00
Michael Liao d2d42f1b2d Avoid generating tempfile(s) never used
As DejaGNU is deprecated, it seems pipe-jam issue doesn't exist any more.

llvm-svn: 180892
2013-05-01 22:46:50 +00:00
Bill Wendling 8f2e6feb8e Revert r180737. The companion patch was reverted, and this is not relevant right now.
llvm-svn: 180889
2013-05-01 22:32:08 +00:00
Nadav Rotem 1e211913b5 SROA: Generate selects instead of shuffles when blending values because this is the cannonical form.
Shuffles are more difficult to lower and we usually don't touch them, while we do optimize selects more often.

llvm-svn: 180875
2013-05-01 19:53:30 +00:00
Nadav Rotem e5a2dda372 Optimize away nop CONCAT_VECTOR nodes.
Optimize CONCAT_VECTOR nodes that merge EXTRACT_SUBVECTOR values that extract from the same vector.

rdar://13402653
PR15866

llvm-svn: 180871
2013-05-01 19:18:51 +00:00
Rafael Espindola 817c1d92b4 Put VMOVPQIto64rr in the VRPDI class.
Patch by Joshua Magee.

llvm-svn: 180842
2013-05-01 13:00:16 +00:00
Michael Liao f7f33ed31e Forget remove the tempfile argument
llvm-svn: 180838
2013-05-01 05:45:57 +00:00
Michael Liao bc793a775e More rewrites of x86 codegen regression tests with FileCheck
llvm-svn: 180837
2013-05-01 05:34:30 +00:00
Jim Grosbach d11584a7f7 Revert "InstCombine: Fold more shuffles of shuffles."
This reverts commit r180802

There's ongoing discussion about whether this is the right place to make
this transformation. Reverting for now while we figure it out.

llvm-svn: 180834
2013-05-01 00:25:27 +00:00
Akira Hatanaka 4254319ef9 [mips] Fix handling of instructions which copy to/from accumulator registers.
Expand copy instructions between two accumulator registers before callee-saved
scan is done. Handle copies between integer GPR and hi/lo registers in
MipsSEInstrInfo::copyPhysReg. Delete pseudo-copy instructions that are not
needed.

llvm-svn: 180827
2013-04-30 23:22:09 +00:00
Stephen Lin 699808ceb2 Only pass 'returned' to target-specific lowering code when the value of entire register is guaranteed to be preserved.
llvm-svn: 180825
2013-04-30 22:49:28 +00:00
Akira Hatanaka 68741cc38d [mips] Instruction selection patterns for DSP-ASE vector select and compare
instructions.

llvm-svn: 180820
2013-04-30 22:37:26 +00:00
Adrian Prantl a2888e71eb Temporarily revert "Change the informal convention of DBG_VALUE so that we can express a"
because it breaks some buildbots.

This reverts commit 180816.

llvm-svn: 180819
2013-04-30 22:35:14 +00:00
Adrian Prantl 9a576644e4 Change the informal convention of DBG_VALUE so that we can express a
register-indirect address with an offset of 0.
It used to be that a DBG_VALUE is a register-indirect value if the offset
(operand 1) is nonzero. The new convention is that a DBG_VALUE is
register-indirect if the first operand is a register and the second
operand is an immediate. For plain registers use the combination reg, reg.

rdar://problem/13658587

llvm-svn: 180816
2013-04-30 22:16:46 +00:00
Akira Hatanaka 433de170ee [mips] Test for r179873.
Patch by Zoran Jovanovic.

llvm-svn: 180804
2013-04-30 20:48:49 +00:00
Jim Grosbach 0b914fe839 InstCombine: Fold more shuffles of shuffles.
Always fold a shuffle-of-shuffle into a single shuffle when there's only one
input vector in the first place. Continue to be more conservative when there's
multiple inputs.

rdar://13402653
PR15866

llvm-svn: 180802
2013-04-30 20:43:52 +00:00
Hal Finkel 7153251ab5 LocalStackSlotAllocation improvements
First, taking advantage of the fact that the virtual base registers are allocated in order of the local frame offsets, remove the quadratic register-searching behavior. Because of the ordering, we only need to check the last virtual base register created.

Second, store the frame index in the FrameRef structure, and get the frame index and the local offset from this structure at the top of the loop iteration. This allows us to de-nest the loops in insertFrameReferenceRegisters (and I think makes the code cleaner). I also moved the needsFrameBaseReg check into the first loop over instructions so that we don't bother pushing FrameRefs for instructions that don't want a virtual base register anyway.

Lastly, and this is the only functionality change, avoid the creation of single-use virtual base registers. These are currently not useful because, in general, they end up replacing what would be one r+r instruction with an add and a r+i instruction. Committing this removes the XFAIL in CodeGen/PowerPC/2007-09-07-LoadStoreIdxForms.ll

Jim has okayed this off-list.

llvm-svn: 180799
2013-04-30 20:04:37 +00:00
Manman Ren 1a5ff287fd TBAA: remove !tbaa from testing cases if not used.
This will make it easier to turn on struct-path aware TBAA since the metadata
format will change.

llvm-svn: 180796
2013-04-30 17:52:57 +00:00
Adrian Prantl 0941638a1b Set debug locations for branch instructions created during inlining, even
the inlined function has multiple returns.

rdar://problem/12415623

llvm-svn: 180793
2013-04-30 17:08:16 +00:00
Rafael Espindola 52501033d0 Fix Addend computation for non external relocations on Macho.
llvm-svn: 180790
2013-04-30 15:40:54 +00:00
Vincent Lejeune e69e26025e R600: fix loop-address.ll test
Texture cache is now used when shader type is not specified

llvm-svn: 180785
2013-04-30 12:47:56 +00:00
Mihai Popa af22d91af0 s tightens up the encoding description for ARM post-indexed ldr instructions. All instructions in this class have bit 4 cleared. It turns out that there is a test case for this, but it was marked XFAIL.
llvm-svn: 180778
2013-04-30 09:00:12 +00:00
David Majnemer 8d048d0482 Fix "Combine bit test + conditional or into simple math"
This fixes the optimization introduced in r179748 and reverted in r179750.

While the optimization was sound, it did not properly respect differences in
bit-width.

llvm-svn: 180777
2013-04-30 08:57:58 +00:00
Michael Liao c9ae780b5b Rewrite X86 codegen regression test with FileCheck
llvm-svn: 180776
2013-04-30 07:51:08 +00:00
Rafael Espindola d00c2765aa Collect the Addend for external relocs.
This fixes 2013-04-04-RelocAddend.ll. We don't have a testcase for non external
relocs with an Addend. I will try to write one.

llvm-svn: 180767
2013-04-30 01:29:57 +00:00
Vincent Lejeune 3abdbf1cad R600: use native for alu
llvm-svn: 180761
2013-04-30 00:14:38 +00:00
Vincent Lejeune c299164284 R600: Add FetchInst bit to instruction defs to denote vertex/tex instructions
v2[Vincent Lejeune]: Split FetchInst into usesTextureCache/usesVertexCache

llvm-svn: 180755
2013-04-30 00:13:39 +00:00
Michael Liao db6c6ea21c Rewrite test in FileCheck instead of grep in X86 codegen
llvm-svn: 180754
2013-04-30 00:13:38 +00:00
Manman Ren f0499ba991 TBAA: remove !tbaa from testing cases if not used.
This will make it easier to turn on struct-path aware TBAA since the metadata
format will change.

llvm-svn: 180745
2013-04-29 22:58:55 +00:00
Bill Wendling 39033855c3 Duplicate a testcase.
llvm-svn: 180744
2013-04-29 22:42:47 +00:00
Manman Ren 662ece49de TBAA: remove !tbaa from testing cases if not used.
This will make it easier to turn on struct-path aware TBAA since the metadata
format will change.

llvm-svn: 180743
2013-04-29 22:42:01 +00:00
Michael Liao c83b3e79fc Rewrite some tests with FileCHeck in X86 codegen
- Revise previous patches of the same purpose by fixing
  *) grep <PA> | not grep <PB> semantically is not the same as
     CHECK: <PA>{{^<PB>.*$}} as the former will check all occurrences of <PA>
     while the later only check the first match. As the result, CHECK needs
     putting in all place where <PA> occurs.
  *) grep <PA> | count <N> needs a final CHECK-NOT of the same pattern.
     (As 'CHECK-<N>' is proposed for discussion, converting 'grep | count <N>'
      where N > 1 is postponed.)

llvm-svn: 180742
2013-04-29 22:41:29 +00:00
Adrian Prantl 3e1758c045 Improve documentation.
llvm-svn: 180738
2013-04-29 22:25:52 +00:00
Rafael Espindola e4dd2e0132 Add getSymbolAlignment to the ObjectFile interface.
For regular object files this is only meaningful for common symbols. An object
file format with direct support for atoms should be able to provide alignment
information for all symbols.

This replaces getCommonSymbolAlignment and fixes
test-common-symbols-alignment.ll on darwin. This also includes a fix to
MachOObjectFile::getSymbolFlags. It was marking undefined symbols as common
(already tested by existing mcjit tests now that it is used).

llvm-svn: 180736
2013-04-29 22:24:22 +00:00
Tom Stellard 119ad03c67 R600: Use correct CF_END instruction on Northern Island GPUs
llvm-svn: 180735
2013-04-29 22:23:58 +00:00
Tom Stellard 8367067e02 R600: Fix encoding of CF_END_{EG, R600} instructions
The EOP bit was not being encoded.

llvm-svn: 180734
2013-04-29 22:23:54 +00:00
Arnold Schwaighofer 474df6d3ed SimplifyCFG: If convert single conditional stores
This resurrects r179957, but adds code that makes sure we don't touch
atomic/volatile stores:

This transformation will transform a conditional store with a preceeding
uncondtional store to the same location:

 a[i] =
 may-alias with a[i] load
 if (cond)
   a[i] = Y

into an unconditional store.

 a[i] = X
 may-alias with a[i] load
 tmp = cond ? Y : X;
 a[i] = tmp

We assume that on average the cost of a mispredicted branch is going to be
higher than the cost of a second store to the same location, and that the
secondary benefits of creating a bigger basic block for other optimizations to
work on outway the potential case where the branch would be correctly predicted
and the cost of the executing the second store would be noticably reflected in
performance.

hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With
this change we are on par with gcc's performance (gcc also performs this
transformation). There was a 1.2 % performance improvement on a ARM swift chip.
Other tests in the test-suite+external seem to be mostly uninfluenced in my
experiments:
This optimization was triggered on 41 tests such that the executable was
different before/after the patch. Only 1 out of the 40 tests (dealII) was
reproducable below 100% (by about .4%). Given that hmmer benefits so much I
believe this to be a fair trade off.

llvm-svn: 180731
2013-04-29 21:28:24 +00:00
Rafael Espindola 29cb481ba0 Disable the MCJIT tests on 32 bit darwin.
I recently enabled them on 32 and 64 bit darwin, but it looks like 32 bit is
still fairly broken.

llvm-svn: 180730
2013-04-29 21:09:32 +00:00
Rafael Espindola f1f1c626e7 Propagate relocation info to resolveRelocation.
This gets most of the MCJITs tests passing with MachO.

llvm-svn: 180716
2013-04-29 17:24:34 +00:00
Michael Gottesman 214ca90f8e [objc-arc] Apply the RV optimization to retains next to calls in ObjCARCContract instead of ObjCARCOpts.
Turning retains into retainRV calls disrupts the data flow analysis in
ObjCARCOpts. Thus we move it as late as we can by moving it into
ObjCARCContract.

We leave in the conversion from retainRV -> retain in ObjCARCOpt since
it enables the dataflow analysis.

rdar://10813093

llvm-svn: 180698
2013-04-29 06:53:53 +00:00
Shuxin Yang 04a4fd43aa Fix a XOR reassociation bug.
When Reassociator optimize "(x | C1)" ^ "(X & C2)", it may swap the two
subexpressions, however, it forgot to swap cached constants (of C1 and C2)
accordingly.

rdar://13739160

llvm-svn: 180676
2013-04-27 18:02:12 +00:00
Tim Northover 72e122607f AArch64: convert MC-layer test to .s file
The CodeGen aspects of this test are already covered by cfi-frame.ll;
making it an assembly file reduces the risk of incidental changes
affecting the test.

llvm-svn: 180671
2013-04-27 11:56:14 +00:00
Michael Gottesman b33b6cb84a [objc-arc] Test cleanups.
Mainly adding paranoid checks for the closing brace of a function to
help with FileCheck error readability. Also some other minor changes.

No actual CHECK changes.

llvm-svn: 180668
2013-04-27 05:25:54 +00:00
Eric Christopher 203e12bf9e Use the target triple from the target machine rather than the module
to determine whether or not we're on a darwin platform for debug code
emitting.

Solves the problem of a module with no triple on the command line
and no triple in the module using non-gdb ok features on darwin. Fix
up the member-pointers test to check the correct things for cross
platform (DW_FORM_flag is a good prefix).

Unfortunately no testcase because I have no ideas how to test something
without a triple and without a triple in the module yet check
precisely on two platforms. Ideas welcome.

llvm-svn: 180660
2013-04-27 01:07:52 +00:00
Eric Christopher b2a602d730 Move the XFAIL out of the middle of a comment.
llvm-svn: 180659
2013-04-27 01:07:22 +00:00
Rafael Espindola 1357ab74e5 Make all darwin ppc stubs local.
This fixes pr15763.
Patch by David Fang.

llvm-svn: 180657
2013-04-27 00:43:16 +00:00
Manman Ren 5c37106d65 Struct-path aware TBAA: change the format of TBAAStructType node.
We switch the order of offset and field type to make TBAAStructType node
(name, parent node, offset) similar to scalar TBAA node (name, parent node).
TypeIsImmutable is added to TBAAStructTag node.

llvm-svn: 180654
2013-04-27 00:26:11 +00:00
Benjamin Kramer 5259bbde82 Make CHECK lines a bit less strict so they also match code generated for win64.
Hopefully brings the windows buildbots back to life.

llvm-svn: 180630
2013-04-26 21:04:21 +00:00
Nadav Rotem be0e89d9e8 Teach the interpreter to handle vector compares and additional vector arithmetic operations.
Patch by Yuri Veselov.

llvm-svn: 180626
2013-04-26 20:19:41 +00:00
Tom Stellard 456adc6c4e R600: Initialize AMDGPUMachineFunction::ShaderType to ShaderType::COMPUTE
We need to intialize this to something and since clang does not set
the shader type attribute and clang is used only for compute shaders,
initializing it to COMPUTE seems like the best choice.

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 180620
2013-04-26 18:32:24 +00:00
Adrian Prantl a8aa97d310 cleanup testcase some more
rdar://problem/13056109

llvm-svn: 180619
2013-04-26 18:10:54 +00:00
Quentin Colombet a83d5e9f91 ARM: Fix encoding of hint instruction for Thumb.
"hint" space for Thumb actually overlaps the encoding space of the CPS
instruction. In actuality, hints can be defined as CPS instructions where imod
and M bits are all nil.

Handle decoding of permitted nop-compatible hints (i.e. nop, yield, wfi, wfe,
sev) in DecodeT2CPSInstruction.

This commit adds a proper diagnostic message for Imm0_4 and updates all tests.

Patch by Mihail Popa <Mihail.Popa@arm.com>.

llvm-svn: 180617
2013-04-26 17:54:54 +00:00
Rafael Espindola 37212578ee Add missing ':'.
llvm-svn: 180616
2013-04-26 17:54:46 +00:00
Adrian Prantl 29b9de7bf1 Bugfix for the debug intrinsic handling in InstCombiner:
Since we can't guarantee that the original dbg.declare instrinsic
is removed by LowerDbgDeclare(), we need to make sure that we are
not inserting the same dbg.value intrinsic over and over.
This removes tons of redundant DIEs when compiling optimized code.

rdar://problem/13056109

llvm-svn: 180615
2013-04-26 17:48:33 +00:00
Benjamin Kramer ae81474a38 ARM/NEON: Pattern match vector integer abs to vabs.
llvm-svn: 180604
2013-04-26 15:00:57 +00:00
Benjamin Kramer aec90531f9 X86: Now that we have a canonical form for vector integer abs, match it into pabs.
llvm-svn: 180600
2013-04-26 12:05:21 +00:00
Benjamin Kramer d56ffc709d DAGCombiner: Canonicalize vector integer abs in the same way we do it for scalars.
This already helps SSE2 x86 a lot because it lacks an efficient way to
represent a vector select. The long term goal is to enable the backend to match
a canonicalized pattern into a single instruction (e.g. vabs or pabs).

llvm-svn: 180597
2013-04-26 09:19:19 +00:00
Nadav Rotem 13306816fc LoopVectorizer: Calculate the number of pointers to disambiguate at runtime based on the numbers of reads and writes.
llvm-svn: 180593
2013-04-26 05:08:59 +00:00
Jack Carter c15c1d245b Mips assembler: .set reorder support
Mips have delayslots for certain instructions 
like jumps and branches. These are instructions 
that follow the branch or jump and are executed
before the jump or branch is completed.

Early Mips compilers could not cope with delayslots
and left them up to the assembler. The assembler would
fill the delayslots with the appropriate instruction,
usually just a nop to allow correct runtime behavior.

The default behavior for this is set with .set reorder.
To tell the assembler that you don't want it to mess with
the delayslot one used .set noreorder.

For backwards compatibility we need to support
.set reorder and have it be the default behavior in the 
assembler.

Our support for it is to insert a NOP directly after an
instruction with a delayslot when in .set reorder mode.

Contributer: Vladimir Medic
llvm-svn: 180584
2013-04-25 23:31:35 +00:00
Michael Liao 0b707eb85e Remove SMLoc paired with CHECK-NOT patterns. Not functionality change.
Pattern has source location by itself. After adding a trivial method to
retrieve it, it's unnecessary to pair a source location for CHECK-NOT patterns.
One thing revised after this is the diagnostic info is more accurate by
pointing to the start of the CHECK-NOT pattern instead of the end of the
CHECK-NOT pattern. E.g. diagnostic message previously looks like

    <stdin>:1:1: error: CHECK-NOT: string occurred!
    test
    ^
    test.txt:1:16: note: CHECK-NOT: pattern specified here
    CHECK-NOT: test
                   ^

is changed to

    <stdin>:1:1: error: CHECK-NOT: string occurred!
    test
    ^
    test.txt:1:12: note: CHECK-NOT: pattern specified here
    CHECK-NOT: test
               ^

llvm-svn: 180578
2013-04-25 21:31:34 +00:00
Arnold Schwaighofer 9881dcf2f2 ARM cost model: Integer div and rem is lowered to a function call
Reflect this in the cost model. I observed this in MiBench/consumer-lame.

radar://13354716

llvm-svn: 180576
2013-04-25 21:16:18 +00:00
Preston Gurd 8b7ab4ba2b This patch adds the X86FixupLEAs pass, which will reduce instruction
latency for certain models of the Intel Atom family, by converting
instructions into their equivalent LEA instructions, when it is both
useful and possible to do so.

llvm-svn: 180573
2013-04-25 20:29:37 +00:00
Nadav Rotem f43cbeee15 LoopVectorizer: No need to generate pointer disambiguation checks between readonly pointers.
llvm-svn: 180570
2013-04-25 19:55:03 +00:00
Reid Kleckner d973ca3c51 [mc-coff] Forward Linker Option flags into the .drectve section
Summary:
This is modelled on the Mach-O linker options implementation and should
support a Clang implementation of #pragma comment(lib/linker).

Reviewers: rafael

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D724

llvm-svn: 180569
2013-04-25 19:34:41 +00:00
Rafael Espindola b770f897ee Fix section relocation for SECTIONREL32 with immediate offset.
Patch by Kai Nacke. This matches the gnu as output.

llvm-svn: 180568
2013-04-25 19:27:05 +00:00
Chad Rosier 8180db1f03 [inline asm] Add a test case for r180226. The specific issue is that the inline
assembly is requesting a 64-bit register, which is invalid for i386.
rdar://13731657

llvm-svn: 180445
2013-04-25 17:10:21 +00:00
Rafael Espindola 1e48387962 Clarify getRelocationAddress x getRelocationOffset a bit.
getRelocationAddress is for dynamic libraries and executables,
getRelocationOffset for relocatable objects.

Mark the getRelocationAddress of COFF and MachO as not implemented yet. Add a
test of ELF's. llvm-readobj -r now prints the same values as readelf -r.

llvm-svn: 180259
2013-04-25 12:28:45 +00:00
Silviu Baranga 4ad2bc5963 Fix constant folding for one lane vector types. Constant folding one lane vector types not returns a vector instead of a scalar.
llvm-svn: 180254
2013-04-25 09:32:33 +00:00
Akira Hatanaka 8aba50fd39 Test case for r180241.
llvm-svn: 180246
2013-04-25 02:22:07 +00:00
Akira Hatanaka 714a8f62db Test case for r180238.
llvm-svn: 180245
2013-04-25 02:21:09 +00:00
Tom Stellard 34e4068d05 R600: Use SHT_PROGBITS for the .AMDGPU.config section
The libelf implementation that is distributed here:
http://www.mr511.de/software/english.html
will not parse sections that are marked SHT_NULL.

llvm-svn: 180230
2013-04-24 23:56:14 +00:00
Jack Carter a2015328e8 Mips assembler: Add 64 bit testing for JAL
Contributer: Vladimir Medic
llvm-svn: 180220
2013-04-24 21:52:42 +00:00
Rafael Espindola 75c3036d4b Use pointers to iterate over symbols.
While here, don't report a dummy symbol for relocations that don't have symbols.
We used to says such relocations were for the first defined symbol, but now we
return end_symbols(). The llvm-readobj output change agrees with otool.

llvm-svn: 180214
2013-04-24 19:47:55 +00:00
Arnold Schwaighofer a6578f7056 LoopVectorize: Scalarize padded types
This patch disables memory-instruction vectorization for types that need padding
bytes, e.g., x86_fp80 has 10 bytes store size with 6 bytes padding in darwin on
x86_64. Because the load/store vectorization is performed by the bit casting to
a packed vector, which has incompatible memory layout due to the lack of padding
bytes, the present vectorizer produces inconsistent result for memory
instructions of those types.
This patch checks an equality of the AllocSize of a scalar type and allocated
size for each vector element, to ensure that there is no padding bytes and the
array can be read/written using vector operations.

Patch by Daisuke Takahashi!

Fixes PR15758.

llvm-svn: 180196
2013-04-24 16:16:01 +00:00
Arnold Schwaighofer 23a0589bce LoopVectorizer: Bail out if we don't have datalayout we need it
llvm-svn: 180195
2013-04-24 16:15:58 +00:00
Andrew Trick 85a1d4cbc0 MI Sched: eliminate local vreg copies.
For now, we just reschedule instructions that use the copied vregs and
let regalloc elliminate it. I would really like to eliminate the
copies on-the-fly during scheduling, but we need a complete
implementation of repairIntervalsInRange() first.

The general strategy is for the register coalescer to eliminate as
many global copies as possible and shrink live ranges to be
extended-basic-block local. The coalescer should not have to worry
about resolving local copies (e.g. it shouldn't attemp to reorder
instructions). The scheduler is a much better place to deal with local
interference. The coalescer side of this equation needs work.

llvm-svn: 180193
2013-04-24 15:54:43 +00:00
Adrian Prantl 1d5f8f93f8 Cleanup testcase and ensure we actually exercise the inliner.
rdar://problem/12415623

llvm-svn: 180168
2013-04-24 01:44:15 +00:00
Jyotsna Verma af2359b98c Hexagon: Use multiclass for combine and STri[bhwd]_shl_V4 instructions.
llvm-svn: 180145
2013-04-23 21:17:40 +00:00
Adrian Prantl 15db52bf6d Make sure the instruction right after an inlined function has a
debug location. This solves a problem where range of an inlined
subroutine is emitted wrongly.
Patch by Manman Ren.

Fixes rdar://problem/12415623

llvm-svn: 180140
2013-04-23 19:56:03 +00:00
Stephen Lin 8118e0b588 Add more tests for r179925 to verify correct handling of signext/zeroext; strengthen condition check to require actual MVT::i32 virtual register types, just in case (no actual functionality change)
llvm-svn: 180138
2013-04-23 19:42:25 +00:00
Rafael Espindola 7f08d1b9a8 Fix typo.
llvm-svn: 180137
2013-04-23 19:39:34 +00:00
Jyotsna Verma 89c84821ea Hexagon: Remove assembler mapped instruction definitions.
llvm-svn: 180133
2013-04-23 19:15:55 +00:00
Vincent Lejeune 117f075f6e R600: Use .AMDGPU.config section to emit stacksize
llvm-svn: 180124
2013-04-23 17:34:12 +00:00
Vincent Lejeune b6bfe85a07 R600: Add CF_END
llvm-svn: 180123
2013-04-23 17:34:00 +00:00
Nadav Rotem 71c9d6d333 LoopVectorizer: Fix 15830. When scalarizing and unrolling stores make sure that the order in which the elements are scalarized is the same as the original order.
This fixes a miscompilation in FreeBSD's regex library.

llvm-svn: 180121
2013-04-23 17:12:42 +00:00
Jyotsna Verma a696239bec Hexagon: Remove duplicate instructions to handle global/immediate values
for absolute/absolute-set addressing modes.

llvm-svn: 180120
2013-04-23 17:11:46 +00:00
Pekka Jaaskelainen d3c90e132a Call the potentially costly isAnnotatedParallel() only once.
Made the uniform write test's checks a bit stricter.

llvm-svn: 180119
2013-04-23 16:44:43 +00:00
Rafael Espindola b716e622ae Write relocations in yaml2obj.
llvm-svn: 180115
2013-04-23 15:53:02 +00:00
Rafael Espindola 70e94800e5 Move test from grep to FileCheck.
llvm-svn: 180092
2013-04-23 12:03:27 +00:00
Alexey Samsonov 068fc8ae6e Use zlib to uncompress debug sections in DWARF parser.
This makes llvm-dwarfdump and llvm-symbolizer understand
debug info sections compressed by ld.gold linker.

llvm-svn: 180088
2013-04-23 10:17:34 +00:00
Pekka Jaaskelainen 6f2f66b63f Refuse to (even try to) vectorize loops which have uniform writes,
even if erroneously annotated with the parallel loop metadata.

Fixes Bug 15794: 
"Loop Vectorizer: Crashes with the use of llvm.loop.parallel metadata"

llvm-svn: 180081
2013-04-23 08:08:51 +00:00
Chad Rosier 53e5768351 Add test case for PR15779, which has previously been fixed.
llvm-svn: 180058
2013-04-22 22:30:01 +00:00
Anat Shemer 10260a75e3 Changed back (relative to commit 179786) the operations executed when extract(cast) is transformed to cast(extract). It uses the Builder class as before. In addition the result node is added to the Worklist, so all the previous extract users will become the new scalar cast users.
llvm-svn: 180045
2013-04-22 20:51:10 +00:00
Akira Hatanaka 0d6964cf4a [mips] In performDSPShiftCombine, check that all elements in the vector are
shifted by the same amount and the shift amount is smaller than the element
size.

llvm-svn: 180039
2013-04-22 19:58:23 +00:00
Peter Collingbourne 8988687d6b COFF: Fix weak external aliases.
Differential Revision: http://llvm-reviews.chandlerc.com/D700

llvm-svn: 180034
2013-04-22 18:48:56 +00:00
Stephen Lin 2ec1b100a4 Extra paranoid test for r179925 (verify that tail calls are not generated to 'this'-returning constructors of objects with different 'this' pointers than the caller)
llvm-svn: 180032
2013-04-22 17:23:49 +00:00
Rafael Espindola 8bd2c228f8 Also verify llvm.compiler_used.
llvm-svn: 180020
2013-04-22 15:16:51 +00:00
Rafael Espindola 74f2e46eef Clarify that llvm.used can contain aliases.
Also add a check for llvm.used in the verifier and simplify clients now that
they can assume they have a ConstantArray.

llvm-svn: 180019
2013-04-22 14:58:02 +00:00
Stepan Dyatkovskiy f80f9513ce Fix for 5.5 Parameter Passing --> Stage C:
-- C.4 and C.5 statements, when NSAA is not equal to SP.
 -- C.1.cp statement for VA functions. Note: There are no VFP CPRCs in a
    variadic procedure.

Before this patch "NSAA != 0" means "don't use GPRs anymore ". But there are
some exceptions in AAPCS.
1. For non VA function: allocate all VFP regs for CPRC. When all VFPs are allocated
   CPRCs would be sent to stack, while non CPRCs may be still allocated in GRPs.
2. Check that for VA functions all params uses GPRs and then stack.
   No exceptions, no CPRCs here.

llvm-svn: 180011
2013-04-22 13:06:52 +00:00
Eric Christopher f565498668 Add .ll as a valid test suffix for Object, this allows .ll -> object
and then dumping as tests.

llvm-svn: 180010
2013-04-22 10:45:06 +00:00
Arnaud A. de Grandmaison e206e6e80a Cleanup: test source files do not need to be executable
llvm-svn: 180003
2013-04-22 08:02:43 +00:00
David Blaikie f55abeaf4c Revert "Revert "PR14606: debug info imported_module support""
This reverts commit r179840 with a fix to test/DebugInfo/two-cus-from-same-file.ll

I'm not sure why that test only failed on ARM & MIPS and not X86 Linux, even
though the debug info was clearly invalid on all of them, but this ought to fix
it.

llvm-svn: 179996
2013-04-22 06:12:31 +00:00
Jim Grosbach 563983c8a3 Legalize vector truncates by parts rather than just splitting.
Rather than just splitting the input type and hoping for the best, apply
a bit more cleverness. Just splitting the types until the source is
legal often leads to an illegal result time, which is then widened and a
scalarization step is introduced which leads to truly horrible code
generation. With the loop vectorizer, these sorts of operations are much
more common, and so it's worth extra effort to do them well.

Add a legalization hook for the operands of a TRUNCATE node, which will
be encountered after the result type has been legalized, but if the
operand type is still illegal. If simple splitting of both types
ends up with the result type of each half still being legal, just
do that (v16i16 -> v16i8 on ARM, for example). If, however, that would
result in an illegal result type (v8i32 -> v8i8 on ARM, for example),
we can get more clever with power-two vectors. Specifically,
split the input type, but also widen the result element size, then
concatenate the halves and truncate again.  For example on ARM,
To perform a "%res = v8i8 trunc v8i32 %in" we transform to:
  %inlo = v4i32 extract_subvector %in, 0
  %inhi = v4i32 extract_subvector %in, 4
  %lo16 = v4i16 trunc v4i32 %inlo
  %hi16 = v4i16 trunc v4i32 %inhi
  %in16 = v8i16 concat_vectors v4i16 %lo16, v4i16 %hi16
  %res = v8i8 trunc v8i16 %in16

This allows instruction selection to generate three VMOVN instructions
instead of a sequences of moves, stores and loads.

Update the ARMTargetTransformInfo to take this improved legalization
into account.

Consider the simplified IR:

define <16 x i8> @test1(<16 x i32>* %ap) {
  %a = load <16 x i32>* %ap
  %tmp = trunc <16 x i32> %a to <16 x i8>
  ret <16 x i8> %tmp
}

define <8 x i8> @test2(<8 x i32>* %ap) {
  %a = load <8 x i32>* %ap
  %tmp = trunc <8 x i32> %a to <8 x i8>
  ret <8 x i8> %tmp
}

Previously, we would generate the truly hideous:
	.syntax unified
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_test1
	.align	2
_test1:                                 @ @test1
@ BB#0:
	push	{r7}
	mov	r7, sp
	sub	sp, sp, #20
	bic	sp, sp, #7
	add	r1, r0, #48
	add	r2, r0, #32
	vld1.64	{d24, d25}, [r0:128]
	vld1.64	{d16, d17}, [r1:128]
	vld1.64	{d18, d19}, [r2:128]
	add	r1, r0, #16
	vmovn.i32	d22, q8
	vld1.64	{d16, d17}, [r1:128]
	vmovn.i32	d20, q9
	vmovn.i32	d18, q12
	vmov.u16	r0, d22[3]
	strb	r0, [sp, #15]
	vmov.u16	r0, d22[2]
	strb	r0, [sp, #14]
	vmov.u16	r0, d22[1]
	strb	r0, [sp, #13]
	vmov.u16	r0, d22[0]
	vmovn.i32	d16, q8
	strb	r0, [sp, #12]
	vmov.u16	r0, d20[3]
	strb	r0, [sp, #11]
	vmov.u16	r0, d20[2]
	strb	r0, [sp, #10]
	vmov.u16	r0, d20[1]
	strb	r0, [sp, #9]
	vmov.u16	r0, d20[0]
	strb	r0, [sp, #8]
	vmov.u16	r0, d18[3]
	strb	r0, [sp, #3]
	vmov.u16	r0, d18[2]
	strb	r0, [sp, #2]
	vmov.u16	r0, d18[1]
	strb	r0, [sp, #1]
	vmov.u16	r0, d18[0]
	strb	r0, [sp]
	vmov.u16	r0, d16[3]
	strb	r0, [sp, #7]
	vmov.u16	r0, d16[2]
	strb	r0, [sp, #6]
	vmov.u16	r0, d16[1]
	strb	r0, [sp, #5]
	vmov.u16	r0, d16[0]
	strb	r0, [sp, #4]
	vldmia	sp, {d16, d17}
	vmov	r0, r1, d16
	vmov	r2, r3, d17
	mov	sp, r7
	pop	{r7}
	bx	lr

	.globl	_test2
	.align	2
_test2:                                 @ @test2
@ BB#0:
	push	{r7}
	mov	r7, sp
	sub	sp, sp, #12
	bic	sp, sp, #7
	vld1.64	{d16, d17}, [r0:128]
	add	r0, r0, #16
	vld1.64	{d20, d21}, [r0:128]
	vmovn.i32	d18, q8
	vmov.u16	r0, d18[3]
	vmovn.i32	d16, q10
	strb	r0, [sp, #3]
	vmov.u16	r0, d18[2]
	strb	r0, [sp, #2]
	vmov.u16	r0, d18[1]
	strb	r0, [sp, #1]
	vmov.u16	r0, d18[0]
	strb	r0, [sp]
	vmov.u16	r0, d16[3]
	strb	r0, [sp, #7]
	vmov.u16	r0, d16[2]
	strb	r0, [sp, #6]
	vmov.u16	r0, d16[1]
	strb	r0, [sp, #5]
	vmov.u16	r0, d16[0]
	strb	r0, [sp, #4]
	ldm	sp, {r0, r1}
	mov	sp, r7
	pop	{r7}
	bx	lr

Now, however, we generate the much more straightforward:
	.syntax unified
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_test1
	.align	2
_test1:                                 @ @test1
@ BB#0:
	add	r1, r0, #48
	add	r2, r0, #32
	vld1.64	{d20, d21}, [r0:128]
	vld1.64	{d16, d17}, [r1:128]
	add	r1, r0, #16
	vld1.64	{d18, d19}, [r2:128]
	vld1.64	{d22, d23}, [r1:128]
	vmovn.i32	d17, q8
	vmovn.i32	d16, q9
	vmovn.i32	d18, q10
	vmovn.i32	d19, q11
	vmovn.i16	d17, q8
	vmovn.i16	d16, q9
	vmov	r0, r1, d16
	vmov	r2, r3, d17
	bx	lr

	.globl	_test2
	.align	2
_test2:                                 @ @test2
@ BB#0:
	vld1.64	{d16, d17}, [r0:128]
	add	r0, r0, #16
	vld1.64	{d18, d19}, [r0:128]
	vmovn.i32	d16, q8
	vmovn.i32	d17, q9
	vmovn.i16	d16, q8
	vmov	r0, r1, d16
	bx	lr

llvm-svn: 179989
2013-04-21 23:47:41 +00:00
Jim Grosbach fb08e55cc1 ARM: Split out cost model vcvt testcases.
They had a separate RUN line already, so may as well be in a separate file.

llvm-svn: 179988
2013-04-21 23:47:37 +00:00
Jakob Stoklund Olesen 84ebe25db7 Passing arguments to varags functions under the SPARC v9 ABI.
Arguments after the fixed arguments never use the floating point
registers.

llvm-svn: 179987
2013-04-21 21:36:49 +00:00
Jakob Stoklund Olesen 65d3287282 Fix the SETHIimm pattern for 64-bit code.
Don't ignore the high 32 bits of the immediate.

llvm-svn: 179985
2013-04-21 21:18:03 +00:00
Benjamin Kramer 0212dc27ed SROA: Don't crash on a select with two identical operands.
This is an edge case that can happen if we modify a chain of multiple selects.
Update all operands in that case and remove the assert. PR15805.

llvm-svn: 179982
2013-04-21 17:48:39 +00:00
Arnold Schwaighofer 6eb32b31bd Revert "SimplifyCFG: If convert single conditional stores"
There is the temptation to make this tranform dependent on target information as
it is not going to be beneficial on all (sub)targets. Therefore, we should
probably do this in MI Early-Ifconversion.

This reverts commit r179957. Original commit message:

"SimplifyCFG: If convert single conditional stores

This transformation will transform a conditional store with a preceeding
uncondtional store to the same location:

a[i] =
may-alias with a[i] load
if (cond)
    a[i] = Y
into an unconditional store.

a[i] = X
may-alias with a[i] load
tmp = cond ? Y : X;
a[i] = tmp

We assume that on average the cost of a mispredicted branch is going to be
higher than the cost of a second store to the same location, and that the
secondary benefits of creating a bigger basic block for other optimizations to
work on outway the potential case were the branch would be correctly predicted
and the cost of the executing the second store would be noticably reflected in
performance.

hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With
this change we are on par with gcc's performance (gcc also performs this
transformation). There was a 1.2 % performance improvement on a ARM swift chip.
Other tests in the test-suite+external seem to be mostly uninfluenced in my
experiments:
This optimization was triggered on 41 tests such that the executable was
different before/after the patch. Only 1 out of the 40 tests (dealII) was
reproducable below 100% (by about .4%). Given that hmmer benefits so much I
believe this to be a fair trade off.

I am going to watch performance numbers across the builtbots and will revert
this if anything unexpected comes up."

llvm-svn: 179980
2013-04-21 13:09:04 +00:00
Tim Northover 4a58db65a5 ARM: fix part of test which actually needed an asserts build
This should fix a buildbot failure that occurred after r179977.

llvm-svn: 179978
2013-04-21 12:20:19 +00:00
Tim Northover 798697d662 ARM: Use ldrd/strd to spill 64-bit pairs when available.
This allows common sp-offsets to be part of the instruction and is
probably faster on modern CPUs too.

llvm-svn: 179977
2013-04-21 11:57:07 +00:00
Nadav Rotem c57af326a4 SLPVectorize: Add support for vectorization of casts.
llvm-svn: 179975
2013-04-21 08:05:59 +00:00
Michael Gottesman d5b701faf1 [objc-arc] Cleaned up tail-call-invariant-enforcement.ll.
Specifically:

1. Added checks that unwind is being properly added to various instructions.
2. Fixed the declaration/calling of objc_release to have a return type of void.
3. Moved all checks to precede the functions and added checks to ensure that the
checks would only match inside the specific function that we are attempting to
check.

llvm-svn: 179973
2013-04-21 02:59:44 +00:00
Michael Gottesman 77aa946321 [objc-arc] Check that objc-arc-expand properly handles all strictly forwarding calls and does not touch calls which are not strictly forwarding (i.e. objc_retainBlock).
llvm-svn: 179972
2013-04-21 01:57:46 +00:00
Michael Gottesman 524052fec1 [objc-arc] Renamed the test file clang-arc-used-intrinsic-removed-if-isolated.ll -> intrinsic-use-isolated.ll to match the other test file intrinsic-use.ll.
llvm-svn: 179971
2013-04-21 01:42:24 +00:00
Bill Wendling eaff0ce3a4 Remove tbaa metadata.
llvm-svn: 179970
2013-04-21 01:38:25 +00:00
Jakob Stoklund Olesen a41f91ea8e Compile varargs functions for SPARCv9.
With a little help from the frontend, it looks like the standard va_*
intrinsics can do the job.

Also clean up an old bitcast hack in LowerVAARG that dealt with
unaligned double loads. Load SDNodes can specify an alignment now.

Still missing: Calling varargs functions with float arguments.

llvm-svn: 179961
2013-04-20 22:49:16 +00:00
Nadav Rotem 8aca44a623 Fix PR15800. Do not try to vectorize vectors and structs.
llvm-svn: 179960
2013-04-20 22:29:43 +00:00
Arnold Schwaighofer 3546ccf465 SimplifyCFG: If convert single conditional stores
This transformation will transform a conditional store with a preceeding
uncondtional store to the same location:

 a[i] =
 may-alias with a[i] load
 if (cond)
   a[i] = Y

into an unconditional store.

 a[i] = X
 may-alias with a[i] load
 tmp = cond ? Y : X;
 a[i] = tmp

We assume that on average the cost of a mispredicted branch is going to be
higher than the cost of a second store to the same location, and that the
secondary benefits of creating a bigger basic block for other optimizations to
work on outway the potential case were the branch would be correctly predicted
and the cost of the executing the second store would be noticably reflected in
performance.

hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With
this change we are on par with gcc's performance (gcc also performs this
transformation). There was a 1.2 % performance improvement on a ARM swift chip.
Other tests in the test-suite+external seem to be mostly uninfluenced in my
experiments:
This optimization was triggered on 41 tests such that the executable was
different before/after the patch. Only 1 out of the 40 tests (dealII) was
reproducable below 100% (by about .4%). Given that hmmer benefits so much I
believe this to be a fair trade off.

I am going to watch performance numbers across the builtbots and will revert
this if anything unexpected comes up.

llvm-svn: 179957
2013-04-20 21:42:09 +00:00
Tim Northover d9d4211fe2 ARM: don't add FrameIndex offset for LDMIA (has no immediate)
Previously, when spilling 64-bit paired registers, an LDMIA with both
a FrameIndex and an offset was produced. This kind of instruction
shouldn't exist, and the extra operand was being confused with the
predicate, causing aborts later on.

This removes the invalid 0-offset from the instruction being
produced.

llvm-svn: 179956
2013-04-20 19:31:00 +00:00
Nuno Lopes 36e827602a recommit tests
llvm-svn: 179955
2013-04-20 17:39:52 +00:00
Stephen Lin 8fccb8a772 Minor renaming of tests (for consistency with an in-development patch)
llvm-svn: 179954
2013-04-20 16:21:26 +00:00
Benjamin Kramer 5bd25f3786 Don't litter .s files in test directory.
llvm-svn: 179937
2013-04-20 10:43:40 +00:00
Nadav Rotem 83c7c41bc2 SLPVectorizer: Improve the cost model for loop invariant broadcast values.
llvm-svn: 179930
2013-04-20 06:13:47 +00:00
Stephen Lin b8bd232a3d Add CodeGen support for functions that always return arguments via a new parameter attribute 'returned', which is taken advantage of in target-independent tail call opportunity detection and in ARM call lowering (when placed on an integral first parameter).
llvm-svn: 179925
2013-04-20 05:14:40 +00:00
Stephen Lin ffc445492c Allow tail call opportunity detection through nested and/or multiple iterations of extractelement/insertelement indirection
llvm-svn: 179924
2013-04-20 04:27:51 +00:00
Akira Hatanaka 1ebb2a1c56 [mips] Instruction selection patterns for DSP-ASE vector shifts.
llvm-svn: 179906
2013-04-19 23:21:32 +00:00