Commit Graph

10451 Commits

Author SHA1 Message Date
Daniel Dunbar 828984ff4e MC/AsmParser: Add .macros_{off,on} support, not that makes sense since we don't
support macros.

llvm-svn: 108649
2010-07-18 18:38:02 +00:00
Owen Anderson 41670a11a8 Add a testcase for r108639.
llvm-svn: 108640
2010-07-18 08:57:19 +00:00
Owen Anderson 7d2818b073 Another attempt at getting the clang self-host to like my instcombine patch.
llvm-svn: 108614
2010-07-17 06:56:35 +00:00
Jim Grosbach b97e2bbe32 Add combiner patterns to more effectively utilize the BFI (bitfield insert)
instruction for non-constant operands. This includes the case referenced
in the README.txt regarding a bitfield copy.

llvm-svn: 108608
2010-07-17 03:30:54 +00:00
Eli Friedman ceb16a5ce9 Test for ELF .size directive.
llvm-svn: 108607
2010-07-17 03:15:24 +00:00
Jim Grosbach 11013eda5a Add basic support to code-gen the ARM/Thumb2 bit-field insert (BFI) instruction
and a combine pattern to use it for setting a bit-field to a constant
value. More to come for non-constant stores.

llvm-svn: 108570
2010-07-16 23:05:05 +00:00
Bill Wendling bf8370ff36 Consider this function:
void foo() { __builtin_unreachable(); }

It will output the following on Darwin X86:

_func1:
Leh_func_begin0:
        pushq %rbp
Ltmp0:
        movq %rsp, %rbp
Ltmp1:
Leh_func_end0:

This prolog adds a new Call Frame Information (CFI) row to the FDE with an
address that is not within the address range of the code it describes -- part is
equal to the end of the function -- and therefore results in an invalid EH
frame. If we emit a nop in this situation, then the CFI row is now within the
address range.

llvm-svn: 108568
2010-07-16 22:51:10 +00:00
Jakob Stoklund Olesen c30b4ddc58 Remove the X86::FP_REG_KILL pseudo-instruction and the X86FloatingPointRegKill
pass that inserted it.

It is no longer necessary to limit the live ranges of FP registers to a single
basic block.

llvm-svn: 108536
2010-07-16 17:41:44 +00:00
Benjamin Kramer 50729ad717 Feed the right output into FileCheck.
llvm-svn: 108523
2010-07-16 10:58:02 +00:00
Nick Lewycky 375efe3157 Arrays and vectors with different numbers of elements are not equivalent.
llvm-svn: 108517
2010-07-16 06:31:12 +00:00
Tobias Grosser 3d84c9c793 LoopSimplify does not update domfrontier correctly.
This fixes PR7649.

llvm-svn: 108513
2010-07-16 05:59:45 +00:00
Jakob Stoklund Olesen 37c42a3d02 Remove many calls to TII::isMoveInstr. Targets should be producing COPY anyway.
TII::isMoveInstr is going tobe completely removed.

llvm-svn: 108507
2010-07-16 04:45:42 +00:00
Jakob Stoklund Olesen b1671271ab Add forgotten test case.
llvm-svn: 108506
2010-07-16 04:45:35 +00:00
Dan Gohman 103c4ebea5 Use the source-order scheduler instead of the "fast" scheduler at -O0,
because it's more likely to keep debug line information in its original
order.

llvm-svn: 108496
2010-07-16 02:01:19 +00:00
Eric Christopher 15a81cddb4 Also revert 108422, it's causing some test failures.
Working on testcases for Owen.

llvm-svn: 108494
2010-07-16 01:36:12 +00:00
Dan Gohman c6eefe4d4e Fix this test.
llvm-svn: 108491
2010-07-16 01:28:45 +00:00
Dale Johannesen bfd4fd7bb7 The SelectionDAGBuilder's handling of debug info, on rare
occasions, caused code to be generated in a different order.
All cases I've seen involved float softening in the type
legalizer, and this could be perhaps be fixed there, but
it's better not to generate things differently in the first
place.  7797940 (6/29/2010..7/15/2010).

llvm-svn: 108484
2010-07-16 00:02:08 +00:00
Bill Wendling 4bda1c8e68 Revert. This isn't the correct way to go.
llvm-svn: 108478
2010-07-15 23:42:21 +00:00
Dan Gohman fbbdfcaea7 Fix the order that SCEVExpander considers add operands in so that
it doesn't miss an opportunity to form a GEP, regardless of the
relative loop depths of the operands. This fixes rdar://8197217.

llvm-svn: 108475
2010-07-15 23:38:13 +00:00
Bill Wendling 973dc3b1d8 Handle code gen for the unreachable instruction if it's the only instruction in
the function. We'll just turn it into a "trap" instruction instead.

The problem with not handling this is that it might generate a prologue without
the equivalent epilogue to go with it:

$ cat t.ll
define void @foo() {
entry:
  unreachable
}
$ llc -o - t.ll -relocation-model=pic -disable-fp-elim -unwind-tables
        .section        __TEXT,__text,regular,pure_instructions
        .globl  _foo
        .align  4, 0x90
_foo:                                   ## @foo
Leh_func_begin0:
## BB#0:                                ## %entry
        pushq   %rbp
Ltmp0:
        movq    %rsp, %rbp
Ltmp1:
Leh_func_end0:
...

The unwind tables then have bad data in them causing all sorts of problems.

Fixes <rdar://problem/8096481>.

llvm-svn: 108473
2010-07-15 23:32:40 +00:00
Evan Cheng 55f0c6b9fc Split -enable-finite-only-fp-math to two options:
-enable-no-nans-fp-math and -enable-no-infs-fp-math. All of the current codegen fp math optimizations only care whether the fp arithmetics arguments and results can never be NaN.

llvm-svn: 108465
2010-07-15 22:07:12 +00:00
Chris Lattner 60b131654b fix the definitions of ConstTextCoalSection/ConstDataCoalSection
to keep "Text" in sync with the "pure instructions" section attribute.
Lack of this attribute was preventing the assembler from emitting
multibyte noops instructions for templates (and inlines, and other
coalesced stuff) and was causing the assembler to mismatch .o files.

This fixes rdar://8018335

llvm-svn: 108461
2010-07-15 21:22:00 +00:00
Devang Patel df09db62e2 Fix crash reported in PR7653.
llvm-svn: 108441
2010-07-15 18:45:27 +00:00
Dan Gohman 4afd412d6b Watch out for a constant offset cancelling out a base register, forming
a zero. This situation arrises in Fortran code with induction variables
that start at 1 instead of 0. This fixes PR7651.

llvm-svn: 108424
2010-07-15 15:14:45 +00:00
Owen Anderson 7151dfd48a Reapply r108378, with bugfixes, testcase, and improved comment formatting.
This now passes LIT, nighty test, and llvm-gcc bootstrap on my machine.

llvm-svn: 108422
2010-07-15 15:00:23 +00:00
Chris Lattner 19eff2a9f6 Fix PR7647, handling the case when 'To' ends up being
mutated by recursive simplification.  This also enhances
ReplaceAndSimplifyAllUses to actually do a real RAUW
at the end of it, which updates any value handles
pointing to "From" to start pointing to "To".  This
seems useful for debug info and random other VH users.

llvm-svn: 108415
2010-07-15 06:36:08 +00:00
Chris Lattner e985a63bbf see comment.
llvm-svn: 108409
2010-07-15 05:17:36 +00:00
Eric Christopher 25e72a8920 Temporarily disable this test.
llvm-svn: 108371
2010-07-14 23:12:58 +00:00
Devang Patel 29168baf4b Make it a .ll test case.
llvm-svn: 108370
2010-07-14 23:12:52 +00:00
Eric Christopher e34b383e71 Add a testcase for the vla and stack realignment warning.
llvm-svn: 108365
2010-07-14 22:26:35 +00:00
Dale Johannesen 6fe8c37a01 Tests for llvm-gcc commit 108360.
llvm-svn: 108362
2010-07-14 21:22:35 +00:00
Jim Grosbach a90af1ba38 Improve 64-subtraction of immediates when parts of the immediate can fit
in the literal field of an instruction. E.g.,
long long foo(long long a) {
  return a - 734439407618LL;
}

rdar://7038284

llvm-svn: 108339
2010-07-14 17:45:16 +00:00
Dan Gohman 042523340b Delete fast-isel's trivial load optimization; it breaks debugging because
it can look past points where a debugger might modify user variables.

llvm-svn: 108336
2010-07-14 17:25:37 +00:00
Bob Wilson bb57896f8e Fix test to appease the buildbots.
llvm-svn: 108334
2010-07-14 16:43:47 +00:00
Evan Cheng a8e8874552 Fix for PR7193 was overly conservative. The only case where sibcall callee
address cannot be allocated a register is in 32-bit mode where the first
three arguments are marked inreg. In that case EAX, EDX, and ECX will be
used for argument passing.

This fixes PR7610.

llvm-svn: 108327
2010-07-14 06:44:01 +00:00
Bob Wilson bad47f62f6 Add support for NEON VMVN immediate instructions.
llvm-svn: 108324
2010-07-14 06:31:50 +00:00
Chris Lattner ec0e7b1643 revert r108320, I see the failures now...
llvm-svn: 108322
2010-07-14 06:16:35 +00:00
Chris Lattner 658680b2f5 reapply benjamin's instcombine patch, I don't see anything wrong with it and can't repro any problems with a manual self-host.
llvm-svn: 108320
2010-07-14 05:59:13 +00:00
Evan Cheng c893115312 Re-enable the test with fix.
llvm-svn: 108319
2010-07-14 05:49:23 +00:00
Chris Lattner 711338fb04 temporarily disable to test to fix buildbots.
llvm-svn: 108310
2010-07-14 02:21:59 +00:00
Evan Cheng d542414945 Teach ProcessImplicitDefs to transform more COPY instructions into IMPLICIT_DEF (and subsequently eliminate them). This allows machine LICM to hoist IMPLICIT_DEF's. PR7620.
llvm-svn: 108304
2010-07-14 01:22:19 +00:00
Bob Wilson 103a0dcfe1 Add an ARM-specific DAG combining to avoid redundant VDUPLANE nodes.
Radar 7373643.

llvm-svn: 108303
2010-07-14 01:22:12 +00:00
Bruno Cardoso Lopes 6c6c14a55c Add AVX 256-bit compare instructions and a bunch of testcases
llvm-svn: 108286
2010-07-13 22:06:38 +00:00
Bob Wilson a3f1901531 Use a target-specific VMOVIMM DAG node instead of BUILD_VECTOR to represent
NEON VMOV-immediate instructions.  This simplifies some things.

llvm-svn: 108275
2010-07-13 21:16:48 +00:00
Bruno Cardoso Lopes fd8bfcd6e1 AVX 256-bit conversion instructions
Add the x86 VEX_L form to handle special cases where VEX_L must be set.

llvm-svn: 108274
2010-07-13 21:07:28 +00:00
Dale Johannesen caca5488dc In inline asm treat indirect 'X' constraint as 'm'.
This may not be right in all cases, but it's better
than asserting which it was doing before.  PR 7528.

llvm-svn: 108268
2010-07-13 20:17:05 +00:00
Dan Gohman afd69cf5b7 Add support for empty named metadata too. This isn't particularly
useful, but it is nice for consistency.

llvm-svn: 108262
2010-07-13 19:42:44 +00:00
Dan Gohman 1e0213a758 Add support for empty metadata nodes: !{}.
llvm-svn: 108259
2010-07-13 19:33:27 +00:00
Evan Cheng 0cc4ad983d Extend the r107852 optimization which turns some fp compare to code sequence using only i32 operations. It now optimize some f64 compares when fp compare is exceptionally slow (e.g. cortex-a8). It also catches comparison against 0.0.
llvm-svn: 108258
2010-07-13 19:27:42 +00:00
Evan Cheng f43961007c -enable-unsafe-fp-math should not imply -enable-finite-only-fp-math.
llvm-svn: 108254
2010-07-13 18:46:14 +00:00
Dale Johannesen f241d4626c Fix PR number.
llvm-svn: 108251
2010-07-13 18:14:47 +00:00
Duncan Sands f88a284579 Handle the case of a tail recursion in which the tail call is followed
by a return that returns a constant, while elsewhere in the function
another return instruction returns a different constant.  This is a
special case of accumulator recursion, so just generalize the existing
logic a bit.

llvm-svn: 108241
2010-07-13 15:41:41 +00:00
Chris Lattner 55595fb291 my work on adding segment registers to LEA missed the
disassembler.  Remove some code from the disassembler to
compensate, unbreaking disassembly of lea's.

llvm-svn: 108226
2010-07-13 04:23:55 +00:00
Bruno Cardoso Lopes dff283e146 Add AVX 256-bit packed logical forms
llvm-svn: 108224
2010-07-13 02:38:35 +00:00
Bruno Cardoso Lopes 36b32aeaa5 Add AVX 256-bit unop arithmetic instructions
llvm-svn: 108223
2010-07-13 01:53:31 +00:00
Bruno Cardoso Lopes 8e67a0482e Add AVX 256 binary arithmetic instructions
llvm-svn: 108207
2010-07-12 23:04:15 +00:00
Dan Gohman 51e6d9bbf6 Apply the SSE dependence idiom for SSE unary operations to
SD instructions too, in addition to SS instructions. And
add a comment about it.

llvm-svn: 108191
2010-07-12 20:46:04 +00:00
Bruno Cardoso Lopes f9bcaad76d Add AVX 256-bit MOVMSK forms
llvm-svn: 108184
2010-07-12 20:06:32 +00:00
Daniel Dunbar d388c93f87 MC/AsmParser: Move .tbss and .zerofill parsing to Darwin specific parser.
llvm-svn: 108180
2010-07-12 19:37:35 +00:00
Daniel Dunbar 63a379dd5c MC/AsmParser: Move .desc parsing to Darwin specific parser.
llvm-svn: 108179
2010-07-12 19:22:53 +00:00
Daniel Dunbar ae9da1481a MC/AsmParser: Move some misc. Darwin directive handling to DarwinAsmParser.
llvm-svn: 108174
2010-07-12 18:49:22 +00:00
Dan Gohman c128e70ff2 Add a lint check for mismatched return types, inspired by PR6944.
llvm-svn: 108162
2010-07-12 18:02:04 +00:00
Benjamin Kramer 8f36402ac2 Nope, still breaks the release selfhost bots :(
llvm-svn: 108153
2010-07-12 16:38:48 +00:00
Benjamin Kramer 07b695e052 Reapply the "or" half of r108136, which seems to be less problematic.
llvm-svn: 108152
2010-07-12 16:15:48 +00:00
Benjamin Kramer c719e8ae9e Revert r108141 again, sigh.
llvm-svn: 108148
2010-07-12 14:42:04 +00:00
Benjamin Kramer f578c36035 Reapply 108136 with an ugly pasto fixed.
llvm-svn: 108141
2010-07-12 13:44:00 +00:00
Benjamin Kramer 9675e759cf Revert r108136 until I figure out why it broke selfhost.
llvm-svn: 108139
2010-07-12 12:35:49 +00:00
Benjamin Kramer 35473faa50 instcombine: fold (x & y) | (~x & z) and (x & y) ^ (~x & z) into ((y ^ z) & x) ^ z which is one instruction shorter. (PR6773)
before:
  %and = and i32 %y, %x
  %neg = xor i32 %x, -1
  %and4 = and i32 %z, %neg
  %xor = xor i32 %and4, %and

after:
  %xor1 = xor i32 %z, %y
  %and2 = and i32 %xor1, %x
  %xor = xor i32 %and2, %z

llvm-svn: 108136
2010-07-12 11:54:45 +00:00
Chris Lattner 25eea4db66 fix PR7311 by avoiding breaking casts when a bitcast from scalar->vector
is involved.

llvm-svn: 108117
2010-07-12 01:19:22 +00:00
Chris Lattner bbc25ff5cc if jump threading is able to infer interesting values on both
the LHS and RHS of an and/or instruction, don't multiply add
known predecessor values.  This fixes the crash on testcase
from PR7498

llvm-svn: 108114
2010-07-12 00:47:34 +00:00
Chris Lattner fd4a09fc0a fix PR7429, a crash turning a load from a string into a float.
llvm-svn: 108113
2010-07-12 00:22:51 +00:00
Chris Lattner f8feba368c convert to filechecconvert to filecheckk
llvm-svn: 108112
2010-07-12 00:21:10 +00:00
Chris Lattner 9338b0a1e2 merge two tests.
llvm-svn: 108111
2010-07-12 00:19:47 +00:00
Jakob Stoklund Olesen c4227f1362 Remove TargetInstrInfo::copyRegToReg entirely.
Targets must now implement TargetInstrInfo::copyPhysReg instead. There is no
longer a default implementation forwarding to copyRegToReg.

llvm-svn: 108095
2010-07-11 17:01:17 +00:00
Rafael Espindola a76eccf815 Fix va_arg for doubles. With this patch VAARG nodes always contain the
correct alignment information, which simplifies ExpandRes_VAARG a bit.

The patch introduces a new alignment information to TargetLoweringInfo. This is
needed since the two natural candidates cannot be used:

* The 's' in target data: If this is set to the minimal alignment of any
  argument, getCallFrameTypeAlignment would return 4 for doubles on ARM for
  example.
* The getTransientStackAlignment method. It is possible for an architecture to
  have argument less aligned than what we maintain the stack pointer.

llvm-svn: 108072
2010-07-11 04:01:49 +00:00
Dan Gohman 79be2b9be5 Fix this test.
llvm-svn: 108059
2010-07-10 22:42:12 +00:00
Jakob Stoklund Olesen c4b3bcc051 FileCheckize inline asm FP stack tests
llvm-svn: 108046
2010-07-10 16:30:25 +00:00
Dan Gohman 30933b3bdb Add an explicit triple to make this test behave consistently.
llvm-svn: 108041
2010-07-10 09:01:35 +00:00
Dan Gohman 367b65b56e Fix this XTARGET so that this does doesn't XPASS on non-darwin hosts.
llvm-svn: 108040
2010-07-10 09:01:03 +00:00
Dan Gohman d7b5ce3312 Reapply bottom-up fast-isel, with several fixes for x86-32:
- Check getBytesToPopOnReturn().
 - Eschew ST0 and ST1 for return values.
 - Fix the PIC base register initialization so that it doesn't ever
   fail to end up the top of the entry block.

llvm-svn: 108039
2010-07-10 09:00:22 +00:00
Bruno Cardoso Lopes 2419606bfb Add AVX 256-bit packed MOVNT variants
llvm-svn: 108021
2010-07-09 21:42:42 +00:00
Bruno Cardoso Lopes 6bc772eec7 Add AVX 256-bit unpack and interleave
llvm-svn: 108017
2010-07-09 21:20:35 +00:00
Jakob Stoklund Olesen 51702ec46b Fix a few tests
llvm-svn: 108011
2010-07-09 20:43:09 +00:00
Jim Grosbach 2a5725b1a3 In the presence of variable sized objects, allocate an emergency spill slot.
rdar://8131327

llvm-svn: 108008
2010-07-09 20:27:06 +00:00
Dan Gohman ea9ae3e6ed Add a target triple.
llvm-svn: 108003
2010-07-09 19:17:36 +00:00
Dan Gohman 7929c448fc Fix MachineLICM to actually visit inner loops.
llvm-svn: 108001
2010-07-09 18:49:45 +00:00
Bruno Cardoso Lopes 792e906bef Start the support for AVX instructions with 256-bit %ymm registers. A couple of
notes:
- The instructions are being added with dummy placeholder patterns using some 256
  specifiers, this is not meant to work now, but since there are some multiclasses
  generic enough to accept them,  when we go for codegen, the stuff will be already
  there.
- Add VEX encoding bits to support YMM
- Add MOVUPS and MOVAPS in the first round
- Use "Y" as suffix for those Instructions: MOVUPSYrr, ...
- All AVX instructions in X86InstrSSE.td will move soon to a new X86InstrAVX
  file.

llvm-svn: 107996
2010-07-09 18:27:43 +00:00
Bob Wilson 6586e9b203 --- Reverse-merging r107947 into '.':
U    utils/TableGen/FastISelEmitter.cpp
--- Reverse-merging r107943 into '.':
U    test/CodeGen/X86/fast-isel.ll
U    test/CodeGen/X86/fast-isel-loads.ll
U    include/llvm/Target/TargetLowering.h
U    include/llvm/Support/PassNameParser.h
U    include/llvm/CodeGen/FunctionLoweringInfo.h
U    include/llvm/CodeGen/CallingConvLower.h
U    include/llvm/CodeGen/FastISel.h
U    include/llvm/CodeGen/SelectionDAGISel.h
U    lib/CodeGen/LLVMTargetMachine.cpp
U    lib/CodeGen/CallingConvLower.cpp
U    lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
U    lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp
U    lib/CodeGen/SelectionDAG/FastISel.cpp
U    lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
U    lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
U    lib/CodeGen/SelectionDAG/InstrEmitter.cpp
U    lib/CodeGen/SelectionDAG/TargetLowering.cpp
U    lib/Target/XCore/XCoreISelLowering.cpp
U    lib/Target/XCore/XCoreISelLowering.h
U    lib/Target/X86/X86ISelLowering.cpp
U    lib/Target/X86/X86FastISel.cpp
U    lib/Target/X86/X86ISelLowering.h

llvm-svn: 107987
2010-07-09 16:37:18 +00:00
Jakob Stoklund Olesen a57965827f Fix test to be less sensitive of regalloc accidents
llvm-svn: 107951
2010-07-09 01:32:11 +00:00
Bob Wilson 88a4e6dc0e Print "dregpair" NEON operands with a space between them, for readability and
consistency with other instructions that have lists of register operands.

llvm-svn: 107944
2010-07-09 00:47:20 +00:00
Dan Gohman 0b5aa1cdd3 Re-apply bottom-up fast-isel, with fixes. Be very careful to avoid emitting
a DBG_VALUE after a terminator, or emitting any instructions before an EH_LABEL.

llvm-svn: 107943
2010-07-09 00:39:23 +00:00
Bob Wilson 21eed476e8 Reenable DAG combining for vector shuffles. It looks like it was temporarily
disabled and then never turned back on again.  Adjust some tests, one because
this change avoids an unnecessary instruction, and the other to make it
continue testing what it was intended to test.

llvm-svn: 107941
2010-07-09 00:38:12 +00:00
Bill Wendling a992445ff2 Extension of r107506. Make sure that we don't mark a function as having a call
if the inline ASM doesn't need a stack frame.

llvm-svn: 107922
2010-07-08 22:38:02 +00:00
Chris Lattner 9f034c1e5d Rework segment prefix emission code to handle segments
in memory operands at the same type as hard coded segments.
This fixes problems where we'd emit the segment override after
the REX prefix on instructions like:
mov %gs:(%rdi), %rax

This fixes rdar://8127102.  I have several cleanup patches coming
next.

llvm-svn: 107917
2010-07-08 22:28:12 +00:00
Stuart Hastings aa246f5687 Test case for r107843. Radar 8152866.
llvm-svn: 107907
2010-07-08 20:31:05 +00:00
Evan Cheng 0f54854a1d Check for FiniteOnlyFPMath as well.
llvm-svn: 107904
2010-07-08 20:12:24 +00:00
Benjamin Kramer 2321e6a4d4 Teach instcombine to transform
(X >s -1) ? C1 : C2 and (X <s  0) ? C2 : C1
into ((X >>s 31) & (C2 - C1)) + C1, avoiding the conditional.

This optimization could be extended to take non-const C1 and C2 but we better
stay conservative to avoid code size bloat for now.

for
int sel(int n) {
     return n >= 0 ? 60 : 100;
}

we now generate
  sarl  $31, %edi
  andl  $40, %edi
  leal  60(%rdi), %eax

instead of
  testl %edi, %edi
  movl  $60, %ecx
  movl  $100, %eax
  cmovnsl %ecx, %eax

llvm-svn: 107866
2010-07-08 11:39:10 +00:00
Eric Christopher e796253217 A slight reworking of the custom patterns for x86-64 tpoff codegen and
correct the testcase for valid assembly.

Needs more tests.

llvm-svn: 107860
2010-07-08 07:36:46 +00:00
Evan Cheng be1f7a931e r107852 is only safe with -enable-unsafe-fp-math to account for +0.0 == -0.0.
llvm-svn: 107856
2010-07-08 06:01:49 +00:00
Evan Cheng 25f9364cbd Optimize some vfp comparisons to integer ones. This patch implements the simplest case when the following conditions are met:
1. The arguments are f32.
2. The arguments are loads and they have no uses other than the comparison.
3. The comparison code is EQ or NE.

e.g.
        vldr.32 s0, [r1]
        vldr.32 s1, [r0]
        vcmpe.f32       s1, s0
        vmrs    apsr_nzcv, fpscr
	beq     LBB0_2
=>
        ldr     r1, [r1]
        ldr     r0, [r0]
        cmp     r0, r1
        beq     LBB0_2

More complicated cases will be implemented in subsequent patches.

llvm-svn: 107852
2010-07-08 02:08:50 +00:00
Dale Johannesen e2289285ae Changes to ARM tail calls, mostly cosmetic.
Add explicit testcases for tail calls within the same module.
Duplicate some code to humor those who think .w doesn't apply on ARM.
Leave this disabled on Thumb1, and add some comments explaining why it's hard
and won't gain much.

llvm-svn: 107851
2010-07-08 01:18:23 +00:00
Dan Gohman e75704369d Revert 107840 107839 107813 107804 107800 107797 107791.
Debug info intrinsics win for now.

llvm-svn: 107850
2010-07-08 01:00:56 +00:00
Chris Lattner efa3c824cc Fix the second half of PR7437: scalarrepl wasn't preserving
address spaces when SRoA'ing memcpy's.

llvm-svn: 107846
2010-07-08 00:27:05 +00:00
Chris Lattner ac5881295c Implement the major chunk of PR7195: support for 'callw'
in the integrated assembler.  Still some discussion to be
done.

llvm-svn: 107825
2010-07-07 22:27:31 +00:00
Bruno Cardoso Lopes 6c61451011 Add more assembly opcodes for SSE compare instructions
llvm-svn: 107823
2010-07-07 22:24:03 +00:00
Jakob Stoklund Olesen ddaf0099a5 Allow copies between GR8_ABCD_L and GR8_ABCD_H.
This fixes PR7540.

llvm-svn: 107809
2010-07-07 20:33:27 +00:00
Dan Gohman e7ccc51cc1 Implement bottom-up fast-isel. This has the advantage of not requiring
a separate DCE pass over MachineInstrs.

llvm-svn: 107804
2010-07-07 19:20:32 +00:00
Dan Gohman 2d4d01d0de Add X86FastISel support for return statements. This entails refactoring
a bunch of stuff, to allow the target-independent calling convention
logic to be employed.

llvm-svn: 107800
2010-07-07 18:32:53 +00:00
Bruno Cardoso Lopes fd8060335b Add AVX AES instructions
llvm-svn: 107798
2010-07-07 18:24:20 +00:00
Dan Gohman 00ef93258a Remove interprocedural-basic-aa and associated code. The AliasAnalysis
interface needs implementations to be consistent, so any code which
wants to support different semantics must use a different interface.
It's not currently worthwhile to add a new interface for this new
concept.

Document that AliasAnalysis doesn't support cross-function queries.

llvm-svn: 107776
2010-07-07 14:27:09 +00:00
Bruno Cardoso Lopes 6d122aef97 Add AVX SSE4.2 instructions
llvm-svn: 107752
2010-07-07 03:39:29 +00:00
Bruno Cardoso Lopes 8f5472a8e8 Add AVX SSE4.1 insertps, ptest and movntdqa instructions
llvm-svn: 107747
2010-07-07 01:14:56 +00:00
Bruno Cardoso Lopes 6430c7350d Add AVX SSE4.1 extractps and pinsr instructions
llvm-svn: 107746
2010-07-07 01:01:13 +00:00
Bruno Cardoso Lopes f3116ebe96 Add AVX SSE4.1 Extract Integer instructions
llvm-svn: 107740
2010-07-07 00:07:24 +00:00
Dale Johannesen ce65663330 Accept RIP-relative symbols with 'i' constraint, and
print the (%rip) only if the 'a' modifier is present.
PR 7528.

llvm-svn: 107727
2010-07-06 23:27:00 +00:00
Bruno Cardoso Lopes 1f9ad516c6 Add the rest of AVX SSE4.1 packed move with sign/zero extend instructions
llvm-svn: 107723
2010-07-06 23:15:17 +00:00
Dale Johannesen 6f01541ae6 Make test not hang waiting for input.
llvm-svn: 107721
2010-07-06 23:06:58 +00:00
Bruno Cardoso Lopes 35702d27c4 Add part of AVX SSE4.1 packed move with sign/zero extend instructions
llvm-svn: 107720
2010-07-06 23:01:41 +00:00
Bruno Cardoso Lopes e2bd058d32 Add AVX vblendvpd, vblendvps and vpblendvb instructions
Update VEX encoding to support those new instructions

llvm-svn: 107715
2010-07-06 22:36:24 +00:00
Jakob Stoklund Olesen a64c0a3d22 Be more forgiving when calculating alias interference for physreg coalescing.
It is OK for an alias live range to overlap if there is a copy to or from the
physical register. CoalescerPair can work out if the copy is coalescable
independently of the alias.

This means that we can join with the actual destination interval instead of
using the getOrigDstReg() hack. It is no longer necessary to merge clobber
ranges into subregisters.

llvm-svn: 107695
2010-07-06 20:31:51 +00:00
Devang Patel 23a7593534 Fix PR7545 crash.
llvm-svn: 107678
2010-07-06 18:18:32 +00:00
Rafael Espindola 7c510aa7bc Don't create neon moves in CopyRegToReg. NEONMoveFixPass will do the conversion
if profitable.

llvm-svn: 107673
2010-07-06 16:24:34 +00:00
Eric Christopher 8f06b4a294 Remove mistakenly added test.
llvm-svn: 107641
2010-07-06 05:20:13 +00:00
Eric Christopher 2ad0c779c3 Fix up -fstack-protector on linux to use the segment
registers.  Split out testcases per architecture and os
now.

Patch from Nelson Elhage.

llvm-svn: 107640
2010-07-06 05:18:56 +00:00
Chris Lattner 60db4557cd another v2f32 case, in this case showing poor codegen.
llvm-svn: 107614
2010-07-05 05:52:56 +00:00
Chris Lattner 431e81f2fb fix test on non-x86 hosts.
llvm-svn: 107608
2010-07-05 03:56:55 +00:00
Chris Lattner 45cc4d74a3 Just rip v2f32 support completely out of the X86 backend. In
the example in the testcase, we now generate:

_test1:                                 ## @test1
	movss	4(%esp), %xmm0
	addss	8(%esp), %xmm0
	movl	12(%esp), %eax
	movss	%xmm0, (%eax)
	ret

instead of:

_test1:                                                     ## @test1
	subl	$20, %esp
	movl	24(%esp), %eax
	movq	%mm0, (%esp)
	movq	%mm0, 8(%esp)
	movss	(%esp), %xmm0
	addss	12(%esp), %xmm0
	movss	%xmm0, (%eax)
	addl	$20, %esp
	ret

v2f32 support did not work reliably because most of the X86
backend didn't know it was legal.  It was apparently only added
to support returning source-level v2f32 values in MMX registers
in x86-32 mode.  If ABI compatibility is important on this
GCC-extended-vector type for some reason, then the frontend
should generate IR that returns v2i32 instead of v2f32.  However,
we generally don't try very hard to be abi compatible on gcc
extended vectors. 

llvm-svn: 107601
2010-07-04 23:07:25 +00:00
Chris Lattner 681b926d54 fix PR7518 - terrible codegen of <2 x float>, by only marking
v2f32 as legal in 32-bit mode.  It is just as terrible there,
but I just care about x86-64 and noone claims it is valuable
in 64-bit mode.

llvm-svn: 107600
2010-07-04 22:57:10 +00:00
Bruno Cardoso Lopes ca99012ac0 Add AVX SSE4.1 blend, mpsadbw and vdp
llvm-svn: 107560
2010-07-03 01:37:03 +00:00
Bruno Cardoso Lopes bc75502f09 Add AVX SSE4.1 binop (some forms of packed max,min,mul,pack,cmp) instructions
llvm-svn: 107558
2010-07-03 01:15:47 +00:00
Bruno Cardoso Lopes fc9cdc4d61 Add AVX SSE4.1 Horizontal Minimum and Position instruction
llvm-svn: 107552
2010-07-03 00:49:21 +00:00
Bruno Cardoso Lopes 621c85b038 Add AVX SSE4.1 round instructions
llvm-svn: 107549
2010-07-03 00:37:44 +00:00
Bruno Cardoso Lopes c7111fd355 - Add support for the rest of AVX SSE3 instructions
- Fix VEX prefix to be emitted with 3 bytes whenever VEX_5M
represents a REX equivalent two byte leading opcode

llvm-svn: 107523
2010-07-02 22:06:54 +00:00
Evan Cheng 0ce84486c3 - Two-address pass should not assume unfolding is always successful.
- X86 unfolding should check if the instructions being unfolded has memoperands.
  If there is no memoperands, then it must assume conservative alignment. If this
  would introduce an expensive sse unaligned load / store, then unfoldMemoryOperand
  etc. should not unfold the instruction.

llvm-svn: 107509
2010-07-02 20:36:18 +00:00
Dale Johannesen 4d887f7ca7 Propagate the AlignStack bit in InlineAsm's to the
PrologEpilog code, and use it to determine whether
the asm forces stack alignment or not.  gcc consistently
does not do this for GCC-style asms; Apple gcc inconsistently
sometimes does it for asm blocks.  There is no
convenient place to put a bit in either the SDNode or
the MachineInstr form, so I've added an extra operand
to each; unlovely, but it does allow for expansion for
more bits, should we need it.  PR 5125.  Some
existing testcases are affected.
The operand lists of the SDNode and MachineInstr forms
are indexed with awesome mnemonics, like "2"; I may
fix this someday, but not now.  I'm not making it any
worse.  If anyone is inspired I think you can find all
the right places from this patch.

llvm-svn: 107506
2010-07-02 20:16:09 +00:00
Bob Wilson 771d04b969 Fix incorrect asm-printing of some NEON immediates. Fix weak testcase so
that it checks the immediate values, not just the instructions opcodes.
Radar 8110263.

llvm-svn: 107487
2010-07-02 17:23:44 +00:00
Dale Johannesen 744c74c444 Prevent test from hanging waiting for input.
llvm-svn: 107446
2010-07-01 22:57:11 +00:00
Bob Wilson 8a99b730a9 ARM function alignments were off by a power of two. svn 83242 changed
getFunctionAlignment and the corresponding use of that value in the ARM
asm printer, but now we're using the standard asm printer.  The result of
this was that function alignments were dropped completely for Thumb functions.
Radar 8143571.

llvm-svn: 107435
2010-07-01 22:26:26 +00:00
Bill Wendling 03bcd6ecc8 Implement the "linker_private_weak" linkage type. This will be used for
Objective-C metadata types which should be marked as "weak", but which the
linker will remove upon final linkage. However, this linkage isn't specific to
Objective-C.

For example, the "objc_msgSend_fixup_alloc" symbol is defined like this:

      .globl l_objc_msgSend_fixup_alloc
      .weak_definition l_objc_msgSend_fixup_alloc
      .section __DATA, __objc_msgrefs, coalesced
      .align 3
l_objc_msgSend_fixup_alloc:
       .quad   _objc_msgSend_fixup
       .quad   L_OBJC_METH_VAR_NAME_1

This is different from the "linker_private" linkage type, because it can't have
the metadata defined with ".weak_definition".

Currently only supported on Darwin platforms.

llvm-svn: 107433
2010-07-01 21:55:59 +00:00
Dan Gohman 84f90a387d Remove context sensitivity concerns from interprocedural-basic-aa, and
make it more aggressive in cases where both pointers are known to live
in the same function.

llvm-svn: 107420
2010-07-01 20:08:40 +00:00
Devang Patel 2b434e12cd Debugging infomration is encoded in llvm IR using metadata. This is designed
such a way that debug info for symbols preserved even if symbols are
optimized away by the optimizer. 

Add new special pass to remove debug info for such symbols.

llvm-svn: 107416
2010-07-01 19:49:20 +00:00
Bruno Cardoso Lopes 5e88700f28 Move SSE3 Move patterns to a more appropriate section
Add AVX SSE3 packed horizontal and & sub instructions

llvm-svn: 107405
2010-07-01 17:35:02 +00:00
Bruno Cardoso Lopes 886ee33a38 Add AVX SSE3 packed addsub instructions
llvm-svn: 107404
2010-07-01 17:08:18 +00:00
Dan Gohman d2965c10a1 Temporarily disable on-demand fast-isel.
llvm-svn: 107393
2010-07-01 12:15:30 +00:00
Dan Gohman aef3d140b7 Teach fast-isel to avoid loading a value from memory when it's already
available in a register. This is pretty primitive, but it reduces the
number of instructions in common testcases by 4%.

llvm-svn: 107380
2010-07-01 03:49:38 +00:00
Dan Gohman 722f5fc567 Enable on-demand fast-isel.
llvm-svn: 107377
2010-07-01 02:58:57 +00:00
Bruno Cardoso Lopes a7a0c83563 Add AVX SSE3 replicate and convert instructions
llvm-svn: 107375
2010-07-01 02:33:39 +00:00
Dan Gohman 7937d5606d Teach X86FastISel to fold constant offsets and scaled indices in
the same address.

llvm-svn: 107373
2010-07-01 02:27:15 +00:00
Bruno Cardoso Lopes 05166740eb - Add AVX SSE2 Move doubleword and quadword instructions.
- Add encode bits for VEX_W
- All 128-bit SSE 1 & SSE2 instructions that are described
  in the .td file now have a AVX encoded form already working.

llvm-svn: 107365
2010-07-01 01:20:06 +00:00
Mikhail Glushenkov 0354891d98 Test for the -filelist fix.
llvm-svn: 107363
2010-07-01 01:00:37 +00:00