Commit Graph

3436 Commits

Author SHA1 Message Date
Dan Gohman 0818684a70 DAGCombine (shl (anyext x, c)) to (anyext (shl x, c)) if the high bits
are not demanded. This often allows the anyext to be folded away.

llvm-svn: 109242
2010-07-23 18:03:30 +00:00
Eric Christopher 9a77382685 Custom lower the memory barrier instructions and add support
for lowering without sse2.  Add a couple of new testcases.

Fixes a few libgomp tests and latent bugs.  Remove a few todos.

llvm-svn: 109078
2010-07-22 02:48:34 +00:00
Evan Cheng 285903853f More register pressure aware scheduling work.
llvm-svn: 109064
2010-07-21 23:53:58 +00:00
Eric Christopher 84bdfd80df Baby steps towards ARM fast-isel.
llvm-svn: 109047
2010-07-21 22:26:11 +00:00
Rafael Espindola 4277e14dc4 Fix calling convention on ARM if vfp2+ is enabled.
llvm-svn: 109009
2010-07-21 11:38:30 +00:00
Dan Gohman 625fd2292d Fix SCEV denormalization of expressions where the exit value from
one loop is involved in the increment of an addrec for another
loop. This fixes rdar://8168938.

llvm-svn: 108863
2010-07-20 17:06:20 +00:00
Jim Grosbach badf087e45 update tests for smarter BIC usage
llvm-svn: 108846
2010-07-20 16:16:48 +00:00
Duncan Sands 2e839de377 The same problem was being tracked in PR7652.
llvm-svn: 108843
2010-07-20 15:52:32 +00:00
Bruno Cardoso Lopes 160695fecb Fix PR7174, a couple o Mips fixes:
- Fix a typo for PIC check during jmp table lowering
- Also fix the "first jump table basic block is not
considered only reachable by fall through" problem, use this
ad-hoc solution until I come up with something better.

Patch by stetorvs@gmail.com

llvm-svn: 108820
2010-07-20 08:37:04 +00:00
Bruno Cardoso Lopes ea7863647b Fix Mips PR7473. Patch by stetorvs@gmail.com
llvm-svn: 108816
2010-07-20 07:58:51 +00:00
Dan Gohman b5e918dc05 After a custom inserter, in a block which has constant instructions,
update the current basic block in addition to the current insert
position, so that they remain consistent. This fixes rdar://8204072.

llvm-svn: 108765
2010-07-19 22:48:56 +00:00
Owen Anderson 9c271e2835 Remove r108639 now that it is handled by InstCombine instead.
llvm-svn: 108688
2010-07-19 08:10:24 +00:00
Owen Anderson 41670a11a8 Add a testcase for r108639.
llvm-svn: 108640
2010-07-18 08:57:19 +00:00
Jim Grosbach b97e2bbe32 Add combiner patterns to more effectively utilize the BFI (bitfield insert)
instruction for non-constant operands. This includes the case referenced
in the README.txt regarding a bitfield copy.

llvm-svn: 108608
2010-07-17 03:30:54 +00:00
Jim Grosbach 11013eda5a Add basic support to code-gen the ARM/Thumb2 bit-field insert (BFI) instruction
and a combine pattern to use it for setting a bit-field to a constant
value. More to come for non-constant stores.

llvm-svn: 108570
2010-07-16 23:05:05 +00:00
Bill Wendling bf8370ff36 Consider this function:
void foo() { __builtin_unreachable(); }

It will output the following on Darwin X86:

_func1:
Leh_func_begin0:
        pushq %rbp
Ltmp0:
        movq %rsp, %rbp
Ltmp1:
Leh_func_end0:

This prolog adds a new Call Frame Information (CFI) row to the FDE with an
address that is not within the address range of the code it describes -- part is
equal to the end of the function -- and therefore results in an invalid EH
frame. If we emit a nop in this situation, then the CFI row is now within the
address range.

llvm-svn: 108568
2010-07-16 22:51:10 +00:00
Jakob Stoklund Olesen c30b4ddc58 Remove the X86::FP_REG_KILL pseudo-instruction and the X86FloatingPointRegKill
pass that inserted it.

It is no longer necessary to limit the live ranges of FP registers to a single
basic block.

llvm-svn: 108536
2010-07-16 17:41:44 +00:00
Benjamin Kramer 50729ad717 Feed the right output into FileCheck.
llvm-svn: 108523
2010-07-16 10:58:02 +00:00
Jakob Stoklund Olesen 37c42a3d02 Remove many calls to TII::isMoveInstr. Targets should be producing COPY anyway.
TII::isMoveInstr is going tobe completely removed.

llvm-svn: 108507
2010-07-16 04:45:42 +00:00
Jakob Stoklund Olesen b1671271ab Add forgotten test case.
llvm-svn: 108506
2010-07-16 04:45:35 +00:00
Dan Gohman 103c4ebea5 Use the source-order scheduler instead of the "fast" scheduler at -O0,
because it's more likely to keep debug line information in its original
order.

llvm-svn: 108496
2010-07-16 02:01:19 +00:00
Dale Johannesen bfd4fd7bb7 The SelectionDAGBuilder's handling of debug info, on rare
occasions, caused code to be generated in a different order.
All cases I've seen involved float softening in the type
legalizer, and this could be perhaps be fixed there, but
it's better not to generate things differently in the first
place.  7797940 (6/29/2010..7/15/2010).

llvm-svn: 108484
2010-07-16 00:02:08 +00:00
Bill Wendling 4bda1c8e68 Revert. This isn't the correct way to go.
llvm-svn: 108478
2010-07-15 23:42:21 +00:00
Bill Wendling 973dc3b1d8 Handle code gen for the unreachable instruction if it's the only instruction in
the function. We'll just turn it into a "trap" instruction instead.

The problem with not handling this is that it might generate a prologue without
the equivalent epilogue to go with it:

$ cat t.ll
define void @foo() {
entry:
  unreachable
}
$ llc -o - t.ll -relocation-model=pic -disable-fp-elim -unwind-tables
        .section        __TEXT,__text,regular,pure_instructions
        .globl  _foo
        .align  4, 0x90
_foo:                                   ## @foo
Leh_func_begin0:
## BB#0:                                ## %entry
        pushq   %rbp
Ltmp0:
        movq    %rsp, %rbp
Ltmp1:
Leh_func_end0:
...

The unwind tables then have bad data in them causing all sorts of problems.

Fixes <rdar://problem/8096481>.

llvm-svn: 108473
2010-07-15 23:32:40 +00:00
Evan Cheng 55f0c6b9fc Split -enable-finite-only-fp-math to two options:
-enable-no-nans-fp-math and -enable-no-infs-fp-math. All of the current codegen fp math optimizations only care whether the fp arithmetics arguments and results can never be NaN.

llvm-svn: 108465
2010-07-15 22:07:12 +00:00
Chris Lattner 60b131654b fix the definitions of ConstTextCoalSection/ConstDataCoalSection
to keep "Text" in sync with the "pure instructions" section attribute.
Lack of this attribute was preventing the assembler from emitting
multibyte noops instructions for templates (and inlines, and other
coalesced stuff) and was causing the assembler to mismatch .o files.

This fixes rdar://8018335

llvm-svn: 108461
2010-07-15 21:22:00 +00:00
Devang Patel df09db62e2 Fix crash reported in PR7653.
llvm-svn: 108441
2010-07-15 18:45:27 +00:00
Dan Gohman 4afd412d6b Watch out for a constant offset cancelling out a base register, forming
a zero. This situation arrises in Fortran code with induction variables
that start at 1 instead of 0. This fixes PR7651.

llvm-svn: 108424
2010-07-15 15:14:45 +00:00
Devang Patel 29168baf4b Make it a .ll test case.
llvm-svn: 108370
2010-07-14 23:12:52 +00:00
Jim Grosbach a90af1ba38 Improve 64-subtraction of immediates when parts of the immediate can fit
in the literal field of an instruction. E.g.,
long long foo(long long a) {
  return a - 734439407618LL;
}

rdar://7038284

llvm-svn: 108339
2010-07-14 17:45:16 +00:00
Dan Gohman 042523340b Delete fast-isel's trivial load optimization; it breaks debugging because
it can look past points where a debugger might modify user variables.

llvm-svn: 108336
2010-07-14 17:25:37 +00:00
Bob Wilson bb57896f8e Fix test to appease the buildbots.
llvm-svn: 108334
2010-07-14 16:43:47 +00:00
Evan Cheng a8e8874552 Fix for PR7193 was overly conservative. The only case where sibcall callee
address cannot be allocated a register is in 32-bit mode where the first
three arguments are marked inreg. In that case EAX, EDX, and ECX will be
used for argument passing.

This fixes PR7610.

llvm-svn: 108327
2010-07-14 06:44:01 +00:00
Bob Wilson bad47f62f6 Add support for NEON VMVN immediate instructions.
llvm-svn: 108324
2010-07-14 06:31:50 +00:00
Evan Cheng c893115312 Re-enable the test with fix.
llvm-svn: 108319
2010-07-14 05:49:23 +00:00
Chris Lattner 711338fb04 temporarily disable to test to fix buildbots.
llvm-svn: 108310
2010-07-14 02:21:59 +00:00
Evan Cheng d542414945 Teach ProcessImplicitDefs to transform more COPY instructions into IMPLICIT_DEF (and subsequently eliminate them). This allows machine LICM to hoist IMPLICIT_DEF's. PR7620.
llvm-svn: 108304
2010-07-14 01:22:19 +00:00
Bob Wilson 103a0dcfe1 Add an ARM-specific DAG combining to avoid redundant VDUPLANE nodes.
Radar 7373643.

llvm-svn: 108303
2010-07-14 01:22:12 +00:00
Bob Wilson a3f1901531 Use a target-specific VMOVIMM DAG node instead of BUILD_VECTOR to represent
NEON VMOV-immediate instructions.  This simplifies some things.

llvm-svn: 108275
2010-07-13 21:16:48 +00:00
Dale Johannesen caca5488dc In inline asm treat indirect 'X' constraint as 'm'.
This may not be right in all cases, but it's better
than asserting which it was doing before.  PR 7528.

llvm-svn: 108268
2010-07-13 20:17:05 +00:00
Evan Cheng 0cc4ad983d Extend the r107852 optimization which turns some fp compare to code sequence using only i32 operations. It now optimize some f64 compares when fp compare is exceptionally slow (e.g. cortex-a8). It also catches comparison against 0.0.
llvm-svn: 108258
2010-07-13 19:27:42 +00:00
Evan Cheng f43961007c -enable-unsafe-fp-math should not imply -enable-finite-only-fp-math.
llvm-svn: 108254
2010-07-13 18:46:14 +00:00
Dale Johannesen f241d4626c Fix PR number.
llvm-svn: 108251
2010-07-13 18:14:47 +00:00
Dan Gohman 51e6d9bbf6 Apply the SSE dependence idiom for SSE unary operations to
SD instructions too, in addition to SS instructions. And
add a comment about it.

llvm-svn: 108191
2010-07-12 20:46:04 +00:00
Jakob Stoklund Olesen c4227f1362 Remove TargetInstrInfo::copyRegToReg entirely.
Targets must now implement TargetInstrInfo::copyPhysReg instead. There is no
longer a default implementation forwarding to copyRegToReg.

llvm-svn: 108095
2010-07-11 17:01:17 +00:00
Rafael Espindola a76eccf815 Fix va_arg for doubles. With this patch VAARG nodes always contain the
correct alignment information, which simplifies ExpandRes_VAARG a bit.

The patch introduces a new alignment information to TargetLoweringInfo. This is
needed since the two natural candidates cannot be used:

* The 's' in target data: If this is set to the minimal alignment of any
  argument, getCallFrameTypeAlignment would return 4 for doubles on ARM for
  example.
* The getTransientStackAlignment method. It is possible for an architecture to
  have argument less aligned than what we maintain the stack pointer.

llvm-svn: 108072
2010-07-11 04:01:49 +00:00
Dan Gohman 79be2b9be5 Fix this test.
llvm-svn: 108059
2010-07-10 22:42:12 +00:00
Jakob Stoklund Olesen c4b3bcc051 FileCheckize inline asm FP stack tests
llvm-svn: 108046
2010-07-10 16:30:25 +00:00
Dan Gohman d7b5ce3312 Reapply bottom-up fast-isel, with several fixes for x86-32:
- Check getBytesToPopOnReturn().
 - Eschew ST0 and ST1 for return values.
 - Fix the PIC base register initialization so that it doesn't ever
   fail to end up the top of the entry block.

llvm-svn: 108039
2010-07-10 09:00:22 +00:00
Jakob Stoklund Olesen 51702ec46b Fix a few tests
llvm-svn: 108011
2010-07-09 20:43:09 +00:00