Commit Graph

2828 Commits

Author SHA1 Message Date
Chandler Carruth f0e2e37518 FileCheck-ize and modernize the RUN line.
llvm-svn: 134342
2011-07-02 20:42:44 +00:00
Chandler Carruth abd8a8f3ae FileCheck-ize, tightening checks and avoiding a temporary file.
llvm-svn: 134341
2011-07-02 20:42:42 +00:00
Chandler Carruth 6e4d90f7c6 FileCheck-ize, tightening checks and avoiding a temporary file.
llvm-svn: 134340
2011-07-02 20:42:39 +00:00
Chandler Carruth ff0e32536e FileCheck-ize
llvm-svn: 134339
2011-07-02 20:42:36 +00:00
Chandler Carruth d954bb7ebb FileCheck-ize
llvm-svn: 134338
2011-07-02 20:42:33 +00:00
Chandler Carruth 362bff3bd3 FileCheck-ize a test, avoiding a temporary file.
llvm-svn: 134337
2011-07-02 20:42:31 +00:00
Chandler Carruth f2a29b726f FileCheck-ize and simplify this test.
llvm-svn: 134336
2011-07-02 20:42:28 +00:00
Chandler Carruth 7e44b420e1 FileCheck-ize
llvm-svn: 134335
2011-07-02 20:42:25 +00:00
Chandler Carruth 144cf1a974 FileCheck-ize another codegen test.
llvm-svn: 134334
2011-07-02 20:42:22 +00:00
Chandler Carruth 1d815f5373 Partially FileCheck-ize a test to remove a weird quoting situation.
llvm-svn: 134333
2011-07-02 20:42:20 +00:00
Chandler Carruth 8a3f20abac FileCheck-ize another test, and upgrade its syntax a bit.
llvm-svn: 134332
2011-07-02 20:42:17 +00:00
Chandler Carruth 38d367e473 FileCheck-ize another codegen test, tightening it up.
llvm-svn: 134331
2011-07-02 20:42:14 +00:00
Chandler Carruth bf252382b8 FileCheck-ize another test, making it much more precise for testing the
individual cases, while hard coding less about registers in use.

llvm-svn: 134330
2011-07-02 20:42:11 +00:00
Chandler Carruth 308b3b66b1 FileCheck-ize another test. This one is more clear and runs fewer
commands as a result.

llvm-svn: 134329
2011-07-02 20:42:08 +00:00
Chandler Carruth 334faf8f1a FileCheck-ize a test, no functionality changed.
llvm-svn: 134328
2011-07-02 20:42:06 +00:00
Jakob Stoklund Olesen 54f7c59c1a Better diagnostics when inline asm fails to allocate.
asm.c:2:7: error: ran out of registers during register allocation
  asm(""::"r"(0), "r"(1), "r"(2), "r"(3), "r"(4), "r"(5), "r"(6), "r"(7), "r"(8), "r"(9));
        ^

llvm-svn: 134310
2011-07-02 07:17:37 +00:00
Eric Christopher 2eca9d5ddf Be less specific about register allocation ordering.
llvm-svn: 134308
2011-07-02 04:06:41 +00:00
Eric Christopher a8a56f7e5c TargetConstant immediates won't be placed into registers so tighten
up the valid constant check earlier.

rdar://9692967

llvm-svn: 134286
2011-07-01 23:04:38 +00:00
Dan Gohman a293f24a0d Teach IVUsers to stop at non-affine expressions unless they are both
outside the loop and reducible.

This more completely hides them from LSR, which isn't usually able to
do anything meaningful with non-affine expressions anyway, and this
consequently hides them from SCEVExpander, which is acutely unprepared
for non-affine expressions.

Replace test/CodeGen/X86/lsr-nonaffine.ll with a new test that tests
the new behavior.

This works around the bug in PR10117 / rdar://problem/9633149, and is
generally an improvement besides.

llvm-svn: 134268
2011-07-01 22:05:19 +00:00
Jakob Stoklund Olesen d0e2352b65 Fix a problem with fast-isel return values introduced in r134018.
We would put the return value from long double functions in the wrong
register.

This fixes gcc.c-torture/execute/conversion.c

llvm-svn: 134205
2011-06-30 23:42:18 +00:00
Eric Christopher c932173773 Fix a small thinko for constant i64 lock/orq optimization where we
we didn't have an opcode for 64-bit constant or expressions.

Fixes rdar://9692967

llvm-svn: 134121
2011-06-30 00:48:30 +00:00
Devang Patel 0eada03216 Revert r133953 for now.
llvm-svn: 134116
2011-06-29 23:50:13 +00:00
Benjamin Kramer 8665f8d916 Revert a part of r126557 which could create unschedulable DAGs.
llvm-svn: 134067
2011-06-29 13:47:25 +00:00
Jakob Stoklund Olesen 7297e7e223 Clean up the handling of the x87 fp stack to make it more robust.
Drop the FpMov instructions, use plain COPY instead.

Drop the FpSET/GET instruction for accessing fixed stack positions.
Instead use normal COPY to/from ST registers around inline assembly, and
provide a single new FpPOP_RETVAL instruction that can access the return
value(s) from a call. This is still necessary since you cannot tell from
the CALL instruction alone if it returns anything on the FP stack. Teach
fast isel to use this.

This provides a much more robust way of handling fixed stack registers -
we can tolerate arbitrary FP stack instructions inserted around calls
and inline assembly. Live range splitting could sometimes break x87 code
by inserting spill code in unfortunate places.

As a bonus we handle floating point inline assembly correctly now.

llvm-svn: 134018
2011-06-28 18:32:28 +00:00
Jakob Stoklund Olesen 74dd400410 FileCheckize a couple of tests.
Also and add a test for popping dead return values and avoid testing the
spill precision.

llvm-svn: 133997
2011-06-28 06:25:03 +00:00
Chandler Carruth e2a1b16963 FileCheck-ize a test that had the strangest TCL quote I've seen yet: an
opening single quote with no closing single quote, and with {} quotes
"inside" of it. This broke some of our tools that scrape test cases.

Also, while here, make the test actually assert what the comment says it
asserts. This was essentially authored by Nick Lewycky, and merely typed
in by myself. Let me know if this is still missing the mark, but the
previous test only succeeded due to the improper quoting preventing
*anything* from matching the grep -- it had a '4(%...)' sequence in the
output!

llvm-svn: 133980
2011-06-28 02:03:10 +00:00
Evan Cheng b7d00313dc Remove the experimental (and unused) pre-ra splitting pass. Greedy regalloc can split live ranges.
llvm-svn: 133962
2011-06-27 23:40:45 +00:00
Devang Patel 4dc034df1d During bottom up fast-isel, instructions emitted to materalize registers are at top of basic block and do not have debug location. This may misguide debugger while entering the basic block and sometimes debugger provides semi useful view of current location to developer by picking up previous known location as current location. Assign a sensible location to the first instruction in a basic block, if it does not have one location derived from source file, so that debugger can provide meaningful user experience to developers in edge cases.
llvm-svn: 133953
2011-06-27 22:32:04 +00:00
Jakob Stoklund Olesen f68976071d Move all inline-asm-fpstack tests to a single file.
Also fix some of the tests that were actually testing wrong behavior -
An input operand in {st} is only popped by the inline asm when {st} is
also in the clobber list.

The original bug reports all had ~{st} clobbers as they should.

llvm-svn: 133916
2011-06-27 17:27:37 +00:00
Chad Rosier f3e11190f3 Test case for r133858 (tail call optimize in the presence of byval).
llvm-svn: 133863
2011-06-25 02:44:56 +00:00
Devang Patel f071d72c44 Handle debug info for i128 constants.
llvm-svn: 133821
2011-06-24 20:46:11 +00:00
Andrew Trick 67ff0718a4 lit support for REQUIRES: asserts.
Take #2. Don't piggyback on the existing config.build_mode. Instead,
define a new lit feature for each build feature we need (currently
just "asserts"). Teach both autoconf'd and cmake'd Makefiles to define
this feature within test/lit.site.cfg. This doesn't require any lit
harness changes and should be more robust across build systems.

llvm-svn: 133664
2011-06-22 23:23:19 +00:00
Rafael Espindola 2496c1f1f8 Reenable tail duplication of bb with just an unconditional jump, but
don't remove blocks that have their address taken.

llvm-svn: 133659
2011-06-22 22:31:57 +00:00
Bob Wilson 646dd0f4d1 Revert r133452: "Emit movq for 64-bit register to XMM register moves..."
This is breaking compiler-rt and llvm-gcc builds on MacOSX when not using
the integrated assembler.

llvm-svn: 133524
2011-06-21 17:35:13 +00:00
Evan Cheng 4c0bd9629d Teach dag combine to match halfword byteswap patterns.
1. (((x) & 0xFF00) >> 8) | (((x) & 0x00FF) << 8)
   => (bswap x) >> 16
2. ((x&0xff)<<8)|((x&0xff00)>>8)|((x&0xff000000)>>8)|((x&0x00ff0000)<<8))
   => (rotl (bswap x) 16)

This allows us to eliminate most of the def : Pat patterns for ARM rev16
revsh instructions. It catches many more cases for ARM and x86.

rdar://9609108

llvm-svn: 133503
2011-06-21 06:01:08 +00:00
Nick Lewycky c7df192279 Emit movq for 64-bit register to XMM register moves, but continue to accept
movd when assembling.

llvm-svn: 133452
2011-06-20 18:33:26 +00:00
Nadav Rotem d34ce4344b Fix PromoteIntRes_TRUNCATE: Add support for cases where the
source vector type is to be split while the target vector is to be promoted.
(eg: <4 x i64> -> <4 x i8> )

llvm-svn: 133424
2011-06-20 07:15:58 +00:00
Benjamin Kramer f3c6d6def5 Update test.
llvm-svn: 133390
2011-06-19 12:14:34 +00:00
Nadav Rotem 6d0036e259 Reduce the runtime of the test. Keep only the interesting cases.
llvm-svn: 133381
2011-06-19 08:12:43 +00:00
Chris Lattner 8936d2bfbc Remove support for parsing the "type i32" syntax for defining a numbered
top level type without a specified number.  This syntax isn't documented
and blocks forward progress.

llvm-svn: 133371
2011-06-19 00:03:46 +00:00
Chris Lattner 80ed9dc9e5 rip out a ton of intrinsic modernization logic from AutoUpgrade.cpp, which is
for pre-2.9 bitcode files.  We keep x86 unaligned loads, movnt, crc32, and the
target indep prefetch change.

As usual, updating the testsuite is a PITA.

llvm-svn: 133337
2011-06-18 06:05:24 +00:00
Galina Kistanova 197ea52d4b Moved to the right place.
llvm-svn: 133324
2011-06-18 00:59:37 +00:00
Eric Christopher e4a1266a9a Fix UMULO support for 2x register width to allow the full
range without a libcall to a new mulo<mode> libcall
that we'd have to create.

Finishes the rest of rdar://9090077 and rdar://9210061

llvm-svn: 133318
2011-06-18 00:09:57 +00:00
Nadav Rotem ea7822685a Fix a bug in the type-lowering of integer-promoted elements. Add a check that
the newly created simple type is valid before checking its legality.
Re-commit the test file.

llvm-svn: 133291
2011-06-17 20:54:12 +00:00
Eric Christopher 5bbb2bdb46 Lower multiply with overflow checking to __mulo<mode>
calls if we haven't been able to lower them any
other way.

Fixes rdar://9090077 and rdar://9210061

llvm-svn: 133288
2011-06-17 20:41:29 +00:00
Galina Kistanova ac6bc75030 est 2008-06-04-indirectmem.ll is X86-specific. Move to X86 folder.
llvm-svn: 133275
2011-06-17 18:26:23 +00:00
Chris Lattner 6bc5c89093 Stop accepting and ignoring attributes in function types. Attributes are applied
to functions and call/invokes, not to types.

llvm-svn: 133266
2011-06-17 17:37:13 +00:00
Chris Lattner 5756c16cdf make the asmparser reject function and type redefinitions. 'Merging' hasn't been
needed since llvm-gcc 3.4 days.

llvm-svn: 133248
2011-06-17 07:06:44 +00:00
Chris Lattner 59345c8b65 remove asmparser support for the old getresult instruction, which has been subsumed by extractvalue.
llvm-svn: 133247
2011-06-17 06:57:15 +00:00
Chris Lattner 33de427cd6 remove parser support for the obsolete "multiple return values" syntax, which
was replaced with return of a "first class aggregate".

llvm-svn: 133245
2011-06-17 06:49:41 +00:00
Chris Lattner def1949c00 Remove support for using "foo" as symbols instead of %"foo". This is ancient
syntax and has been long obsolete.  As usual, updating the tests is the nasty
part of this.

llvm-svn: 133242
2011-06-17 06:36:20 +00:00
Chris Lattner b90ed2233c manually upgrade a bunch of tests to modern syntax, and remove some that
are either unreduced or only test old syntax.

llvm-svn: 133228
2011-06-17 03:14:27 +00:00
Nick Lewycky 65c47187f2 There's no need to be so picky about the particular register.
llvm-svn: 133189
2011-06-16 21:00:00 +00:00
Bruno Cardoso Lopes bbf2ab990f Add AVX suport for fpextend.
Original patch by Syoyo Fujita with more comments by me.

llvm-svn: 133153
2011-06-16 07:03:21 +00:00
Nick Lewycky 27d604cbf3 Commit the right set of tests for r133124. Sorry 'bout that!
llvm-svn: 133133
2011-06-16 01:35:45 +00:00
Andrew Trick 41369d5e8a Reenabling this test with REQUIRES: Asserts
llvm-svn: 133132
2011-06-16 01:34:41 +00:00
Nick Lewycky 6d677cfdd8 Add a DAGCombine for (ext (binop (load x), cst)).
llvm-svn: 133124
2011-06-16 01:15:49 +00:00
John McCall 4b7a8d68ae Add a new function attribute, nonlazybind, which inhibits lazy-loading
optimizations when emitting calls to the function;  instead those calls may
use faster relocations which require the function to be immediately resolved
upon loading the dynamic object featuring the call.  This is useful when it
is known that the function will be called frequently and pervasively and
therefore there is no merit in delaying binding of the function.

Currently only implemented for x86-64, where it turns into a call through
the global offset table.

Patch by Dan Gohman, who assures me that he's going to add LangRef documentation
for this once it's committed.

llvm-svn: 133080
2011-06-15 20:36:13 +00:00
Andrew Trick 967d584a3a Disabling this test until I can figure out the right lit flags.
llvm-svn: 133068
2011-06-15 18:25:38 +00:00
Andrew Trick 3013b6ae4a Added -stress-sched flag in the Asserts build.
Added a test case for handling physreg aliases during pre-RA-sched.

llvm-svn: 133063
2011-06-15 17:16:12 +00:00
Chad Rosier 19a1f425a7 TargetLoweringOpt is a struct used by DAGCombine, not a pass.
llvm-svn: 133062
2011-06-15 16:48:02 +00:00
Nadav Rotem 24c6558865 This test was failing on X86 machines which do not have SSE4. Fixed the test by
specifying that the target CPU is corei7.

llvm-svn: 133053
2011-06-15 12:26:53 +00:00
Rafael Espindola 2efebb3610 Add triple.
llvm-svn: 133026
2011-06-14 23:47:36 +00:00
Chad Rosier 818e116723 When pattern matching during instruction selection make sure shl x,1 is not
converted to add x,x if x is a undef.  add undef, undef does not guarantee
that the resulting low order bit is zero.
Fixes <rdar://problem/9453156> and <rdar://problem/9487392>.

llvm-svn: 133022
2011-06-14 22:29:10 +00:00
Rafael Espindola 1bf96ac607 Check the llc output.
llvm-svn: 133021
2011-06-14 22:24:32 +00:00
Stuart Hastings f96281f4b7 Test case for x86 MMX inline asm. rdar://problem/8886707
llvm-svn: 133014
2011-06-14 21:51:38 +00:00
Rafael Espindola 761cee0785 Add a test for the recent regression.
llvm-svn: 133009
2011-06-14 20:38:50 +00:00
Dan Gohman 8355febbf4 This test is still failing. Delete the rest of it.
llvm-svn: 133001
2011-06-14 18:07:36 +00:00
Dan Gohman 92789eafe5 Revert r132991. This test is failing on the
llvm-gcc-x86_64-linux-selfhost buildbot and others.

llvm-svn: 133000
2011-06-14 18:03:11 +00:00
Rafael Espindola 3aeaf9e4c1 Add 132986 back, but avoid non-determinism if a bb address gets reused.
llvm-svn: 132995
2011-06-14 15:31:54 +00:00
Nadav Rotem 0e230bc7bb Add a testcase for #9623
llvm-svn: 132991
2011-06-14 13:23:10 +00:00
Rafael Espindola 06ba7a68de revert 132986 to see if the bots go green.
llvm-svn: 132988
2011-06-14 12:48:26 +00:00
Nadav Rotem a0da74677e This testcase cause a failure on some bots. Remove the failing test until
further investigation.

llvm-svn: 132986
2011-06-14 09:10:37 +00:00
Nadav Rotem 10193c830b Add a testcase for checking the integer-promotion of many different vector
types (with power of two types such as 8,16,32 .. 512).

Fix a bug in the integer promotion of bitcast nodes. Enable integer expanding
only if the target of the conversion is an integer (when the type action is
scalarize).

Add handling to the legalization of vector load/store in cases where the saved
vector is integer-promoted.

llvm-svn: 132985
2011-06-14 08:11:52 +00:00
Bruno Cardoso Lopes dc9ff3a4b1 Add one more argument to the prefetch intrinsic to indicate whether it's a data
or instruction cache access. Update the targets to match it and also teach
autoupgrade.

llvm-svn: 132976
2011-06-14 04:58:37 +00:00
Rafael Espindola da24f2f8e1 Make the threshold used by branch folding softer. Before we would get a
sharp all or nothing transition when one extra predecessor was added. Now
we still test first ones for merging.

llvm-svn: 132974
2011-06-14 04:41:17 +00:00
Bill Wendling e712449688 Heuristic: If the number of operands in the alias are more than the number of
operands in the aliasee, don't print the alias.

llvm-svn: 132963
2011-06-14 03:17:20 +00:00
Jakob Stoklund Olesen fb03a92c33 Be less aggressive about hinting in RAFast.
In particular, don't spill dirty registers only to satisfy a hint. It is
not worth it.

The attached test case provides an example where the fast allocator
would spill a register when other registers are available.

llvm-svn: 132900
2011-06-13 03:26:46 +00:00
Rafael Espindola 2f3c2fe7c5 Really fix the fall-through logic.
Add a triple to the tests.

llvm-svn: 132885
2011-06-12 05:57:01 +00:00
Rafael Espindola cb55e752ed Test for the previous commit.
llvm-svn: 132884
2011-06-12 05:35:39 +00:00
Eli Friedman cd2124a3f0 Add full x86 fast-isel support for memcpy and memset.
rdar://9431466

llvm-svn: 132864
2011-06-10 23:39:36 +00:00
Eli Friedman 82818e4d95 Add -mattr=+sse2 to make the buildbots happy.
llvm-svn: 132839
2011-06-10 08:26:26 +00:00
Chad Rosier 19d5253f60 Adding a test case for revision 132825.
llvm-svn: 132830
2011-06-10 02:44:19 +00:00
Eli Friedman e3944cd825 Add a simple test which makes sure folding immediate float zero to a memory operand works.
llvm-svn: 132824
2011-06-10 00:30:08 +00:00
Eli Friedman 1877ac9937 Change this DAGCombine to build AND of SHR instead of SHR of AND; this matches the ordering we prefer in instcombine. Part of rdar://9562809.
The potential DAGCombine which enforces this more generally messes up some other very fragile patterns, so I'm leaving that alone, at least for now.

llvm-svn: 132809
2011-06-09 22:14:44 +00:00
Eric Christopher cafa08cbf3 Recommit r132764 since it didn't cause the windows buildbot failures.
llvm-svn: 132776
2011-06-09 15:39:01 +00:00
Eric Christopher 76fd742d16 Temporarily revert 132764 to see if it fixes the Windows buildbot.
llvm-svn: 132771
2011-06-09 06:29:54 +00:00
Eric Christopher 11edab6a46 If the alignment of the byval argument is greater than the alignment
of the frame then increase the maximum alignment of the frame to
match.

Fixes PR6965

llvm-svn: 132764
2011-06-09 00:15:19 +00:00
Rafael Espindola c85e0d81e4 Fix a silly error I introduce in r131951.
Fixes PR10095.

llvm-svn: 132735
2011-06-07 23:26:45 +00:00
Stuart Hastings 7ae360f2e1 Tweak this test for ARM-hosted 'bot.
llvm-svn: 132711
2011-06-07 15:23:11 +00:00
Nadav Rotem d1e8f9a1e0 Move the legalizer tests to the X86 directory because the test uses the x86
codegen. Thanks Galina.

llvm-svn: 132706
2011-06-07 05:23:58 +00:00
Jakob Stoklund Olesen df476270eb Simplify local live range splitting's safeguard to fix PR10070.
When local live range splitting creates a live range with the same
number of instructions as the old range, mark it as RS_Local. When such
a range is seen again, require that it be split in a way that reduces
the number of instructions. That guarantees we are making progress while
still being able to perform 3 -> 2+3 splits as required by PR10070.

This also means that the PrevSlot map is no longer needed. This was also
used to estimate new spill weights, but that is no longer necessary
after slotIndexes::insertMachineInstrInMaps() got the extra Late
insertion argument.

llvm-svn: 132697
2011-06-06 23:55:20 +00:00
Stuart Hastings e0d3426e1a Followup to 132458, omit unnecessary stack copy when x87 input is a
load.  rdar://problem/6373334

llvm-svn: 132696
2011-06-06 23:15:58 +00:00
Stuart Hastings 2f7f64f9e1 Test case for PR10085.
llvm-svn: 132682
2011-06-06 20:03:22 +00:00
Eli Friedman bd375f1a3f PR10077: fix fast-isel of extractvalue of aggregate constants.
llvm-svn: 132676
2011-06-06 05:46:34 +00:00
Benjamin Kramer 59652d36a5 Harden tests for windows path separators.
llvm-svn: 132671
2011-06-05 18:20:05 +00:00
Jakob Stoklund Olesen 38080e8700 Fix a test that keeps breaking when allocation orders change.
Who said FileCheck couldn't handle arbitrarily complex conditions?

llvm-svn: 132654
2011-06-04 23:34:40 +00:00
Stuart Hastings be605494ac Reapply 132424 with fixes. This fixes PR10068.
rdar://problem/5993888

llvm-svn: 132606
2011-06-03 23:53:54 +00:00
Jakob Stoklund Olesen 496fa5556f Fix some tests that depend on register allocation.
llvm-svn: 132602
2011-06-03 22:45:21 +00:00
Rafael Espindola e37b939793 Add test for PR10068.
llvm-svn: 132482
2011-06-02 20:02:48 +00:00
Rafael Espindola aa318ae495 Revert 132424 to fix PR10068.
llvm-svn: 132479
2011-06-02 19:57:47 +00:00
Stuart Hastings 351422bdc8 Andy pointed out a dumb omission in this test case. Thanks Andy!
llvm-svn: 132477
2011-06-02 19:26:49 +00:00
Stuart Hastings e239a6920e Jakob pointed out a dumb omission in this test case. Thanks Jakob!
llvm-svn: 132472
2011-06-02 18:44:05 +00:00
Stuart Hastings 8d530ad22a Omit unnecessary stack copy when x87 input is a load.
rdar://problem/6373334

llvm-svn: 132458
2011-06-02 15:57:11 +00:00
Stuart Hastings 7f25c32d5b Tweak testcase for ARM bot. rdar://problem/5993888
llvm-svn: 132454
2011-06-02 05:05:39 +00:00
Devang Patel 324f843107 Do not drop constant values when a variable's content is described using .debug_loc entries.
llvm-svn: 132427
2011-06-01 22:03:25 +00:00
Stuart Hastings 7adc95f69e Recommit 132404 with fixes. rdar://problem/5993888
llvm-svn: 132424
2011-06-01 21:33:14 +00:00
Stuart Hastings aab130d995 Revert 132404 to appease a buildbot. rdar://problem/5993888
llvm-svn: 132419
2011-06-01 19:52:20 +00:00
Stuart Hastings 41b1aa466d Cleanup test case. rdar://problem/5660695
llvm-svn: 132408
2011-06-01 18:23:14 +00:00
Stuart Hastings 7b7c102f2c Add support for x86 CMPEQSS and friends. These instructions do a
floating-point comparison, generate a mask of 0s or 1s, and generally
DTRT with NaNs.  Only profitable when the user wants a materialized 0
or 1 at runtime.  rdar://problem/5993888

llvm-svn: 132404
2011-06-01 17:17:45 +00:00
Stuart Hastings 6f89e2ffaa A forthcoming SSE patch will break this test; since the test is also
valid for x87, re-target to x87.  rdar://problem/5993888

llvm-svn: 132401
2011-06-01 16:13:09 +00:00
Stuart Hastings 4d2fe66dc0 Test case for 132396. rdar://problem/5660695
llvm-svn: 132399
2011-06-01 15:50:29 +00:00
Rafael Espindola 08600bcf65 Use the dwarf->llvm mapping to print register names in the cfi
directives.

Fixes PR9826.

llvm-svn: 132317
2011-05-30 20:20:15 +00:00
Eli Friedman 873106a932 Force a triple to make this test pass on Darwin.
llvm-svn: 132228
2011-05-27 23:12:48 +00:00
Cameron Zwarich 75d99e4b70 Add a GR32_NOREX_NOSP register class and fix a bug where getMatchingSuperRegClass()
was saying that the matching superregister class of GR32_NOREX in GR64_NOREX_NOSP
is GR64_NOREX, which drops the NOSP constraint. This fixes PR10032.

llvm-svn: 132225
2011-05-27 22:26:04 +00:00
Rafael Espindola d23bfb8a7a Make size computation less brittle.
llvm-svn: 132222
2011-05-27 22:05:41 +00:00
Jakob Stoklund Olesen 63a9cef5c2 Delete a test that is no longer relevant.
According to PR2536, the old spiller had trouble with the IMPLICIT_DEF in this
code:

  %reg1028<def> = MOV16rm %reg0, 1, %reg0, <ga:g_5>, Mem:LD(2,2) [g_5 + 0]
  %reg1039<def> = IMPLICIT_DEF
  %reg1038<def> = INSERT_SUBREG %reg1039, %reg1028, 2
  %reg1025<def> = AND32ri %reg1038, 65534, %%EFLAGS<imp-def>

However, today we emit a zero-extending load instead:

  %vreg10<def> = MOVZX32rm16 %noreg, 1, %noreg, <ga:@g_5>, %noreg; %mem:LD2[@g_5] GR32:%vreg10
  %vreg0<def> = AND32ri %vreg10, 65534, %%EFLAGS<imp-def,dead>; %GR32:%vreg0,%vreg10

This makes the test pointless since it no longer creates the spiller hazard.

llvm-svn: 132210
2011-05-27 20:02:42 +00:00
Devang Patel 3c6aed2d98 Select DW_AT_const_value size based on variable size.
llvm-svn: 132193
2011-05-27 16:45:18 +00:00
Cameron Zwarich 34ef49dc74 Fix PR10029 - VerifyCoalescing failure on patterns_dfa.c of 445.gobmk.
llvm-svn: 132181
2011-05-27 05:04:51 +00:00
Chad Rosier b362884ca9 Renamed llvm.x86.sse42.crc32 intrinsics; crc64 doesn't exist.
crc32.[8|16|32] have been renamed to .crc32.32.[8|16|32] and
crc64.[8|16|32] have been renamed to .crc32.64.[8|64].

llvm-svn: 132163
2011-05-26 23:13:19 +00:00
Eli Friedman c48f7c212e Fix test on Windows.
llvm-svn: 132126
2011-05-26 18:00:32 +00:00
Stuart Hastings 493a12bf5e Reverting 132105: it broke some LLVM-GCC DejaGNU tests.
llvm-svn: 132108
2011-05-26 04:09:49 +00:00
Stuart Hastings 276f231c2f Correctly handle a one-word struct passed byval on x86_64.
rdar://problem/6920088

llvm-svn: 132105
2011-05-26 02:44:56 +00:00
Eli Friedman c70355195c Rewrite fast-isel integer cast handling to handle more cases, and to be simpler and more consistent.
The practical effects here are that x86-64 fast-isel can now handle trunc from i8 to i1, and ARM fast-isel can handle many more constructs involving integers narrower than 32 bits (including loads, stores, and many integer casts).

rdar://9437928 .

llvm-svn: 132099
2011-05-25 23:49:02 +00:00
Rafael Espindola fc9bae6f8b Replace the -unwind-tables option with a per function flag. This is more
LTO friendly as we can now correctly merge files compiled with or without
-fasynchronous-unwind-tables.

llvm-svn: 132033
2011-05-25 03:44:17 +00:00
Rafael Espindola 0f33be1b87 Fix the defaults for .eh_frame. We were marking it as writable.
llvm-svn: 131951
2011-05-24 02:50:20 +00:00
Evan Cheng 88f9137fd7 - Teach SelectionDAG::isKnownNeverZero to return true (op x, c) when c is
non-zero.
- Teach X86 cmov optimization to eliminate the cmov from ctlz, cttz extension
  when the source of X86ISD::BSR / X86ISD::BSF is proven to be non-zero.

rdar://9490949

llvm-svn: 131948
2011-05-24 01:48:22 +00:00
Dan Gohman 6c4a319088 When checking for signed multiplication overflow, watch out for INT_MIN and -1.
This fixes PR9845.

llvm-svn: 131919
2011-05-23 21:07:39 +00:00
Devang Patel 9987d3098b Test case for r131908.
llvm-svn: 131909
2011-05-23 17:49:29 +00:00
Devang Patel c4d9a84159 While replacing all uses of a SDValue with another value, do not forget to transfer SDDbgValue.
llvm-svn: 131907
2011-05-23 17:35:08 +00:00
Benjamin Kramer 2fd48f2730 Implement mulo x, 2 -> addo x, x in DAGCombiner.
llvm-svn: 131800
2011-05-21 18:31:55 +00:00
Benjamin Kramer e08fb1dce9 Merge and FileCheckize test cases.
llvm-svn: 131799
2011-05-21 18:31:48 +00:00
Eli Friedman 60afcc2a6f Add fast-isel support for byval calls on x86.
llvm-svn: 131764
2011-05-20 22:21:04 +00:00
Stuart Hastings 91f1d24736 Re-commit 131641 with fixes; de-pseudoize MOVSX16rr8 and friends.
rdar://problem/8614450

llvm-svn: 131746
2011-05-20 19:04:40 +00:00
Chad Rosier ad00f3d0b9 Fixed regression due to commit 131709, which disables vararg tail call optimizations on Win64
llvm-svn: 131740
2011-05-20 17:49:39 +00:00
Benjamin Kramer 0bf26746d9 Rename the "sandybridge" subtarget to "corei7-avx", for GCC compatibility.
llvm-svn: 131730
2011-05-20 15:11:26 +00:00
Cameron Zwarich e0a52df6e5 Fix PR9960 by teaching SimpleRegisterCoalescing::AdjustCopiesBackFrom() to preserve
the phikill flag.

llvm-svn: 131717
2011-05-20 03:54:04 +00:00
Chad Rosier 552f8c4819 Don't attempt to tail call optimize for Win64.
llvm-svn: 131709
2011-05-20 00:59:28 +00:00
Evan Cheng e8d2e9eb35 Revert r131664 and fix it in instcombine instead. rdar://9467055
llvm-svn: 131708
2011-05-20 00:54:37 +00:00
Eli Friedman 22da799428 Add fast-isel support for zeroext and signext ret instructions on x86.
llvm-svn: 131689
2011-05-19 22:16:13 +00:00
Eric Christopher 4014e5e208 Oddly people want to use the 'r' constraint for fp constants on x86.
Fixes rdar://9218925
Fixes PR9601

llvm-svn: 131682
2011-05-19 21:33:47 +00:00
Eli Friedman e53a77d3a6 Fix up this test to use explicit triples (Win64 passes a different number of arguments in registers).
llvm-svn: 131676
2011-05-19 21:13:08 +00:00
Evan Cheng 2b9bd38678 crc32 with 64-bit output zeros upper 32-bits. rdar://9467055
llvm-svn: 131664
2011-05-19 18:57:12 +00:00
Stuart Hastings ae012a7525 Move test to Transforms/InstCombine.
llvm-svn: 131634
2011-05-19 05:53:22 +00:00
Chad Rosier f4e832b14e Enables vararg functions that pass all arguments via registers to be optimized into tail-calls when possible.
llvm-svn: 131560
2011-05-18 19:59:50 +00:00
Stuart Hastings 51d696766c An imminent fix to the x86_64 byval logic will expose a flaw in the
x86_64 sibcall logic.  I've filed PR9943 for the sibcall problem, and
this patch alters the testcase to work around the flaw.  When PR9943
is fixed, this patch should be reverted.

llvm-svn: 131557
2011-05-18 19:19:17 +00:00
Eli Friedman 3f46c3e702 Force a triple on a couple of tests; we don't support fast-isel of ret on Win64.
llvm-svn: 131540
2011-05-18 17:16:37 +00:00
Stuart Hastings 38849debb5 Merge pmovzx test case into existing file.
llvm-svn: 131539
2011-05-18 17:02:04 +00:00
Eli Friedman 7d7ad8374f Make some of the fast-isel tests actually test fast-isel (and fix test failures).
llvm-svn: 131510
2011-05-18 00:00:10 +00:00
Stuart Hastings 5bd18b6638 X86 pmovsx/pmovzx ignore the upper half of their inputs.
rdar://problem/6945110

llvm-svn: 131493
2011-05-17 22:13:31 +00:00
Galina Kistanova dd45646a47 Move test for appropriate directory.
llvm-svn: 131477
2011-05-17 19:06:43 +00:00
Eli Friedman 7b27942fe7 Add x86 fast-isel for calls returning first-class aggregates. rdar://9435872.
This is r131438 with a couple small fixes.

llvm-svn: 131474
2011-05-17 18:29:03 +00:00
Eli Friedman 7335e8a720 Back out r131444 and r131438; they're breaking nightly tests. I'll look into
it more tomorrow.

llvm-svn: 131451
2011-05-17 02:36:59 +00:00
Eli Friedman e5f7f26df0 Fix test.
llvm-svn: 131444
2011-05-17 00:39:14 +00:00
Evan Cheng 54459240e3 Add target triple so test doesn't fail on Windows machines.
llvm-svn: 131439
2011-05-17 00:15:58 +00:00
Eli Friedman 83ba150f3a Add x86 fast-isel for calls returning first-class aggregates. rdar://9435872.
llvm-svn: 131438
2011-05-17 00:13:47 +00:00
Jakob Stoklund Olesen 4edf17d91f Teach LiveInterval::isZeroLength about null SlotIndexes.
When instructions are deleted, they leave tombstone SlotIndex entries.
The isZeroLength method should ignore these null indexes.

This causes RABasic to sometimes spill a callee-saved register in the
abi-isel.ll test, so don't run that test with -regalloc=basic.  Prioritizing
register allocation according to spill weight can cause more registers to be
used.

llvm-svn: 131436
2011-05-16 23:50:05 +00:00
Eli Friedman d4a3609d30 Remove dead code. Fix associated test to use FileCheck.
llvm-svn: 131424
2011-05-16 21:28:22 +00:00
Eli Friedman a4d4a0162d Make fast-isel work correctly s/uadd.with.overflow intrinsics.
llvm-svn: 131420
2011-05-16 21:06:17 +00:00
Eli Friedman 9ac944774f Basic fast-isel of extractvalue. Not too helpful on its own, given the IR clang generates for cases like this, but it should become more useful soon.
llvm-svn: 131417
2011-05-16 20:27:46 +00:00
Rafael Espindola df9db7ed92 Don't produce a vmovntdq if we don't have AVX support.
llvm-svn: 131330
2011-05-14 00:30:01 +00:00
Evan Cheng 43054e6159 Re-enable branchfolding common code hoisting optimization. Fixed a liveness test bug and also taught it to update liveins.
llvm-svn: 131241
2011-05-12 20:30:01 +00:00
Devang Patel 34a6620748 Identify end of prologue (and beginning of function body) using DW_LNS_set_prologue_end line table opcode.
llvm-svn: 131194
2011-05-11 19:22:19 +00:00
Nadav Rotem 8a7beb80f0 Fixes a bug in the DAGCombiner. LoadSDNodes have two values (data, chain).
If there is a store after the load node, then there is a chain, which means
that there is another user. Thus, asking hasOneUser would fail. Instead we
ask hasNUsesOfValue on the 'data' value.

llvm-svn: 131183
2011-05-11 14:40:50 +00:00
Nadav Rotem 8f971c27fb Add custom lowering of X86 vector SRA/SRL/SHL when the shift amount is a splat vector.
llvm-svn: 131179
2011-05-11 08:12:09 +00:00
Rafael Espindola 2a09d65979 Revert 131172 as it is causing clang to miscompile itself. I will try
to provide a reduced testcase.

llvm-svn: 131176
2011-05-11 03:27:17 +00:00
Evan Cheng 05fc35e275 Add a late optimization to BranchFolding that hoist common instruction sequences
at the start of basic blocks to their common predecessor. It's actually quite
common (e.g. about 50 times in JM/lencod) and has shown to be a nice code size
benefit. e.g.

        pushq   %rax
        testl   %edi, %edi
        jne     LBB0_2
## BB#1:
        xorb    %al, %al
        popq    %rdx
        ret
LBB0_2:
        xorb    %al, %al
        callq   _foo
        popq    %rdx
        ret

=>

        pushq   %rax
        xorb    %al, %al
        testl   %edi, %edi
        je      LBB0_2
## BB#1:
        callq   _foo
LBB0_2:
        popq    %rdx
        ret

rdar://9145558

llvm-svn: 131172
2011-05-11 01:03:01 +00:00
Benjamin Kramer d724a590e5 X86: Add a bunch of peeps for add and sub of SETB.
"b + ((a < b) ? 1 : 0)" compiles into
	cmpl	%esi, %edi
	adcl	$0, %esi
instead of
	cmpl	%esi, %edi
	sbbl	%eax, %eax
	andl	$1, %eax
	addl	%esi, %eax

This saves a register, a false dependency on %eax
(Intel's CPUs still don't ignore it) and it's shorter.

llvm-svn: 131070
2011-05-08 18:36:07 +00:00
Jakob Stoklund Olesen a5c889982a Emit a proper error message when register allocators run out of registers.
This can't be just an assertion, users can always write impossible inline
assembly. Such an assembly statement should be included in the error message.

llvm-svn: 131024
2011-05-06 21:58:30 +00:00
Eli Friedman 5401962643 Re-revert r130877; it's apparently causing a regression on 197.parser,
possibly related to cbnz formation.

llvm-svn: 130977
2011-05-06 05:23:07 +00:00
Rafael Espindola a4982bddf3 Don't produce a __debug_frame.
I tested both gdb on a bootstrapped clang and and the gdb testsuite on OS X (snow leopard)
and both are happy using __eh_frame.

llvm-svn: 130937
2011-05-05 18:43:39 +00:00
Eli Friedman 441a01a2b8 Avoid extra vreg copies for arguments passed in registers. Specifically, this can make MachineCSE more effective in some cases (especially in small functions). PR8361 / part of rdar://problem/8259436 .
llvm-svn: 130928
2011-05-05 16:53:34 +00:00
Jakob Stoklund Olesen 17d4f9bbcc Prepare remaining tests for -join-physreg going away.
llvm-svn: 130893
2011-05-04 23:54:59 +00:00
Jakob Stoklund Olesen 369bddf5ad Fix a batch of x86 tests to be coalescer independent.
Most of these tests require a single mov instruction that can come either before
or after a 2-addr instruction. -join-physregs changes the behavior, but the
results are equivalent.

llvm-svn: 130891
2011-05-04 23:54:51 +00:00
Eli Friedman 0fe4608af2 Re-commit r130862 with a minor change to avoid an iterator running off the edge in some cases.
Original message:

Teach MachineCSE how to do simple cross-block CSE involving physregs.  This allows, for example, eliminating duplicate cmpl's on x86. Part of rdar://problem/8259436 .

llvm-svn: 130877
2011-05-04 22:10:36 +00:00
Eli Friedman 3bd79ba856 Back out r130862; it appears to be breaking bootstrap.
llvm-svn: 130867
2011-05-04 20:48:42 +00:00
Eli Friedman a16fc2fec0 Teach MachineCSE how to do simple cross-block CSE involving physregs. This allows, for example, eliminating duplicate cmpl's on x86. Part of rdar://problem/8259436 .
llvm-svn: 130862
2011-05-04 19:54:24 +00:00
Jakob Stoklund Olesen f1b401800a Don't depend on the physreg coalescing order.
llvm-svn: 130818
2011-05-04 01:01:47 +00:00
Bill Wendling db0996c822 Replace the "movnt" intrinsics with a native store + nontemporal metadata bit.
<rdar://problem/8460511>

llvm-svn: 130791
2011-05-03 21:11:17 +00:00
Rafael Espindola fc8223670a Add r130623 back now that ELF has been fixed to work with -fno-dwarf2-cfi-asm.
llvm-svn: 130658
2011-05-01 15:44:13 +00:00
Rafael Espindola 750cb61553 GCC uses a different encoding of pointers in the FDE when using
-fno-dwarf2-cfi-asm. Implement the same behavior.

llvm-svn: 130637
2011-05-01 04:49:54 +00:00
Rafael Espindola b7c2286055 Revert the previous patch while I figure out how to make llvm-gcc
less agressive about disabling cfi on linux :-(

llvm-svn: 130626
2011-04-30 23:03:44 +00:00
Rafael Espindola 5265bc483e Enable CFI on OS X.
Currently the output should be almost identical to the one produced by CodeGen
to make the transition easier.

The only two differences I know of are:

* Some files get an extra advance loc of size 0. This will be fixed when
relaxations are enabled.
* The optimization of declaring an EH symbol as an external variable is not
implemented. This is a subset of adding the nounwind attribute, so we if really
this at -O0 we should probably do it at the IL level.

llvm-svn: 130623
2011-04-30 22:29:54 +00:00
Jakob Stoklund Olesen f5eaa8dc62 Allow folded spills in test.
llvm-svn: 130599
2011-04-30 08:00:50 +00:00
Jakob Stoklund Olesen edfabc9aad Weekly fix of register allocation dependent unit tests.
llvm-svn: 130567
2011-04-30 01:37:52 +00:00
Rafael Espindola 697edc89a5 Change DwarfCFIException's member variables to track what it actually
emmits: .cfi_personality, .cfi_lsda and the moves.

llvm-svn: 130503
2011-04-29 14:48:51 +00:00
Eli Friedman 7cd5101ad3 fast-isel sret calls, try 2. We actually do need to do something on x86-32. rdar://problem/9303592 .
llvm-svn: 130429
2011-04-28 20:19:12 +00:00
Eli Friedman 3cf6d4032a Actually revert r130348 correctly.
llvm-svn: 130418
2011-04-28 18:20:24 +00:00
Eli Friedman d5a80ca3c8 Revert r130348; causing buildbot issues on x86-32.
llvm-svn: 130412
2011-04-28 18:06:10 +00:00
Devang Patel 3e021533cd Teach dwarf writer to handle complex address expression for .debug_loc entries.
This fixes clang generated blocks' variables' debug info.
Radar 9279956.

llvm-svn: 130373
2011-04-28 02:22:40 +00:00
Eli Friedman 33c133919a Fix a silly mistake in r130338.
llvm-svn: 130360
2011-04-28 00:42:03 +00:00
Eli Friedman 8bd572fc58 fast-isel sret. We actually don't need to do anything special on x86. :) rdar://problem/9303592 .
llvm-svn: 130348
2011-04-27 23:58:52 +00:00
Eli Friedman 406c471b69 Make the fast-isel code for literal 0.0 a bit shorter/faster, since 0.0 is common. rdar://problem/9303592 .
llvm-svn: 130338
2011-04-27 22:41:55 +00:00
Eli Friedman 0eea0293d9 Fix an edge case involving branches in fast-isel on x86.
rdar://problem/9303306 .

llvm-svn: 130272
2011-04-27 01:34:27 +00:00
Evan Cheng 1355bbdd11 Be careful about scheduling nodes above previous calls. It increase usages of
more callee-saved registers and introduce copies. Only allows it if scheduling
a node above calls would end up lessen register pressure.

Call operands also has added ABI restrictions for register allocation, so be
extra careful with hoisting them above calls.

rdar://9329627

llvm-svn: 130245
2011-04-26 21:31:35 +00:00
Benjamin Kramer 1d4c835089 Force a triple on this test to unbreak windows buildbots.
llvm-svn: 130226
2011-04-26 18:47:43 +00:00
Dan Gohman 7da91aee83 Fast-isel support for simple inline asms.
llvm-svn: 130205
2011-04-26 17:18:34 +00:00
Rafael Espindola 580eebaa20 Add test for PR9743.
llvm-svn: 130198
2011-04-26 14:17:42 +00:00
Devang Patel 734f2218ac A dbg.declare may not be in entry block, even if it is referring to an incoming argument. However, It is appropriate to emit DBG_VALUE referring to this incoming argument in entry block in MachineFunction.
llvm-svn: 130129
2011-04-25 16:33:52 +00:00
Benjamin Kramer ba446cc12a Make tests more useful.
lit needs a linter ...

llvm-svn: 130126
2011-04-25 10:12:01 +00:00
NAKAMURA Takumi 576273cf56 test/CodeGen/X86/shrink-compare.ll: Relax expressions for Win64.
llvm-svn: 130039
2011-04-23 00:15:45 +00:00
Chris Lattner 6d277517d1 Recommit the fix for rdar://9289512 with a couple tweaks to
fix bugs exposed by the gcc dejagnu testsuite:
1. The load may actually be used by a dead instruction, which
   would cause an assert.
2. The load may not be used by the current chain of instructions,
   and we could move it past a side-effecting instruction. Change
   how we process uses to define the problem away.

llvm-svn: 130018
2011-04-22 21:59:37 +00:00
Benjamin Kramer 341c11da3b DAGCombine: fold "(zext x) == C" into "x == (trunc C)" if the trunc is lossless.
On x86 this allows to fold a load into the cmp, greatly reducing register pressure.
  movzbl	(%rdi), %eax
  cmpl	$47, %eax
->
  cmpb	$47, (%rdi)

This shaves 8k off gcc.o on i386. I'll leave applying the patch in README.txt to Chris :)

llvm-svn: 130005
2011-04-22 18:47:44 +00:00
Benjamin Kramer 4c81624735 X86: Try to use a smaller encoding by transforming (X << C1) & C2 into (X & (C2 >> C1)) & C1. (Part of PR5039)
This tends to happen a lot with bitfield code generated by clang. A simple example for x86_64 is
uint64_t foo(uint64_t x) { return (x&1) << 42; }
which used to compile into bloated code:
	shlq	$42, %rdi               ## encoding: [0x48,0xc1,0xe7,0x2a]
	movabsq	$4398046511104, %rax    ## encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x00,0x04,0x00,0x00]
	andq	%rdi, %rax              ## encoding: [0x48,0x21,0xf8]
	ret                             ## encoding: [0xc3]

with this patch we can fold the immediate into the and:
	andq	$1, %rdi                ## encoding: [0x48,0x83,0xe7,0x01]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	shlq	$42, %rax               ## encoding: [0x48,0xc1,0xe0,0x2a]
	ret                             ## encoding: [0xc3]

It's possible to save another byte by using 'andl' instead of 'andq' but I currently see no way of doing
that without making this code even more complicated. See the TODOs in the code.

llvm-svn: 129990
2011-04-22 15:30:40 +00:00
Daniel Dunbar 6309828206 Revert r1296656, "Fix rdar://9289512 - not folding load into compare at -O0...",
which broke a couple GCC test suite tests at -O0.

llvm-svn: 129914
2011-04-21 16:14:46 +00:00
Daniel Dunbar ed3d5496dc llc: Eliminate a use of getDarwinMajorNumber().
- As before, there is a minor semantic change here (evidenced by the test
   change) for Darwin triples that have no version component. I debated changing
   the default behavior of isOSVersionLT, but decided it made more sense for
   triples to be explicit.

llvm-svn: 129805
2011-04-19 20:46:13 +00:00
Eli Friedman ee92a6b332 Add support for FastISel'ing varargs calls.
llvm-svn: 129765
2011-04-19 17:22:22 +00:00
Chris Lattner 91328b317b Implement support for x86 fastisel of small fixed-sized memcpys, which are generated
en-mass for C++ PODs.  On my c++ test file, this cuts the fast isel rejects by 10x 
and shrinks the generated .s file by 5%

llvm-svn: 129755
2011-04-19 05:52:03 +00:00
Chris Lattner 5f4b783426 Implement support for fast isel of calls of i1 arguments, even though they are illegal,
when they are a truncate from something else.  This eliminates fully half of all the 
fastisel rejections on a test c++ file I'm working with, which should make a substantial
improvement for -O0 compile of c++ code.

This fixed rdar://9297003 - fast isel bails out on all functions taking bools

llvm-svn: 129752
2011-04-19 05:09:50 +00:00
Chris Lattner d7f7c93914 Handle i1/i8/i16 constant integer arguments to calls by prepromoting them.
Before we would bail out on i1 arguments all together, now we just bail on
non-constant ones.  Also, we used to emit extraneous code.  e.g. test12 was:

	movb	$0, %al
	movzbl	%al, %edi
	callq	_test12

and test13 was:
	movb	$0, %al
	xorl	%edi, %edi
	movb	%al, 7(%rsp)
	callq	_test13f

Now we get:

	movl	$0, %edi
	callq	_test12
and:
	movl	$0, %edi
	callq	_test13f

llvm-svn: 129751
2011-04-19 04:42:38 +00:00
Chris Lattner c59290a34c be layout aware, to produce:
testb	$1, %al
	je	LBB0_2
## BB#1:                                ## %if.then
	movb	$0, %al

instead of:

	testb	$1, %al
	jne	LBB0_1
	jmp	LBB0_2
LBB0_1:                                 ## %if.then
	movb	$0, %al

how 'bout that.

llvm-svn: 129749
2011-04-19 04:26:32 +00:00
Chris Lattner 2c8a4c3b1b fix rdar://9297006 - fast isel bails out on trunc to i1 -> bools cry,
a common cause of fast isel rejects on c++ code.

llvm-svn: 129748
2011-04-19 04:22:17 +00:00
Chris Lattner 48f75ad678 while we're at it, handle 'sdiv exact' of a power of 2 also,
this fixes a few rejects on c++ iterator loops.

llvm-svn: 129694
2011-04-18 07:00:40 +00:00
Chris Lattner 562d6e82bd fix rdar://9297011 - udiv by power of two causing fast-isel rejects
llvm-svn: 129693
2011-04-18 06:55:51 +00:00
Chris Lattner 07add49a4b Implement major new fastisel functionality: the matcher can now handle immediates with
value constraints on them (when defined as ImmLeaf's).  This is particularly important
for X86-64, where almost all reg/imm instructions take a i64immSExt32 immediate operand,
which has a value constraint.  Before this patch we ended up iseling the examples into
such amazing code as:

	movabsq	$7, %rax
	imulq	%rax, %rdi
	movq	%rdi, %rax
	ret

now we produce:

	imulq	$7, %rdi, %rax
	ret

This dramatically shrinks the generated code at -O0 on x86-64.

llvm-svn: 129691
2011-04-18 06:22:33 +00:00
Chris Lattner 353fda159d relax this test to just check that the lock prefix is encoded properly,
and to not rely on the register allocator's arbitrary operand choices.

llvm-svn: 129690
2011-04-18 06:15:35 +00:00
Chris Lattner b53ccb8e36 1. merge fast-isel-shift-imm.ll into fast-isel-x86-64.ll
2. implement rdar://9289501 - fast isel should fold trivial multiplies to shifts
3. teach tblgen to handle shift immediates that are different sizes than the 
   shifted operands, eliminating some code from the X86 fast isel backend.
4. Have FastISel::SelectBinaryOp use (the poorly named) FastEmit_ri_ function
   instead of FastEmit_ri to simplify code.

llvm-svn: 129666
2011-04-17 20:23:29 +00:00
Chris Lattner eb729d48ff fix an x86 fast isel issue where we'd completely give up on folding an address
when we have a global variable base an an index.  Instead, just give up on
folding the global variable.

Before we'd geenrate:

_test:                                  ## @test
## BB#0:
	movq	_rtx_length@GOTPCREL(%rip), %rax
	leaq	(%rax), %rax
	addq	%rdi, %rax
	movzbl	(%rax), %eax
	ret

now we generate:

_test:                                  ## @test
## BB#0:
	movq	_rtx_length@GOTPCREL(%rip), %rax
	movzbl	(%rax,%rdi), %eax
	ret

The difference is even more significant when there is a scale
involved.

This fixes rdar://9289558 - total fail with addr mode formation at -O0/x86-64

llvm-svn: 129664
2011-04-17 17:47:38 +00:00
Chris Lattner 4832660b4d fix an oversight which caused us to compile the testcase (and other
less trivial things) into a dummy lea.  Before we generated:

_test:                                  ## @test
	movq	_G@GOTPCREL(%rip), %rax
	leaq	(%rax), %rax
	ret

now we produce:

_test:                                  ## @test
	movq	_G@GOTPCREL(%rip), %rax
	ret

This is part of rdar://9289558

llvm-svn: 129662
2011-04-17 17:12:08 +00:00
Chris Lattner 045c43855c Fix rdar://9289512 - not folding load into compare at -O0
The basic issue here is that bottom-up isel is matching the branch
and compare, and was failing to fold the load into the branch/compare
combo.  Fixing this (by allowing folding into any instruction of a
sequence that is selected) allows us to produce things like:


cmpb    $0, 52(%rax)
je      LBB4_2

instead of:

movb    52(%rax), %cl
cmpb    $0, %cl
je      LBB4_2

This makes the generated -O0 code run a bit faster, but also speeds up
compile time by putting less pressure on the register allocator and 
generating less code.

This was one of the biggest classes of missing load folding.  Implementing
this shrinks 176.gcc's c-decl.s (as a random example) by about 4% in (verbose-asm)
line count.

llvm-svn: 129656
2011-04-17 06:35:44 +00:00
Eli Friedman 55f7bf3289 Remove working entry from README.
llvm-svn: 129654
2011-04-17 02:36:27 +00:00
Chris Lattner fba7ca63cc fix rdar://9289583 - fast isel should handle non-canonical commutative binops
allowing us to fold the immediate into the 'and' in this case:

int test1(int i) {
  return 8&i;
}

llvm-svn: 129653
2011-04-17 01:16:47 +00:00
Eli Friedman 55b0acd624 PR9055: extend the fix to PR4050 (r70179) to apply to zext and anyext.
Returning a new node makes the code try to replace the old node, which
in the included testcase is killed by CSE.

llvm-svn: 129650
2011-04-16 23:25:34 +00:00
Rafael Espindola 9fef721830 Add this test back for Darwin.
llvm-svn: 129607
2011-04-15 21:06:27 +00:00
Rafael Espindola a01cdb0e37 Add 129518 back with a fix for when we are producing eh just because of debug info.
Change ELF systems to use CFI for producing the EH tables. This reduces the
size of the clang binary in Debug builds from 690MB to 679MB.

llvm-svn: 129571
2011-04-15 15:11:06 +00:00
NAKAMURA Takumi b5e3e9dd27 Revert r129518, "Change ELF systems to use CFI for producing the EH tables. This reduces the"
It broke several builds.

llvm-svn: 129557
2011-04-15 03:35:57 +00:00
Michael J. Spencer 30088ba110 Add 3DNow! intrinsics.
llvm-svn: 129551
2011-04-15 00:32:41 +00:00
Rafael Espindola aa2a7cd828 Change ELF systems to use CFI for producing the EH tables. This reduces the
size of the clang binary in Debug builds from 690MB to 679MB.

llvm-svn: 129518
2011-04-14 15:18:53 +00:00
Andrew Trick bfbd972b1f In the pre-RA scheduler, maintain cmp+br proximity.
This is done by pushing physical register definitions close to their
use, which happens to handle flag definitions if they're not glued to
the branch. This seems to be generally a good thing though, so I
didn't need to add a target hook yet.

The primary motivation is to generate code closer to what people
expect and rule out missed opportunity from enabling macro-op
fusion. As a side benefit, we get several 2-5% gains on x86
benchmarks. There is one regression:
SingleSource/Benchmarks/Shootout/lists slows down be -10%. But this is
an independent scheduler bug that will be tracked separately.
See rdar://problem/9283108.

Incidentally, pre-RA scheduling is only half the solution. Fixing the
later passes is tracked by:
<rdar://problem/8932804> [pre-RA-sched] on x86, attempt to schedule CMP/TEST adjacent with condition jump

Fixes:
<rdar://problem/9262453> Scheduler unnecessary break of cmp/jump fusion

llvm-svn: 129508
2011-04-14 05:15:06 +00:00
Bill Wendling 410ec4aad1 As Dan pointed out, movzbl, movsbl, and friends are nicer than their alias
(movzx/movsx) because they give more information. Revert that part of the patch.

llvm-svn: 129498
2011-04-14 01:46:37 +00:00
Bill Wendling 7e07d6fb69 Have the X86 back-end emit the alias instead of what's being aliased. In most
cases, it's much nicer and more informative reading the alias.

llvm-svn: 129497
2011-04-14 01:11:51 +00:00
Cameron Zwarich 9398197ef1 Fix a regression caused by r102515 where explicit alignment on globals is
ignored. There was a test to catch this, but it was just blindly updated in
a large change. This fixes another part of <rdar://problem/9275290>.

llvm-svn: 129466
2011-04-13 20:36:04 +00:00
Bill Wendling b902f1dd88 Reapply r129401 with patch for clang.
llvm-svn: 129419
2011-04-13 00:36:11 +00:00
Bill Wendling dbfde42468 Revert r129401 for now. Clang is using the old way of doing things.
llvm-svn: 129403
2011-04-12 22:59:27 +00:00
Bill Wendling 47c24875a1 Remove the unaligned load intrinsics in favor of using native unaligned loads.
Now that we have a first-class way to represent unaligned loads, the unaligned
load intrinsics are superfluous.

First part of <rdar://problem/8460511>.

llvm-svn: 129401
2011-04-12 22:46:31 +00:00
Chris Lattner 214f114aa7 look for the verboten argument slot access in any order, thanks to Frits
for pointing this out

llvm-svn: 129217
2011-04-09 17:00:34 +00:00
Chris Lattner 41c80e89f3 have dag combine zap "store undef", which can be formed during call lowering
with undef arguments.

llvm-svn: 129185
2011-04-09 02:32:02 +00:00
Jakob Stoklund Olesen 6aa0fbf4c0 Run LiveDebugVariables in RegAllocBasic and RegAllocGreedy.
llvm-svn: 128935
2011-04-05 21:40:37 +00:00
Jakob Stoklund Olesen e20fec7732 Fix one more batch of X86 tests to be register allocation dependent.
llvm-svn: 128919
2011-04-05 20:20:30 +00:00
Jakob Stoklund Olesen 18fd84c79a When dead code elimination removes all but one use, try to fold the single def into the remaining use.
Rematerialization can leave single-use loads behind that we might as well fold whenever possible.

llvm-svn: 128918
2011-04-05 20:20:26 +00:00
Jakob Stoklund Olesen 76ad3debab Ensure all defs referring to a virtual register are marked dead by addRegisterDead().
There can be multiple defs for a single virtual register when they are defining
sub-registers.

The missing <dead> flag was stopping the inline spiller from eliminating dead
code after rematerialization.

llvm-svn: 128888
2011-04-05 16:53:50 +00:00
Rafael Espindola 7dd4d6e2e8 Print visibility info for external variables.
llvm-svn: 128887
2011-04-05 15:51:32 +00:00
Jakob Stoklund Olesen bd09d45489 Fix register-dependent X86 tests.
llvm-svn: 128867
2011-04-05 00:32:44 +00:00
Jakob Stoklund Olesen 2e85396509 Allow coalescing with reserved physregs in certain cases:
When a virtual register has a single value that is defined as a copy of a
reserved register, permit that copy to be joined. These virtual register are
usually copies of the stack pointer:

  %vreg75<def> = COPY %ESP; GR32:%vreg75
  MOV32mr %vreg75, 1, %noreg, 0, %noreg, %vreg74<kill>
  MOV32mi %vreg75, 1, %noreg, 8, %noreg, 0
  MOV32mi %vreg75<kill>, 1, %noreg, 4, %noreg, 0
  CALLpcrel32 ...

Coalescing these virtual registers early decreases register pressure.
Previously, they were coalesced by RALinScan::attemptTrivialCoalescing after
register allocation was completed.

The lower register pressure causes the mcinst-lowering-cmp0.ll test case to fail
because it depends on linear scan spilling a particular register.

I am deleting 2008-08-05-SpillerBug.ll because it is counting the number of
instructions emitted, and its revision history shows the 'correct' count being
edited many times.

llvm-svn: 128845
2011-04-04 21:00:03 +00:00
Jakob Stoklund Olesen 9a78835414 Mark all uses as <undef> when joining a copy.
This way, shrinkToUses() will ignore the instruction that is about to be
deleted, and we avoid leaving invalid live ranges that SplitKit doesn't like.

Fix a misunderstanding in MachineVerifier about <def,undef> operands. The
<undef> flag is valid on def operands where it has the same meaning as <undef>
on a use operand. It only applies to sub-register defines which also read the
full register.

llvm-svn: 128642
2011-03-31 17:23:25 +00:00
Evan Cheng ee9d45dd55 Don't try to create zero-sized stack objects.
llvm-svn: 128586
2011-03-30 23:44:13 +00:00
Rafael Espindola 6b2fac21ca Reduce test case.
llvm-svn: 128445
2011-03-29 02:18:54 +00:00
Bill Wendling 96f962fdff In some cases, the "fail BB dominator" may be null after the BB was split (and
becomes reachable when before it wasn't). Check to make sure that it's not null
before trying to use it.

llvm-svn: 128434
2011-03-28 23:02:18 +00:00
Jakob Stoklund Olesen 9a624fa993 Collect and coalesce DBG_VALUE instructions before emitting the function.
Correctly terminate the range of register DBG_VALUEs when the register is
clobbered or when the basic block ends.

The code is now ready to deal with variables that are sometimes in a register
and sometimes on the stack. We just need to teach emitDebugLoc to say 'stack
slot'.

llvm-svn: 128327
2011-03-26 02:19:36 +00:00
Jakob Stoklund Olesen 1886a4c823 Emit less labels for debug info and stop emitting .loc directives for DBG_VALUEs.
The .dot directives don't need labels, that is a leftover from when we created
line number info manually.

Instructions following a DBG_VALUE can share its label since the DBG_VALUE
doesn't produce any code.

llvm-svn: 128284
2011-03-25 17:20:59 +00:00