Dan Gohman
56b6885104
When doing the very-late shift-and address-mode optimization,
...
create a new DAG node to represent the new shift to keep the
DAG consistent, even though it'll almost always be folded into
the address.
If a user of the resulting address has multiple uses, the
nodes may get revisited by a later MatchAddress call, in which
case DAG inconsistencies do matter.
This fixes PR2849.
llvm-svn: 57465
2008-10-13 20:52:04 +00:00
Evan Cheng
4c499c4fa6
Also update sub-register intervals after a trivial computation is rematt'ed for a copy instruction. PR2775.
...
llvm-svn: 57458
2008-10-13 18:35:52 +00:00
Evan Cheng
762f0f53ec
Add a test case for _Complex passed as a FCA.
...
llvm-svn: 57456
2008-10-13 18:13:07 +00:00
Chris Lattner
2753955fc0
Change CALLSEQ_BEGIN and CALLSEQ_END to take TargetConstant's as
...
parameters instead of raw Constants. This prevents the constants from
being selected by the isel pass, fixing PR2735.
llvm-svn: 57385
2008-10-11 22:08:30 +00:00
Dan Gohman
60ad173dfe
Remove -disable-fast-isel. Use cl::boolOrDefault with -fast-isel
...
instead.
So now: -fast-isel or -fast-isel=true enable fast-isel, and
-fast-isel=false disables it. Fast-isel is also on by default
with -fast, and off by default otherwise.
llvm-svn: 57270
2008-10-07 23:00:56 +00:00
Dan Gohman
b8118fd432
Add a testcase for i256 add. i256 isn't fully supported in
...
codegen right now, but add and subtract work.
llvm-svn: 57260
2008-10-07 20:39:12 +00:00
Anders Carlsson
1699ad9030
Certain patterns involving the "movss" instruction were marked as requiring SSE2, when in reality movss is an SSE1 instruction.
...
llvm-svn: 57246
2008-10-07 16:14:11 +00:00
Dale Johannesen
6c6729f3a8
Be more precise about which conversions of NaNs
...
are Inexact. (These are not Inexact as defined
by IEEE754, but that seems like a reasonable way
to abstract what happens: information is lost.)
llvm-svn: 57218
2008-10-06 22:59:10 +00:00
Evan Cheng
94d14f2d45
Fix PR2850 and PR2863. Only generate movddup for 128-bit SSE vector shuffles.
...
llvm-svn: 57210
2008-10-06 21:13:08 +00:00
Anton Korobeynikov
b52ef06c8c
Revert r56675 - it breaks unwinding runtime everywhere.
...
llvm-svn: 57048
2008-10-04 11:09:36 +00:00
Dan Gohman
78bb44fcd4
Fix a bug in the local allocator's liveness computation where it
...
was setting kill flags on tied uses in two-address instructions.
The kill flags were causing the allocator to think it could
allocate the use and its tied def in different registers.
llvm-svn: 57039
2008-10-04 00:31:14 +00:00
Dale Johannesen
867d549fce
Handle some 64-bit atomics on x86-32, some of the time.
...
llvm-svn: 56963
2008-10-02 18:53:47 +00:00
Dan Gohman
88536398ff
Fix a think-o in isSafeToMove. This fixes it from thinking that
...
volatile memory references are safe to move.
llvm-svn: 56948
2008-10-02 15:04:30 +00:00
Dan Gohman
dfc507d2b5
Disable fast-isel for this test, as it doesn't emit the same
...
number of instructions.
llvm-svn: 56940
2008-10-01 23:48:35 +00:00
Devang Patel
1b76f2c40b
Remove OptimizeForSize global. Use function attribute optsize.
...
llvm-svn: 56937
2008-10-01 23:18:38 +00:00
Dan Gohman
5c8c00af1f
Split this test and move it into target-specific directories.
...
This fixes failures on configurations that don't have one or the
other targets enabled.
llvm-svn: 56926
2008-10-01 19:46:30 +00:00
Dan Gohman
7354227de0
nounwind-ify this test.
...
llvm-svn: 56918
2008-10-01 15:07:14 +00:00
Bill Wendling
920f6d588e
Moved this option to the front-end.
...
llvm-svn: 56901
2008-10-01 01:02:18 +00:00
Dan Gohman
1df16dff64
Use explicit target-triples to unbreak this test on non-darwin systems.
...
llvm-svn: 56896
2008-10-01 00:25:38 +00:00
Bill Wendling
1782584f56
Just don't transform this memset into "bzero" if no-builtin is specified.
...
llvm-svn: 56888
2008-09-30 22:05:33 +00:00
Bill Wendling
e818bc159f
- Initialize "--no-builtin" to "false".
...
- Testcase for r56885.
llvm-svn: 56886
2008-09-30 21:40:30 +00:00
Evan Cheng
9156bd2f48
Re-apply 56835 along with header file changes.
...
llvm-svn: 56848
2008-09-30 15:44:16 +00:00
Duncan Sands
2b9adce1d0
Revert commit 56835 since it breaks the build.
...
"If a re-materializable instruction has a register
operand, the spiller will change the register operand's
spill weight to HUGE_VAL to avoid it being spilled.
However, if the operand is already in the queue ready
to be spilled, avoid re-materializing it".
llvm-svn: 56837
2008-09-30 10:00:30 +00:00
Evan Cheng
9469049f7d
If a re-materializable instruction has a register operand, the spiller will change the register operand's spill weight to HUGE_VAL to avoid it being spilled. However, if the operand is already in the queue ready to be spilled, avoid re-materializing it.
...
llvm-svn: 56835
2008-09-30 06:36:58 +00:00
Evan Cheng
82237f2f42
Fix PR2835. Do not change the width of a volatile load.
...
llvm-svn: 56792
2008-09-29 17:26:18 +00:00
Evan Cheng
3774b2f292
Re-apply 56683 with fixes.
...
llvm-svn: 56748
2008-09-27 01:56:22 +00:00
Evan Cheng
7d6fa97567
Implement "punpckldq %xmm0, $xmm0" as "pshufd $0x50, %xmm0, %xmm" unless optimizing for code size.
...
llvm-svn: 56711
2008-09-26 23:41:32 +00:00
Bill Wendling
c966a737c5
Temporarily reverting r56683. This is causing a failure during the build of llvm-gcc:
...
/Volumes/Gir/devel/llvm/clean/llvm-gcc.obj/./gcc/xgcc -B/Volumes/Gir/devel/llvm/clean/llvm-gcc.obj/./gcc/ -B/Volumes/Gir/devel/llvm/clean/llvm-gcc.install/i386-apple-darwin9.5.0/bin/ -B/Volumes/Gir/devel/llvm/clean/llvm-gcc.install/i386-apple-darwin9.5.0/lib/ -isystem /Volumes/Gir/devel/llvm/clean/llvm-gcc.install/i386-apple-darwin9.5.0/include -isystem /Volumes/Gir/devel/llvm/clean/llvm-gcc.install/i386-apple-darwin9.5.0/sys-include -mmacosx-version-min=10.4 -O2 -O2 -g -O2 -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -fPIC -pipe -g -DHAVE_GTHR_DEFAULT -DIN_LIBGCC2 -D__GCC_FLOAT_NOT_NEEDED -I. -I. -I../../llvm-gcc.src/gcc -I../../llvm-gcc.src/gcc/. -I../../llvm-gcc.src/gcc/../include -I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include -I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Gir/devel/llvm/clean/llvm.obj/include -I/Volumes/Gir/devel/llvm/clean/llvm.src/include -fexceptions -fvisibility=hidden -DHIDE_EXPORTS -c ../../llvm-gcc.src/gcc/unwind-dw2-fde-darwin.c -o libgcc/./unwind-dw2-fde-darwin.o
Assertion failed: (TargetRegisterInfo::isVirtualRegister(regA) && TargetRegisterInfo::isVirtualRegister(regB) && "cannot update physical register live information"), function runOnMachineFunction, file /Volumes/Gir/devel/llvm/clean/llvm.src/lib/CodeGen/TwoAddressInstructionPass.cpp, line 311.
../../llvm-gcc.src/gcc/unwind-dw2.c:1527: internal compiler error: Abort trap
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://developer.apple.com/bugreporter > for instructions.
{standard input}:3521:non-relocatable subtraction expression, "_dwarf_reg_size_table" minus "L20$pb"
{standard input}:3521:symbol: "_dwarf_reg_size_table" can't be undefined in a subtraction expression
{standard input}:3520:non-relocatable subtraction expression, "_dwarf_reg_size_table" minus "L20$pb"
...
llvm-svn: 56703
2008-09-26 22:10:44 +00:00
Evan Cheng
d77cbe8947
Fix @llvm.frameaddress codegen. FP elimination optimization should be disabled when frame address is desired. Also add support for depth > 0.
...
llvm-svn: 56683
2008-09-26 19:48:35 +00:00
Evan Cheng
994dd0bbec
Avoid spilling EBP / RBP twice in the prologue.
...
llvm-svn: 56675
2008-09-26 19:14:21 +00:00
Evan Cheng
9dbe45c000
Prefer movlhps over punpcklqdq, etc. in more cases.
...
llvm-svn: 56627
2008-09-25 23:35:16 +00:00
Evan Cheng
74c9ed91b0
With sse3 and when the source is a load or has multiple uses, favors movddup over shuffp*, pshufd, etc. Without sse3 or when the source is from a register, make use of movlhps
...
llvm-svn: 56620
2008-09-25 20:50:48 +00:00
Dale Johannesen
c50ada2f56
Accept 'inreg' attribute on x86 functions as
...
meaning sse_regparm (i.e. float/double values go
in XMM0 instead of ST0). Update documentation
to reflect reality.
llvm-svn: 56619
2008-09-25 20:47:45 +00:00
Evan Cheng
f8ead16b50
Fix patterns for SSE4.1 move and sign extend instructions. Also add instructions which fold VZEXT_MOVL and VZEXT_LOAD.
...
llvm-svn: 56594
2008-09-24 23:27:55 +00:00
Dale Johannesen
86d421df23
Remove SelectionDag early allocation of registers
...
for earlyclobbers. Teach Local RA about earlyclobber,
and add some tests for it.
llvm-svn: 56592
2008-09-24 23:13:09 +00:00
Evan Cheng
e0add20c1b
Properly handle 'm' inline asm constraints. If a GV is being selected for the addressing mode, it requires the same logic for PIC relative addressing, etc.
...
llvm-svn: 56526
2008-09-24 00:05:32 +00:00
Evan Cheng
9e9426cb82
Support x86 specific inline asm modifier 'J'.
...
llvm-svn: 56483
2008-09-22 23:57:37 +00:00
Arnold Schwaighofer
796a271c5f
Change the calling convention used when tail call optimization is enabled from CC_X86_32_TailCall to CC_X86_32_FastCC.
...
llvm-svn: 56436
2008-09-22 14:50:07 +00:00
Evan Cheng
c042000649
Fix PR2808. When regalloc runs out of register, it spill a physical register around the live interval being allocated. Do not continue to try to spill another register, just grab the physical register and move on.
...
llvm-svn: 56381
2008-09-20 01:28:05 +00:00
Evan Cheng
65502487b7
Clean up the test.
...
llvm-svn: 56380
2008-09-20 01:26:27 +00:00
Evan Cheng
4730522235
No need to print function stubs for Mac OS X 10.5 and up. Linker will handle it.
...
llvm-svn: 56378
2008-09-20 00:13:45 +00:00
Dan Gohman
9801ba451a
Refactor X86SelectConstAddr, folding it into X86SelectAddress. This
...
results in better code for globals. Also, unbreak the local CSE for
GlobalValue stub loads.
llvm-svn: 56371
2008-09-19 22:16:54 +00:00
Evan Cheng
4c0197043c
Re-materalized definition instructions may be dead. Whack them.
...
llvm-svn: 56352
2008-09-19 17:38:47 +00:00
Dale Johannesen
f8610ebebc
Add a bit to mark operands of asm's that conflict
...
with an earlyclobber operand elsewhere. Propagate
this bit and the earlyclobber bit through SDISel.
Change linear-scan RA not to allocate regs in a way
that conflicts with an earlyclobber. See also comments.
llvm-svn: 56290
2008-09-17 21:13:11 +00:00
Dan Gohman
68e7735a38
Teach LSR to optimize away SMAX operations for tripcounts in common
...
cases. See the comment above OptimizeSMax for the full story, and
the testcase for an example. This cancels out a pessimization
commonly attributed to indvars, and will allow us to lift some of
the artificial throttles in indvars, rather than add new ones.
llvm-svn: 56230
2008-09-15 21:22:06 +00:00
Arnold Schwaighofer
33ad850d93
Add indirect tail call (function pointer) examples.
...
llvm-svn: 56127
2008-09-11 22:24:28 +00:00
Arnold Schwaighofer
dd45bc25ac
When tailcallopt is enabled all fastcc calls must have an aligned argument stack size. Add a test case.
...
llvm-svn: 56119
2008-09-11 20:28:43 +00:00
Evan Cheng
5456a37280
Fix PR2748. Avoid coalescing physical register with virtual register which would create illegal extract_subreg. e.g.
...
vr1024 = extract_subreg vr1025, 1
...
vr1024 = mov8rr AH
If vr1024 is coalesced with AH, the extract_subreg is now illegal since AH does not have a super-reg whose sub-register 1 is AH.
llvm-svn: 56118
2008-09-11 20:07:10 +00:00
Evan Cheng
4c9fbbb511
Fix PR2783 - coalescer bug. Missing a TargetRegisterInfo::isVirtualRegister check.
...
llvm-svn: 56112
2008-09-11 18:40:32 +00:00
Evan Cheng
b401449ceb
Propagate subreg index when promoting a load to a copy.
...
llvm-svn: 56085
2008-09-11 01:02:12 +00:00
Evan Cheng
710c3cf36a
Fix a fastcc + sret bug. If fastcc and sret, callee doesn't need to pop the hidden struct ptr; Re-enable fastcc.
...
llvm-svn: 56061
2008-09-10 18:25:29 +00:00
Evan Cheng
53b728c27c
Fix PR2757. Ignore liveinterval register allocation preference if the preference register is not in the right register class. This can happen due to sub-register coalescing.
...
llvm-svn: 56006
2008-09-09 20:22:01 +00:00
Evan Cheng
1e97901388
Fix a constant lowering bug. Now we can do load and store instructions with funky getelementptr embedded in the address operand.
...
llvm-svn: 55975
2008-09-09 01:26:59 +00:00
Anton Korobeynikov
ab4928a6e5
Reapply 55902: Add test for checking proper lowering of eh_return & unwind init intrinsics on 32bit x86 targets
...
llvm-svn: 55960
2008-09-08 21:14:36 +00:00
Anton Korobeynikov
35883b7eec
Reapply 55903: Testcase for 64-bit lowering of eh_return & unwind_init
...
llvm-svn: 55959
2008-09-08 21:14:19 +00:00
Bill Wendling
6da384d980
Remove these testcases associated with changes between r 55898 and r 55909.
...
llvm-svn: 55931
2008-09-08 18:00:39 +00:00
Bill Wendling
99b83712f3
Reverting r55898 to r55909. One of these patches was causing an ICE during the full bootstrap on Darwin:
...
/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/xgcc
-B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/
-B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.4.0/bin/
-B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.4.0/lib/
-isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.4.0/include
-isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.4.0/sys-include
-O2 -O2 -g -O2 -DIN_GCC -W -Wall -Wwrite-strings
-Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition
-isystem ./include -fPIC -pipe -g -DHAVE_GTHR_DEFAULT -DIN_LIBGCC2
-D__GCC_FLOAT_NOT_NEEDED -I. -I. -I../../llvm-gcc.src/gcc
-I../../llvm-gcc.src/gcc/. -I../../llvm-gcc.src/gcc/../include
-I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include
-I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber
-I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include
-I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include
-DSHARED -m64 -DL_negdi2 -c ../../llvm-gcc.src/gcc/libgcc2.c -o
libgcc/x86_64/_negdi2_s.o
Assertion failed: (TargetRegisterInfo::isVirtualRegister(regA) &&
TargetRegisterInfo::isVirtualRegister(regB) && "cannot update physical
register live information"), function runOnMachineFunction, file
/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/lib/CodeGen/TwoAddressInstructionPass.cpp,
line 311.
/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/xgcc
-B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/
-B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.4.0/bin/
-B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.4.0/lib/
-isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.4.0/include
-isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.4.0/sys-include
-O2 -O2 -g -O2 -DIN_GCC -W -Wall -Wwrite-strings
-Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition
-isystem ./include -fPIC -pipe -g -DHAVE_GTHR_DEFAULT -DIN_LIBGCC2
-D__GCC_FLOAT_NOT_NEEDED -I. -I. -I../../llvm-gcc.src/gcc
-I../../llvm-gcc.src/gcc/. -I../../llvm-gcc.src/gcc/../include
-I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include
-I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber
-I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include
-I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include
-DSHARED -m64 -DL_lshrdi3 -c ../../llvm-gcc.src/gcc/libgcc2.c -o
libgcc/x86_64/_lshrdi3_s.o
../../llvm-gcc.src/gcc/unwind-dw2.c:1527: internal compiler error: Abort trap
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://developer.apple.com/bugreporter > for instructions.
{standard input}:unknown:Undefined local symbol LBB21_11
{standard input}:unknown:Undefined local symbol LBB21_12
{standard input}:unknown:Undefined local symbol LBB21_13
{standard input}:unknown:Undefined local symbol LBB21_8
llvm-svn: 55928
2008-09-08 17:59:12 +00:00
Evan Cheng
d172048b59
Handle calls which produce i1 results: promote to i8 but and it with 1 to get the low bit.
...
llvm-svn: 55925
2008-09-08 17:15:42 +00:00
Anton Korobeynikov
f5e48008a4
Testcase for 64-bit lowering of eh_return & unwind_init
...
llvm-svn: 55903
2008-09-08 14:23:16 +00:00
Anton Korobeynikov
f0d1a0ccf0
Add test for checking proper lowering of eh_return & unwind init intrinsics on 32bit x86 targets
...
llvm-svn: 55902
2008-09-08 14:22:57 +00:00
Evan Cheng
ca14c07175
Correctly handle physical register inputs. They are not explicit input operands in the resulting machine instrs.
...
llvm-svn: 55893
2008-09-08 08:39:33 +00:00
Evan Cheng
b928669409
Handle x86 truncate to i8 with target hook for now.
...
llvm-svn: 55877
2008-09-07 08:47:42 +00:00
Eli Friedman
a9c52c8219
Fix for PR2687: Add patterns to match sint_to_fp and fp_to_sint for <2 x
...
i32>. This is a little messy, but it works.
We should really get rid of the intrinsics, though, since they map
perfectly well to standard LLVM instructions.
llvm-svn: 55864
2008-09-05 23:07:03 +00:00
Evan Cheng
d4e01dce74
Fix test.
...
llvm-svn: 55849
2008-09-05 20:04:37 +00:00
Evan Cheng
4f0d21592a
If SSE2 is available, x86 should pass first 3 f32/f64 arguments in XMM registers for fastcc calls.
...
llvm-svn: 55840
2008-09-05 17:24:07 +00:00
Evan Cheng
6c94b99c62
For whatever the reason, x86 CallingConv::Fast (i.e. fastcc) was not passing scalar arguments in registers. This patch defines a new fastcc CC which is slightly different from the FastCall CC. In addition to passing integer arguments in ECX and EDX, it also specify doubles are passed in 8-byte slots which are 8-byte aligned (instead of 4-byte aligned). This avoids a potential performance hazard where doubles span cacheline boundaries.
...
llvm-svn: 55807
2008-09-04 22:59:58 +00:00
Owen Anderson
b8c7ba228f
Fix the ordering of operands to the store (inverted relative to LLVM IR), and fix the testcase.
...
llvm-svn: 55777
2008-09-04 16:48:33 +00:00
Owen Anderson
4f948bd87a
Add a first attempt at implementing stores for X86 fast isel using target hooks.
...
Dan or Evan, please review.
llvm-svn: 55764
2008-09-04 07:08:58 +00:00
Evan Cheng
8d8f47d50b
Load from GV stub should be locally CSE'd.
...
llvm-svn: 55763
2008-09-04 06:18:33 +00:00
Evan Cheng
3152edf474
Remove code that pad number of bytes to pop for X86_FastCall CC. The code doesn't do the "aligning" for Cygwin, Mingw, and Windows. But aligning it on Darwin and Linux breaks gcc compatibility. That ruled out all the platforms we support!
...
llvm-svn: 55756
2008-09-04 01:04:15 +00:00
Evan Cheng
a41ee2974b
Add X86 target hook to implement load (even from GlobalAddress).
...
llvm-svn: 55693
2008-09-03 06:44:39 +00:00
Evan Cheng
a3771d5bd9
Re-apply 55467 with fix. If copy is being replaced by remat'ed def, transfer the implicit defs onto the remat'ed instruction.
...
llvm-svn: 55564
2008-08-30 09:09:33 +00:00
Evan Cheng
cfb7f3abdf
Transform (x << (y&31)) -> (x << y). This takes advantage of the fact x86 shift instructions 2nd operand (shift count) is limited to 0 to 31 (or 63 in the x86-64 case).
...
llvm-svn: 55558
2008-08-30 02:03:58 +00:00
Evan Cheng
3fddc7e906
Swap fp comparison operands and change predicate to allow load folding (safely this time).
...
llvm-svn: 55553
2008-08-29 23:22:12 +00:00
Evan Cheng
0b82607fa1
xfail this.
...
llvm-svn: 55550
2008-08-29 22:59:13 +00:00
Evan Cheng
960b17a3c2
Swap fp comparison operands and change predicate to allow load folding.
...
llvm-svn: 55521
2008-08-28 23:48:31 +00:00
Dan Gohman
f27e33baa7
Optimize DAGCombiner's worklist processing. Previously it started
...
its work by putting all nodes in the worklist, requiring a big
dynamic allocation. Now, DAGCombiner just iterates over the AllNodes
list and maintains a worklist for nodes that are newly created or
need to be revisited. This allows the worklist to stay small in most
cases, so it can be a SmallVector.
This has the side effect of making DAGCombine not miss a folding
opportunity in alloca-align-rounding.ll.
llvm-svn: 55498
2008-08-28 21:01:56 +00:00
Dan Gohman
04cf2e4540
Revert r55467; it causes regressions in UnitTests/Vector/divides,
...
Benchmarks/sim/sim, and others on x86-64.
llvm-svn: 55475
2008-08-28 17:22:54 +00:00
Evan Cheng
6975602024
If a copy isn't coalesced, but its src is defined by trivial computation. Re-materialize the src to replace the copy.
...
llvm-svn: 55467
2008-08-28 07:53:51 +00:00
Dale Johannesen
897b2380d8
This test crashes on non-x86 host; make SSE explicit.
...
Feel free to fix a better way!
llvm-svn: 55456
2008-08-28 01:51:09 +00:00
Dan Gohman
5ca269e684
Basic FastISel support for floating-point constants.
...
llvm-svn: 55401
2008-08-27 01:09:54 +00:00
Chris Lattner
09f8cef571
If an xmm register is referenced explicitly in an inline asm, make sure to
...
assign it to a version of the xmm register with the regclass that matches its
type. This fixes PR2715, a bug handling some crazy xpcom case in mozilla.
llvm-svn: 55358
2008-08-26 06:19:02 +00:00
Evan Cheng
f00f1e50b5
Try approach to moving call address load inside of callseq_start. Now it's done during the preprocess of x86 isel. callseq_start's chain is changed to load's chain node; while load's chain is the last of callseq_start or the loads or copytoreg nodes inserted to move arguments to the right spot.
...
llvm-svn: 55338
2008-08-25 21:27:18 +00:00
Owen Anderson
32635dbfb2
Add support for fast isel of (integer) immediate materialization pattens, and use them to support
...
bitcast of constants in fast isel.
llvm-svn: 55325
2008-08-25 20:20:32 +00:00
Evan Cheng
e414681352
Fix asm printing of MOVSDto64mr and MOV64toSDrm.
...
llvm-svn: 55300
2008-08-25 04:11:42 +00:00
Bill Wendling
934b374bc8
Fix this test. Don't null out the file, just XFAIL it until patch can be fixed.
...
llvm-svn: 55296
2008-08-24 21:48:46 +00:00
Bill Wendling
5b836c5f77
Temporarily reverting r55292. It's causing a bootstraping failure:
...
/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/xgcc ... src/libiberty/make-temp-file.c -o make-temp-file.o
Assertion failed: (Node2Index[SU->NodeNum] > Node2Index[I->Dep->NodeNum] && "Wrong topological sorting"), function InitDAGTopologicalSorting, file /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp, line 508.
../../../../llvm-gcc.src/libiberty/hashtab.c:955: internal compiler error: Abort trap
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://developer.apple.com/bugreporter > for instructions.
make[4]: *** [hashtab.o] Error 1
make[4]: *** Waiting for unfinished jobs....
make[3]: *** [multi-do] Error 1
make[2]: *** [all] Error 2
make[1]: *** [all-target-libiberty] Error 2
make: *** [all] Error 2
llvm-svn: 55295
2008-08-24 21:45:30 +00:00
Evan Cheng
8fa17424f7
Move callseq_start above the call address load to allow load to be folded into the call node.
...
llvm-svn: 55292
2008-08-24 19:19:55 +00:00
Anton Korobeynikov
55e9d7c178
Testcase for 64bit maskmovq
...
llvm-svn: 55239
2008-08-23 15:53:47 +00:00
Dale Johannesen
bb170bd08c
Test all currently supported atomic builtins on x86-{32,64}.
...
These just test that they go through the BE.
llvm-svn: 55208
2008-08-22 22:39:21 +00:00
Dan Gohman
49e19e906f
Factor out the predicate check code from DAGISelEmitter.cpp
...
and use it in FastISelEmitter.cpp, and make FastISel
subtarget aware. Among other things, this lets it work
properly on x86 targets that don't have SSE, where it
successfully selects x87 instructions.
llvm-svn: 55156
2008-08-22 00:20:26 +00:00
Bill Wendling
1a6e930ea4
Testcase for PR2585.
...
llvm-svn: 55151
2008-08-21 23:04:49 +00:00
Dan Gohman
46989c637d
Add -mattr=sse2 so this test doesn't fail on non-x86 hosts.
...
llvm-svn: 55145
2008-08-21 22:34:25 +00:00
Dale Johannesen
c360f3ba50
Make x86 and sse2 explicit for non-x86 hosts.
...
llvm-svn: 55141
2008-08-21 21:26:06 +00:00
Evan Cheng
9534ea03e8
Fix a number of byval / memcpy / memset related codegen issues.
...
1. x86-64 byval alignment should be max of 8 and alignment of type. Previously the code was not doing what the commit message was saying.
2. Do not use byte repeat move and store operations. These are slow.
llvm-svn: 55139
2008-08-21 21:00:15 +00:00
Dan Gohman
cdf1a276e3
getelementptr doesn't work on x86-64 yet, because it
...
has MOV64ri32 and no plain MOV64ri.
llvm-svn: 55126
2008-08-21 17:28:42 +00:00
Dan Gohman
efb7d2d03d
MVT::getMVT uses iPTR for pointer types, while we need the actual
...
intptr_t type in this case. FastISel can now select simple
getelementptr instructions.
llvm-svn: 55125
2008-08-21 17:25:26 +00:00
Dan Gohman
fe9056584b
Basic fast-isel support for instructions with constant int operands.
...
llvm-svn: 55099
2008-08-21 01:41:07 +00:00
Dan Gohman
eaef5f612a
Add a -march line for this test, and run it on x86-64 too for fun.
...
llvm-svn: 55030
2008-08-20 00:56:07 +00:00
Dan Gohman
b16a7783c5
Add FastISel support for floating-point operations.
...
llvm-svn: 55021
2008-08-20 00:23:20 +00:00
Dan Gohman
a3e4d5a5e1
Add FastISel support for several more binary operators.
...
llvm-svn: 55020
2008-08-20 00:11:48 +00:00
Bill Wendling
e79740851f
Add support for the __sync_sub_and_fetch atomics and friends for X86. The code
...
was already present, but not hooked up to anything.
llvm-svn: 55018
2008-08-19 23:09:18 +00:00
Dan Gohman
065e24709e
Fast-isel is now *minimally* functional. Add a testcase to
...
demonstrate the extent of its capabilities. Note that it
only attempts to operate on one of the blocks in this
testcase.
llvm-svn: 55016
2008-08-19 22:37:59 +00:00
Dale Johannesen
5afbf510aa
Add support for 8 and 16 bit forms of __sync
...
builtins on X86.
Change "lock" instructions to be on a separate line.
This is needed to work around a bug in the Darwin
assembler.
llvm-svn: 54999
2008-08-19 18:47:28 +00:00
Evan Cheng
ab35bfdf18
Fix a (u)comiss intrinsic lowering bug. It was using anyext which can return junk in higher bits. Patch by Nate Begeman.
...
llvm-svn: 54903
2008-08-17 19:22:34 +00:00
Dan Gohman
7e3c392248
Allow SelectionDAG to create EXTRACT_VECTOR_ELT nodes with
...
non-constant indices. Only a few of the peephole checks require
a constant index.
llvm-svn: 54764
2008-08-13 21:51:37 +00:00
Dan Gohman
c82ad79c64
Improve the grep commands for this test to be tolerant of ABI
...
differences, and to be more specific.
llvm-svn: 54648
2008-08-11 20:10:41 +00:00
Dan Gohman
127bb03b8c
Take the FrameOffset into account when computing the alignment
...
of stack objects. This fixes PR2656.
llvm-svn: 54646
2008-08-11 18:27:03 +00:00
Dan Gohman
4e2f3ace2c
Add an EXTRACTPSmr pattern to match the pattern that
...
X86ISelLowering creates.
llvm-svn: 54544
2008-08-08 18:30:21 +00:00
Dan Gohman
527ca7e253
Re-enable elimination of unnecessary SUBREG_TO_REG instructions in
...
LowerSubregs, and fix an x86-64 isel bug that this exposed.
SUBREG_TO_REG for x86-64 implicit zero extension is only safe for
isel to generate when the source is known to always have zeros in
the high 32 bits. The EXTRACT_SUBREG instruction does not clear
the high 32 bits.
llvm-svn: 54444
2008-08-07 02:54:50 +00:00
Dan Gohman
a8dbaeb1df
Add an extra example that shouldn't get an and instruction.
...
llvm-svn: 54443
2008-08-07 02:23:06 +00:00
Dan Gohman
91c2c432c0
Re-introduce the 8-bit subreg zext-inreg patterns for x86-32,
...
this time using MOV32to32_ and MOV16to16_. Thanks to Evan for
suggesting this.
llvm-svn: 54418
2008-08-06 18:27:21 +00:00
Evan Cheng
7823a411d5
Fix PR2620: Fix X86cmppd selection code so it expects operands to be v2f64.
...
llvm-svn: 54376
2008-08-05 22:19:15 +00:00
Evan Cheng
aa33b932bd
Fix PR2596: out of bound reference.
...
llvm-svn: 54375
2008-08-05 21:51:46 +00:00
Owen Anderson
b358c8f482
Update the remaining tests not to use -disable-correct-folding, and remove two
...
that couldn't be updated.
llvm-svn: 54359
2008-08-05 18:19:14 +00:00
Owen Anderson
8ccb8d35c4
One more -disable-correct-folding case removed.
...
llvm-svn: 54358
2008-08-05 18:08:56 +00:00
Owen Anderson
ae0af127a6
Remove another -disable-correct-folding use.
...
llvm-svn: 54357
2008-08-05 18:05:58 +00:00
Owen Anderson
31a0dea5c0
Eliminate another use of -disable-correct-folding.
...
llvm-svn: 54356
2008-08-05 18:03:01 +00:00
Evan Cheng
0ca10c9572
Fix PR2568: Fix bug that cause redudant kill marker after its live interval has been extended due to coalescing.
...
llvm-svn: 54346
2008-08-05 07:10:38 +00:00
Owen Anderson
5018bd032a
Update these tests to work by disabling the new correct CFG generation. This flag should ONLY be used to for tests like these.
...
llvm-svn: 54334
2008-08-04 23:55:29 +00:00
Dan Gohman
90c724cadc
Fix SDISel lowering of PHI nodes to use ComputeValueVTs.
...
This allows it to work correctly on aggregate values.
This fixes PR2623.
llvm-svn: 54331
2008-08-04 23:42:46 +00:00
Dale Johannesen
962178e77e
Make sse2 explicit, for non-x86 hosts.
...
llvm-svn: 54251
2008-07-31 20:16:33 +00:00
Dan Gohman
345d63ccf2
Improve dagcombining for sext-loads and sext-in-reg nodes.
...
llvm-svn: 54239
2008-07-31 00:50:31 +00:00
Dan Gohman
f8cc0aa87d
I missed this file in r54223. movzbl is now used instead
...
of movzbw here.
llvm-svn: 54224
2008-07-30 18:23:34 +00:00
Dan Gohman
86b06335aa
Reapply r54147 with a constraint to only use the 8-bit
...
subreg form on x86-64, to avoid the problem with x86-32
having GPRs that don't have 8-bit subregs.
Also, change several 16-bit instructions to use
equivalent 32-bit instructions. These have a smaller
encoding and avoid partial-register updates.
llvm-svn: 54223
2008-07-30 18:09:17 +00:00
Mon P Wang
2c839d4b1e
Added support for overloading intrinsics (atomics) based on pointers
...
to different address spaces. This alters the naming scheme for those
intrinsics, e.g., atomic.load.add.i32 => atomic.load.add.i32.p0i32
llvm-svn: 54195
2008-07-30 04:36:53 +00:00
Dan Gohman
43105328d3
Revert 54147.
...
llvm-svn: 54148
2008-07-29 01:02:18 +00:00
Dan Gohman
26ec56c75c
Add x86 isel patterns to match what would be a ZERO_EXTEND_INREG operation,
...
which is represented in codegen as an 'and' operation. This matches them
with movz instructions, instead of leaving them to be matched by and
instructions with an immediate field.
llvm-svn: 54147
2008-07-28 22:18:25 +00:00
Dan Gohman
108c58aef4
Fix embedded CRLF characters.
...
llvm-svn: 54125
2008-07-27 18:37:58 +00:00
Nate Begeman
59063cb315
Fix test RUN line
...
llvm-svn: 54040
2008-07-25 19:08:59 +00:00
Nate Begeman
283b2da27a
Disable mov{L, LP, HP, HLP, *DUP} shuffles for mmx
...
mmx needs its own fancy shuffle logic based on unpack; for now we get correct but awful code.
Also commit Mon Ping's VSETCC patch
llvm-svn: 54039
2008-07-25 19:05:58 +00:00
Dan Gohman
4e6a25ac45
This test needs -aggressive-remat enabled.
...
llvm-svn: 54015
2008-07-25 15:25:32 +00:00
Dan Gohman
09b0448dbc
Enable rematerialization of constants using AliasAnalysis::pointsToConstantMemory,
...
and knowledge of PseudoSourceValues. This unfortunately isn't sufficient to allow
constants to be rematerialized in PIC mode -- the extra indirection is a
complication.
llvm-svn: 54000
2008-07-25 00:02:30 +00:00
Dan Gohman
d9f9e2e260
Add target triples so these tests behave as expected on non-darwin hosts.
...
llvm-svn: 53991
2008-07-24 18:08:01 +00:00
Evan Cheng
c353a5f3b3
New test case.
...
llvm-svn: 53971
2008-07-24 00:22:05 +00:00
Evan Cheng
a2b4b4ad99
Fix PR2485: do all 4-element SSE shuffles in max. of 2 shuffle instructions.
...
Based on patch by Nicolas Capens.
llvm-svn: 53939
2008-07-23 00:22:17 +00:00
Duncan Sands
775e509525
LegalizeTypes support for VSETCC. Fixes PR2575.
...
llvm-svn: 53938
2008-07-22 23:54:03 +00:00
Evan Cheng
b8ff223f26
Fix pr2566: incorrect assumption about bit_convert. It doesn't not have to output a vector value. Patch by Nicolas Capens!
...
llvm-svn: 53932
2008-07-22 20:42:56 +00:00
Evan Cheng
0384670141
Fix PR2574: implement v2f32 scalar_to_vector.
...
llvm-svn: 53927
2008-07-22 18:39:19 +00:00
Bill Wendling
75840e6435
Fix for first part of PR2562. Generate the "pinsrw" instruction for inserts
...
into v4i16 vectors.
llvm-svn: 53807
2008-07-20 02:32:23 +00:00
Anton Korobeynikov
74d3fa165c
Testcase for PR2549
...
llvm-svn: 53785
2008-07-19 06:31:12 +00:00
Evan Cheng
cefd6e62fa
Subreg live interval valno may not have a corresponding def machineinstr since it's less precise.
...
llvm-svn: 53734
2008-07-17 19:48:53 +00:00
Evan Cheng
c5add64f0a
Add nounwind.
...
llvm-svn: 53733
2008-07-17 19:48:04 +00:00
Evan Cheng
e0a352e8e7
Fix PR2536: a nasty spiller bug. If a two-address instruction uses a register but the use portion of its live range is not part of its liveinterval, it must be defined by an implicit_def. In that case, do not spill the use. e.g.
...
8 %reg1024<def> = IMPLICIT_DEF
12 %reg1024<def> = INSERT_SUBREG %reg1024<kill>, %reg1025, 2
The live range [12, 14) are not part of the r1024 live interval since it's defined by an implicit def. It will not conflicts with live interval of r1025. Now suppose both registers are spilled, you can easily see a situation where both registers are reloaded before the INSERT_SUBREG and both target registers that would overlap.
llvm-svn: 53503
2008-07-12 01:56:02 +00:00
Duncan Sands
d9948110a6
Port a shift-by-1 optimization from LegalizeDAG: it
...
was presumably added after the rest of the code was
copied to LegalizeTypes.
llvm-svn: 53459
2008-07-11 16:54:57 +00:00
Bill Wendling
5774466a33
The frame address on an x86-64 box needs to be offset by -8, not -4.
...
llvm-svn: 53450
2008-07-11 07:18:52 +00:00
Evan Cheng
71b7398463
Fix for PR2472. Use movss to set lower 32-bits of a zero XMM vector.
...
llvm-svn: 53386
2008-07-10 01:08:23 +00:00
Anton Korobeynikov
dad7a35acf
Testcase for PR2024
...
llvm-svn: 53327
2008-07-09 14:09:41 +00:00
Evan Cheng
03001cb820
Fix two serious LSR bugs.
...
1. LSR runOnLoop is always returning false regardless if any transformation is made.
2. AddUsersIfInteresting can create new instructions that are added to DeadInsts. But there is a later early exit which prevents them from being freed.
llvm-svn: 53193
2008-07-07 19:51:32 +00:00
Dale Johannesen
3bfe3b0e76
Considering predecessors of exit blocks gets
...
us a little more tail merging.
llvm-svn: 52986
2008-07-01 21:50:49 +00:00
Chris Lattner
501c0b82d6
test doesn't need eh info
...
llvm-svn: 52811
2008-06-27 03:14:20 +00:00
Dale Johannesen
0e28b3b9ee
Allow for rounding up of stack frame.
...
llvm-svn: 52751
2008-06-26 01:55:32 +00:00
Chris Lattner
b1e66ce3bb
when we know the signbit of an input to uint_to_fp is zero,
...
change it to sint_to_fp on targets where that is cheaper (and
visaversa of course). This allows us to compile uint_to_fp to:
_test:
movl 4(%esp), %eax
shrl $23, %eax
cvtsi2ss %eax, %xmm0
movl 8(%esp), %eax
movss %xmm0, (%eax)
ret
instead of:
.align 3
LCPI1_0: ## double
.long 0 ## double least significant word 4.5036e+15
.long 1127219200 ## double most significant word 4.5036e+15
.text
.align 4,0x90
.globl _test
_test:
subl $12, %esp
movl 16(%esp), %eax
shrl $23, %eax
movl %eax, (%esp)
movl $1127219200, 4(%esp)
movsd (%esp), %xmm0
subsd LCPI1_0, %xmm0
cvtsd2ss %xmm0, %xmm0
movl 20(%esp), %eax
movss %xmm0, (%eax)
addl $12, %esp
ret
llvm-svn: 52747
2008-06-26 00:16:49 +00:00
Evan Cheng
3fc2372d3a
- Fix a x86 vector isel bug: illegal transformation of a vector_shuffle into a
...
shift.
- Add a readme entry for a missing vector_shuffle optimization that results in
awful codegen.
llvm-svn: 52740
2008-06-25 20:52:59 +00:00
Mon P Wang
6a490371c9
Added MemOperands to Atomic operations since Atomics touches memory.
...
Added abstract class MemSDNode for any Node that have an associated MemOperand
Changed atomic.lcs => atomic.cmp.swap, atomic.las => atomic.load.add, and
atomic.lss => atomic.load.sub
llvm-svn: 52706
2008-06-25 08:15:39 +00:00
Evan Cheng
73db52ebf8
Enable two-address remat by default.
...
llvm-svn: 52701
2008-06-25 01:16:38 +00:00
Dale Johannesen
6316bc705e
v2f32 is now a valid (MMX) type which breaks this
...
test (doesn't work for any MMX vector types, it's
not me). Rewritten to use v2i16 which is generic
and going to stay that way; I think that preserves
the point of the test.
llvm-svn: 52692
2008-06-24 22:03:36 +00:00
Evan Cheng
3f2ceac565
If it's determined safe, remat MOV32r0 (i.e. xor r, r) and others as it is instead of using the longer MOV32ri instruction.
...
llvm-svn: 52670
2008-06-24 07:10:51 +00:00
Bill Wendling
559067c55c
Make test work on non-x86 machines (like my G4 PPC).
...
llvm-svn: 52619
2008-06-23 06:16:31 +00:00
Evan Cheng
f593a65497
Undo spill weight tweak. Need to investigate the performance regressions.
...
llvm-svn: 52572
2008-06-21 06:45:54 +00:00
Eli Friedman
8d66e98c92
Fix a bug with <8 x i16> shuffle lowering on X86 where parts of the
...
shuffle could be skipped. The check is invalid because the loop index i
doesn't correspond to the element actually inserted. The correct check is
already done a few lines earlier, for whether the element is already in
the right spot, so this shouldn't have any effect on the codegen for
code that was already correct.
llvm-svn: 52486
2008-06-19 06:09:51 +00:00
Evan Cheng
b139ee95b4
New test case.
...
llvm-svn: 52483
2008-06-19 01:50:24 +00:00
Evan Cheng
d6d9cfa72a
This also got better (55 - 51 instructions). But doing one more re-materialization.
...
llvm-svn: 52482
2008-06-19 01:50:13 +00:00
Evan Cheng
1d1907d778
This got better.
...
llvm-svn: 52481
2008-06-19 01:46:43 +00:00
Evan Cheng
1cde1f8d5e
Do not issue identity copies.
...
llvm-svn: 52373
2008-06-16 22:52:53 +00:00
Evan Cheng
49bad4c9e1
- Add "Commutative" property to intrinsics. This allows tblgen to generate the commuted variants for dagisel matching code.
...
- Mark lots of X86 intrinsics as "Commutative" to allow load folding.
llvm-svn: 52353
2008-06-16 20:29:38 +00:00
Evan Cheng
fb79059b85
Teach the spiller to commute instructions in order to fold a reload. This hits 410 times on 444.namd and 122 times on 252.eon.
...
llvm-svn: 52266
2008-06-13 23:58:02 +00:00
Duncan Sands
8651e9c584
Disable some DAG combiner optimizations that may be
...
wrong for volatile loads and stores. In fact this
is almost all of them! There are three types of
problems: (1) it is wrong to change the width of
a volatile memory access. These may be used to
do memory mapped i/o, in which case a load can have
an effect even if the result is not used. Consider
loading an i32 but only using the lower 8 bits. It
is wrong to change this into a load of an i8, because
you are no longer tickling the other three bytes. It
is also unwise to make a load/store wider. For
example, changing an i16 load into an i32 load is
wrong no matter how aligned things are, since the
fact of loading an additional 2 bytes can have
i/o side-effects. (2) it is wrong to change the
number of volatile load/stores: they may be counted
by the hardware. (3) it is wrong to change a volatile
load/store that requires one memory access into one
that requires several. For example on x86-32, you
can store a double in one processor operation, but to
store an i64 requires two (two i32 stores). In a
multi-threaded program you may want to bitcast an i64
to a double and store as a double because that will
occur atomically, and be indivisible to other threads.
So it would be wrong to convert the store-of-double
into a store of an i64, because this will become two
i32 stores - no longer atomic. My policy here is
to say that the number of processor operations for
an illegal operation is undefined. So it is alright
to change a store of an i64 (requires at least two
stores; but could be validly lowered to memcpy for
example) into a store of double (one processor op).
In short, if the new store is legal and has the same
size then I say that the transform is ok. It would
also be possible to say that transforms are always
ok if before they were illegal, whether after they
are illegal or not, but that's more awkward to do
and I doubt it buys us anything much.
However this exposed an interesting thing - on x86-32
a store of i64 is considered legal! That is because
operations are marked legal by default, regardless of
whether the type is legal or not. In some ways this
is clever: before type legalization this means that
operations on illegal types are considered legal;
after type legalization there are no illegal types
so now operations are only legal if they really are.
But I consider this to be too cunning for mere mortals.
Better to do things explicitly by testing AfterLegalize.
So I have changed things so that operations with illegal
types are considered illegal - indeed they can never
map to a machine operation. However this means that
the DAG combiner is more conservative because before
it was "accidentally" performing transforms where the
type was illegal because the operation was nonetheless
marked legal. So in a few such places I added a check
on AfterLegalize, which I suppose was actually just
forgotten before. This causes the DAG combiner to do
slightly more than it used to, which resulted in the X86
backend blowing up because it got a slightly surprising
node it wasn't expecting, so I tweaked it.
llvm-svn: 52254
2008-06-13 19:07:40 +00:00
Evan Cheng
2d788ce3fb
Fix some tests.
...
llvm-svn: 52245
2008-06-12 21:23:38 +00:00
Dale Johannesen
01b7cae58d
Fix parameter spelling: sse not sse1
...
llvm-svn: 52185
2008-06-10 17:57:58 +00:00
Matthijs Kooijman
07f4eecd2a
Fix some more quoting issues in RUN lines, this time regarding unintended
...
variable expansions involving the $ character.
This fixes 4 tests that were not running properly before.
llvm-svn: 52183
2008-06-10 16:10:32 +00:00
Matthijs Kooijman
400c49c781
Remove double pipes in RUN commandlines.
...
This fixes 5 testcases that were not being run properly before.
llvm-svn: 52180
2008-06-10 15:11:36 +00:00
Dan Gohman
1b095b443c
Convert several tests to use temporary files instead of redundantly
...
executing the test commands.
llvm-svn: 52163
2008-06-10 00:36:41 +00:00
Rafael Espindola
29479df2ac
add support for PIC on linux x86-64
...
llvm-svn: 52139
2008-06-09 09:52:31 +00:00
Evan Cheng
976b1eee81
Fix a memcpy lowering bug. Even though the memcpy alignment is smaller than the desired alignment, the frame destination alignment may still be larger than the desired alignment. Don't change its alignment to something smaller.
...
llvm-svn: 51970
2008-06-04 23:37:54 +00:00
Dan Gohman
92d62b43c2
Fix the position of MemOperands in nodes that use variadic_ops
...
in DAGISelEmitter output. This bug was recently uncovered by the
addition of patterns for CALL32m and CALL64m, which are nodes
that now have both MemOperands and variadic_ops.
This bug was especially visible with PIC in various configurations,
because the new patterns are matching the indirect call code used
in many PIC configurations.
llvm-svn: 51877
2008-06-02 17:40:38 +00:00
Dan Gohman
96af4ddb62
Add patterns for CALL32m and CALL64m. They aren't matched in most
...
cases due to an isel deficiency already noted in
lib/Target/X86/README.txt, but they can be matched in this fold-call.ll
testcase, for example.
This is interesting mainly because it exposes a tricky tblgen bug;
tblgen was incorrectly computing the starting index for variable_ops
in the case of a complex pattern.
llvm-svn: 51706
2008-05-29 21:50:34 +00:00
Dan Gohman
714663ab94
Expand small memmovs using inline code. Set the X86 threshold for expanding
...
memmove to a more plausible value, now that it's actually being used.
llvm-svn: 51696
2008-05-29 19:42:22 +00:00
Evan Cheng
5e28227dbd
Implement vector shift up / down and insert zero with ps{rl}lq / ps{rl}ldq.
...
llvm-svn: 51667
2008-05-29 08:22:04 +00:00
Evan Cheng
6892c5507f
Add nounwind.
...
llvm-svn: 51665
2008-05-29 07:09:24 +00:00
Evan Cheng
68079268f5
Fix PR2289: vr defined by multiple implicit_def as result of coalescing.
...
llvm-svn: 51648
2008-05-28 17:40:10 +00:00
Evan Cheng
427412e7c8
Teach local register allocator to deal with landing pad MBB's.
...
llvm-svn: 51647
2008-05-28 17:22:32 +00:00
Dan Gohman
221e9d0d22
Specify a target so that this tests tests what it's intended to test.
...
llvm-svn: 51600
2008-05-27 17:55:57 +00:00
Dan Gohman
923a375053
Make this test independent of the target-triple; the stack alignment
...
is specifically what this test depends on.
llvm-svn: 51599
2008-05-27 17:44:23 +00:00
Nick Lewycky
213e114a2c
The Linux ABI emits an extra "movl %esp, %ebp" in function prologue and
...
sometimes a "mov %ebp, %esp" in the epilogue.
Force these tests that rely on counting 'mov' to use i686-apple-darwin8.8.0
where they were written.
llvm-svn: 51568
2008-05-26 20:18:56 +00:00
Evan Cheng
948627aadd
New loadl_pd and loadh_pd tests.
...
llvm-svn: 51525
2008-05-24 00:10:02 +00:00
Evan Cheng
04d24edcbb
Use movlps / movhps to modify low / high half of 16-byet memory location.
...
llvm-svn: 51501
2008-05-23 21:23:16 +00:00
Dan Gohman
3388d022ac
Use PMULDQ for v2i64 multiplies when SSE4.1 is available. And add
...
load-folding table entries for PMULDQ and PMULLD.
llvm-svn: 51489
2008-05-23 17:49:40 +00:00
Evan Cheng
f3be7a7ea7
Bug: rcpps can only folds a load if the address is 16-byte aligned. Fixed many 'ps' load folding patterns in X86InstrSSE.td which are missing the proper alignment checks.
...
Also fixed some 80 col. violations.
llvm-svn: 51462
2008-05-23 00:37:07 +00:00
Evan Cheng
a1100782d5
Add a couple of test cases.
...
llvm-svn: 51441
2008-05-22 21:19:19 +00:00
Evan Cheng
53963b775e
Add missing patterns.
...
llvm-svn: 51435
2008-05-22 18:56:56 +00:00
Chris Lattner
a87f1a568c
testcase for PR2267
...
llvm-svn: 51408
2008-05-22 04:45:22 +00:00
Evan Cheng
a5d27ae586
Fix PR2343. An *interesting* coalescer bug.
...
BB1:
vr1025 = copy vr1024
..
BB2:
vr1024 = op
= op vr1025
<loop eventually branch back to BB1>
Even though vr1025 is copied from vr1024, it's not safe to coalesced them since live range of vr1025 intersects the def of vr1024. This happens when vr1025 is assigned the value of the previous iteration of vr1024 in the loop.
llvm-svn: 51394
2008-05-21 22:34:12 +00:00
Gabor Greif
1e427c3264
sabre brings to my attention that the 'tr' suffix is also obsolete
...
llvm-svn: 51349
2008-05-20 21:00:03 +00:00
Gabor Greif
f45ff35bfe
Rename the last test with .llx extension to .ll, resolve duplicate test by renaming to isnan2. Now that no test has llx ending there is no need to search for them from dg.exp too.
...
llvm-svn: 51328
2008-05-20 19:52:04 +00:00
Dan Gohman
cd2e772d08
Run vortex-bug as x86-64, which is what the original bug was triggered on.
...
llvm-svn: 51289
2008-05-20 00:54:39 +00:00
Dale Johannesen
0bf92b14f1
Use common where we mean common, not weak.
...
llvm-svn: 51173
2008-05-16 00:52:30 +00:00
Dan Gohman
0a0fa7cf78
Fix a bug in LoopStrengthReduce that caused it to emit IR with
...
use-before-def. The problem comes up in code with multiple PHIs where
one PHI is being rewritten in terms of the other, but the other needs
to be casted first. LLVM rules requre the cast instruction to be
inserted after any PHI instructions, but when instructions were
inserted to replace the second PHI value with a function of the first,
they were ended up going before the cast instruction. Avoid this
problem by remembering the location of the cast instruction, when one
is needed, and inserting the expansion of the new value after it.
This fixes a bug that surfaced in 255.vortex on x86-64 when
instcombine was removed from the middle of the loop optimization
passes.
llvm-svn: 51169
2008-05-15 23:26:57 +00:00
Dan Gohman
3ab94df276
When bit-twiddling CondCode values for integer comparisons produces
...
SETOEQ, is it does with (SETEQ & SETULE), map it to SETEQ.
llvm-svn: 51112
2008-05-14 18:17:09 +00:00
Evan Cheng
1120279ae6
Instead of a vector load, shuffle and then extract an element. Load the element from address with an offset.
...
pshufd $1, (%rdi), %xmm0
movd %xmm0, %eax
=>
movl 4(%rdi), %eax
llvm-svn: 51026
2008-05-13 08:35:03 +00:00
Evan Cheng
3f40c69083
On x86, it's safe to treat i32 load anyext as a normal i32 load. Ditto for i8 anyext load to i16.
...
llvm-svn: 51019
2008-05-13 00:54:02 +00:00
Evan Cheng
b980f6fb3d
Xform bitconvert(build_pair(load a, load b)) to a single load if the load locations are at the right offset from each other.
...
llvm-svn: 51008
2008-05-12 23:04:07 +00:00
Dale Johannesen
e6942c31ea
New test for tail merging
...
llvm-svn: 51007
2008-05-12 22:59:44 +00:00
Evan Cheng
71b9afb053
When transforming a vector_shuffle to a load, the base address must not be an undef.
...
llvm-svn: 50940
2008-05-10 06:46:49 +00:00
Evan Cheng
9c4d685165
Add nounwind.
...
llvm-svn: 50931
2008-05-10 02:22:25 +00:00
Evan Cheng
bec201fa06
If all sources of a PHI node are defined by an implicit_def, just emit an implicit_def instead of a copy.
...
llvm-svn: 50927
2008-05-10 00:17:50 +00:00
Evan Cheng
867af2678f
Add a pattern to do move the low element of a v4f32 and zero extend the rest.
...
llvm-svn: 50922
2008-05-09 23:37:55 +00:00
Evan Cheng
961339bbdb
Handle a few more cases of folding load i64 into xmm and zero top bits.
...
Note, some of the code will be moved into target independent part of DAG combiner in a subsequent patch.
llvm-svn: 50918
2008-05-09 21:53:03 +00:00
Evan Cheng
0352e63e39
Simplify test.
...
llvm-svn: 50911
2008-05-09 19:56:32 +00:00
Evan Cheng
0360ecbec1
Use movq to move low half of XMM register and zero-extend the rest.
...
llvm-svn: 50874
2008-05-08 22:35:02 +00:00
Evan Cheng
78af38c392
Handle vector move / load which zero the destination register top bits (i.e. movd, movq, movss (addr), movsd (addr)) with X86 specific dag combine.
...
llvm-svn: 50838
2008-05-08 00:57:18 +00:00
Evan Cheng
0d6311d46c
Add nounwind.
...
llvm-svn: 50837
2008-05-07 22:59:08 +00:00
Evan Cheng
7ca4a67ca1
Yet another nasty spiller bug.
...
%ecx = op
store %cl<kill>, (addr)
(addr) = op %al
It's not safe to unfold the last operand and eliminate store even though %cl is marked kill. It's a sub-register use which means one of its super-register(s) may be used below.
llvm-svn: 50794
2008-05-07 00:49:28 +00:00
Anton Korobeynikov
f5d2c3b45a
Use target triple in tests, not 'realign-stack=0' option. Per request.
...
llvm-svn: 50778
2008-05-06 23:09:29 +00:00
Evan Cheng
ef3faa1b1c
Fix PR2287. Darwin passes mmx values in register in 64-mode, not Linux.
...
llvm-svn: 50716
2008-05-06 07:23:50 +00:00
Mon P Wang
3e58393c3d
Added addition atomic instrinsics and, or, xor, min, and max.
...
llvm-svn: 50663
2008-05-05 19:05:59 +00:00
Chris Lattner
9c0c60d080
no need for eh info
...
llvm-svn: 50658
2008-05-05 18:24:33 +00:00
Dan Gohman
bcde172222
Add AsmPrinter support for emitting a directive to declare that
...
the code being generated does not require an executable stack.
Also, add target-specific code to make use of this on Linux
on x86.
llvm-svn: 50634
2008-05-05 00:28:39 +00:00
Evan Cheng
d9481366e3
Select vector shift with non-immediate i32 shift amount operand by first moving the operand into the right register.
...
llvm-svn: 50619
2008-05-04 09:15:50 +00:00
Evan Cheng
cdf22f2953
Add separate intrinsics for MMX / SSE shifts with i32 integer operands. This allow us to simplify the horribly complicated matching code.
...
llvm-svn: 50601
2008-05-03 00:52:09 +00:00
Chris Lattner
34931afff7
specify an arch for non-x86 hosts.
...
llvm-svn: 50576
2008-05-02 15:11:58 +00:00
Chris Lattner
d4b2a67cf3
don't randomly miscompile seto/setuo just because we are in
...
ffastmath mode. This fixes rdar://5902801, a miscompilation
of gcc.dg/builtins-8.c.
Bill, please pull this into Tak.
llvm-svn: 50523
2008-05-01 07:26:11 +00:00
Arnold Schwaighofer
68988d13f6
Really commit the test checking the argument lowering behaviour on x86-64 :).
...
llvm-svn: 50478
2008-04-30 09:19:47 +00:00
Chris Lattner
5c88f7b1ad
make the vector conversion magic handle multiple results.
...
We now compile test2/test3 to:
_test2:
## InlineAsm Start
set %xmm0, %xmm1
## InlineAsm End
addps %xmm1, %xmm0
ret
_test3:
## InlineAsm Start
set %xmm0, %xmm1
## InlineAsm End
paddd %xmm1, %xmm0
ret
as expected.
llvm-svn: 50389
2008-04-29 04:48:56 +00:00
Chris Lattner
f9a49c4322
add support for multiple return values in inline asm. This is a step
...
towards PR2094. It now compiles the attached .ll file to:
_sad16_sse2:
movslq %ecx, %rax
## InlineAsm Start
%ecx %rdx %rax %rax %r8d %rdx %rsi
## InlineAsm End
## InlineAsm Start
set %eax
## InlineAsm End
ret
which is pretty decent for a 3 output, 4 input asm.
llvm-svn: 50386
2008-04-29 04:29:54 +00:00
Evan Cheng
11b98b6612
Another extract_subreg coalescing bug.
...
e.g.
vr1024<2> extract_subreg vr1025, 2
If vr1024 do not have the same register class as vr1025, it's not safe to coalesce this away. For example, vr1024 might be a GPR32 while vr1025 might be a GPR64.
llvm-svn: 50385
2008-04-29 01:41:44 +00:00
Evan Cheng
73c3b4741a
Add -march=x86.
...
llvm-svn: 50380
2008-04-28 23:31:41 +00:00
Evan Cheng
315e3cb9a3
Test case.
...
llvm-svn: 50377
2008-04-28 22:14:34 +00:00
Chris Lattner
2237973438
Implement a signficant optimization for inline asm:
...
When choosing between constraints with multiple options,
like "ir", test to see if we can use the 'i' constraint and
go with that if possible. This produces more optimal ASM in
all cases (sparing a register and an instruction to load it),
and fixes inline asm like this:
void test () {
asm volatile (" %c0 %1 " : : "imr" (42), "imr"(14));
}
Previously we would dump "42" into a memory location (which
is ok for the 'm' constraint) which would cause a problem
because the 'c' modifier is not valid on memory operands.
Isn't it great how inline asm turns 'missed optimization'
into 'compile failed'??
Incidentally, this was the todo in
PowerPC/2007-04-24-InlineAsm-I-Modifier.ll
Please do NOT pull this into Tak.
llvm-svn: 50315
2008-04-27 00:37:18 +00:00
Nate Begeman
98f0898e42
Feedback from chris
...
llvm-svn: 50305
2008-04-25 21:47:35 +00:00
Nate Begeman
f10b493fc0
Add a testcase for the recent "handle variable vector insert elt in mem" patch
...
llvm-svn: 50303
2008-04-25 21:26:59 +00:00
Evan Cheng
402572a149
Update tests.
...
llvm-svn: 50293
2008-04-25 20:13:47 +00:00
Evan Cheng
ccde6dd016
Special handling for MMX values being passed in either GPR64 or lower 64-bits of XMM registers.
...
llvm-svn: 50289
2008-04-25 19:11:04 +00:00
Evan Cheng
df38b35a1e
MMX argument passing fixes:
...
On Darwin / Linux x86-32, v8i8, v4i16, v2i32 values are passed in MM[0-2].
On Darwin / Linux x86-32, v1i64 values are passed in memory.
On Darwin x86-64, v8i8, v4i16, v2i32 values are passed in XMM[0-7].
On Darwin x86-64, v1i64 values are passed in 64-bit GPRs.
llvm-svn: 50257
2008-04-25 07:56:45 +00:00
Chris Lattner
741c7a3b49
Loosen up an assertion to allow intrinsics. I really have no
...
idea what this code (findNonImmUse) does, so I'm only guessing
that this is the right thing. It would be really really nice
if this had comments and perhaps switched to SmallPtrSet
(hint hint) :)
This fixes rdar://5886601, a crash on gcc.target/i386/sse4_1-pblendw.c
llvm-svn: 50252
2008-04-25 05:13:01 +00:00
Evan Cheng
9165e165dc
Fix bug in x86 memcpy / memset lowering. If there are trailing bytes not handled by rep instructions, a new memcpy / memset is introduced for them. However, since source / destination addresses are already adjusted, their offsets should be zero.
...
llvm-svn: 50239
2008-04-25 00:26:43 +00:00
Anton Korobeynikov
dd4ef2e30c
Disable stack realignment for these tests
...
llvm-svn: 50172
2008-04-23 18:25:44 +00:00
Anton Korobeynikov
c3ada5c9c4
Fix test becase ABI stack alignment dropped to 'normal' value
...
llvm-svn: 50171
2008-04-23 18:25:16 +00:00
Anton Korobeynikov
955a8a9101
Fix test, instruction count is valid only if stack is not realigned
...
llvm-svn: 50170
2008-04-23 18:24:48 +00:00
Dan Gohman
f166d2d0d6
Implement an x86-64 ABI detail of passing structs by hidden first
...
argument. The x86-64 ABI requires the incoming value of %rdi to
be copied to %rax on exit from a function that is returning a
large C struct.
Also, add a README-X86-64 entry detailing the missed optimization
opportunity and proposing an alternative approach.
llvm-svn: 50075
2008-04-21 23:59:07 +00:00
Chris Lattner
470ab00c76
A better fix for my previous patch, MOVZQI2PQIrr just requires SSE2.
...
llvm-svn: 49986
2008-04-20 05:52:46 +00:00
Chris Lattner
a124f1e219
Not all x86-64 machines have sse3 apparently.
...
llvm-svn: 49985
2008-04-20 05:47:56 +00:00
Chris Lattner
50fb77f829
rename *.llx -> *.ll
...
llvm-svn: 49970
2008-04-19 22:29:10 +00:00
Evan Cheng
7e4a55bc58
Be more careful with insert_subreg and extract_subreg where either source or destination operand has already been coalesced with another register that's defined by a insert_subreg or extract_subreg.
...
llvm-svn: 49843
2008-04-17 07:58:04 +00:00
Evan Cheng
c8c3a899c0
Fix a sub-register indice propagation bug.
...
llvm-svn: 49832
2008-04-17 00:06:42 +00:00
Evan Cheng
147cb764b5
Don't forget about sub-register indices when rematting instructions.
...
llvm-svn: 49830
2008-04-16 23:44:44 +00:00
Evan Cheng
7b989d853e
Really test what's intended.
...
llvm-svn: 49802
2008-04-16 18:21:55 +00:00
Evan Cheng
e45b8f89c5
Rewrite LiveVariable liveness computation. The new implementation is much simplified. It eliminated the nasty recursive routines and removed the partial def / use bookkeeping. There is also potential for performance improvement by replacing the conservative handling of partial physical register definitions. The code is currently disabled until live interval analysis is taught of the name scheme.
...
This patch also fixed a couple of nasty corner cases.
llvm-svn: 49784
2008-04-16 09:46:40 +00:00
Dan Gohman
d43d3beeb0
Add support for the form of the SSE41 extractps instruction that
...
puts its result in a 32-bit GPR.
llvm-svn: 49762
2008-04-16 02:32:24 +00:00
Dan Gohman
8c99ccaf96
Recreate the size SDNode instead of reusing the old one in the x86
...
memcpy lowering code; this ensures that the size node has the desired
result type. This fixes a regression from r49572 with @llvm.memcpy.i64
on x86-32.
llvm-svn: 49761
2008-04-16 01:32:32 +00:00