Bruno Cardoso Lopes
f61d1c072e
Fix vbroadcast matching logic to early unmatch if the node doesn't have
...
only one use. Fix PR10825.
llvm-svn: 138951
2011-09-01 18:15:06 +00:00
Jakob Stoklund Olesen
6357fa2f06
Prevent remat of partial register redefinitions.
...
An instruction that redefines only part of a larger register can never
be rematerialized since the virtual register value depends on the old
value in other parts of the register.
This was fixed for the inline spiller in r138794. This patch fixes the
problem for all register allocators, and includes a small test case.
<rdar://problem/10032939>
llvm-svn: 138944
2011-09-01 17:18:50 +00:00
Andrew Trick
832a6a1909
PreRA scheduler should avoid cloning compares.
...
Added canClobberReachingPhysRegUse() to handle a particular pattern in
which a two-address instruction could be forced to interfere with
EFLAGS, causing a compare to be unnecessarilly cloned.
Fixes rdar://problem/5875261
llvm-svn: 138924
2011-09-01 00:54:31 +00:00
Bill Wendling
55fb73a6e0
Remove old declare statements.
...
llvm-svn: 138905
2011-08-31 21:41:20 +00:00
Bill Wendling
22055c713f
Update more tests to the new EH scheme.
...
llvm-svn: 138904
2011-08-31 21:40:15 +00:00
Bill Wendling
d4e871404d
Update more tests to the new EH scheme.
...
llvm-svn: 138903
2011-08-31 21:39:05 +00:00
Bill Wendling
c4c24f03e9
Revert r138894. This was failing on cmake-clang-i686-msvc10.
...
llvm-svn: 138900
2011-08-31 21:20:25 +00:00
Bill Wendling
e6174a2c85
Update more tests to the new EH scheme.
...
llvm-svn: 138894
2011-08-31 21:04:11 +00:00
Eli Friedman
7c3bdede25
Generic expansion for atomic load/store into cmpxchg/atomicrmw xchg; implements 64-bit atomic load/store for ARM.
...
llvm-svn: 138872
2011-08-31 18:26:09 +00:00
Eli Friedman
1ccecbb9d3
64-bit atomic cmpxchg for ARM.
...
llvm-svn: 138868
2011-08-31 17:52:22 +00:00
David Greene
cdef71f4f3
Compress Repeated Byte Output
...
Emit a repeated sequence of bytes using .zero. This saves an enormous
amount of asm file space for certain programs.
llvm-svn: 138864
2011-08-31 17:30:56 +00:00
Benjamin Kramer
5247ca0ae5
This test requires sse, otherwise x87 ops will block tailcall optimization
...
llvm-svn: 138859
2011-08-31 16:49:05 +00:00
Bruno Cardoso Lopes
9fc6b8be03
- Move all MOVSS and MOVSD patterns close to their definitions
...
- Duplicate some store patterns to their AVX forms!
- Catched a bug while restricting the patterns subtarget, fix it
and update a testcase to check it properly
llvm-svn: 138851
2011-08-31 03:04:20 +00:00
Evan Cheng
cb1e5bae4c
Fix (movhps load) lowering / pattern to match more cases. rdar://10050549
...
llvm-svn: 138848
2011-08-31 02:05:24 +00:00
Eli Friedman
2c7bb52f56
Some minor cleanups for r138845.
...
llvm-svn: 138846
2011-08-31 00:41:05 +00:00
Eli Friedman
c3f9c4a852
Some 64-bit atomic operations on ARM. 64-bit cmpxchg coming next.
...
llvm-svn: 138845
2011-08-31 00:31:29 +00:00
Benjamin Kramer
50cabb5de4
Fix test typo.
...
llvm-svn: 138843
2011-08-31 00:02:59 +00:00
Rafael Espindola
1450f61e8f
Add a triple.
...
llvm-svn: 138831
2011-08-30 21:19:37 +00:00
Rafael Espindola
9f2edc8d2c
Some test code to check if correct code is being generated.
...
Patch by Sanjoy Das.
llvm-svn: 138820
2011-08-30 19:51:29 +00:00
Roman Divacky
71038e7021
Set CR1EQ only when lowering vararg floating arguments (not any vararg
...
arguments as before), unset CR1EQ otherwise.
llvm-svn: 138802
2011-08-30 17:04:16 +00:00
Evan Cheng
e891654a58
Change ARM / Thumb2 addc / adde and subc / sube modeling to use physical
...
register dependency (rather than glue them together). This is general
goodness as it gives scheduler more freedom. However it is motivated by
a nasty bug in isel.
When a i64 sub is expanded to subc + sube.
libcall #1
\
\ subc
\ / \
\ / \
\ / libcall #2
sube
If the libcalls are not serialized (i.e. both have chains which are dag
entry), legalizer can serialize them in arbitrary orders. If it's
unlucky, it can force libcall #2 before libcall #1 in the above case.
subc
|
libcall #2
|
libcall #1
|
sube
However since subc and sube are "glued" together, this ends up being a
cycle when the scheduler combine subc and sube as a single scheduling
unit.
The right solution is to fix LegalizeType too chains the libcalls together.
However, LegalizeType is not processing nodes in order so that's harder than
it should be. For now, the move to physical register dependency will do.
rdar://10019576
llvm-svn: 138791
2011-08-30 01:34:54 +00:00
Eli Friedman
850b9a9a84
Explicitly zero out parts of a vector which are required to be zero by the algorithm in LowerUINT_TO_FP_i32. This only has a substantial effect on the generated code when the input is extracted from a vector register; other ways of loading an i32 do the appropriate zeroing implicitly. Fixes PR10802.
...
llvm-svn: 138768
2011-08-29 21:15:46 +00:00
Owen Anderson
baa7edb06a
Add testcase for r138746.
...
llvm-svn: 138747
2011-08-29 18:02:40 +00:00
Duncan Sands
4d63542b82
Fix PR5329: pay attention to constructor/destructor priority
...
when outputting them. With this, the entire LLVM testsuite
passes when built with dragonegg.
llvm-svn: 138724
2011-08-28 13:17:22 +00:00
Bill Wendling
2f92d2cdae
Update to new EH scheme.
...
llvm-svn: 138699
2011-08-27 04:53:41 +00:00
Bill Wendling
46c720994a
Cannot have an llvm.eh.exception call in a non-landing pad block.
...
llvm-svn: 138698
2011-08-27 04:53:28 +00:00
Eli Friedman
5e5704277f
Add support for generating CMPXCHG16B on x86-64 for the cmpxchg IR instruction.
...
llvm-svn: 138660
2011-08-26 21:21:21 +00:00
Bill Wendling
ac88ab7cce
Revert r138606 until LowerInvoke has been converted to the new EH scheme.
...
llvm-svn: 138656
2011-08-26 21:11:23 +00:00
Eli Friedman
452aae6202
Atomic load/store on ARM/Thumb.
...
I don't really like the patterns, but I'm having trouble coming up with a
better way to handle them.
I plan on making other targets use the same legalization
ARM-without-memory-barriers is using... it's not especially efficient, but
if anyone cares, it's not that hard to fix for a given target if there's
some better lowering.
llvm-svn: 138621
2011-08-26 02:59:24 +00:00
Bill Wendling
62fe9e9aa6
Update to the new EH scheme.
...
llvm-svn: 138606
2011-08-25 23:48:37 +00:00
Bruno Cardoso Lopes
8347b86293
Add support for AVX 256-bit version of MOVDDUP!
...
llvm-svn: 138588
2011-08-25 21:40:37 +00:00
Andrew Trick
6446bf780a
ARM fix for missing implicit operands on ldmia_ret.
...
rdar://10005094: miscompile of 176.gcc
llvm-svn: 138568
2011-08-25 17:50:53 +00:00
Bill Wendling
3fb137f7ef
LSR wants to split the landing pad's critical edge. Let it do it, but use the
...
proper function to do it.
llvm-svn: 138550
2011-08-25 05:55:40 +00:00
Bruno Cardoso Lopes
296256fb32
Add support for 256-bit versions of VSHUFPD and VSHUFPS.
...
llvm-svn: 138546
2011-08-25 02:58:26 +00:00
Eli Friedman
9c73a57b20
Hook up 64-bit atomic load/store on x86-32. I plan to write more efficient implementations eventually.
...
llvm-svn: 138505
2011-08-24 22:33:28 +00:00
Eli Friedman
5aabaaa367
Basic tests for atomic load and store on x86.
...
llvm-svn: 138486
2011-08-24 21:16:59 +00:00
Richard Osborne
6e3c83eb1c
Add Uses=[SP] to call instructions. This fixes a miscompilation with a
...
variable sized alloca.
llvm-svn: 138433
2011-08-24 13:32:43 +00:00
Craig Topper
de92622aa5
Break 256-bit vector int add/sub/mul into two 128-bit operations to avoid costly scalarization. Fixes PR10711.
...
llvm-svn: 138427
2011-08-24 06:14:18 +00:00
Bruno Cardoso Lopes
9e9f2ce32d
Fix a nasty bug where a v4i64 was being wrong emitted with 32-bit
...
permutations. Also tidy up some patterns and make them close to their
instruction definition!
llvm-svn: 138392
2011-08-23 22:06:37 +00:00
Nick Lewycky
4c8ff77f1b
PerformSubCombine to work on integers larger than i128. Fixes a crasher.
...
llvm-svn: 138354
2011-08-23 19:01:24 +00:00
Craig Topper
6612e35b0d
Add support for breaking 256-bit v16i16 and v32i8 VSETCC into two 128-bit ones, avoiding sclarization. Add vex form of pcmpeqq and pcmpgtq. Fixes more cases for PR10712.
...
llvm-svn: 138321
2011-08-23 04:36:33 +00:00
Bruno Cardoso Lopes
2a3ffb5d97
Introduce a pass to insert vzeroupper instructions to avoid AVX to
...
SSE transition penalty. The pass is enabled through the "x86-use-vzeroupper"
llc command line option. This is only the first step (very naive and
conservative one) to sketch out the idea, but proper DFA is coming next
to allow smarter decisions. Comments and ideas now and in further commits
will be very appreciated.
llvm-svn: 138317
2011-08-23 01:14:17 +00:00
Bruno Cardoso Lopes
74f090d44c
Add support for breaking 256-bit int VETCC into two 128-bit ones,
...
avoding scalarization of the compare. Reduces code from 59 to 6
instructions. Fix PR10712.
llvm-svn: 138271
2011-08-22 20:31:04 +00:00
Chad Rosier
d6641af80c
With the fix in r138164: "Add <imp-def> operands to QQ and QQQQ stack loads."
...
-verify-machineinstrs can be enabled for this test case.
llvm-svn: 138171
2011-08-20 00:34:45 +00:00
Chad Rosier
be7625161e
VMOVQQQQs pseudo instructions are only created by ARMBaseInstrInfo::copyPhysReg.
...
Therefore, rather then generate a pseudo instruction, which is later expanded,
generate the necessary instructions in place.
llvm-svn: 138163
2011-08-20 00:17:25 +00:00
Devang Patel
59e27c5f12
Do not use named md nodes to track variables that are completely optimized. This does not scale while doing LTO with debug info. New approach is to include list of variables in the subprogram info directly.
...
llvm-svn: 138145
2011-08-19 23:28:12 +00:00
Jim Grosbach
8d77bb5f06
Use regex to remove false dependencies on register allocation.
...
llvm-svn: 138137
2011-08-19 23:10:31 +00:00
Jim Grosbach
066e9ec1e4
Update tests.
...
llvm-svn: 138116
2011-08-19 22:19:48 +00:00
Jakob Stoklund Olesen
90b6018c8f
Add test case for r138018.
...
llvm-svn: 138033
2011-08-19 04:30:24 +00:00
Akira Hatanaka
fb4161ae88
Use subword loads instead of a 4-byte load when the size of a structure (or a
...
piece of it) that is being passed by value is smaller than a word.
llvm-svn: 138007
2011-08-18 23:39:37 +00:00