Commit Graph

15370 Commits

Author SHA1 Message Date
Chandler Carruth 18dfac385b The logic for breaking the CFG in the presence of hot successors didn't
properly account for the *global* probability of the edge being taken.
This manifested as a very large number of unconditional branches to
blocks being merged against the CFG even though they weren't
particularly hot within the CFG.

The fix is to check whether the edge being merged is both locally hot
relative to other successors for the source block, and globally hot
compared to other (unmerged) predecessors of the destination block.

This introduces a new crasher on GCC single-source, but it's currently
behind a flag, and Ben has offered to work on the reduction. =]

llvm-svn: 145010
2011-11-20 11:22:06 +00:00
Benjamin Kramer 650c09aa4d XFAIL this test until I figure out what indvars is doing here (or find someone who does)
llvm-svn: 145008
2011-11-20 11:10:03 +00:00
Chandler Carruth 20df3953d3 Add some comments to the latest test case I added here to document what
is actually being tested. Also add some FileCheck goodness to much more
carefully ensure that the result is the desired result. Before this test
would only have failed through an assert failure if the underlying fix
were reverted.

Also, add some weight metadata and a comment explaining exactly what is
going on to a trick section of the test case. Originally, we were
getting very unlucky and trying to form a block chain that isn't
actually profitable. I'm working on a fix to avoid forming these
unprofitable chains, and that would also have masked any failure from
this test case. The easy solution is to add some metadata that makes it
*really* profitable to form the bad chain here.

llvm-svn: 145006
2011-11-20 09:30:40 +00:00
Craig Topper e79761df73 Add code for lowering v32i8 shifts by a splat to AVX2 immediate shift instructions. Remove 256-bit splat handling from LowerShift as it was already handled by PerformShiftCombine.
llvm-svn: 145005
2011-11-20 00:12:05 +00:00
Craig Topper a3a6583694 Use 256-bit vcmpeqd for creating an all ones vector when AVX2 is enabled.
llvm-svn: 145004
2011-11-19 22:34:59 +00:00
Chandler Carruth f3dc9eff16 Move the handling of unanalyzable branches out of the loop-driven chain
formation phase and into the initial walk of the basic blocks. We
essentially pre-merge all blocks where unanalyzable fallthrough exists,
as we won't be able to update the terminators effectively after any
reorderings. This is quite a bit more principled as there may be CFGs
where the second half of the unanalyzable pair has some analyzable
predecessor that gets placed first. Then it may get placed next,
implicitly breaking the unanalyzable branch even though we never even
looked at the part that isn't analyzable. I've included a test case that
triggers this (thanks Benjamin yet again!), and I'm hoping to synthesize
some more general ones as I dig into related issues.

Also, to make this new scheme work we have to be able to handle branches
into the middle of a chain, so add this check. We always fallback on the
incoming ordering.

Finally, this starts to really underscore a known limitation of the
current implementation -- we don't consider broken predecessors when
merging successors. This can caused major missed opportunities, and is
something I'm planning on looking at next (modulo more bug reports).

llvm-svn: 144994
2011-11-19 10:26:02 +00:00
Craig Topper 6d77f4ae14 Test cases for SSSE3/AVX integer horizontal add/sub.
llvm-svn: 144990
2011-11-19 09:03:33 +00:00
Craig Topper de6b73bb4d Extend VPBLENDVB and VPSIGN lowering to work for AVX2.
llvm-svn: 144987
2011-11-19 07:07:26 +00:00
Andrew Trick 6b4d578f54 Fix a corner case in updating LoopInfo after fully unrolling an outer loop.
The loop tree's inclusive block lists are painful and expensive to
update. (I have no idea why they're inclusive). The design was
supposed to handle this case but the implementation missed it and my
unit tests weren't thorough enough.

Fixes PR11335: loop unroll update.

llvm-svn: 144970
2011-11-18 03:42:41 +00:00
Nadav Rotem 1ec141d0f9 Add AVX2 vpbroadcast support
llvm-svn: 144967
2011-11-18 02:49:55 +00:00
Kostya Serebryany 1cdc6e9567 [asan] workaround for reg alloc bug 11395: don't instrument functions with large chunks of inline assembler
llvm-svn: 144962
2011-11-18 01:41:06 +00:00
Devang Patel 107e8ec30d DISubrange supports unsigned lower/upper array bounds, so let's not fake it in the end while emitting DWARF. If a FE needs to encode signed lower/upper array bounds then we need to extend DISubrange or ad DISignedSubrange.
llvm-svn: 144937
2011-11-17 23:43:15 +00:00
Andrew Trick 949045864d Fix an overly general check in SimplifyIndvar to handle useless phi cycles.
The right way to check for a binary operation is
cast<BinaryOperator>. The original check: cast<Instruction> &&
numOperands() == 2 would match phi "instructions", leading to an
infinite loop in extreme corner case: a useless phi with operands
[self, constant] that prior optimization passes failed to remove,
being used in the loop by another useless phi, in turn being used by an
lshr or udiv.

Fixes PR11350: runaway iteration assertion.

llvm-svn: 144935
2011-11-17 23:36:35 +00:00
Kostya Serebryany 65e2211b95 fall back to explicit list of allowed linkages when instrumenting globals in asan; add a test check that asan does not touch linkonce_odr
llvm-svn: 144933
2011-11-17 23:14:59 +00:00
Chad Rosier f83ab704e4 When fast iseling a GEP, accumulate the offset rather than emitting a series of
ADDs.  MaxOffs is used as a threshold to limit the size of the offset. Tradeoffs
being: (1) If we can't materialize the large constant then we'll cause fast-isel
to bail. (2) Too large of an offset can't be directly encoded in the ADD
resulting in a MOV+ADD.  Generally not a bad thing because otherwise we would
have had ADD+ADD, but on Thumb this turns into a MOVS+MOVT+ADD. Working on a fix
for that. (3) Conversely, too low of a threshold we'll miss opportunities to 
coalesce ADDs.
rdar://10412592

llvm-svn: 144886
2011-11-17 07:15:58 +00:00
Eli Friedman 489c0ff4a4 Add support for custom names for library functions in TargetLibraryInfo. Add a custom name for fwrite and fputs on x86-32 OSX. Make SimplifyLibCalls honor the custom
names for fwrite and fputs.

Fixes <rdar://problem/9815881>.

llvm-svn: 144876
2011-11-17 01:27:36 +00:00
Daniel Dunbar 586aabc44a build/make/test: Get rid of unused BUGPOINT_TOPTS variable.
llvm-svn: 144864
2011-11-16 23:56:03 +00:00
Eli Friedman ff1eaa7578 Make sure to replace the chain properly when DAGCombining a LOAD+EXTRACT_VECTOR_ELT into a single LOAD. Fixes PR10747/PR11393.
llvm-svn: 144863
2011-11-16 23:50:22 +00:00
Jim Grosbach f4d2e0d458 Remove obsolete test.
The PLD encoding is checked via the .s file now.

llvm-svn: 144853
2011-11-16 22:50:38 +00:00
Jim Grosbach d3f02cbce9 Generalize the fixup info for ARM mode.
We don't (yet) have the granularity in the fixups to be specific about which
bitranges are affected. That's a future cleanup, but we're not there yet.

llvm-svn: 144852
2011-11-16 22:48:37 +00:00
Jim Grosbach d66cb5ab33 Update test for r144842.
llvm-svn: 144851
2011-11-16 22:46:27 +00:00
Evan Cheng 011538dc79 Another missing X86ISD::MOVLPD pattern. rdar://10450317
llvm-svn: 144839
2011-11-16 22:24:44 +00:00
Evan Cheng 822ddde50d Disable expensive two-address optimizations at -O0. rdar://10453055
llvm-svn: 144806
2011-11-16 18:44:48 +00:00
Nick Lewycky db408f1857 Fix typo in test.
llvm-svn: 144774
2011-11-16 03:56:38 +00:00
Nick Lewycky c7f1e7993c Merge isObjectPointerWithTrustworthySize with getPointerSize. Use it when
looking at the size of the pointee. Fixes PR11390!

llvm-svn: 144773
2011-11-16 03:49:48 +00:00
Eli Friedman e6270395e3 Fix testcase.
llvm-svn: 144769
2011-11-16 03:03:52 +00:00
Eli Friedman 87f92512c3 CONCAT_VECTORS can have more than two operands. PR11389.
llvm-svn: 144768
2011-11-16 02:52:39 +00:00
Kostya Serebryany 6e6b03ec46 AddressSanitizer, first commit (compiler module only)
llvm-svn: 144758
2011-11-16 01:35:23 +00:00
Andrew Trick 90c7a108ca Fix SCEV overly optimistic back edge taken count for multi-exit loops.
Fixes PR11375: Different results for 'clang++ huh.cpp'...

llvm-svn: 144746
2011-11-16 00:52:40 +00:00
Jim Grosbach e891fe8d6c ARM assembly parsing for register range syntax for VLD/VST register lists.
For example,
vld1.f64 {d2-d5}, [r2,:128]!

Should be equivalent to:
vld1.f64 {d2,d3,d4,d5}, [r2,:128]!

It's not documented syntax in the ARM ARM, but it is consistent with what's
accepted for VLDM/VSTM and is unambiguous in meaning, so it's a good thing to
support.

rdar://10451128

llvm-svn: 144727
2011-11-15 23:19:15 +00:00
Nadav Rotem 37010002f2 AVX: Add support for vbroadcast from BUILD_VECTOR and refactor some of the vbroadcast code.
llvm-svn: 144720
2011-11-15 22:50:37 +00:00
NAKAMURA Takumi f6b315c081 test/CodeGen/X86/dec-eflags-lower.ll: Relax expression for win32 x64.
llvm-svn: 144714
2011-11-15 22:30:37 +00:00
Jim Grosbach 75fb4abcdc ARM assembly parsing two operand forms for shift instructions.
llvm-svn: 144713
2011-11-15 22:27:54 +00:00
Pete Cooper 7c7ba1baa1 Added custom lowering for load->dec->store sequence in x86 when the EFLAGS registers is used
by later instructions.

Only done for DEC64m right now.

Fixes <rdar://problem/6172640>

llvm-svn: 144705
2011-11-15 21:57:53 +00:00
Jim Grosbach 131b45e632 ARM alternate size suffices for VTRN instructions.
rdar://10435076

llvm-svn: 144694
2011-11-15 20:49:46 +00:00
Jim Grosbach 5803f6d5a2 ARM assembly parsing for optional datatype suffix on VFP VMOV GPR<->VFP insns.
Yet more of rdar://10435076.

llvm-svn: 144691
2011-11-15 20:29:42 +00:00
Jim Grosbach c5b1bc561e ARM assembly parsing for two-operand form of 'mul' instruction.
rdar://10449856.

llvm-svn: 144689
2011-11-15 20:14:51 +00:00
Jim Grosbach 72dfd20aba ARM assembly parsing for two-operand form of 'mul' instruction.
Ongoing rdar://10435114.

llvm-svn: 144688
2011-11-15 20:02:06 +00:00
Jim Grosbach 68c899c211 Testcase for r144684.
llvm-svn: 144685
2011-11-15 19:56:17 +00:00
Owen Anderson 0ac9058f89 Fix an ambiguous decoding where we failed to properly decode VMOVv2f32 and VMOVv4f32.
llvm-svn: 144683
2011-11-15 19:55:00 +00:00
Jim Grosbach 6efa7b9852 Thumb2 assembly parsing for mul.w in IT block fix.
When the 3rd operand is not a low-register, and the first two operands are
the same low register, the parser was incorrectly trying to use the 16-bit
instruction encoding.

rdar://10449281

llvm-svn: 144679
2011-11-15 19:29:45 +00:00
Rafael Espindola f11e7f1305 We currently use a callback to handle an IL pass deleting a BB that still
has a reference to it. Unfortunately, that doesn't work for codegen passes
since we don't get notified of MBB's being deleted (the original BB stays).

Use that fact to our advantage and after printing a function, check if
any of the IL BBs corresponds to a symbol that was not printed. This fixes
pr11202.

llvm-svn: 144674
2011-11-15 19:08:46 +00:00
Jakob Stoklund Olesen 4949b9a283 Revert r144611 and r144613.
These tests are actually correct, clang was miscompiling ExeDepsFix::processUses.

Evan fixed the miscompilation in r144628.

llvm-svn: 144630
2011-11-15 07:13:03 +00:00
Chandler Carruth 9b548a7fcf Rather than trying to use the loop block sequence *or* the function
block sequence when recovering from unanalyzable control flow
constructs, *always* use the function sequence. I'm not sure why I ever
went down the path of trying to use the loop sequence, it is
fundamentally not the correct sequence to use. We're trying to preserve
the incoming layout in the cases of unreasonable control flow, and that
is only encoded at the function level. We already have a filter to
select *exactly* the sub-set of blocks within the function that we're
trying to form into a chain.

The resulting code layout is also significantly better because of this.
In several places we were ending up with completely unreasonable control
flow constructs due to the ordering chosen by the loop structure for its
internal storage. This change removes a completely wasteful vector of
basic blocks, saving memory allocation in the common case even though it
costs us CPU in the fairly rare case of unnatural loops. Finally, it
fixes the latest crasher reduced out of GCC's single source. Thanks
again to Benjamin Kramer for the reduction, my bugpoint skills failed at
it.

llvm-svn: 144627
2011-11-15 06:26:43 +00:00
Craig Topper 05baa85f58 Properly qualify AVX2 specific parts of execution dependency table. Also enable converting between 256-bit PS/PD operations when AVX1 is enabled. Fixes PR11370.
llvm-svn: 144622
2011-11-15 05:55:35 +00:00
Jakob Stoklund Olesen 9c0de9bb6b Really fix test.
llvm-svn: 144613
2011-11-15 03:17:01 +00:00
Jakob Stoklund Olesen 14b66375a9 Allow for depencendy-breaking instructions before cvt*.
This should unbreak clang-x86_64-darwin10-RA, but I can't actually
reproduce the failure.

llvm-svn: 144611
2011-11-15 02:29:48 +00:00
Evan Cheng 7ca4b6eb5c Add vmov.f32 to materialize f32 immediate splats which cannot be handled by
integer variants. rdar://10437054

llvm-svn: 144608
2011-11-15 02:12:34 +00:00
Jakob Stoklund Olesen f8ad336bc4 Break false dependencies before partial register updates.
Two new TargetInstrInfo hooks lets the target tell ExecutionDepsFix
about instructions with partial register updates causing false unwanted
dependencies.

The ExecutionDepsFix pass will break the false dependencies if the
updated register was written in the previoius N instructions.

The small loop added to sse-domains.ll runs twice as fast with
dependency-breaking instructions inserted.

llvm-svn: 144602
2011-11-15 01:15:30 +00:00
Jim Grosbach a498af2b1d ARM parsing datatype suffix variants for non-writeback VST1 instructions.
rdar://10435076

llvm-svn: 144593
2011-11-14 23:43:46 +00:00
Jim Grosbach 72838a0345 ARM parsing datatype suffix variants for non-writeback VLD1 instructions.
rdar://10435076

llvm-svn: 144592
2011-11-14 23:32:59 +00:00
Jim Grosbach 3d6c0e0bb2 ARM parsing optional datatype suffix for VAND/VEOR/VORR instructions.
rdar://10435076

llvm-svn: 144587
2011-11-14 23:11:19 +00:00
Jim Grosbach 3e2c6f380c ARM VLDR/VSTR instructions don't need a size suffix.
Canonicallize on the non-suffixed form, but continue to accept assembly that
has any correctly sized type suffix.

llvm-svn: 144583
2011-11-14 23:03:21 +00:00
Nick Lewycky 7013a19e8a Refactor capture tracking (which already had a couple flags for whether returns
and stores capture) to permit the caller to see each capture point and decide
whether to continue looking.

Use this inside memdep to do an analysis that basicaa won't do. This lets us
solve another devirtualization case, fixing PR8908!

llvm-svn: 144580
2011-11-14 22:49:42 +00:00
Chad Rosier 4e88fbebde Add newline to end of file. Thanks, Eli.
llvm-svn: 144579
2011-11-14 22:48:33 +00:00
Chad Rosier ab7223e99a Add support for inlining small memcpys.
rdar://10412592

llvm-svn: 144578
2011-11-14 22:46:17 +00:00
Chad Rosier 45110fdf8d Fix a performance regression from r144565. Positive offsets were being lowered
into registers, rather then encoded directly in the load/store.

llvm-svn: 144576
2011-11-14 22:34:48 +00:00
Evan Cheng fb13d32b3f Add a missing pattern for X86ISD::MOVLPD. rdar://10436044
llvm-svn: 144566
2011-11-14 20:35:52 +00:00
Chad Rosier adfd200bcb Add support for Thumb load/stores with negative offsets.
rdar://10412592

llvm-svn: 144565
2011-11-14 20:22:27 +00:00
Evan Cheng 30f44ad785 Teach two-address pass to re-schedule two-address instructions (or the kill
instructions of the two-address operands) in order to avoid inserting copies.
This fixes the few regressions introduced when the two-address hack was
disabled (without regressing the improvements).
rdar://10422688

llvm-svn: 144559
2011-11-14 19:48:55 +00:00
Pete Cooper 890e02e854 Changed SSE4/AVX <2 x i64> extract and insert ops to be Custom lowered
Constant idx case is still done in tablegen but other cases are then expanded

Fixes <rdar://problem/10435460>

llvm-svn: 144557
2011-11-14 19:38:42 +00:00
Jakob Stoklund Olesen 7e6004a3c1 Fix early-clobber handling in shrinkToUses.
I broke this in r144515, it affected most ARM testers.

<rdar://problem/10441389>

llvm-svn: 144547
2011-11-14 18:45:38 +00:00
Jakob Stoklund Olesen 7e07b388ac Delete stale comment.
llvm-svn: 144542
2011-11-14 18:03:05 +00:00
Chandler Carruth ed5aa547bc Fix an overflow bug in MachineBranchProbabilityInfo. This pass relied on
the sum of the edge weights not overflowing uint32, and crashed when
they did. This is generally safe as BranchProbabilityInfo tries to
provide this guarantee. However, the CFG can get modified during codegen
in a way that grows the *sum* of the edge weights. This doesn't seem
unreasonable (imagine just adding more blocks all with the default
weight of 16), but it is hard to come up with a case that actually
triggers 32-bit overflow. Fortuately, the single-source GCC build is
good at this. The solution isn't very pretty, but its no worse than the
previous code. We're already summing all of the edge weights on each
query, we can sum them, check for an overflow, compute a scale, and sum
them again.

I've included a *greatly* reduced test case out of the GCC source that
triggers it. It's a pretty lame test, as it clearly is just barely
triggering the overflow. I'd like to have something that is much more
definitive, but I don't understand the fundamental pattern that triggers
an explosion in the edge weight sums.

The buggy code is duplicated within this file. I'll colapse them into
a single implementation in a subsequent commit.

llvm-svn: 144526
2011-11-14 08:50:16 +00:00
Chad Rosier 2a1df883d0 Add support for ARM halfword load/stores and signed byte loads with negative
offsets.
rdar://10412592

llvm-svn: 144518
2011-11-14 04:09:28 +00:00
Chandler Carruth 1071cfa4ae Teach machine block placement to cope with unnatural loops. These don't
get loop info structures associated with them, and so we need some way
to make forward progress selecting and placing basic blocks. The
technique used here is pretty brutal -- it just scans the list of blocks
looking for the first unplaced candidate. It keeps placing blocks like
this until the CFG becomes tractable.

The cost is somewhat unfortunate, it requires allocating a vector of all
basic block pointers eagerly. I have some ideas about how to simplify
and optimize this, but I'm trying to get the logic correct first.

Thanks to Benjamin Kramer for the reduced test case out of GCC. Sadly
there are other bugs that GCC is tickling that I'm reducing and working
on now.

llvm-svn: 144516
2011-11-14 00:00:35 +00:00
Chandler Carruth 8d15078927 Rewrite #3 of machine block placement. This is based somewhat on the
second algorithm, but only loosely. It is more heavily based on the last
discussion I had with Andy. It continues to walk from the inner-most
loop outward, but there is a key difference. With this algorithm we
ensure that as we visit each loop, the entire loop is merged into
a single chain. At the end, the entire function is treated as a "loop",
and merged into a single chain. This chain forms the desired sequence of
blocks within the function. Switching to a single algorithm removes my
biggest problem with the previous approaches -- they had different
behavior depending on which system triggered the layout. Now there is
exactly one algorithm and one basis for the decision making.

The other key difference is how the chain is formed. This is based
heavily on the idea Andy mentioned of keeping a worklist of blocks that
are viable layout successors based on the CFG. Having this set allows us
to consistently select the best layout successor for each block. It is
expensive though.

The code here remains very rough. There is a lot that needs to be done
to clean up the code, and to make the runtime cost of this pass much
lower. Very much WIP, but this was a giant chunk of code and I'd rather
folks see it sooner than later. Everything remains behind a flag of
course.

I've added a couple of tests to exercise the issues that this iteration
was motivated by: loop structure preservation. I've also fixed one test
that was exhibiting the broken behavior of the previous version.

llvm-svn: 144495
2011-11-13 11:20:44 +00:00
Chad Rosier 1198d894d0 The order in which the predicate is added differs between Thumb and ARM mode. Fix predicate when in ARM mode and restore SelectIntrinsicCall.
llvm-svn: 144494
2011-11-13 09:44:21 +00:00
Chad Rosier a476e391f1 Temporarily disable SelectIntrinsicCall when in ARM mode. This is causing failures.
llvm-svn: 144492
2011-11-13 05:14:43 +00:00
Chad Rosier c8cfd3a8fb Add support for emitting both signed- and zero-extend loads. Fix
SimplifyAddress to handle either a 12-bit unsigned offset or the ARM +/-imm8
offsets (addressing mode 3).  This enables a load followed by an integer 
extend to be folded into a single load.

For example:
ldrb r1, [r0]       ldrb r1, [r0]
uxtb r2, r1     =>
mov  r3, r2         mov  r3, r1

llvm-svn: 144488
2011-11-13 02:23:59 +00:00
Jakob Stoklund Olesen 6ddb767fb5 Remove the -color-ss-with-regs option.
It was off by default.

The new register allocators don't have the problems that made it
necessary to reallocate registers during stack slot coloring.

llvm-svn: 144481
2011-11-13 00:31:23 +00:00
Jakob Stoklund Olesen 7ef502f6d1 Delete the 'standard' spiller with used the old spilling framework.
The current register allocators all use the inline spiller.

llvm-svn: 144477
2011-11-12 23:29:02 +00:00
Jakob Stoklund Olesen ce4ef9f8d5 Remove histogram tests.
Counting the number of occurences of each opcode is not a useful test.

llvm-svn: 144474
2011-11-12 22:39:40 +00:00
Jakob Stoklund Olesen 0eac531bc2 RAGreedy is better about hinting now.
Or maybe we are just getting lucky.

llvm-svn: 144473
2011-11-12 22:39:37 +00:00
Jakob Stoklund Olesen 8ec1a92afd Linear scan is going away.
llvm-svn: 144472
2011-11-12 22:39:34 +00:00
Jakob Stoklund Olesen 654d60888e XFAIL test that depends on linear scan to remove dead code.
Filed PR11364 to track the problem.  Should the register allocator
eliminate dead code?

llvm-svn: 144471
2011-11-12 22:39:30 +00:00
Jakob Stoklund Olesen fa3a8ee6e2 Remove obsolete test.
This test was committed with a bugfix to RemoveCopyByCommutingDef, but
that optimization is no longer triggered by this test.

llvm-svn: 144470
2011-11-12 22:39:27 +00:00
Jakob Stoklund Olesen 80b3d299a9 Remove obsolete test.
This test is for a very specific LocalRewriter bug.  LocalRewriter is
going away.

llvm-svn: 144469
2011-11-12 22:39:24 +00:00
Jakob Stoklund Olesen 0c7d9d90ef Remove obsolete test.
I don't think this test does what is was supposed to do, and
LocalRewriter is going away anyway.

llvm-svn: 144463
2011-11-12 20:37:57 +00:00
Jakob Stoklund Olesen 126f9779c3 Eliminate more linear scan tests.
llvm-svn: 144462
2011-11-12 20:35:26 +00:00
Jakob Stoklund Olesen 9d090daa33 Switch a couple -O0 tests to RABasic.
llvm-svn: 144461
2011-11-12 20:11:04 +00:00
Jakob Stoklund Olesen 4deff7bc1d Switch a few tests off linearscan.
llvm-svn: 144460
2011-11-12 19:53:52 +00:00
Jakob Stoklund Olesen 6ac6aa782d Delete old test of a VirtRegRewriter feature.
This test doesn't expose the issue with RAGreedy.

I filed PR11363 to track the missing InlineSpiller feature.

llvm-svn: 144459
2011-11-12 19:53:48 +00:00
Jakob Stoklund Olesen 74d091b395 Remove old test that doesn't make sense.
The test is checking that the output doesn't contains any 'mov '
strings. It does contain movl, though.

llvm-svn: 144458
2011-11-12 19:53:45 +00:00
Craig Topper 3dc75f9e3b Add more AVX2 shift lowering support. Move AVX2 variable shift to use patterns instead of custom lowering code.
llvm-svn: 144457
2011-11-12 09:58:49 +00:00
Nick Lewycky d48ab84556 Don't try to loop on iterators that are potentially invalidated inside the loop. Fixes PR11361!
llvm-svn: 144454
2011-11-12 03:09:12 +00:00
Eli Friedman ecb453805d Make sure scalarrepl picks the correct alloca when it rewrites a bitcast. Fixes PR11353.
llvm-svn: 144442
2011-11-12 02:07:50 +00:00
Rafael Espindola e7cc8bff82 The dwarf standard says that the only differences between a out-of-line
instance and a concrete inlined instance are the use of DW_TAG_subprogram
instead of DW_TAG_inlined_subroutine and the who owns the tree.

We were also omitting DW_AT_inline from the abstract roots. To fix this,
make sure we mark abstract instance roots with DW_AT_inline even when
we have only out-of-line instances referring to them with DW_AT_abstract_origin.

FileCheck is not a very good tool for tests like this, maybe we should add
a -verify mode to llvm-dwarfdump.

llvm-svn: 144441
2011-11-12 01:57:54 +00:00
Eli Friedman 9d448e4a42 Don't try to form pre/post-indexed loads/stores until after LegalizeDAG runs. Fixes PR11029.
llvm-svn: 144438
2011-11-12 00:35:34 +00:00
Jim Grosbach 609d113874 ARM optional size suffix for VLDR/VSTR syntax.
llvm-svn: 144427
2011-11-11 23:34:43 +00:00
Chad Rosier a7ebc5617d Add support in fast-isel for selecting memset/memcpy/memmove intrinsics.
llvm-svn: 144426
2011-11-11 23:31:03 +00:00
Chad Rosier ab1a4a2301 Loosen test by using REs. Approved by Devang.
llvm-svn: 144425
2011-11-11 23:25:38 +00:00
Andrew Trick 28c1d18434 Preserve MachineMemOperands in ARMLoadStoreOptimizer.
Fixes PR8113.

llvm-svn: 144409
2011-11-11 22:18:09 +00:00
Jim Grosbach 85a2343b01 ARM allow Q registers in vldm/vstm register lists.
rdar://9672822

llvm-svn: 144407
2011-11-11 21:27:40 +00:00
Devang Patel a90cb76489 Move X86 specific test in X86 directory.
llvm-svn: 144395
2011-11-11 18:13:19 +00:00
Devang Patel a39794b029 Move X86 specific test in X86 directory.
llvm-svn: 144394
2011-11-11 18:10:38 +00:00
Dan Bailey 089cc53232 allow non-device function calls in PTX when natively handling device-side printf
llvm-svn: 144388
2011-11-11 14:45:12 +00:00
Craig Topper ea28a34c43 Add lowering for AVX2 shift instructions.
llvm-svn: 144380
2011-11-11 07:39:23 +00:00
Chad Rosier 7ddd63ce4e Add support for using immediates with select instructions.
rdar://10412592

llvm-svn: 144376
2011-11-11 06:20:39 +00:00
Eli Friedman c4a001478c Make sure to expand SIGN_EXTEND_INREG for NEON vectors. PR11319, round 3.
llvm-svn: 144361
2011-11-11 03:16:38 +00:00
Eli Friedman 0a309292c4 Get rid of an optimization in SCCP which appears to have many issues. Specifically, it doesn't handle many cases involving undef correctly, and it is missing other checks which
lead to it trying to re-mark a value marked as a constant with a different value.  It also appears to trigger very rarely.

Fixes PR11357.

llvm-svn: 144352
2011-11-11 01:16:15 +00:00
Chad Rosier 2a3503e061 Add support for using MVN to materialize negative constants.
rdar://10412592

llvm-svn: 144348
2011-11-11 00:36:21 +00:00
Jim Grosbach 9bded9dc24 Thumb2 parsing for push/pop w/ hi registers in the reglist.
rdar://10130228.

llvm-svn: 144331
2011-11-10 23:17:11 +00:00
Rafael Espindola 79278365d3 Check in getOrCreateSubprogramDIE if a declaration exists and if so output
it first.

This is a more general fix to pr11300.

llvm-svn: 144324
2011-11-10 22:34:29 +00:00
Jim Grosbach 5a5ce63742 Thumb MUL assembly parsing for 3-operand form.
Get the source register that isn't tied to the destination register correct,
even when the assembly source operand order is backwards.

rdar://10428630

llvm-svn: 144322
2011-11-10 22:10:12 +00:00
Chad Rosier d1762e00e2 When in ARM mode, LDRH/STRH require special handling of negative offsets.
For correctness, disable this for now.
rdar://10418009

llvm-svn: 144316
2011-11-10 21:09:49 +00:00
Jim Grosbach c14871cc67 ARM assembly parsing for LSR/LSL/ROR(immediate).
More of rdar://9704684

llvm-svn: 144301
2011-11-10 19:18:01 +00:00
Jim Grosbach 61db5a59f7 ARM assembly parsing for ASR(immediate).
Start of rdar://9704684

llvm-svn: 144293
2011-11-10 16:44:55 +00:00
NAKAMURA Takumi 270354100a test/CodeGen/X86/lsr-loop-exit-cond.ll: Try to appease linux and freebsd bots to specify explicit -mtriple=x86_64-darwin.
I guess it expects -relocation-model=pic.

llvm-svn: 144290
2011-11-10 14:18:59 +00:00
Evan Cheng d33b2d6b7a Use a bigger hammer to fix PR11314 by disabling the "forcing two-address
instruction lower optimization" in the pre-RA scheduler.

The optimization, rather the hack, was done before MI use-list was available.
Now we should be able to implement it in a better way, perhaps in the
two-address pass until a MI scheduler is available.

Now that the scheduler has to backtrack to handle call sequences. Adding
artificial scheduling constraints is just not safe. Furthermore, the hack
is not taking all the other scheduling decisions into consideration so it's just
as likely to pessimize code. So I view disabling this optimization goodness
regardless of PR11314.

llvm-svn: 144267
2011-11-10 07:43:16 +00:00
Chad Rosier 3fbd094ad9 For immediate encodings of icmp, zero or sign extend first. Then
determine if the value is negative and flip the sign accordingly.
rdar://10422026

llvm-svn: 144258
2011-11-10 01:30:39 +00:00
Jakob Stoklund Olesen eef48b6938 Strip old implicit operands after foldMemoryOperand.
The TII.foldMemoryOperand hook preserves implicit operands from the
original instruction.  This is not what we want when those implicit
operands refer to the register being spilled.

Implicit operands referring to other registers are preserved.

This fixes PR11347.

llvm-svn: 144247
2011-11-10 00:17:03 +00:00
Jim Grosbach 25bc090170 Thumb2 assembly parsing STMDB w/ optional .w suffix.
rdar://10422955

llvm-svn: 144242
2011-11-09 23:44:23 +00:00
Eli Friedman 2d4055b683 Make sure we correctly unroll conversions between v2f64 and v2i32 on ARM.
llvm-svn: 144241
2011-11-09 23:36:02 +00:00
Pete Cooper 856977cb15 DeadStoreElimination can now trim the size of a store if the end of the store is dead.
Currently checks alignment and killing stores on a power of 2 boundary as this is likely
to trim the size of the earlier store without breaking large vector stores into scalar ones.

Fixes <rdar://problem/10140300>

llvm-svn: 144239
2011-11-09 23:07:35 +00:00
Eli Friedman 53218b6fcc Add check so we don't try to perform an impossible transformation. Fixes issue from PR11319.
llvm-svn: 144216
2011-11-09 22:25:12 +00:00
Nadav Rotem 1938482bfa AVX2: Add patterns for variable shift operations
llvm-svn: 144212
2011-11-09 21:22:13 +00:00
Chad Rosier c22f6518b2 Use REs to remove dependencies on the register allocation order.
llvm-svn: 144209
2011-11-09 20:06:13 +00:00
Duncan Sands 635e4efca0 Speculatively revert commit 144124 (djg) in the hope that the 32 bit
dragonegg self-host buildbot will recover (it is complaining about object
files differing between different build stages).  Original commit message:

Add a hack to the scheduler to disable pseudo-two-address dependencies in
basic blocks containing calls. This works around a problem in which
these artificial dependencies can get tied up in calling seqeunce
scheduling in a way that makes the graph unschedulable with the current
approach of using artificial physical register dependencies for calling
sequences. This fixes PR11314.

llvm-svn: 144188
2011-11-09 14:20:48 +00:00
Nadav Rotem 79135d844d Add AVX2 support for vselect of v32i8
llvm-svn: 144187
2011-11-09 13:21:28 +00:00
Craig Topper f87a2bef51 Enable execution dependency fix pass for YMM registers when AVX2 is enabled. Add AVX2 logical operations to list of replaceable instructions.
llvm-svn: 144179
2011-11-09 09:37:21 +00:00
Craig Topper c9eb09d3b8 Add instruction selection for AVX2 integer comparisons.
llvm-svn: 144176
2011-11-09 08:06:13 +00:00
Craig Topper 8c8a431057 Add AVX2 instruction lowering for add, sub, and mul.
llvm-svn: 144174
2011-11-09 07:28:55 +00:00
Nick Lewycky 0485d51a76 Don't forget to check FlagNW when determining whether an AddRecExpr will wrap
or not. Patch by Brendon Cahoon!

llvm-svn: 144173
2011-11-09 07:11:37 +00:00
Chad Rosier 595d419427 Add support for encoding immediates in icmp and fcmp. Hopefully, this will
remove a fair number of unnecessary materialized constants.
rdar://10412592

llvm-svn: 144163
2011-11-09 03:22:02 +00:00
Jakob Stoklund Olesen 3dc89c9768 Collapse DomainValues across loop back-edges.
During the initial RPO traversal of the basic blocks, remember the ones
that are incomplete because of back-edges from predecessors that haven't
been visited yet.

After the initial RPO, revisit all those loop headers so the incoming
DomainValues on the back-edges can be properly collapsed.

This will properly fix execution domains on software pipelined code,
like the included test case.

llvm-svn: 144151
2011-11-09 01:06:56 +00:00
Dan Gohman a4bc6171a5 Add a hack to the scheduler to disable pseudo-two-address dependencies in
basic blocks containing calls. This works around a problem in which
these artificial dependencies can get tied up in calling seqeunce
scheduling in a way that makes the graph unschedulable with the current
approach of using artificial physical register dependencies for calling
sequences. This fixes PR11314.

llvm-svn: 144124
2011-11-08 21:29:06 +00:00
Evan Cheng c3770ac687 Add workaround for Cortex-M3 errata 602117 by replacing ldrd x, y, [x] with ldm or ldr pairs.
llvm-svn: 144123
2011-11-08 21:21:09 +00:00
Eli Friedman 0bae8b2cfb Fix code to match comment. Fixes PR11340, a regression from r143209.
llvm-svn: 144121
2011-11-08 21:08:02 +00:00
Pete Cooper 9ee220915b LICM pass now understands invariant load metadata. Nothing generates this yet so it will currently never get used in real tests
llvm-svn: 144107
2011-11-08 19:30:00 +00:00
Pete Cooper fbbbd04705 Adding test for machine-licm operating on invariant load instructions
llvm-svn: 144104
2011-11-08 19:06:53 +00:00
Lang Hames b85fcd07df Lower mem-ops to unaligned i32/i16 load/stores on ARM where supported.
Add support for trimming constants to GetDemandedBits. This fixes some funky
constant generation that occurs when stores are expanded for targets that don't
support unaligned stores natively.

llvm-svn: 144102
2011-11-08 18:56:23 +00:00
NAKAMURA Takumi d8d583f766 test/CodeGen/X86/vec_shuffle-39.ll: Add explicit -mtriple=x86_64-linux. Passing packed value is not compatible on Win32 x64.
llvm-svn: 144068
2011-11-08 03:46:39 +00:00
NAKAMURA Takumi ac9ef21f02 test/CodeGen/X86/vec_shuffle-38.ll: Relax expression for Win32 x64.
llvm-svn: 144067
2011-11-08 03:46:32 +00:00
NAKAMURA Takumi 33dac06330 test/CodeGen/X86/vec_shuffle.ll: Add explicit -mtriple=i686-linux. We may see some suboptimal frame (%ebp) emission on certain hosts. Possible [PR11031]
llvm-svn: 144066
2011-11-08 03:46:25 +00:00
Eli Friedman 6f84fed675 Make sure to mark vector extload's as expand on ARM. Fixes PR11319.
llvm-svn: 144057
2011-11-08 01:43:53 +00:00
Eli Friedman f2a9bd4b1e Add a bunch of calls to RemoveDeadNode in LegalizeDAG, so legalization doesn't get confused by CSE later on. Fixes PR11318.
Re-commit of r144034, with an extra fix so that RemoveDeadNode doesn't blow up.

llvm-svn: 144055
2011-11-08 01:25:24 +00:00
Evan Cheng 91b56e0390 Add x86 isel logic and patterns to match movlps from clang generated IR for _mm_loadl_pi(). rdar://10134392, rdar://10050222
llvm-svn: 144052
2011-11-08 00:31:58 +00:00
Bill Wendling 2a917595d2 Convert to the new EH model.
llvm-svn: 144050
2011-11-08 00:23:01 +00:00
Bill Wendling 2197b015c8 Convert to the new EH model.
llvm-svn: 144049
2011-11-08 00:17:28 +00:00
Bill Wendling 9b7942a543 Convert tests to the new EH model.
llvm-svn: 144048
2011-11-08 00:09:27 +00:00
Chad Rosier 5de1bea5c9 Enable support for returning i1, i8, and i16. Nothing special todo as it's the
callee's responsibility to sign or zero-extend the return value.  The additional
test case just checks to make sure the calls are selected (i.e., -fast-isel-abort
doesn't assert).

llvm-svn: 144047
2011-11-08 00:03:32 +00:00
Pete Cooper 2dc40434aa Added missing newline
llvm-svn: 144046
2011-11-08 00:03:24 +00:00
Eli Friedman a35a5295e0 Revert r144034 while I try to track down a crash.
llvm-svn: 144044
2011-11-07 23:53:20 +00:00
Jakob Stoklund Olesen 9279f9efbc Fix test for Windows as well.
llvm-svn: 144038
2011-11-07 23:10:43 +00:00
Jakob Stoklund Olesen a70e9417fb Kill and collapse outstanding DomainValues.
DomainValues that are only used by "don't care" instructions are now
collapsed to the first possible execution domain after all basic blocks
have been processed.  This typically means the PS domain on x86.

For example, the vsel_i64 and vsel_double functions in sse2-blend.ll are
completely collapsed to the PS domain instead of containing a mix of
execution domains created by isel.

llvm-svn: 144037
2011-11-07 23:08:21 +00:00
Pete Cooper 7a4be01ac8 InstCombine now optimizes vector udiv by power of 2 to shifts
Fixes r8429

llvm-svn: 144036
2011-11-07 23:04:49 +00:00
Eli Friedman 55a86d32d3 Add a bunch of calls to RemoveDeadNode in LegalizeDAG, so legalization doesn't get confused by CSE later on. Fixes PR11318.
llvm-svn: 144034
2011-11-07 22:51:10 +00:00
Benjamin Kramer 69d57cf9c4 Simplify some uses of utohexstr.
As a side effect hex is printed lowercase instead of uppercase now.

llvm-svn: 144013
2011-11-07 21:00:59 +00:00
Jakob Stoklund Olesen 7f076cb6cc Fix test for Linux.
llvm-svn: 144003
2011-11-07 20:47:23 +00:00
Jakob Stoklund Olesen 0241308954 Expand V_SET0 to xorps by default.
The xorps instruction is smaller than pxor, so prefer that encoding.

The ExecutionDepsFix pass will switch the encoding to pxor and xorpd
when appropriate.

llvm-svn: 143996
2011-11-07 19:15:58 +00:00
Craig Topper a6d409d543 Add AVX2 variable shift instructions and intrinsics.
llvm-svn: 143915
2011-11-07 08:26:24 +00:00
Craig Topper ff39be0afc Add AVX2 VPMOVMASK instructions and intrinsics.
llvm-svn: 143904
2011-11-07 03:20:35 +00:00
Craig Topper e122dcbf4a Add AVX2 VEXTRACTI128 and VINSERTI128 instructions. Fix VPERM2I128 to be qualified with HasAVX2 instead of HasAVX. Mark VINSERTF128 and VEXTRACTF128 as never having side effects.
llvm-svn: 143902
2011-11-07 02:00:04 +00:00
Craig Topper f01f1b5cb9 More AVX2 instructions and their intrinsics.
llvm-svn: 143895
2011-11-06 23:04:08 +00:00
Craig Topper 05d1cb98e7 Add more AVX2 instructions and intrinsics.
llvm-svn: 143861
2011-11-06 06:12:20 +00:00
Chad Rosier d0191a53c9 Add support for passing i1, i8, and i16 call parameters. Also, be sure to
zero-extend the constant integer encoding.  Test case provides testing for
both call parameters and materialization of i1, i8, and i16 types.

llvm-svn: 143821
2011-11-05 20:16:15 +00:00
Benjamin Kramer 807cf22b55 Update lit's list of tools.
llvm-svn: 143815
2011-11-05 16:20:52 +00:00
Benjamin Kramer c74798d5cf Add an option to pad an uleb128 to MCObjectWriter and remove the uleb128 encoding from the DWARF asm printer.
As a side effect we now print dwarf ulebs with .ascii directives.

llvm-svn: 143809
2011-11-05 11:52:44 +00:00
Nick Lewycky f2905afe62 Do simple cross-block DSE when we encounter a free statement. Fixes PR11240.
llvm-svn: 143808
2011-11-05 10:48:42 +00:00
Eli Friedman 8f249600e7 Enhanced vzeroupper insertion pass that avoids inserting vzeroupper where it is unnecessary through local analysis. Patch from Bruno Cardoso Lopes, with some additional changes.
I'm going to wait for any review comments and perform some additional testing before turning this on by default.

llvm-svn: 143750
2011-11-04 23:46:11 +00:00
Daniel Dunbar 21079cad11 build/cmake: Change to require Python be available.
llvm-svn: 143742
2011-11-04 23:04:05 +00:00
Rafael Espindola c2a8401ad2 Add triple to test.
llvm-svn: 143735
2011-11-04 20:20:34 +00:00
Rafael Espindola 6cf4e830ce Emit declarations before definitions if they are available. This causes DW_AT_specification to
point back in the file in the included testcase. Fixes PR11300.

llvm-svn: 143726
2011-11-04 19:00:29 +00:00
Dan Gohman ce3d6248b2 Add tests for existing InstSimplify features.
llvm-svn: 143721
2011-11-04 18:39:16 +00:00
Dan Gohman 85977e6ab4 Teach instsimplify to simplify calls to undef.
llvm-svn: 143719
2011-11-04 18:32:42 +00:00
Craig Topper b9a46e6b83 Add intrinsics for X86 vcvtps2ph and vcvtph2ps instructions
llvm-svn: 143682
2011-11-04 06:59:21 +00:00
Chad Rosier f3e73ad5da Add fast-isel support for returning i1, i8, and i16.
llvm-svn: 143669
2011-11-04 00:50:21 +00:00
Daniel Dunbar e6d40de414 Speculatively revert "DeadStoreElimination can now trim the size of a store if
the end of it is dead.", which appears to break bootstrapping LLVM.

llvm-svn: 143668
2011-11-04 00:48:26 +00:00
Dan Gohman 198b7ffc11 Reapply r143206, with fixes. Disallow physical register lifetimes
across calls, and only check for nested dependences on the special
call-sequence-resource register.

llvm-svn: 143660
2011-11-03 21:49:52 +00:00
Pete Cooper 65ba66c660 Reverted r143600 - selector reference change
llvm-svn: 143646
2011-11-03 20:47:50 +00:00
Dan Bailey b68515c232 fixed global array handling for ptx to use the correct bit widths
llvm-svn: 143640
2011-11-03 19:24:46 +00:00
Pete Cooper 8a95aedb5d DeadStoreElimination can now trim the size of a store if the end of it is dead.
Only currently done if the later store is writing to a power of 2 address or 
has the same alignment as the earlier store as then its likely to not break up
large stores into smaller ones

Fixes <rdar://problem/10140300>

llvm-svn: 143630
2011-11-03 18:01:56 +00:00
Craig Topper 0e7cbbabea Add new X86 AVX2 VBROADCAST instructions.
llvm-svn: 143612
2011-11-03 07:35:53 +00:00
Chad Rosier bf5f4bec1a Add support for sign-extending non-legal types in SelectSIToFP().
llvm-svn: 143603
2011-11-03 02:04:59 +00:00
Pete Cooper e6173d81ae Treat objc selector reference globals as invariant so that MachineLICM can hoist them out of loops. Fixes <rdar://problem/6027699>
llvm-svn: 143600
2011-11-03 00:56:36 +00:00
Lang Hames 9929c423a1 Try to lower memset/memcpy/memmove to vector instructions on ARM where the alignment permits.
llvm-svn: 143582
2011-11-02 22:52:45 +00:00
Nick Lewycky 000307fef9 I added the first test to run llvm-dwarfdump.
llvm-svn: 143571
2011-11-02 21:02:27 +00:00
Nick Lewycky d1ee7f8cf1 Don't emit a directory entry for the value in DW_AT_comp_dir, that is always
implied by directory index zero.

llvm-svn: 143570
2011-11-02 20:55:33 +00:00
Chad Rosier 9cf803c4bf Add support for comparing integer non-legal types.
llvm-svn: 143559
2011-11-02 18:08:25 +00:00
Owen Anderson fbb704f551 Fix the issue that r143552 was trying to address the _right_ way. One-register lists are legal on LDM/STM instructions, but we should not print the PUSH/POP aliases when they appear. This fixes round tripping on this instruction.
llvm-svn: 143557
2011-11-02 18:03:14 +00:00
Daniel Dunbar 5aba1b4ea3 tests: Clean up tests/CMakeLists.txt to drop some variable configuration we no
longer need substitutions for.

llvm-svn: 143555
2011-11-02 17:54:51 +00:00
Andrew Trick c2c79c90f2 Rewrite LinearFunctionTestReplace to handle pointer-type IVs.
We've been hitting asserts in this code due to the many supported
combintions of modes (iv-rewrite/no-iv-rewrite) and IV types. This
second rewrite of the code attempts to deal with these cases systematically.

llvm-svn: 143546
2011-11-02 17:19:57 +00:00
Craig Topper a47b05c7f3 More AVX2 instructions and intrinsics.
llvm-svn: 143536
2011-11-02 06:54:17 +00:00
Craig Topper 682b850602 Add a bunch more X86 AVX2 instructions and their corresponding intrinsics.
llvm-svn: 143529
2011-11-02 04:42:13 +00:00
Andrew Trick 0dae890346 Broaden an assert to handle enable-iv-rewrite=true following r143183.
Narrowest possible fix for PR11279.

llvm-svn: 143522
2011-11-02 00:02:45 +00:00
Kevin Enderby 82ed3be1fb Fixed a bug in the code to create a dwarf file and directory table entires when
it is separating the directory part from the basename of the FileName.  Noticed 
that this:

  .file 1 "dir/foo"

when assembled got the two parts switched.  Using the Mac OS X dwarfdump tool
it can be seen easily:

% dwarfdump -a a.out
include_directories[  1] = 'foo'
                Dir  Mod Time   File Len   File Name
                ---- ---------- ---------- ---------------------------
file_names[  1]    1 0x00000000 0x00000000 dir
...

Which should be:
...
include_directories[  1] = 'dir'
                Dir  Mod Time   File Len   File Name
                ---- ---------- ---------- ---------------------------
file_names[  1]    1 0x00000000 0x00000000 foo

llvm-svn: 143521
2011-11-01 23:39:05 +00:00
Owen Anderson 69e54a740c Fix disassembly of some VST1 instructions.
llvm-svn: 143507
2011-11-01 22:18:13 +00:00
Eli Friedman 3f5eccbe7a Teach the x86 backend a couple tricks for dealing with v16i8 sra by a constant splat value. Fixes PR11289.
llvm-svn: 143498
2011-11-01 21:18:39 +00:00
Richard Osborne 56ce0932db Don't fold negative offsets into cp / dp accesses to avoid relocation errors.
This can happen if the address + addend is less than the start of the cp / dp.

llvm-svn: 143459
2011-11-01 11:31:53 +00:00
Richard Osborne 37fe7d6641 Combine various XCore tests for floating point intrinsic support into a single test.
llvm-svn: 143458
2011-11-01 10:51:48 +00:00
Richard Osborne 8591b6b0ab Move various XCore tests to FileCheck
llvm-svn: 143457
2011-11-01 10:41:28 +00:00
Craig Topper fec80c6ad2 Fix operand type for x86 pmadd_ub_sw intrinsic.
llvm-svn: 143455
2011-11-01 07:25:22 +00:00
Eli Friedman a49b828f8f Make sure we use the right insertion point when instcombine replaces a PHI with another instruction. (Specifically, don't insert an arbitrary instruction before a PHI.) Fixes PR11275.
llvm-svn: 143437
2011-11-01 04:49:29 +00:00
Eli Friedman 0eb88775ef Move x86-specific tests into X86 folder.
llvm-svn: 143424
2011-11-01 03:21:48 +00:00
Eli Friedman 6185a2aa7c Move another test requiring x86 into X86 directory.
llvm-svn: 143421
2011-11-01 03:12:47 +00:00
Eli Friedman 2cd281ea67 Move test requiring x86 backend into X86 directory.
llvm-svn: 143420
2011-11-01 03:11:41 +00:00
Matt Beaumont-Gay 1c1a2b8123 Change the actual tests to match the input directory rename (duh)
llvm-svn: 143404
2011-10-31 23:56:52 +00:00
Matt Beaumont-Gay da5e57cba1 Rename "TestObjectFiles" to "Inputs" (like the pattern for Clang tests)
llvm-svn: 143400
2011-10-31 23:46:38 +00:00
Rafael Espindola 300dcb8e37 Move test to the X86 directory, note the PR number and only run MC once.
llvm-svn: 143352
2011-10-31 17:23:09 +00:00
Owen Anderson 40703f4252 More not-crashing NEON disassembly updates for the vld refactoring.
llvm-svn: 143351
2011-10-31 17:17:32 +00:00
Craig Topper 9821e75e64 Fix operand type for int_x86_ssse3_phadd_sw_128 intrinsic
llvm-svn: 143336
2011-10-31 07:16:37 +00:00
Craig Topper 242d1f8c73 Test case for X86 FS/GS Base intrinsics
llvm-svn: 143332
2011-10-31 02:15:47 +00:00
Craig Topper cfcfdf2aab Begin adding AVX2 instructions. No selection support yet other than intrinsics.
llvm-svn: 143331
2011-10-31 02:15:10 +00:00
Nick Lewycky aab6169ef6 Switch new .file directive emission off by default, change llc's flag for it to
-enable-dwarf-directory.

llvm-svn: 143326
2011-10-31 01:06:02 +00:00
Duncan Sands 3d5692a475 Reapply commit 143214 with a fix: m_ICmp doesn't match conditions
with the given predicate, it matches any condition and returns the
predicate - d'oh!  Original commit message:
The expression icmp eq (select (icmp eq x, 0), 1, x), 0 folds to false.
Spotted by my super-optimizer in 186.crafty and 450.soplex.  We really
need a proper infrastructure for handling generalizations of this kind
of thing (which occur a lot), however this case is so simple that I decided
to go ahead and implement it directly.

llvm-svn: 143318
2011-10-30 19:56:36 +00:00
Benjamin Kramer 7402ee6ec2 X86: Emit logical shift by constant splat of <16 x i8> as a <8 x i16> shift and zero out the bits where zeros should've been shifted in.
llvm-svn: 143315
2011-10-30 17:31:21 +00:00
Craig Topper 9cdb9ffa43 Fix return type for X86 mpsadbw instrinsic. The instruction takes in a vector of 8-bit integers, but produces a vector of 16-bit integers.
llvm-svn: 143313
2011-10-30 17:22:45 +00:00
Nadav Rotem c602b2c4de Fix pr11266.
On x86: (shl V, 1) -> add V,V

Hardware support for vector-shift is sparse and in many cases we scalarize the
result. Additionally, on sandybridge padd is faster than shl.

llvm-svn: 143311
2011-10-30 13:24:22 +00:00
Nadav Rotem 1dda6a8ce1 Stabilize the test by specifying an exact cpu target
llvm-svn: 143307
2011-10-30 08:07:50 +00:00
Nadav Rotem bf6568b5d6 Add a new DAGCombine optimization for BUILD_VECTOR.
If all of the inputs are zero/any_extended, create a new simple BV
which can be further optimized by other BV optimizations.

llvm-svn: 143297
2011-10-29 21:23:04 +00:00
Benjamin Kramer 932de2bc86 Force SSE for this test.
llvm-svn: 143291
2011-10-29 19:43:44 +00:00
Benjamin Kramer 594ee77964 SimplifyLibCalls: Use IRBuilder.CreateGlobalString when creating a string for printf->puts, which correctly sets the unnamed_addr bit on the resulting GlobalVariable.
Fixes PR11264.

llvm-svn: 143289
2011-10-29 19:43:31 +00:00
Eli Friedman 3af3c046a9 Revert r143214; it's breaking a bunch of stuff.
llvm-svn: 143265
2011-10-29 00:56:07 +00:00
Dan Gohman 9b9c970148 Revert r143206, as there are still some failing tests.
llvm-svn: 143262
2011-10-29 00:41:52 +00:00
NAKAMURA Takumi 6e315dd8ba test/CodeGen/PowerPC/2008-10-17-AsmMatchingOperands.ll: [PR11218] Mark "REQUIRES: asserts" for now.
llvm-svn: 143247
2011-10-28 23:11:03 +00:00
Jim Grosbach b009a872d7 Add Thumb2 alias for "mov Rd, #imm" to "mvn Rd, #~imm".
When '~imm' is encodable as a t2_so_imm but plain 'imm' is not. For example,
  mov r2, #-3
becomes
  mvn r2, #2

rdar://10349224

llvm-svn: 143235
2011-10-28 22:36:30 +00:00
Owen Anderson 5524ce7d82 Fix illegal disassembly testcase.
llvm-svn: 143231
2011-10-28 21:45:09 +00:00
Duncan Sands 280bc553b3 The expression icmp eq (select (icmp eq x, 0), 1, x), 0 folds to false.
Spotted by my super-optimizer in 186.crafty and 450.soplex.  We really
need a proper infrastructure for handling generalizations of this kind
of thing (which occur a lot), however this case is so simple that I decided
to go ahead and implement it directly.

llvm-svn: 143214
2011-10-28 19:01:20 +00:00
Duncan Sands 985ba6386d A shift of a power of two is a power of two or zero.
For completeness - not spotted in the wild.

llvm-svn: 143211
2011-10-28 18:30:05 +00:00
Duncan Sands 92af0a8a7f Fold icmp ugt (udiv X, Y), X to false. Spotted by my super-optimizer
in 186.crafty.

llvm-svn: 143209
2011-10-28 18:17:44 +00:00
Owen Anderson dde461c8b1 Reapply r143202, with a manual decoding hook for SWP. This change inadvertantly exposed a decoding ambiguity between SWP and CPS that the auto-generated decoder can't handle.
llvm-svn: 143208
2011-10-28 18:02:13 +00:00
Dan Gohman 73057ad24f Reapply r143177 and r143179 (reverting r143188), with scheduler
fixes: Use a separate register, instead of SP, as the
calling-convention resource, to avoid spurious conflicts with
actual uses of SP. Also, fix unscheduling of calling sequences,
which can be triggered by pseudo-two-address dependencies.

llvm-svn: 143206
2011-10-28 17:55:38 +00:00
Jim Grosbach 7a49575d7f Thumb2 ADD/SUB instructions encoding selection outside IT block.
Outside an IT block, "add r3, #2" should select a 32-bit wide encoding
rather than generating an error indicating the 16-bit encoding is only
legal in an IT block (outside, the 'S' suffic is required for the 16-bit
encoding).

rdar://10348481

llvm-svn: 143201
2011-10-28 16:57:07 +00:00
NAKAMURA Takumi 7636f55348 test/MC/AsmParser/2011-09-06-NoNewline.s: Add explicit -mtriple=i386. It uses X86 instruction.
FIXME: Would it be reproduced without target-specific operands?
FIXME: Why run llvm-mc as the same input by 3 times?
llvm-svn: 143195
2011-10-28 14:12:30 +00:00
NAKAMURA Takumi 29ccdd8207 Dwarf: [PR11022] Fix emitting DW_AT_const_value(>i64), to be host-endian-neutral.
Don't assume APInt::getRawData() would hold target-aware endianness nor host-compliant endianness. rawdata[0] holds most lower i64, even on big endian host.

FIXME: Add a testcase for big endian target.

FIXME: Ditto on CompileUnit::addConstantFPValue() ?
llvm-svn: 143194
2011-10-28 14:12:22 +00:00
NAKAMURA Takumi 88dd835f09 test/CodeGen/X86/2010-08-10-DbgConstant.ll: Add explicit -mtriple=i686-linux. It must be for elf!
llvm-svn: 143189
2011-10-28 10:50:52 +00:00
Duncan Sands 225a7037d6 Speculatively disable Dan's commits 143177 and 143179 to see if
it fixes the dragonegg self-host (it looks like gcc is miscompiled).
Original commit messages:
Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW
on every node as it legalizes them. This makes it easier to use
hasOneUse() heuristics, since unneeded nodes can be removed from the
DAG earlier.

Make LegalizeOps visit the DAG in an operands-last order. It previously
used operands-first, because LegalizeTypes has to go operands-first, and
LegalizeTypes used to be part of LegalizeOps, but they're now split.
The operands-last order is more natural for several legalization tasks.
For example, it allows lowering code for nodes with floating-point or
vector constants to see those constants directly instead of seeing the
lowered form (often constant-pool loads). This makes some things
somewhat more complicated today, though it ought to allow things to be
simpler in the future. It also fixes some bugs exposed by Legalizing
using RAUW aggressively.

Remove the part of LegalizeOps that attempted to patch up invalid chain
operands on libcalls generated by LegalizeTypes, since it doesn't work
with the new LegalizeOps traversal order. Instead, define what
LegalizeTypes is doing to be correct, and transfer the responsibility
of keeping calls from having overlapping calling sequences into the
scheduler.

Teach the scheduler to model callseq_begin/end pairs as having a
physical register definition/use to prevent calls from having
overlapping calling sequences. This is also somewhat complicated, though
there are ways it might be simplified in the future.

This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others.
Please direct high-level questions about this patch to management.

Delete #if 0 code accidentally left in.

llvm-svn: 143188
2011-10-28 09:55:57 +00:00
Nick Lewycky cc64ae140d Always use the string pool, even when it makes the .o larger. This may help
tools that read the debug info in the .o files by making the DIE sizes more
consistent.

llvm-svn: 143186
2011-10-28 05:29:47 +00:00
Andrew Trick effdca9441 LFTR should avoid a type mismatch with null pointer IVs.
Fixes rdar://10359193 Indvar LinearFunctionTestReplace assertion

llvm-svn: 143183
2011-10-28 03:45:11 +00:00
Dan Gohman 4db3f7dd83 Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW
on every node as it legalizes them. This makes it easier to use
hasOneUse() heuristics, since unneeded nodes can be removed from the
DAG earlier.

Make LegalizeOps visit the DAG in an operands-last order. It previously
used operands-first, because LegalizeTypes has to go operands-first, and
LegalizeTypes used to be part of LegalizeOps, but they're now split.
The operands-last order is more natural for several legalization tasks.
For example, it allows lowering code for nodes with floating-point or
vector constants to see those constants directly instead of seeing the
lowered form (often constant-pool loads). This makes some things
somewhat more complicated today, though it ought to allow things to be
simpler in the future. It also fixes some bugs exposed by Legalizing
using RAUW aggressively.

Remove the part of LegalizeOps that attempted to patch up invalid chain
operands on libcalls generated by LegalizeTypes, since it doesn't work
with the new LegalizeOps traversal order. Instead, define what
LegalizeTypes is doing to be correct, and transfer the responsibility
of keeping calls from having overlapping calling sequences into the
scheduler.

Teach the scheduler to model callseq_begin/end pairs as having a
physical register definition/use to prevent calls from having
overlapping calling sequences. This is also somewhat complicated, though
there are ways it might be simplified in the future.

This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others.
Please direct high-level questions about this patch to management.

llvm-svn: 143177
2011-10-28 01:29:32 +00:00
Jim Grosbach 080a499ee0 ARM Allow 'q' registers in VLD/VST vector lists.
Just treat it as if the constituent D registers where specified.

rdar://10348896

llvm-svn: 143167
2011-10-28 00:06:50 +00:00
Dan Gohman 4c9fca99c9 Remove the Alpha backend.
llvm-svn: 143164
2011-10-27 22:56:32 +00:00
Owen Anderson f211416dde Add testcase for r143162.
llvm-svn: 143163
2011-10-27 22:54:14 +00:00
Jakob Stoklund Olesen e5a6adceac Also set addrmode6 alignment when align==size.
Previously, we were only setting the alignment bits on over-aligned
loads and stores.

llvm-svn: 143160
2011-10-27 22:39:16 +00:00
Evan Cheng f4807a19e8 Avoid partial CPSR dependency from loop backedges. rdar://10357570
llvm-svn: 143145
2011-10-27 21:21:05 +00:00
Daniel Dunbar a054790390 tests: Rip out a bunch of now unused test code relating to use of llvm-gcc in LLVM tests.
llvm-svn: 143143
2011-10-27 20:59:26 +00:00
Daniel Dunbar f7223fcc80 tests: Remove llvm2cpp, I'm pretty sure no one uses this.
llvm-svn: 143142
2011-10-27 20:59:21 +00:00
Duncan Sands 7cb61e5a0e Reapply commit 143028 with a fix: the problem was casting a ConstantExpr Mul
using BinaryOperator (which only works for instructions) when it should have
been a cast to OverflowingBinaryOperator (which also works for constants).
While there, correct a few other dubious looking uses of BinaryOperator.
Thanks to Chad Rosier for the testcase.  Original commit message:
My super-optimizer noticed that we weren't folding this expression to
true: (x *nsw x) sgt 0, where x = (y | 1).  This occurs in 464.h264ref.

llvm-svn: 143125
2011-10-27 19:16:21 +00:00
Benjamin Kramer 652f576a70 2>&1 doesn't work here, it just creates an empty file called "&1"
llvm-svn: 143117
2011-10-27 18:27:45 +00:00
Pete Cooper ce63700797 Changed test to check for correct load size instead of shift as the shift might change if optimised
llvm-svn: 143116
2011-10-27 18:15:58 +00:00
Kevin Enderby 49e6a0da7e Change the sysexit mnemonic (and sysexitl) to never have the REX.W prefix and
not depend on In32BitMode.  Use the sysexitq mnemonic for the version with the
REX.W prefix and only allow it only In64BitMode.  rdar://9738584

llvm-svn: 143112
2011-10-27 17:40:41 +00:00
Jim Grosbach 6ed3845530 Thumb2 t2LDMDB[_UPD] assembly parsing to recognize .w suffix.
rdar://10348844

llvm-svn: 143110
2011-10-27 17:33:59 +00:00
Jim Grosbach ba7f90c7df Thumb2 t2MVNi assembly parsing to recognize ".w" suffix.
rdar://10348584

llvm-svn: 143108
2011-10-27 17:16:55 +00:00
Bob Wilson 1455ce27e4 Revert Duncan's r143028 expression folding which appears to be the culprit
behind a compile failure on 483.xalancbmk.

llvm-svn: 143102
2011-10-27 15:47:25 +00:00
Nick Lewycky d59c0cac6c Teach our Dwarf emission to use the string pool.
llvm-svn: 143097
2011-10-27 06:44:11 +00:00
Eli Friedman e9e356ad6b Don't crash on 128-bit sdiv by constant. Found by inspection.
llvm-svn: 143095
2011-10-27 02:06:39 +00:00
Eli Friedman 73beaf7bbc It is not safe to sink an alloca into a stacksave/stackrestore pair, so don't do that. <rdar://problem/10352360>
llvm-svn: 143093
2011-10-27 01:33:51 +00:00
Chad Rosier d24e7e1d9b A branch predicated on a constant can just FastEmit an unconditional branch.
llvm-svn: 143086
2011-10-27 00:21:16 +00:00
Jim Grosbach 61fdba048f Thumb2 ldr pc-relative encoding fixes.
We were parsing label references to the i12 encoding, which isn't right.
They need to go to the pci variant instead.

More of rdar://10348687

llvm-svn: 143068
2011-10-26 22:22:01 +00:00
Rafael Espindola f5a15529a7 Run test with -verify-machineinstrs.
Patch by Sanjoy Das.

llvm-svn: 143066
2011-10-26 21:20:26 +00:00
Rafael Espindola b3285224cd Fixes an issue reported by -verify-machineinstrs.
Patch by Sanjoy Das.

llvm-svn: 143064
2011-10-26 21:16:41 +00:00
Rafael Espindola 66393c127d This commit introduces two fake instructions MORESTACK_RET and
MORESTACK_RET_RESTORE_R10; which are lowered to a RET and a RET
followed by a MOV respectively.  Having a fake instruction prevents
the verifier from seeing a MachineBasicBlock end with a
non-terminator (MOV).  It also prevents the rather eccentric case of a
MachineBasicBlock ending with RET but having successors nevertheless.

Patch by Sanjoy Das.

llvm-svn: 143062
2011-10-26 21:12:27 +00:00
Lang Hames c47e283430 Make sure short memsets on ARM lower to stores, even when optimizing for size.
llvm-svn: 143055
2011-10-26 20:56:52 +00:00
Duncan Sands ba286d7c73 The maximum power of 2 dividing a power of 2 is itself. This occurs
in 403.gcc and was spotted by my super-optimizer.

llvm-svn: 143054
2011-10-26 20:55:21 +00:00
Jim Grosbach 25d4707c4d Thumb2 remove redundant ".w" suffix from t2MVNCCi pattern.
llvm-svn: 143034
2011-10-26 17:28:15 +00:00
Duncan Sands 1d2bb9882d My super-optimizer noticed that we weren't folding this expression to
true: (x *nsw x) sgt 0, where x = (y | 1).  This occurs in 464.h264ref.

llvm-svn: 143028
2011-10-26 15:31:51 +00:00
James Molloy dd9137aa56 Revert r142530 at least temporarily while a discussion is had on llvm-commits regarding exactly how much optsize should optimize for size over performance.
llvm-svn: 143023
2011-10-26 08:53:19 +00:00
Evan Cheng 043c9d3f7a Revert part of r142530. The patch potentially hurts performance especially
on Darwin platforms where -Os means optimize for size without hurting
performance.

llvm-svn: 143002
2011-10-26 01:17:44 +00:00
Mon P Wang 6ebf401412 The bitcode reader can create an shuffle with a place holder mask which it will
fix up later. For this special case, allow such a mask to be considered valid.
<rdar://problem/8622574>

llvm-svn: 142992
2011-10-26 00:34:48 +00:00
Michael J. Spencer 8ab7b036f7 Object: change test to create archive.
llvm-svn: 142982
2011-10-25 22:30:58 +00:00
Chad Rosier 67a5df5329 Add a few test cases to ensure the bitcode reader is backward compatible with
LLVM 2.9.  My understanding is that we plan to maintain compatibility with 2.9
until the 3.1 release.  At that time we can generate new test cases using LLVM
3.0.

llvm-svn: 142958
2011-10-25 20:33:19 +00:00
Chad Rosier 1248500425 Simplify tests by not piping them through llvm-dis.
llvm-svn: 142948
2011-10-25 19:59:50 +00:00
Duncan Sands a370f3e34e Restore commits 142790 and 142843 - they weren't breaking the build
bots.  Original commit messages:
- Reapply r142781 with fix. Original message:

  Enhance SCEV's brute force loop analysis to handle multiple PHI nodes in the
  loop header when computing the trip count.

  With this, we now constant evaluate:
    struct ListNode { const struct ListNode *next; int i; };
    static const struct ListNode node1 = {0, 1};
    static const struct ListNode node2 = {&node1, 2};
    static const struct ListNode node3 = {&node2, 3};
    int test() {
      int sum = 0;
      for (const struct ListNode *n = &node3; n != 0; n = n->next)
        sum += n->i;
      return sum;
    }

- Now that we look at all the header PHIs, we need to consider all the header PHIs
  when deciding that the loop has stopped evolving. Fixes miscompile in the gcc
  torture testsuite!

llvm-svn: 142919
2011-10-25 12:28:52 +00:00
Chandler Carruth 32f46e7c07 Fix the API usage in loop probability heuristics. It was incorrectly
classifying many edges as exiting which were in fact not. These mainly
formed edges into sub-loops. It was also not correctly classifying all
returning edges out of loops as leaving the loop. With this match most
of the loop heuristics are more rational.

Several serious regressions on loop-intesive benchmarks like perlbench's
loop tests when built with -enable-block-placement are fixed by these
updated heuristics. Unfortunately they in turn uncover some other
regressions. There are still several improvemenst that should be made to
loop heuristics including trip-count, and early back-edge management.

llvm-svn: 142917
2011-10-25 09:47:41 +00:00
Duncan Sands 805c5b92c8 Speculatively revert commits 142790 and 142843 to see if it fixes
the dragonegg and llvm-gcc self-host buildbots.  Original commit
messages:
- Reapply r142781 with fix. Original message:

  Enhance SCEV's brute force loop analysis to handle multiple PHI nodes in the
  loop header when computing the trip count.

  With this, we now constant evaluate:
    struct ListNode { const struct ListNode *next; int i; };
    static const struct ListNode node1 = {0, 1};
    static const struct ListNode node2 = {&node1, 2};
    static const struct ListNode node3 = {&node2, 3};
    int test() {
      int sum = 0;
      for (const struct ListNode *n = &node3; n != 0; n = n->next)
        sum += n->i;
      return sum;
    }

- Now that we look at all the header PHIs, we need to consider all the header PHIs
when deciding that the loop has stopped evolving. Fixes miscompile in the gcc
torture testsuite!

llvm-svn: 142916
2011-10-25 09:26:43 +00:00
Chad Rosier 48d436618d Fix these test cases to not use .bc files. Otherwise, we run into issues with
bitcode reader/writer backward compatibility.

llvm-svn: 142896
2011-10-25 01:22:20 +00:00
Jim Grosbach 17ec1a19e5 ARM assembly parsing and encoding for VLD1 with writeback.
Four entry register lists.

llvm-svn: 142882
2011-10-25 00:14:01 +00:00
Dan Gohman b43c36f391 Remove the Blackfin backend.
llvm-svn: 142880
2011-10-25 00:05:42 +00:00
Dan Gohman dfc96aea90 Remove the SystemZ backend.
llvm-svn: 142878
2011-10-24 23:48:32 +00:00
Jim Grosbach 92fd05ecdc ARM assembly parsing and encoding for VLD1 w/ writeback.
Three entry register list variation.

llvm-svn: 142876
2011-10-24 23:26:05 +00:00
Eli Friedman a5e244c08d Don't crash on variable insertelement on ARM. PR10258.
llvm-svn: 142871
2011-10-24 23:08:52 +00:00
Bill Wendling 57e3aaad89 Check the visibility of the global variable before placing it into the stubs
table. A hidden variable could potentially end up in both lists.
<rdar://problem/10336715>

llvm-svn: 142869
2011-10-24 23:05:43 +00:00
Jim Grosbach 3ea0657d54 ARM assembly parsing and encoding for VLD1 w/ writeback.
One and two length register list variants.

llvm-svn: 142861
2011-10-24 22:16:58 +00:00
Nick Lewycky a58fb48a55 Now that we look at all the header PHIs, we need to consider all the header PHIs
when deciding that the loop has stopped evolving. Fixes miscompile in the gcc
torture testsuite!

llvm-svn: 142843
2011-10-24 21:02:38 +00:00
Owen Anderson 295b1e84ce Fix a NEON disassembly case that was broken in the recent refactorings. As more of this code gets refactored, a lot of these manual decoding hooks should get smaller and/or go away entirely.
llvm-svn: 142817
2011-10-24 18:04:29 +00:00
Dan Gohman 2c9bda1512 Remove the explicit request for "Latency" scheduling from MSP430,
as the Latency scheduler is going away.

llvm-svn: 142811
2011-10-24 17:53:16 +00:00
Dan Gohman c32af340fc Change the default scheduler from Latency to ILP, since Latency
is going away.

llvm-svn: 142810
2011-10-24 17:45:02 +00:00
Jim Grosbach 3adec13c3e Update test for r142801.
llvm-svn: 142806
2011-10-24 17:26:26 +00:00
Benjamin Kramer d48d52e7b2 XFAIL test on leak checkers.
llvm-svn: 142804
2011-10-24 17:24:05 +00:00
Chandler Carruth 7111f4564c Remove return heuristics from the static branch probabilities, and
introduce no-return or unreachable heuristics.

The return heuristics from the Ball and Larus paper don't work well in
practice as they pessimize early return paths. The only good hitrate
return heuristics are those for:
 - NULL return
 - Constant return
 - negative integer return

Only the last of these three can possibly require significant code for
the returning block, and even the last is fairly rare and usually also
a constant. As a consequence, even for the cold return paths, there is
little code on that return path, and so little code density to be gained
by sinking it. The places where sinking these blocks is valuable (inner
loops) will already be weighted appropriately as the edge is a loop-exit
branch.

All of this aside, early returns are nearly as common as all three of
these return categories, and should actually be predicted as taken!
Rather than muddy the waters of the static predictions, just remain
silent on returns and let the CFG itself dictate any layout or other
issues.

However, the return heuristic was flagging one very important case:
unreachable. Unfortunately it still gave a 1/4 chance of the
branch-to-unreachable occuring. It also didn't do a rigorous job of
finding those blocks which post-dominate an unreachable block.

This patch builds a more powerful analysis that should flag all branches
to blocks known to then reach unreachable. It also has better worst-case
runtime complexity by not looping through successors for each block. The
previous code would perform an N^2 walk in the event of a single entry
block branching to N successors with a switch where each successor falls
through to the next and they finally fall through to a return.

Test case added for noreturn heuristics. Also doxygen comments improved
along the way.

llvm-svn: 142793
2011-10-24 12:01:08 +00:00
Nick Lewycky 9be7f277e4 Reapply r142781 with fix. Original message:
Enhance SCEV's brute force loop analysis to handle multiple PHI nodes in the
  loop header when computing the trip count.

  With this, we now constant evaluate:
    struct ListNode { const struct ListNode *next; int i; };
    static const struct ListNode node1 = {0, 1};
    static const struct ListNode node2 = {&node1, 2};
    static const struct ListNode node3 = {&node2, 3};
    int test() {
      int sum = 0;
      for (const struct ListNode *n = &node3; n != 0; n = n->next)
        sum += n->i;
      return sum;
    }

llvm-svn: 142790
2011-10-24 06:57:05 +00:00
Nick Lewycky dd1d3df524 A dead malloc, a free(NULL) and a free(undef) are all trivially dead
instructions.

This doesn't introduce any optimizations we weren't doing before (except
potentially due to pass ordering issues), now passes will eliminate them sooner
as part of their own cleanups.

llvm-svn: 142787
2011-10-24 04:35:36 +00:00
Nick Lewycky 9d28c26d77 Speculatively revert r142781. Bots are showing
Assertion `i_nocapture < OperandTraits<PHINode>::operands(this) && "getOperand() out of range!"' failed.
coming out of indvars.

llvm-svn: 142786
2011-10-24 04:00:25 +00:00
Nick Lewycky 1700007ecc Enhance SCEV's brute force loop analysis to handle multiple PHI nodes in the
loop header when computing the trip count.

With this, we now constant evaluate:
  struct ListNode { const struct ListNode *next; int i; };
  static const struct ListNode node1 = {0, 1};
  static const struct ListNode node2 = {&node1, 2};
  static const struct ListNode node3 = {&node2, 3};
  int test() {
    int sum = 0;
    for (const struct ListNode *n = &node3; n != 0; n = n->next)
      sum += n->i;
    return sum;
  }

llvm-svn: 142781
2011-10-23 23:43:14 +00:00
Craig Topper b05d9e9bea Add X86 SARX, SHRX, and SHLX instructions.
llvm-svn: 142779
2011-10-23 22:18:24 +00:00
Chandler Carruth 1c8ace0e89 Teach the BranchProbabilityInfo pass to print its results, and use that
to bring it under direct test instead of merely indirectly testing it in
the BlockFrequencyInfo pass.

The next step is to start adding tests for the various heuristics
employed, and to start fixing those heuristics once they're under test.

llvm-svn: 142778
2011-10-23 21:21:50 +00:00
Chandler Carruth bd1be4d01c Completely re-write the algorithm behind MachineBlockPlacement based on
discussions with Andy. Fundamentally, the previous algorithm is both
counter productive on several fronts and prioritizing things which
aren't necessarily the most important: static branch prediction.

The new algorithm uses the existing loop CFG structure information to
walk through the CFG itself to layout blocks. It coalesces adjacent
blocks within the loop where the CFG allows based on the most likely
path taken. Finally, it topologically orders the block chains that have
been formed. This allows it to choose a (mostly) topologically valid
ordering which still priorizes fallthrough within the structural
constraints.

As a final twist in the algorithm, it does violate the CFG when it
discovers a "hot" edge, that is an edge that is more than 4x hotter than
the competing edges in the CFG. These are forcibly merged into
a fallthrough chain.

Future transformations that need te be added are rotation of loop exit
conditions to be fallthrough, and better isolation of cold block chains.
I'm also planning on adding statistics to model how well the algorithm
does at laying out blocks based on the probabilities it receives.

The old tests mostly still pass, and I have some new tests to add, but
the nested loops are still behaving very strangely. This almost seems
like working-as-intended as it rotated the exit branch to be
fallthrough, but I'm not convinced this is actually the best layout. It
is well supported by the probabilities for loops we currently get, but
those are pretty broken for nested loops, so this may change later.

llvm-svn: 142743
2011-10-23 09:18:45 +00:00
Craig Topper 980d59832a Add X86 RORX instruction
llvm-svn: 142741
2011-10-23 07:34:00 +00:00
Cameron Zwarich 057fbb1a10 The element insertion code in scalar replacement doesn't handle incorrect
element types, even though the element extraction code does. It is surprising
that this bug has been here for so long. Fixes <rdar://problem/10318778>.

llvm-svn: 142740
2011-10-23 07:02:10 +00:00
Craig Topper e94d277db8 Add X86 MULX instruction for disassembler.
llvm-svn: 142738
2011-10-23 00:33:32 +00:00
Nick Lewycky 52340ac5f8 Oops! Fix test I forgot to submit as part of r142735.
llvm-svn: 142736
2011-10-22 22:07:31 +00:00
Nick Lewycky 32f8051d66 A non-escaping malloc in the entry block is not unlike an alloca. Do dead-store
elimination on them too.

llvm-svn: 142735
2011-10-22 21:59:35 +00:00
Nick Lewycky a6674c7fc9 Make SCEV's brute force analysis stronger in two ways. Firstly, we should be
able to constant fold load instructions where the argument is a constant.
Second, we should be able to watch multiple PHI nodes through the loop; this
patch only supports PHIs in loop headers, more can be done here.

With this patch, we now constant evaluate:
  static const int arr[] = {1, 2, 3, 4, 5};
  int test() {
    int sum = 0;
    for (int i = 0; i < 5; ++i) sum += arr[i];
    return sum;
  }

llvm-svn: 142731
2011-10-22 19:58:20 +00:00
Nadav Rotem e649d66552 Fix pr11193.
SHL inserts zeros from the right, thus even when the original
sign_extend_inreg value was of 1-bit, we need to sra.

llvm-svn: 142724
2011-10-22 12:39:25 +00:00
Jim Grosbach 11c0b347c6 Assembly parsing for 4-register sequential variant of VLD2.
llvm-svn: 142704
2011-10-21 23:58:57 +00:00
Jim Grosbach 118b38cbf1 Assembly parsing for 2-register sequential variant of VLD2.
llvm-svn: 142691
2011-10-21 22:21:10 +00:00
Eli Friedman 688db1d6d0 Remap blockaddress correctly when inlining a function. Fixes PR10162.
llvm-svn: 142684
2011-10-21 20:45:19 +00:00
Jim Grosbach 846bcff7c7 Assembly parsing for 4-register variant of VLD1.
llvm-svn: 142682
2011-10-21 20:35:01 +00:00
Jim Grosbach c4360fe575 Assembly parsing for 3-register variant of VLD1.
llvm-svn: 142675
2011-10-21 20:02:19 +00:00
Eli Friedman ce818277fc Extend instcombine's shufflevector simplification to handle more cases where the input and output vectors have different sizes. Patch by Xiaoyi Guo.
llvm-svn: 142671
2011-10-21 19:06:29 +00:00
Jim Grosbach 2f2e3c4737 ARM VLD parsing and encoding.
Next step in the ongoing saga of NEON load/store assmebly parsing. Handle
VLD1 instructions that take a two-register register list.

Adjust the instruction definitions to only have the single encoded register
as an operand. The super-register from the pseudo is kept as an implicit def,
so passes which come after pseudo-expansion still know that the instruction
defines the other subregs.

llvm-svn: 142670
2011-10-21 18:54:25 +00:00
Nadav Rotem 5e00bb5feb Fix pr11194. When promoting and splitting integers we need to use
ZExtPromotedInteger and SExtPromotedInteger based on the operation we legalize.

SetCC return type needs to be legalized via PromoteTargetBoolean.

llvm-svn: 142660
2011-10-21 17:35:19 +00:00
Chandler Carruth 70a38058b1 Don't hard code the desired alignment for loops -- it isn't 16-bytes on
all x86 systems. Sorry for the breakage.

llvm-svn: 142656
2011-10-21 16:41:39 +00:00
Nadav Rotem d315157f12 1. Fix the widening of SETCC in WidenVecOp_SETCC. Use the correct return CC type.
2. Fix a typo in CONCAT_VECTORS which exposed the bug in #1.

llvm-svn: 142648
2011-10-21 11:42:07 +00:00
Chandler Carruth 8b9737cb54 Add loop aligning to MachineBlockPlacement based on review discussion so
it's a bit more plausible to use this instead of CodePlacementOpt. The
code for this was shamelessly stolen from CodePlacementOpt, and then
trimmed down a bit. There doesn't seem to be much utility in returning
true/false from this pass as we may or may not have rewritten all of the
blocks. Also, the statistic of counting how many loops were aligned
doesn't seem terribly important so I removed it. If folks would like it
to be included, I'm happy to add it back.

This was probably the most egregious of the missing features, and now
I'm going to start gathering some performance numbers and looking at
specific loop structures that have different layout between the two.

Test is updated to include both basic loop alignment and nested loop
alignment.

llvm-svn: 142645
2011-10-21 08:57:37 +00:00
Chandler Carruth ddfeaafdfb Add a very basic test for MachineBlockPlacement. This is essentially the
canonical example I used when developing it, and is one of the primary
motivating real-world use cases for __builtin_expect (when burried under
a macro).

I'm working on more test cases here, but I'm trying to make sure both
that the pass is doing the right thing with the test cases and that they
aren't too brittle to changes elsewhere in the code generation pipeline.

Feedback and/or suggestions on how to test this are very welcome.
Especially feedback on whether testing the block comments is a good
strategy; I couldn't find any good examples to steal from but all the
other ideas I had were a lot uglier or more fragile.

llvm-svn: 142644
2011-10-21 08:01:56 +00:00
Craig Topper 039a79067a Remove intrinsics for X86 BLSI, BLSMSK, and BLSR intrinsics and replace with custom isel lowering code.
llvm-svn: 142642
2011-10-21 06:55:01 +00:00
Owen Anderson 16c8fc5191 Revert r142618, r142622, and r142624, which were based on an incorrect reading of the ARMv7 docs.
llvm-svn: 142626
2011-10-20 22:23:58 +00:00
Owen Anderson 608c60c773 Fix decoding tests for fixed MSR encodings.
llvm-svn: 142624
2011-10-20 22:01:48 +00:00
Owen Anderson 48da0ed477 Fix tests for corrected MSR encodings.
llvm-svn: 142622
2011-10-20 21:53:19 +00:00
Jim Grosbach 9036c5cf2b ARM VLD1/VST1 (one register, no writeback) assembly parsing and encoding.
llvm-svn: 142583
2011-10-20 15:04:25 +00:00
Jim Grosbach 3ad44e50b3 Tidy up formatting.
llvm-svn: 142582
2011-10-20 14:57:47 +00:00
Jim Grosbach 8db25984a9 ARM VTBX (one register) assembly parsing and encoding.
llvm-svn: 142581
2011-10-20 14:48:50 +00:00
Eli Friedman 1923a330e6 Refactor code from inlining and globalopt that checks whether a function definition is unused, and enhance it so it can tell that functions which are only used by a blockaddress are in fact dead. This probably doesn't happen much on most code, but the Linux kernel's _THIS_IP_ can trigger this issue with blockaddress. (GlobalDCE can also handle the given tescase, but we only run that at -O3.) Found while looking at PR11180.
llvm-svn: 142572
2011-10-20 05:23:42 +00:00
Nick Lewycky 462098824f "@string = constant i8 0" is a value i8* string of length zero. Analyze that
correctly in GetStringLength, fixing PR11181!

llvm-svn: 142558
2011-10-20 00:34:35 +00:00
Chad Rosier add38c12b8 Revert 142337. Thumb1 still doesn't support dynamic stack realignment. :(
llvm-svn: 142557
2011-10-20 00:07:12 +00:00
Evan Cheng 54d678fff4 Fix TLS lowering bug. The CopyFromReg must be glued to the TLSCALL. rdar://10291355
llvm-svn: 142550
2011-10-19 22:22:54 +00:00
Nadav Rotem 8824472a25 Improve code generation for vselect on SSE2:
When checking the availability of instructions using the TLI, a 'promoted'
instruction IS available. It means that the value is bitcasted to another type
for which there is an operation. The correct check for the availablity of an
instruction is to check if it should be expanded.

llvm-svn: 142542
2011-10-19 20:43:16 +00:00
Rafael Espindola e0d0908356 Fix parsing of a line with only a # in it.
llvm-svn: 142537
2011-10-19 18:48:52 +00:00
James Molloy 2d768fd379 Use literal pool loads instead of MOVW/MOVT for materializing global addresses when optimizing for size.
On spec/gcc, this caused a codesize improvement of ~1.9% for ARM mode and ~4.9% for Thumb(2) mode. This is
codesize including literal pools.

The pools themselves doubled in size for ARM mode and quintupled for Thumb mode, leaving suggestion that there
is still perhaps redundancy in LLVM's use of constant pools that could be decreased by sharing entries.

Fixes PR11087.

llvm-svn: 142530
2011-10-19 14:11:07 +00:00
David Greene 13c8360c26 Add Paste Test
This tests TableGen's paste functionality.

llvm-svn: 142526
2011-10-19 13:04:50 +00:00
David Greene d699161a99 Add NAME Member
Add a Value named "NAME" to each Record.  This will be set to the def or defm
name when instantiating multiclasses.  This will replace the #NAME# processing
hack once paste functionality is in place.

llvm-svn: 142518
2011-10-19 13:04:13 +00:00
Chandler Carruth deac50cba9 Generalize the reading of probability metadata to work for both branches
and switches, with arbitrary numbers of successors. Still optimized for
the common case of 2 successors for a conditional branch.

Add a test case for switch metadata showing up in the BlockFrequencyInfo pass.

llvm-svn: 142493
2011-10-19 10:32:19 +00:00
Chandler Carruth d27a7a947b Teach the BranchProbabilityInfo analysis pass to read any metadata
encoding of probabilities. In the absense of metadata, it continues to
fall back on static heuristics.

This allows __builtin_expect, after lowering through llvm.expect
a branch instruction's metadata, to actually enter the branch
probability model. This is one component of resolving PR2577.

llvm-svn: 142492
2011-10-19 10:30:30 +00:00
Chandler Carruth 343fad44ea Add pass printing support to BlockFrequencyInfo pass. The implementation
layer already had support for printing the results of this analysis, but
the wiring was missing.

Now that printing the analysis works, actually bring some of this
analysis, and the BranchProbabilityInfo analysis that it wraps, under
test! I'm planning on fixing some bugs and doing other work here, so
having a nice place to add regression tests and a way to observe the
results is really useful.

llvm-svn: 142491
2011-10-19 10:12:41 +00:00
Nadav Rotem 6652e22bad Add support for the vector-widening of vselect and vector-setcc
llvm-svn: 142488
2011-10-19 09:45:11 +00:00
Craig Topper ef309c3384 Rename PEXTR to PEXT. Add intrinsics for BMI instructions.
llvm-svn: 142480
2011-10-19 07:48:35 +00:00
Lang Hames 20a04e74e3 Added testcase for <rdar://problem/10215997>
llvm-svn: 142462
2011-10-18 23:50:52 +00:00
Nadav Rotem 0d3393356b Add additional element-promotion tests.
llvm-svn: 142442
2011-10-18 23:05:33 +00:00
Nadav Rotem 75c2229f41 Fix a bug in the legalization of vector anyext-load and trunc-store. Mem Index starts with zero.
llvm-svn: 142434
2011-10-18 22:32:43 +00:00
Jim Grosbach 43f1d206b9 Tidy up formatting.
llvm-svn: 142422
2011-10-18 21:09:01 +00:00
Jim Grosbach 1f63e04b2c Tidy up formatting.
llvm-svn: 142421
2011-10-18 21:08:16 +00:00
Jim Grosbach 4e5c764b65 Enable more encoded immediate tests.
llvm-svn: 142415
2011-10-18 20:20:51 +00:00
Jim Grosbach 89f9e1dca4 More vmov lane testcases.
llvm-svn: 142414
2011-10-18 20:19:48 +00:00
Jim Grosbach e9f204c197 ARM vmla/vmls assembly parsing for the lane index operand.
llvm-svn: 142413
2011-10-18 20:14:56 +00:00
Jim Grosbach 712f3670fd ARM vmov assembly parsing for the lane index operand.
llvm-svn: 142412
2011-10-18 20:10:47 +00:00
Michael J. Spencer bfa067862c llvm-objdump: Add static symbol table dumping.
llvm-svn: 142404
2011-10-18 19:32:17 +00:00
Jim Grosbach 611450071c ARM vmla/vmls assembly parsing for the lane index operand.
llvm-svn: 142389
2011-10-18 18:27:07 +00:00
Owen Anderson 40ec1da2ab Another failing encoding.
llvm-svn: 142388
2011-10-18 18:23:03 +00:00
Jim Grosbach 32b83a4e16 Fix NEON mul encoding tests. Wrong file contents previously.
llvm-svn: 142387
2011-10-18 18:14:55 +00:00
Jim Grosbach c8eff0327a ARM vqdmulh assembly parsing for the lane index operand.
llvm-svn: 142386
2011-10-18 18:12:09 +00:00
Jim Grosbach d1bc6da657 Remove duplicate test.
llvm-svn: 142383
2011-10-18 18:05:50 +00:00
Jim Grosbach b3ecff77cd Tidy up formatting.
llvm-svn: 142382
2011-10-18 18:05:16 +00:00
Jim Grosbach e6fbca3a61 ARM vmul assembly parsing for the lane index operand.
llvm-svn: 142381
2011-10-18 18:01:52 +00:00
Jim Grosbach f416cb16c0 Tidy up.
llvm-svn: 142380
2011-10-18 18:01:09 +00:00
Owen Anderson c91064551a Add a few more testcases.
llvm-svn: 142379
2011-10-18 17:57:31 +00:00
Owen Anderson 2a498c1107 Add several FIXME cases for ARM encodings.
llvm-svn: 142377
2011-10-18 17:50:22 +00:00
Bob Wilson 9258b76d8d Fix incorrect check for sign-extended constant BUILD_VECTOR.
<rdar://problem/10298332>

llvm-svn: 142371
2011-10-18 17:34:51 +00:00
Bob Wilson 681561901d Fix a DAG combiner assertion failure when constant folding BUILD_VECTORS.
svn r139159 caused SelectionDAG::getConstant() to promote BUILD_VECTOR operands
with illegal types, even before type legalization.  For this testcase, that led
to one BUILD_VECTOR with i16 operands and another with promoted i32 operands,
which triggered the assertion.

llvm-svn: 142370
2011-10-18 17:34:47 +00:00
Jim Grosbach 8206790ab0 Tests for 142365.
llvm-svn: 142368
2011-10-18 17:23:34 +00:00
Jim Grosbach 95135982cd Tidy up formatting.
llvm-svn: 142367
2011-10-18 17:22:53 +00:00
Jim Grosbach e4454e0de2 ARM assembly parsing and encoding for VMOV.i64.
llvm-svn: 142356
2011-10-18 16:18:11 +00:00
Justin Holewinski 1fb5bb126e PTX: Fix disabling of MAD instruction selection
llvm-svn: 142352
2011-10-18 13:39:20 +00:00
Chad Rosier 0ffe593a16 Add support for dynamic stack realignment when in thumb1 mode.
rdar://10288916

llvm-svn: 142337
2011-10-18 05:28:00 +00:00
Jim Grosbach 8211c051ca ARM assembly parsing and encoding for VMOV/VMVN/VORR/VBIC.i32.
llvm-svn: 142321
2011-10-18 00:22:00 +00:00
Michael J. Spencer 81c80ddb0c Revert "llvm-objdump: Add static symbol table dumping."
This reverts commit 0c30d4e4f5f9110c5a67bd0ca84444dc58697596.

llvm-svn: 142320
2011-10-18 00:17:04 +00:00
Michael J. Spencer 6b22ef8af2 llvm-objdump: Add static symbol table dumping.
llvm-svn: 142319
2011-10-17 23:55:22 +00:00
Jim Grosbach 26bfc9e5da Enable a few more NEON immediate tests.
llvm-svn: 142313
2011-10-17 23:50:19 +00:00
Jim Grosbach cda32ae372 ARM assembly parsing and encoding for VMOV/VMVN/VORR/VBIC.i16.
llvm-svn: 142303
2011-10-17 23:09:09 +00:00
Nick Lewycky 40f8f2ff24 Add support for a new extension to the .file directive:
.file filenumber "directory" "filename"

This removes one join+split of the directory+filename in MC internals. Because
bitcode files have independent fields for directory and filenames in debug info,
this patch may change the .o files written by existing .bc files.

llvm-svn: 142300
2011-10-17 23:05:28 +00:00
Dan Gohman a7107f992e Teach the ARC optimizer about the !clang.arc.copy_on_escape metadata
tag on objc_retainBlock calls, which indicates that they may be
optimized away. rdar://10211286.

llvm-svn: 142298
2011-10-17 22:53:25 +00:00
Jim Grosbach 741cd73aab ARM NEON "vmov.i8" immediate assembly parsing and encoding.
NEON immediates are "interesting". Start of the work to handle parsing them
in an 'as' compatible manner. Getting the matcher to play nicely with
these and the floating point immediates from VFP is an extra fun wrinkle.

llvm-svn: 142293
2011-10-17 22:26:03 +00:00
Lang Hames e7594abd87 Fixed quoting on default data layout option.
llvm-svn: 142286
2011-10-17 21:54:43 +00:00
Bill Wendling c68c8cb8d4 Add support for the Objective-C personality function to the instruction
combining of the landingpad instruction. The ObjC personality function acts
almost identically to the C++ personality function. In particular, it uses
"null" as a "catch-all" value.

llvm-svn: 142256
2011-10-17 21:20:24 +00:00
Nadav Rotem d2c72d6d03 Add CHECKs and document PR11158.
llvm-svn: 142240
2011-10-17 20:23:23 +00:00
Nadav Rotem 28e4f67f26 stabalize tests by specifying the exact sse level
llvm-svn: 142229
2011-10-17 19:45:38 +00:00
Dan Gohman 1736c14b85 Suppress partial retain+release elimination when there's a
possibility that it will span multiple CFG diamonds/triangles which
could have different controlling predicates.  rdar://10282956

llvm-svn: 142222
2011-10-17 18:48:25 +00:00
Bill Wendling 63a4ea1859 Correct over-zealous removal of hack.
Some code want to check that *any* call within a function has the 'returns
twice' attribute, not just that the current function has one.

llvm-svn: 142221
2011-10-17 18:43:40 +00:00
Bill Wendling 42cf65fe51 Temporarily XFAIL waiting for a fix.
llvm-svn: 142215
2011-10-17 18:25:32 +00:00
Michael J. Spencer 4e25c02487 llvm-objdump: Add -s, which prints the contents of each section.
llvm-svn: 142199
2011-10-17 17:13:22 +00:00
Michael J. Spencer b4f19a5d86 llvm-objdump: Add tests.
llvm-svn: 142198
2011-10-17 17:13:05 +00:00
Hal Finkel 9dda3e0d13 use FileCheck and not grep in new tests
llvm-svn: 142189
2011-10-17 16:01:41 +00:00
Nadav Rotem 83d1a93cf4 Clean the triple, add check lines.
llvm-svn: 142183
2011-10-17 07:07:51 +00:00
Nadav Rotem 89c282e96f Previously v2i32 vectors were legalized to v4i32. Now, they are legalized to
v2i64. These tests do not check MMX nor zmoving into them.

llvm-svn: 142182
2011-10-17 06:59:01 +00:00
Hal Finkel 7ccb391d21 Test case for CanLowerReturn fix (r141981)
llvm-svn: 142172
2011-10-17 04:03:59 +00:00
Hal Finkel ad677b64db Add PPC 440 scheduler and some associated tests (new files)
llvm-svn: 142171
2011-10-17 04:03:55 +00:00
Chandler Carruth 3e8aa65bc2 Add a routine to swap branch instruction operands, and update any
profile metadata at the same time. Use it to preserve metadata attached
to a branch when re-writing it in InstCombine.

Add metadata to the canonicalize_branch InstCombine test, and check that
it is tranformed correctly.

Reviewed by Nick Lewycky!

llvm-svn: 142168
2011-10-17 01:11:57 +00:00
Nadav Rotem 053a7358d6 Add tripple and stabalize a few more tests.
llvm-svn: 142158
2011-10-16 21:20:54 +00:00
Nadav Rotem a6b6566db6 Add triple to tests.
llvm-svn: 142154
2011-10-16 20:53:20 +00:00
Nadav Rotem 9513104d2a fix a typo in the test
llvm-svn: 142153
2011-10-16 20:43:41 +00:00
Nadav Rotem 486ff59a9f Enable element promotion type legalization by deafault.
Changed tests which assumed that vectors are legalized by widening them.

llvm-svn: 142152
2011-10-16 20:31:33 +00:00
Nick Lewycky 84baea77ea Oops! Fix testcase.
llvm-svn: 142151
2011-10-16 20:20:15 +00:00
Nick Lewycky 0a7e9ccf04 When looking for dependencies on the src pointer, scan the src pointer. Scanning
on the memcpy call will pull up other unrelated stuff. Fixes PR11142.

llvm-svn: 142150
2011-10-16 20:13:32 +00:00
Nadav Rotem 2130a07687 Remove the the test which checks the saving of a vector of booleans into memory.
The decision was to pack the bits. Currently no codegen supports this.
Currently, all of the bits in the vector are saved into the same address
in memory.

llvm-svn: 142149
2011-10-16 19:06:06 +00:00
Craig Topper 96fa597828 Add X86 PEXTR and PDEP instructions.
llvm-svn: 142141
2011-10-16 16:50:08 +00:00
Nadav Rotem bc25b6eb67 Fix a bug in LowerV2I64Splat, which generated a BUILD_VECTOR for which there was
no pattern.

llvm-svn: 142130
2011-10-16 10:02:06 +00:00
Craig Topper aea148c366 Add X86 BZHI instruction as well as BMI2 feature detection.
llvm-svn: 142122
2011-10-16 07:55:05 +00:00
Craig Topper 0ae8d4d738 Add X86 INVPCID instruction. Add 32/64-bit predicates to INVEPT, INVVPID, VMREAD, and VMWRITE to remove hack from X86RecognizableInstr.
llvm-svn: 142117
2011-10-16 07:05:40 +00:00
Chris Lattner a3a0681083 Enhance llvm::SourceMgr to support diagnostic ranges, the same way clang does. Enhance
the X86 asmparser to produce ranges in the one case that was annoying me, for example:

test.s:10:15: error: invalid operand for instruction
movl 0(%rax), 0(%edx)
              ^~~~~~~

It should be straight-forward to enhance filecheck, tblgen, and/or the .ll parser to use 
ranges where appropriate if someone is interested.

llvm-svn: 142106
2011-10-16 04:47:35 +00:00
Craig Topper 25ea4e5ad3 Add X86 BEXTR instruction. This instruction uses VEX.vvvv to encode Operand 3 instead of Operand 2 so needs special casing in the disassembler and code emitter. Ultimately, should pass this information from tablegen
llvm-svn: 142105
2011-10-16 03:51:13 +00:00
NAKAMURA Takumi 4f539312df test/Makefile: Inspect $(PROJ_OBJ_ROOT)/tools/clang/Makefile instead of $(PROJ_SRC_ROOT)/tools/clang for "check-all".
llvm-svn: 142100
2011-10-16 02:54:14 +00:00
Craig Topper 27ad12539d Add support for X86 blsr, blsmsk, and blsi instructions. Required extra work because these are the first VEX encoded instructions to use the reg field as an opcode extension.
llvm-svn: 142082
2011-10-15 20:46:47 +00:00
Nico Weber 3a46b375e2 Let this test pass even if 'int' is somewhere in its directory path.
On my machine, grep matched:

  ; ModuleID = '/Volumes/MacintoshHD2/src/chrome-git/src/third_party/llvm/test/Linker/2011-08-18-unique-debug-type.ll'
  !9 = metadata !{i32 720932, null, metadata !"int", null, i32 0, i64 32, i64 32, i64 0, i32 0, i32 5} ; [ DW_TAG_base_type ]

Explicitly filter out the ModuleID line.

llvm-svn: 142077
2011-10-15 18:07:16 +00:00
Andrew Trick fd4ca0f4ac Fix SCEVExpander assert during LSR: "argument of incompatible type".
Just because we're dealing with a GEP doesn't mean we can assert the
SCEV has a pointer type. The fix is simply to ignore the SCEV pointer
type, which we really didn't need.
Fixes PR11138 webkit crash.

llvm-svn: 142058
2011-10-15 06:19:55 +00:00
Eli Friedman 74d1da5a05 Add missing correctness check to ARMTargetLowering::ReconstructShuffle. Fixes PR11129.
llvm-svn: 142022
2011-10-14 23:58:49 +00:00
Owen Anderson 220f2643fa Update test for disabling of code/data marker labels in ELF.
llvm-svn: 142003
2011-10-14 21:12:55 +00:00
Torok Edwin 3e3ed6d21b OCaml bindings: add some missing functions and testcases.
The C bindings exposed some APIs that weren't covered by the OCaml bindings

llvm-svn: 141997
2011-10-14 20:38:33 +00:00
Torok Edwin 90e6edf7c4 OCaml bindings: fix infinite recursion on string_of_lltype
llvm-svn: 141994
2011-10-14 20:38:14 +00:00
Jakob Stoklund Olesen 06b6ccfe90 Update live-in lists when splitting critical edges.
Fixes PR10814. Patch by Jan Sjödin!

llvm-svn: 141960
2011-10-14 17:25:46 +00:00