Commit Graph

5147 Commits

Author SHA1 Message Date
Bill Wendling 57e3aaad89 Check the visibility of the global variable before placing it into the stubs
table. A hidden variable could potentially end up in both lists.
<rdar://problem/10336715>

llvm-svn: 142869
2011-10-24 23:05:43 +00:00
Dan Gohman 2c9bda1512 Remove the explicit request for "Latency" scheduling from MSP430,
as the Latency scheduler is going away.

llvm-svn: 142811
2011-10-24 17:53:16 +00:00
Dan Gohman c32af340fc Change the default scheduler from Latency to ILP, since Latency
is going away.

llvm-svn: 142810
2011-10-24 17:45:02 +00:00
Chandler Carruth bd1be4d01c Completely re-write the algorithm behind MachineBlockPlacement based on
discussions with Andy. Fundamentally, the previous algorithm is both
counter productive on several fronts and prioritizing things which
aren't necessarily the most important: static branch prediction.

The new algorithm uses the existing loop CFG structure information to
walk through the CFG itself to layout blocks. It coalesces adjacent
blocks within the loop where the CFG allows based on the most likely
path taken. Finally, it topologically orders the block chains that have
been formed. This allows it to choose a (mostly) topologically valid
ordering which still priorizes fallthrough within the structural
constraints.

As a final twist in the algorithm, it does violate the CFG when it
discovers a "hot" edge, that is an edge that is more than 4x hotter than
the competing edges in the CFG. These are forcibly merged into
a fallthrough chain.

Future transformations that need te be added are rotation of loop exit
conditions to be fallthrough, and better isolation of cold block chains.
I'm also planning on adding statistics to model how well the algorithm
does at laying out blocks based on the probabilities it receives.

The old tests mostly still pass, and I have some new tests to add, but
the nested loops are still behaving very strangely. This almost seems
like working-as-intended as it rotated the exit branch to be
fallthrough, but I'm not convinced this is actually the best layout. It
is well supported by the probabilities for loops we currently get, but
those are pretty broken for nested loops, so this may change later.

llvm-svn: 142743
2011-10-23 09:18:45 +00:00
Nadav Rotem e649d66552 Fix pr11193.
SHL inserts zeros from the right, thus even when the original
sign_extend_inreg value was of 1-bit, we need to sra.

llvm-svn: 142724
2011-10-22 12:39:25 +00:00
Nadav Rotem 5e00bb5feb Fix pr11194. When promoting and splitting integers we need to use
ZExtPromotedInteger and SExtPromotedInteger based on the operation we legalize.

SetCC return type needs to be legalized via PromoteTargetBoolean.

llvm-svn: 142660
2011-10-21 17:35:19 +00:00
Chandler Carruth 70a38058b1 Don't hard code the desired alignment for loops -- it isn't 16-bytes on
all x86 systems. Sorry for the breakage.

llvm-svn: 142656
2011-10-21 16:41:39 +00:00
Nadav Rotem d315157f12 1. Fix the widening of SETCC in WidenVecOp_SETCC. Use the correct return CC type.
2. Fix a typo in CONCAT_VECTORS which exposed the bug in #1.

llvm-svn: 142648
2011-10-21 11:42:07 +00:00
Chandler Carruth 8b9737cb54 Add loop aligning to MachineBlockPlacement based on review discussion so
it's a bit more plausible to use this instead of CodePlacementOpt. The
code for this was shamelessly stolen from CodePlacementOpt, and then
trimmed down a bit. There doesn't seem to be much utility in returning
true/false from this pass as we may or may not have rewritten all of the
blocks. Also, the statistic of counting how many loops were aligned
doesn't seem terribly important so I removed it. If folks would like it
to be included, I'm happy to add it back.

This was probably the most egregious of the missing features, and now
I'm going to start gathering some performance numbers and looking at
specific loop structures that have different layout between the two.

Test is updated to include both basic loop alignment and nested loop
alignment.

llvm-svn: 142645
2011-10-21 08:57:37 +00:00
Chandler Carruth ddfeaafdfb Add a very basic test for MachineBlockPlacement. This is essentially the
canonical example I used when developing it, and is one of the primary
motivating real-world use cases for __builtin_expect (when burried under
a macro).

I'm working on more test cases here, but I'm trying to make sure both
that the pass is doing the right thing with the test cases and that they
aren't too brittle to changes elsewhere in the code generation pipeline.

Feedback and/or suggestions on how to test this are very welcome.
Especially feedback on whether testing the block comments is a good
strategy; I couldn't find any good examples to steal from but all the
other ideas I had were a lot uglier or more fragile.

llvm-svn: 142644
2011-10-21 08:01:56 +00:00
Craig Topper 039a79067a Remove intrinsics for X86 BLSI, BLSMSK, and BLSR intrinsics and replace with custom isel lowering code.
llvm-svn: 142642
2011-10-21 06:55:01 +00:00
Chad Rosier add38c12b8 Revert 142337. Thumb1 still doesn't support dynamic stack realignment. :(
llvm-svn: 142557
2011-10-20 00:07:12 +00:00
Evan Cheng 54d678fff4 Fix TLS lowering bug. The CopyFromReg must be glued to the TLSCALL. rdar://10291355
llvm-svn: 142550
2011-10-19 22:22:54 +00:00
Nadav Rotem 8824472a25 Improve code generation for vselect on SSE2:
When checking the availability of instructions using the TLI, a 'promoted'
instruction IS available. It means that the value is bitcasted to another type
for which there is an operation. The correct check for the availablity of an
instruction is to check if it should be expanded.

llvm-svn: 142542
2011-10-19 20:43:16 +00:00
James Molloy 2d768fd379 Use literal pool loads instead of MOVW/MOVT for materializing global addresses when optimizing for size.
On spec/gcc, this caused a codesize improvement of ~1.9% for ARM mode and ~4.9% for Thumb(2) mode. This is
codesize including literal pools.

The pools themselves doubled in size for ARM mode and quintupled for Thumb mode, leaving suggestion that there
is still perhaps redundancy in LLVM's use of constant pools that could be decreased by sharing entries.

Fixes PR11087.

llvm-svn: 142530
2011-10-19 14:11:07 +00:00
Nadav Rotem 6652e22bad Add support for the vector-widening of vselect and vector-setcc
llvm-svn: 142488
2011-10-19 09:45:11 +00:00
Craig Topper ef309c3384 Rename PEXTR to PEXT. Add intrinsics for BMI instructions.
llvm-svn: 142480
2011-10-19 07:48:35 +00:00
Lang Hames 20a04e74e3 Added testcase for <rdar://problem/10215997>
llvm-svn: 142462
2011-10-18 23:50:52 +00:00
Nadav Rotem 0d3393356b Add additional element-promotion tests.
llvm-svn: 142442
2011-10-18 23:05:33 +00:00
Nadav Rotem 75c2229f41 Fix a bug in the legalization of vector anyext-load and trunc-store. Mem Index starts with zero.
llvm-svn: 142434
2011-10-18 22:32:43 +00:00
Bob Wilson 9258b76d8d Fix incorrect check for sign-extended constant BUILD_VECTOR.
<rdar://problem/10298332>

llvm-svn: 142371
2011-10-18 17:34:51 +00:00
Bob Wilson 681561901d Fix a DAG combiner assertion failure when constant folding BUILD_VECTORS.
svn r139159 caused SelectionDAG::getConstant() to promote BUILD_VECTOR operands
with illegal types, even before type legalization.  For this testcase, that led
to one BUILD_VECTOR with i16 operands and another with promoted i32 operands,
which triggered the assertion.

llvm-svn: 142370
2011-10-18 17:34:47 +00:00
Justin Holewinski 1fb5bb126e PTX: Fix disabling of MAD instruction selection
llvm-svn: 142352
2011-10-18 13:39:20 +00:00
Chad Rosier 0ffe593a16 Add support for dynamic stack realignment when in thumb1 mode.
rdar://10288916

llvm-svn: 142337
2011-10-18 05:28:00 +00:00
Nick Lewycky 40f8f2ff24 Add support for a new extension to the .file directive:
.file filenumber "directory" "filename"

This removes one join+split of the directory+filename in MC internals. Because
bitcode files have independent fields for directory and filenames in debug info,
this patch may change the .o files written by existing .bc files.

llvm-svn: 142300
2011-10-17 23:05:28 +00:00
Nadav Rotem d2c72d6d03 Add CHECKs and document PR11158.
llvm-svn: 142240
2011-10-17 20:23:23 +00:00
Nadav Rotem 28e4f67f26 stabalize tests by specifying the exact sse level
llvm-svn: 142229
2011-10-17 19:45:38 +00:00
Hal Finkel 9dda3e0d13 use FileCheck and not grep in new tests
llvm-svn: 142189
2011-10-17 16:01:41 +00:00
Nadav Rotem 83d1a93cf4 Clean the triple, add check lines.
llvm-svn: 142183
2011-10-17 07:07:51 +00:00
Nadav Rotem 89c282e96f Previously v2i32 vectors were legalized to v4i32. Now, they are legalized to
v2i64. These tests do not check MMX nor zmoving into them.

llvm-svn: 142182
2011-10-17 06:59:01 +00:00
Hal Finkel 7ccb391d21 Test case for CanLowerReturn fix (r141981)
llvm-svn: 142172
2011-10-17 04:03:59 +00:00
Hal Finkel ad677b64db Add PPC 440 scheduler and some associated tests (new files)
llvm-svn: 142171
2011-10-17 04:03:55 +00:00
Nadav Rotem 053a7358d6 Add tripple and stabalize a few more tests.
llvm-svn: 142158
2011-10-16 21:20:54 +00:00
Nadav Rotem a6b6566db6 Add triple to tests.
llvm-svn: 142154
2011-10-16 20:53:20 +00:00
Nadav Rotem 9513104d2a fix a typo in the test
llvm-svn: 142153
2011-10-16 20:43:41 +00:00
Nadav Rotem 486ff59a9f Enable element promotion type legalization by deafault.
Changed tests which assumed that vectors are legalized by widening them.

llvm-svn: 142152
2011-10-16 20:31:33 +00:00
Nadav Rotem 2130a07687 Remove the the test which checks the saving of a vector of booleans into memory.
The decision was to pack the bits. Currently no codegen supports this.
Currently, all of the bits in the vector are saved into the same address
in memory.

llvm-svn: 142149
2011-10-16 19:06:06 +00:00
Nadav Rotem bc25b6eb67 Fix a bug in LowerV2I64Splat, which generated a BUILD_VECTOR for which there was
no pattern.

llvm-svn: 142130
2011-10-16 10:02:06 +00:00
Eli Friedman 74d1da5a05 Add missing correctness check to ARMTargetLowering::ReconstructShuffle. Fixes PR11129.
llvm-svn: 142022
2011-10-14 23:58:49 +00:00
Jakob Stoklund Olesen 06b6ccfe90 Update live-in lists when splitting critical edges.
Fixes PR10814. Patch by Jan Sjödin!

llvm-svn: 141960
2011-10-14 17:25:46 +00:00
Craig Topper 965de2c197 Add X86 ANDN instruction. Including instruction selection.
llvm-svn: 141947
2011-10-14 07:06:56 +00:00
Craig Topper 3657fe4b17 Add X86 TZCNT instruction and patterns to select it. Also added core-avx2 processor which is gcc's name for Haswell.
llvm-svn: 141939
2011-10-14 03:21:46 +00:00
Jakob Stoklund Olesen 7fb5632e73 Add value numbers when spilling dead defs.
When spilling around an instruction with a dead def, remember to add a
value number for the def.

The missing value number wouldn't normally create problems since there
would be an incoming live range as well.  However, due to another bug
we could spill a dead V_SET0 instruction which doesn't read any values.

The missing value number caused an empty live range to be created which
is dangerous since it doesn't interfere with anything.

This fixes part of PR11125.

llvm-svn: 141923
2011-10-14 00:34:31 +00:00
Benjamin Kramer f07d898ae1 Force CPU type on test so it doesn't accidentally emit movbe instead of bswap on Intel Atom CPUs.
llvm-svn: 141863
2011-10-13 14:27:54 +00:00
Kalle Raiskila 3815de8d50 Mark 'branch indirect' instruction as an indirect branch.
Not having it confused assembly printing of jumptables.

llvm-svn: 141862
2011-10-13 11:40:03 +00:00
Bill Wendling 25f6d3e321 More closely follow libgcc, which has code after the `ret' instruction to
release the stack segment and reset the stack pointer. Place the code in its own
MBB to make the verifier happy.

llvm-svn: 141859
2011-10-13 08:24:19 +00:00
Bill Wendling 063f55ffdd Revert r141854 because it was causing failures:
http://lab.llvm.org:8011/builders/llvm-x86_64-linux/builds/101

--- Reverse-merging r141854 into '.':
U    test/MC/Disassembler/X86/x86-32.txt
U    test/MC/Disassembler/X86/simple-tests.txt
D    test/CodeGen/X86/bmi.ll
U    lib/Target/X86/X86InstrInfo.td
U    lib/Target/X86/X86ISelLowering.cpp
U    lib/Target/X86/X86.td
U    lib/Target/X86/X86Subtarget.h

llvm-svn: 141857
2011-10-13 07:48:07 +00:00
Bill Wendling 22a690e3db Should not add instructions to a BB after a return instruction. The machine instruction verifier doesn't like this, nor do I.
llvm-svn: 141856
2011-10-13 07:42:32 +00:00
Craig Topper 8cc9388073 Add X86 TZCNT instruction and patterns to select it. Also added core-avx2 processor which is gcc's name for Haswell.
llvm-svn: 141854
2011-10-13 07:09:14 +00:00
Jakob Stoklund Olesen 068dc91de9 Also inflate register classes around inline asm.
Now that MI->getRegClassConstraint() can also handle inline assembly,
don't bail when recomputing the register class of a virtual register
used by inline asm.

This fixes PR11078.

llvm-svn: 141836
2011-10-12 23:37:40 +00:00