shouldn't do AU.setPreservesCFG(), because even though CodeGen passes
don't modify the LLVM IR CFG, they may modify the MachineFunction CFG,
and passes like MachineLoop are registered with isCFGOnly set to true.
llvm-svn: 77691
failures when building assorted projects with clang.
--- Reverse-merging r77654 into '.':
U include/llvm/CodeGen/Passes.h
U include/llvm/CodeGen/MachineFunctionPass.h
U include/llvm/CodeGen/MachineFunction.h
U include/llvm/CodeGen/LazyLiveness.h
U include/llvm/CodeGen/SelectionDAGISel.h
D include/llvm/CodeGen/MachineFunctionAnalysis.h
U include/llvm/Function.h
U lib/Target/CellSPU/SPUISelDAGToDAG.cpp
U lib/Target/PowerPC/PPCISelDAGToDAG.cpp
U lib/CodeGen/LLVMTargetMachine.cpp
U lib/CodeGen/MachineVerifier.cpp
U lib/CodeGen/MachineFunction.cpp
U lib/CodeGen/PrologEpilogInserter.cpp
U lib/CodeGen/MachineLoopInfo.cpp
U lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
D lib/CodeGen/MachineFunctionAnalysis.cpp
D lib/CodeGen/MachineFunctionPass.cpp
U lib/CodeGen/LiveVariables.cpp
llvm-svn: 77661
This adds location info for all llvm_unreachable calls (which is a macro now) in
!NDEBUG builds.
In NDEBUG builds location info and the message is off (it only prints
"UREACHABLE executed").
llvm-svn: 75640
Make llvm_unreachable take an optional string, thus moving the cerr<< out of
line.
LLVM_UNREACHABLE is now a simple wrapper that makes the message go away for
NDEBUG builds.
llvm-svn: 75379
- Fix fabs, fneg for f32 and f64.
- Use BuildVectorSDNode.isConstantSplat, now that the functionality exists
- Continue to improve i64 constant lowering. Lower certain special constants
to the constant pool when they correspond to SPU's shufb instruction's
special mask values. This avoids the overhead of performing a shuffle on a
zero-filled vector just to get the special constant when the memory load
suffices.
llvm-svn: 67067
1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants.
2. MachineConstantPool alignment field is also a log2 value.
3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values.
4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries.
5. Asm printer uses expensive data structure multimap to track constant pool entries by sections.
6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic.
Solutions:
1. ConstantPoolSDNode alignment field is changed to keep non-log2 value.
2. MachineConstantPool alignment field is also changed to keep non-log2 value.
3. Functions that create ConstantPool nodes are passing in non-log2 alignments.
4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT.
5. Asm printer uses cheaper data structure to group constant pool entries.
6. Asm printer compute entry offsets after grouping is done.
7. Change JIT code to compute entry offsets on the fly.
llvm-svn: 66875
instruction. The class also consolidates the code for detecting constant
splats that's shared across PowerPC and the CellSPU backends (and might be
useful for other backends.) Also introduces SelectionDAG::getBUID_VECTOR() for
generating new BUILD_VECTOR nodes.
llvm-svn: 65296
- Rename fcmp.ll test to fcmp32.ll, start adding new double tests to fcmp64.ll
- Fix select_bits.ll test
- Capitulate to the DAGCombiner and move i64 constant loads to instruction
selection (SPUISelDAGtoDAG.cpp).
<rant>DAGCombiner will insert all kinds of 64-bit optimizations after
operation legalization occurs and now we have to do most of the work that
instruction selection should be doing twice (once to determine if v2i64
build_vector can be handled by SelectCode(), which then runs all of the
predicates a second time to select the necessary instructions.) But,
CellSPU is a good citizen.</rant>
llvm-svn: 62990
- Ensure that (operation) legalization emits proper FDIV libcall when needed.
- Fix various bugs encountered during llvm-spu-gcc build, along with various
cleanups.
- Start supporting double precision comparisons for remaining libgcc2 build.
Discovered interesting DAGCombiner feature, which is currently solved via
custom lowering (64-bit constants are not legal on CellSPU, but DAGCombiner
insists on inserting one anyway.)
- Update README.
llvm-svn: 62664
and into the ScheduleDAGInstrs class, so that they don't get
destructed and re-constructed for each block. This fixes a
compile-time hot spot in the post-pass scheduler.
To help facilitate this, tidy and do some minor reorganization
in the scheduler constructor functions.
llvm-svn: 62275
sequences in SPUDAGToDAGISel.cpp and SPU64InstrInfo.td, killing custom
DAG node types as needed.
- i64 mul is now a legal instruction, but emits an instruction sequence
that stretches tblgen and the imagination, as well as violating laws of
several small countries and most southern US states (just kidding, but
looking at a function with 80+ parameters is really weird and just plain
wrong.)
- Update tests as needed.
llvm-svn: 62254
instruction sequence and cannot ordinarily be simplified by DAGcombine
into the various target description files or SPUDAGToDAGISel.cpp.
This makes some 64-bit operations legal.
- Eliminate target-dependent ISD enums.
- Update tests.
llvm-svn: 61508
DAGcombine's ability to find reasons to remove truncates when they were not
needed. Consequently, the CellSPU backend would produce correct, but _really
slow and horrible_, code.
Replaced with instruction sequences that do the equivalent truncation in
SPUInstrInfo.td.
- Re-examine how unaligned loads and stores work. Generated unaligned
load code has been tested on the CellSPU hardware; see the i32operations.c
and i64operations.c in CodeGen/CellSPU/useful-harnesses. (While they may be
toy test code, it does prove that some real world code does compile
correctly.)
- Fix truncating stores in bug 3193 (note: unpack_df.ll will still make llc
fault because i64 ult is not yet implemented.)
- Added i64 eq and neq for setcc and select/setcc; started new instruction
information file for them in SPU64InstrInfo.td. Additional i64 operations
should be added to this file and not to SPUInstrInfo.td.
llvm-svn: 61447
- Fix bug 3185, with misc other cleanups.
- Needed to implement SPUInstrInfo::InsertBranch(). CAUTION: Not sure what
gets or needs to get passed to InsertBranch() to insert a conditional
branch. This will abort for now until a good test case shows up.
llvm-svn: 60811
- First patch from Nehal Desai, a new contributor at Aerospace. Nehal's patch
fixes sign/zero/any-extending loads for integers and floating point. Example
code, compiled w/o debugging or optimization where he first noticed the bug:
int main(void) {
float a = 99.0;
printf("%d\n", a);
return 0;
}
Verified that this code actually works on a Cell SPU.
Changes by Scott Michel:
- Fix bug in the value type list constructed by SPUISD::LDRESULT to include
both the load result's result and chain, not just the chain alone.
- Simplify LowerLOAD and remove extraneous and unnecessary chains.
- Remove unused SPUISD pseudo instructions.
llvm-svn: 60526
- Fix v2[if]64 vector insertion code before IBM files a bug report.
- Ensure that zero (0) offsets relative to $sp don't trip an assert
(add $sp, 0 gets legalized to $sp alone, tripping an assert)
- Shuffle masks passed to SPUISD::SHUFB are now v16i8 or v4i32
llvm-svn: 60358
(a) Remove conditionally removed code in SelectXAddr. Basically, hope for the
best that the A-form and D-form address predicates catch everything before
the code decides to emit a X-form address.
(b) Expand vector store test cases to include the usual suspects.
llvm-svn: 60034
priority function. Instead, just iterate over the AllNodes list, which is
already in topological order. This eliminates a fair amount of bookkeeping,
and speeds up the isel phase by about 15% on many testcases.
The impact on most targets is that AddToISelQueue calls can be simply removed.
In the x86 target, there are two additional notable changes.
The rule-bending AND+SHIFT optimization in MatchAddress that creates new
pre-isel nodes during isel is now a little more verbose, but more robust.
Instead of either creating an invalid DAG or creating an invalid topological
sort, as it has historically done, it can now just insert the new nodes into
the node list at a position where they will be consistent with the topological
ordering.
Also, the address-matching code has logic that checked to see if a node was
"already selected". However, when a node is selected, it has all its uses
taken away via ReplaceAllUsesWith or equivalent, so it won't recieve any
further visits from MatchAddress. This code is now removed.
llvm-svn: 58748
flag. Then in a debugger developers can set breakpoints at these calls
to see waht is about to be selected and what the resulting subgraph
looks like. This really helps when debugging instruction selection.
llvm-svn: 58278
process up to a higher level. This allows FastISel to leverage
more of SelectionDAGISel's infastructure, such as updating Machine
PHI nodes.
Also, implement transitioning from SDISel back to FastISel in
the middle of a block, so it's now possible to go back and
forth. This allows FastISel to hand individual CallInsts and other
complicated things off to SDISel to handle, while handling the rest
of the block itself.
To help support this, reorganize the SelectionDAG class so that it
is allocated once and reused throughout a function, instead of
being completely reallocated for each block.
llvm-svn: 55219
replacement of multiple values. This is slightly more efficient
than doing multiple ReplaceAllUsesOfValueWith calls, and theoretically
could be optimized even further. However, an important property of this
new function is that it handles the case where the source value set and
destination value set overlap. This makes it feasible for isel to use
SelectNodeTo in many very common cases, which is advantageous because
SelectNodeTo avoids a temporary node and it doesn't require CSEMap
updates for users of values that don't change position.
Revamp MorphNodeTo, which is what does all the work of SelectNodeTo, to
handle operand lists more efficiently, and to correctly handle a number
of corner cases to which its new wider use exposes it.
This commit also includes a change to the encoding of post-isel opcodes
in SDNodes; now instead of being sandwiched between the target-independent
pre-isel opcodes and the target-dependent pre-isel opcodes, post-isel
opcodes are now represented as negative values. This makes it possible
to test if an opcode is pre-isel or post-isel without having to know
the size of the current target's post-isel instruction set.
These changes speed up llc overall by 3% and reduce memory usage by 10%
on the InstructionCombining.cpp testcase with -fast and -regalloc=local.
llvm-svn: 53728
of apint codegen failure is the DAG combiner doing
the wrong thing because it was comparing MVT's using
< rather than comparing the number of bits. Removing
the < method makes this mistake impossible to commit.
Instead, add helper methods for comparing bits and use
them.
llvm-svn: 52098