initializer problem, a minor tweak to the way the
DAGISelEmitter finds load/store nodes, and a renaming of the
new PseudoSourceValue objects.
llvm-svn: 46827
DAGUpdateListener object pointer instead of just returning a vector
of deleted nodes. This makes the interfaces more efficient (no more
allocating a vector [at least a malloc], filling it in, then walking
it) and more clean. This also allows the client to be notified of
nodes that are *changed* but not deleted.
llvm-svn: 46677
Added ISD::DECLARE node type to represent llvm.dbg.declare intrinsic. Now the intrinsic calls are lowered into a SDNode and lives on through out the codegen passes.
For now, since all the debugging information recording is done at isel time, when a ISD::DECLARE node is selected, it has the side effect of also recording the variable. This is a short term solution that should be fixed in time.
llvm-svn: 46659
the pattern when generating matchin code.
The first (and currently, only) attribute causes the immediate parent node of the ComplexPattern operand to be passed into the matching code rather than the node at the root of the entire DAG containing the pattern.
llvm-svn: 46606
in the backend. Introduce a new SDNode type, MemOperandSDNode, for
holding a MemOperand in the SelectionDAG IR, and add a MemOperand
list to MachineInstr, and code to manage them. Remove the offset
field from SrcValueSDNode; uses of SrcValueSDNode that were using
it are all all using MemOperandSDNode now.
Also, begin updating some getLoad and getStore calls to use the
PseudoSourceValue objects.
Most of this was written by Florian Brander, some
reorganization and updating to TOT by me.
llvm-svn: 46585
out, DAGISelEmitter can compute it in its ctor, which simplifies some code.
Now we can use CodegenDAGPatterns in other parts of tblgen that want access
to dag pattern info, woo!
llvm-svn: 45636
take a deleted nodes vector, instead of requiring it.
One more significant change: Implement the start of a legalizer that
just works on types. This legalizer is designed to run before the
operation legalizer and ensure just that the input dag is transformed
into an output dag whose operand and result types are all legal, even
if the operations on those types are not.
This design/impl has the following advantages:
1. When finished, this will *significantly* reduce the amount of code in
LegalizeDAG.cpp. It will remove all the code related to promotion and
expansion as well as splitting and scalarizing vectors.
2. The new code is very simple, idiomatic, and modular: unlike
LegalizeDAG.cpp, it has no 3000 line long functions. :)
3. The implementation is completely iterative instead of recursive, good
for hacking on large dags without blowing out your stack.
4. The implementation updates nodes in place when possible instead of
deallocating and reallocating the entire graph that points to some
mutated node.
5. The code nicely separates out handling of operations with invalid
results from operations with invalid operands, making some cases
simpler and easier to understand.
6. The new -debug-only=legalize-types option is very very handy :),
allowing you to easily understand what legalize types is doing.
This is not yet done. Until the ifdef added to SelectionDAGISel.cpp is
enabled, this does nothing. However, this code is sufficient to legalize
all of the code in 186.crafty, olden and freebench on an x86 machine. The
biggest issues are:
1. Vectors aren't implemented at all yet
2. SoftFP is a mess, I need to talk to Evan about it.
3. No lowering to libcalls is implemented yet.
4. Various operations are missing etc.
5. There are FIXME's for stuff I hax0r'd out, like softfp.
Hey, at least it is a step in the right direction :). If you'd like to help,
just enable the #ifdef in SelectionDAGISel.cpp and compile code with it. If
this explodes it will tell you what needs to be implemented. Help is
certainly appreciated.
Once this goes in, we can do three things:
1. Add a new pass of dag combine between the "type legalizer" and "operation
legalizer" passes. This will let us catch some long-standing isel issues
that we miss because operation legalization often obfuscates the dag with
target-specific nodes.
2. We can rip out all of the type legalization code from LegalizeDAG.cpp,
making it much smaller and simpler. When that happens we can then
reimplement the core functionality left in it in a much more efficient and
non-recursive way.
3. Once the whole legalizer is non-recursive, we can implement whole-function
selectiondags maybe...
llvm-svn: 42981
1.
[(set GR32:$dst, (add GR32:$src1, GR32:$src2)),
(modify EFLAGS)]
This indicates the source pattern expects the instruction would produce 2 values. The first is the result of the addition. The second is an implicit definition in register EFLAGS.
2.
def : Pat<(parallel (addc GR32:$src1, GR32:$src2), (modify EFLAGS)), ()>
Similar to #1 except this is used for def : Pat patterns.
llvm-svn: 41897
InOperandList. This gives one piece of important information: # of results
produced by an instruction.
An example of the change:
def ADD32rr : I<0x01, MRMDestReg, (ops GR32:$dst, GR32:$src1, GR32:$src2),
"add{l} {$src2, $dst|$dst, $src2}",
[(set GR32:$dst, (add GR32:$src1, GR32:$src2))]>;
=>
def ADD32rr : I<0x01, MRMDestReg, (outs GR32:$dst), (ins GR32:$src1, GR32:$src2),
"add{l} {$src2, $dst|$dst, $src2}",
[(set GR32:$dst, (add GR32:$src1, GR32:$src2))]>;
llvm-svn: 40033
that there were two input operands before the variable operand portion. This
*happened* to be true for all call instructions, which took a chain and a
destination, but was not true for the PPC BCTRL instruction, whose destination
is implicit.
Making this code more general allows elimination of the custom selection logic
for BCTRL.
llvm-svn: 31732
X86ISD::CMP, etc.) instead of SDNode names (add, x86cmp, etc). We now allow
multiple SDNodes to map to the same SelectionDAG node (e.g. store, indexed
store).
llvm-svn: 31575
chain operand to point to the load being folded. Now we relax this, traversing
up the chain, if it doesn't reach the load, then it's ok. We will create a
TokenFactor (of all the chain operands and the load's chain) to capture all
the control flow dependencies.
llvm-svn: 30897
The dag/inst combiners often 'simplify' the masked value based on whether
or not the bits are live or known zero/one. This is good and dandy, but
often causes special case patterns to fail, such as alpha's CMPBGE pattern,
which looks like "(set GPRC:$RC, (setuge (and GPRC:$RA, 255), (and GPRC:$RB, 255)))".
Here the pattern for (and X, 255) should match actual dags like (and X, 254) if
the dag combiner proved that the missing bits are already zero (one for 'or').
For CodeGen/Alpha/cmpbge.ll:test2 for example, this results in:
sll $16,1,$0
cmpbge $0,$17,$0
ret $31,($26),1
instead of:
sll $16,1,$0
and $0,254,$0
and $17,255,$1
cmpule $1,$0,$0
ret $31,($26),1
... and requires no target-specific code.
llvm-svn: 30871
the branch's chain is also produced by cmp.
[ch, r : ld]
^ ^
| |
[XX]--/ \- [flag : cmp]
^ ^
| |
\---[br flag]-
Remove an isel check which prevents loads from being folded into cmp / test
instructions.
2) Whenever possible, delete a selected node to allow more load folding
opportunities. Note not all nodes can be deleted after it has been
selected. Some may have simply morphed; some have not changed at all (e.g.
EntryToken).
llvm-svn: 30242
- Clean up the code generated by tablegen:
* AddToISelQueue now takes one argument.
* ComplexPattern matching condition can now be shared.
* Eliminate passing unnecessary arguments to emit routines.
* Eliminate some unneeded SDOperand declarations in select routines.
* Other minor clean ups.
- This reduces foot print slightly: X86ISelDAGToDAG.o is reduced from 971k
to 823k.
llvm-svn: 29892
introduced by previous commit.
- SelectCode now returns a SDNode*. If it is not null, the selected node
produces the same number of results as the input node. The seletion loop
is responsible for calling ReplaceAllUsesWith() to replace the input node
with the output target node. For other cases, e.g. when load is folded,
the selection code is responsible for calling ReplaceAllUsesOfValueWith()
and SelectCode returns NULL.
- Other clean ups.
llvm-svn: 29602
in the start of an array and a count of operands where applicable. In many
cases, the number of operands is known, so this static array can be allocated
on the stack, avoiding the heap. In many other cases, a SmallVector can be
used, which has the same benefit in the common cases.
I updated a lot of code calling getNode that takes a vector, but ran out of
time. The rest of the code should be updated, and these methods should be
removed.
We should also do the same thing to eliminate the methods that take a
vector of MVT::ValueTypes.
It would be extra nice to convert the dagiselemitter to avoid creating vectors
for operands when calling getTargetNode.
llvm-svn: 29566
per possible ValueType of the node. e.g. Select_add is split into Select_add_i8,
Select_add_i16, etc.
For opcodes which do not produce a non-chain result, it is split on the
ValueType of its first non-chain operand. e.g. Select_store.
On X86 / Mac OS X, Select_store used to be the largest function. It had a stack
frame size of 8.5k. Now the largest one is Store_i32 with a frame size of 3.1k.
llvm-svn: 29404
code that emit target specific nodes into emit functions that are uniquified
and shared among selection routines.
e.g. This reduces X86ISelDAGToDAG.o (release) from ~2M to ~1.5M. Stack frame
size of Select_store from ~13k down to ~8k.
This is the first step. Further work to enable more sharing will follow.
llvm-svn: 29158
because information about one can help refine the other. This allows us to
write:
def : Pat<(i32 (extload xaddr:$src, i8)),
(LBZX xaddr:$src)>;
as:
def : Pat<(extload xaddr:$src, i8),
(LBZX xaddr:$src)>;
because tblgen knows LBZX returns i32.
llvm-svn: 28865
instruction, and the result type of the instruction to refine the pattern.
This allows us to write things like this:
def : Pat<(v2i64 (bitconvert (v16i8 VR128:$src))), (v2i64 VR128:$src)>;
as:
def : Pat<(v2i64 (bitconvert (v16i8 VR128:$src))), (VR128:$src)>
and fixes a ppc64 issue.
llvm-svn: 28863
a cycle. This increase the search space and will increase compile time (in
practice it appears to be small, e.g. 176.gcc goes from 62 sec to 65 sec)
that will be addressed later.
llvm-svn: 28476
patterns that look like this:
def : Pat<(i32 (X86Wrapper tconstpool :$dst)), (MOV32ri tconstpool :$dst)>;
InsertOneTypeCheck should copy the type from the resolved pattern to the
unresolved one as long as there types are different.
llvm-svn: 28389
1. Use expects a chain output.
2. Node is expanded into multiple target ops.
3. One of the inner node produces a chain, the outer most one doesn't.
llvm-svn: 28209
x86 and ppc for 100% dense switch statements when relocations are non-PIC.
This support will be extended and enhanced in the coming days to support
PIC, and less dense forms of jump tables.
llvm-svn: 27947
tblgen: In STVEBX: Intrinsic 'llvm.ppc.altivec.stvebx' expects 3 operands, not 2 operands!
instead of like this:
tblgen: In STVEBX: Intrinsic 'intrinsic_void expects 3 operands, not 2 operands!
llvm-svn: 27185
intrinsics that don't take pointer arguments now work. For example, we can
compile this:
int test3( __m128d *A) {
return _mm_movemask_pd(*A);
}
int test4( __m128 *A) {
return _mm_movemask_ps(*A);
}
to this:
_test3:
movl 4(%esp), %eax
movapd (%eax), %xmm0
movmskpd %xmm0, %eax
ret
_test4:
movl 4(%esp), %eax
movaps (%eax), %xmm0
movmskps %xmm0, %eax
ret
llvm-svn: 27090
The instruction patterns do not contain enough information to resolve the
exact type of the destination if it of a generic vector type.
llvm-svn: 26892
if (N1.getOpcode() == ISD::ADD &&
...)
if (... &&
(N1.getNumOperands() == 1 || !isNonImmUse(N1.Val, N10.Val))) &&
...)
TableGen knows N1 must have more than one operand.
llvm-svn: 26592
due to ordering issue. i.e. they were selected for chain use first.
Now at load select time, check if it is being selected for a chain use and if
it has only a single real use. If so, return a HANDLENODE (with the load as
its operand) in its place and record it.
When it is folded or the load is selected for a real use, the isel records it
as the replacement for the HANDLENODE. The replacement is done when all nodes
are selected.
This scheme exposed a couple of problems where cycles can happen. (See comments
in EmitMatchCode() for descriptions of the problems and their workaround /
solutions.) These problems have been resolved with a small compile time
penality.
llvm-svn: 25995
Chain is initially set to the chain operand of store node, when it reaches
load, if it matches the load then Chain is set to the chain operand of the
load.
However, if the matching code that follows this fails, isel moves on to the
next pattern but it does not restore Chain to the chain operand of the store.
So when it tries to match the next store / op / load pattern it would fail on
the Chain == load.getOperand(0) test.
The solution is for each chain operand to get a unique name. e.g. Chain10.
llvm-svn: 25931
if (N1.getOpcode() == ISD::LOAD &&
N1.hasOneUse() &&
!CodeGenMap.count(N1.getValue(0)) &&
!CodeGenMap.count(N1.getValue(1))) {
instead of this:
if (N1.getOpcode() == ISD::LOAD) {
if (N1.hasOneUse()) {
if (!CodeGenMap.count(N1.getValue(0))) {
if (!CodeGenMap.count(N1.getValue(1))) {
llvm-svn: 25763
"if" statements (indenting it appropriately, of course) instead of using goto's.
This inverts the logic for all of the if statements, which makes things simpler
to understand in addition to making the generated code easier to read.
llvm-svn: 25757
directly to the output file. This makes things simple because the code doesn't
have to worry about indentation or the case when there is no goto. It also
allows us to indent the code better without touching everything :)
llvm-svn: 25756
If store's chain operand is load, then use load's chain operand instead. If
it isn't (likely a TokenFactor), then do not allow the folding.
llvm-svn: 25708
has already been selected. The number of use check is not strong enough since
a node can be replaced with newly created target node. e.g. If the original
node has two uses, when it is selected for one of the uses it is replaced with
another. Each node now has a single use but isel still should not fold it.
llvm-svn: 25651
instruction to produce a result. e.g MUL8m, the instruction does not
produce a explicit result. However it produces an implicit result in
AL which would be copied to a temp. The root operator of the matching
pattern is a mul so the use would expect it to produce a result.
llvm-svn: 25458