Now it will factor things like this:
CheckType i32
...
CheckOpcode ISD::AND
CheckType i64
...
into:
SwitchType:
i32: ...
i64:
CheckOpcode ISD::AND
...
This shrinks hte table by a few bytes, nothing spectacular.
llvm-svn: 97908
sequence, just emit instruction predicates right before them. This
exposes yet more factoring opportunitites, shrinking the X86 table
to 79144 bytes.
llvm-svn: 97704
as the very last thing before node emission. This should
dramatically reduce the number of times we do 'MatchAddress'
on X86, speeding up compile time. This also improves comments
in the tables and shrinks the table a bit, now down to
80506 bytes for x86.
llvm-svn: 97703
SwitchOpcodeMatcher) and have DAGISelMatcherOpt form it. This
speeds up selection, particularly for X86 which has lots of
variants of instructions with only type differences.
llvm-svn: 97645
stuff now that we don't care about emulating the old broken
behavior of the old isel. This eliminates the
'CheckChainCompatible' check (along with IsChainCompatible) which
did an incorrect and inefficient scan *up* the chain nodes which
happened as the pattern was being formed and does the validation
at the end in HandleMergeInputChains when it forms a structural
pattern. This scans "down" the graph, which means that it is
quickly bounded by nodes already selected. This also handles
token factors that get "trapped" in the dag.
Removing the CheckChainCompatible nodes also shrinks the
generated tables by about 6K for X86 (down to 83K).
There are two pieces remaining before I can nuke PreprocessRMW:
1. I xfailed a test because we're now producing worse code in a
case that has nothing to do with the change: it turns out that
our use of MorphNodeTo will leave dead nodes in the graph
which (depending on how the graph is walked) end up causing
bogus uses of chains and blocking matches. This is really
bad for other reasons, so I'll fix this in a follow-up patch.
2. CheckFoldableChainNode needs to be improved to handle the TF.
llvm-svn: 97539
EmitMergeInputChainsMatcher node up into EmitResultCode. This
doesn't have much of an effect on the generated code, the X86
table is exactly the same size.
llvm-svn: 97514
ordered correctly. Previously it would get in trouble when
two patterns were too similar and give them nondet ordering.
We force this by using the record ID order as a fallback.
The testsuite diff is due to alpha patterns being ordered
slightly differently, the change is a semantic noop afaict:
< lda $0,-100($16)
---
> subq $16,100,$0
llvm-svn: 97509
This allows formation of OpcodeSwitch for top level patterns, in
particular on X86. This saves about 1K of data space in the x86
table and makes the dispatch much more efficient.
llvm-svn: 97440
ComplexPattern at the root be generated multiple times, once
for each opcode they are part of. This encourages factoring
because the opcode checks get treated just like everything
else in the matcher.
llvm-svn: 97439
to a scope where every child starts with a CheckOpcode, but
executes more efficiently. Enhance DAGISelMatcherOpt to
form it.
This also fixes a bug in CheckOpcode: apparently the SDNodeInfo
objects are not pointer comparable, we have to compare the
enum name.
llvm-svn: 97438
so that we get grouping at the top level.
Add an optimization to reorder type check & record nodes
after opcode checks. We prefer to expose tree shape
matching which improves grouping and will enhance the next
optimization.
llvm-svn: 97432
dispatcher method. This eliminates the dependence of the new isel's
generated code on the old isel's predicates, however some random
hand written isel code still uses them.
llvm-svn: 97431
specifies whether there is an output flag or not. Use this
instead of redundantly encoding the chain/flag results in the
output vtlist.
llvm-svn: 97419
even some the old isel didn't. There are several parts of
this that make me feel dirty, but it's no worse than the
old isel. I'll clean up the parts I can do without ripping
out the old one next.
llvm-svn: 97415
node is always guaranteed to have a particular type
instead of hacking in ISD::STORE explicitly. This allows
us to use implied types for a broad range of nodes, even
target specific ones.
llvm-svn: 97355
with getType() == MVT::i32 etc. Teach it that two different
integer constants are contradictory. This cuts 1K off the X86
table, down to 98k
llvm-svn: 97314
predicates. For example if we have:
Scope:
CheckType i32
ABC
CheckType f32
DEF
CheckType i32
GHI
Then we know that we can transform this into:
Scope:
CheckType i32
Scope
ABC
GHI
CheckType f32
DEF
This reorders the check for the 'GHI' predicate above
the check for the 'DEF' predidate. However it is safe to do this
in this situation because we know that a node cannot have both an
i32 and f32 type.
We're now doing more factoring that the old isel did.
llvm-svn: 97312
as deeply into the pattern as we can get away with. In pratice, this
means "all the way to to the emitter code, but not across
ComplexPatterns". This substantially increases the amount of factoring
we get.
llvm-svn: 97305
gross little neighbor merging implementation. This one has
the benefit of not violating the ordering of patterns, so it
generates code that passes tests again.
llvm-svn: 97218
current design. This generates a matcher that successfully
runs, but it turns out that the factoring we're doing violates
the ordering of patterns, so we end up matching (e.g.) movups
where we want movaps. This won't due, but I'll address this in
a follow on patch. It's nice to not be on by default yet! :)
llvm-svn: 97215
instead of to have a chained series of scope nodes. This makes
the generated table smaller, improves the efficiency of the
interpreter, and make the factoring optimization much more
reasonable to implement.
llvm-svn: 97160
splitting all the patterns under scope nodes into equality sets
based on their first node. The second step is to rewrite the
graph info a form that exposes the sharing. Before I do this,
I want to redesign the Scope node.
llvm-svn: 97130
movechild/record -> recordchild/movechild and
movechild/moveparent -> noop xforms. This slightly shrinks the tables
(x86 to 117454) and enables adding future improvements.
llvm-svn: 97051
the old one around for comparative purposes: have the
ENABLE_NEW_ISEL #define (which is not enabled on mainline) stop
emitting the old isel at all, yay for build time win.
llvm-svn: 97033
the new isel: fold movechild+record+moveparent into a
single recordchild N node. This shrinks the X86 table
from 125443 to 117502 bytes.
llvm-svn: 97031
internal nodes with flag results. Record these with a new
OPC_MarkFlagResults opcode and use this to update the interior
nodes' flag results properly. This fixes CodeGen/X86/i256-add.ll
with the new isel.
llvm-svn: 97021
Needed to correctly handle things like 'llvmc -framework Foo foo.o -framework
Bar bar.o' - before this commit all '-framework' options would've been grouped
together in the beginning.
Due to our dependence on CommandLine this turned out to be a giant hack; we will
migrate away from CommandLine eventually.
llvm-svn: 96922
input/output patterns have the same type. It turns out that
this triggers all the time because we don't infer types
between these boundaries. Until we do, don't turn this on.
llvm-svn: 96905
ridiculously ginormous patterns and need more than one byte
of displacement for encodings. This fixes CellSPU/fdiv.ll.
SPU is still doing something else ridiculous though.
llvm-svn: 96833
well as the operands produced when the pattern is matched. This
allows CheckSame to work correctly when matching replicated
names involving ComplexPatterns. This fixes a bunch of MSP430
failures, we're down to 13 failures, two of which are
due to a sched bug.
llvm-svn: 96824
sure to only run the complex pattern on nodes where the target opts in.
This patch only handles targets with one opcode specified so far, but
fixes 16 failures, only 34 left.
llvm-svn: 96813
result nodes correctly. Note that this includes a horrible hack
in DAGISelHeader which cannot be fixed reasonably without
eliminating (parallel) from input patterns. That, in turn,
can't be done until we support writing multiple result patterns
for the X86and_flag and related multiple-result nodes.
llvm-svn: 96767
the point where it is to the 95% feature complete mark, it just
needs result updating to be done (then testing, optimization
etc).
More specificallly, this adds support for chain and flag handling
on the result nodes, support for sdnodexforms, support for variadic
nodes, memrefs, pinned physreg inputs, and probably lots of other
stuff.
In the old DAGISelEmitter, this deletes the dead code related to
OperatorMap, cleans up a variety of dead stuff handling "implicit
remapping" from things like globaladdr -> targetglobaladdr (which
is no longer used because globaladdr always needs to be legalized),
and some minor formatting fixes.
llvm-svn: 96716
'ischaincompatible' when a pattern has more than one input chain. Need
to do some commenting and cleanup now that I understand how this works.
llvm-svn: 96443
into a roundss intrinsic, producing a cyclic dag. The root cause
of this is badness handling ComplexPattern nodes in the old dagisel
that I noticed through inspection. Eliminate a copy of the of the
code that handled ComplexPatterns by making EmitChildMatchCode call
into EmitMatchCode.
llvm-svn: 96408
use and only call IsProfitableToFold/IsLegalToFold on the load
being folded, like the old dagiselemitter does. This
substantially simplifies the code and improves opportunities for
sharing.
llvm-svn: 96368
with chains. On interior nodes that lead up to them, we just directly
check that there is a single use. This generates slightly more
efficient code.
llvm-svn: 96366
IsLegalToFold and IsProfitableToFold. The generic version of the later simply checks whether the folding candidate has a single use.
This allows the target isel routines more flexibility in deciding whether folding makes sense. The specific case we are interested in is folding constant pool loads with multiple uses.
llvm-svn: 96255
produce a table based matcher instead of gobs of C++ Code.
Though it's not done yet, the shrinkage seems promising,
the table for the X86 ISel is 75K and still has a lot of
optimization to come (compare to the ~1.5M of .o generated
the old way, much of which will go away).
The code is currently disabled by default (the #if 0 in
DAGISelEmitter.cpp). When enabled it generates a dead
SelectCode2 function in the DAGISel Header which will
eventually replace SelectCode.
There is still a lot of stuff left to do, which are
documented with a trail of FIXMEs.
llvm-svn: 96215
that predated -fast-isel which attempted to speed up the dag pattern
matchers at -O0. Since fast-isel is around, this is basically
obsolete and removing it shrinks the generated dag isels.
llvm-svn: 96188