PostGenericScheduler uses either the new machine model or the hazard
checker for top-down scheduling. Most of the infrastructure for PreRA
machine scheduling is reused.
With a some tuning, this should allow MachineScheduler to be default
for all ARM targets, including cortex-A9, using the new machine
model. Likewise, with additional tuning, it should be able to replace
PostRAScheduler for all targets.
The PostMachineScheduler pass does not currently run the
AntiDepBreaker. There is less need for it on targets that are already
running preRA MachineScheduler. I want to prove it's necessary before
committing to the maintenance burden.
The PostMachineScheduler also currently removes kill flags and adds
them all back later. This is a bit ridiculous. I'd prefer passes to
directly use a liveness utility than rely on flags.
A test case that enables this scheduler will be included in a
subsequent checkin that updates the A9 model.
llvm-svn: 198122
These helper classes take care of the book-keeping the drives the
GenericScheduler heuristics. It is likely that developers writing
target-specific schedulers that work similarly to GenericScheduler
will want to use these helpers too. The immediate goal is to develop a
GenericPostScheduler that can run in place of the old PostRAScheduler,
but will use the new machine model.
No functionality change intended.
llvm-svn: 196643
Not only does it trigger -Wparentheses, I think the assert actually
relies on incorrect operator precedence.
Also, the grammar as questionable, but I might not know enough about the
problem at hand.
llvm-svn: 196567
This allows a target to use MI-Sched as an in-order scheduler that
will model strict resource conflicts without defining a processor
itinerary. Instead, the target can now use the new per-operand machine
model and define in-order resources with BufferSize=0. For example,
this would allow restricting the type of operations that can be formed
into a dispatch group. (Normally NumMicroOps is sufficient to enforce
dispatch groups).
If the intent is to model latency in in-order pipeline, as opposed to
resource conflicts, then a resource with BufferSize=1 should be
defined instead.
This feature is only casually tested as there are no in-tree targets
using it yet. However, Hal will be experimenting with POWER7.
llvm-svn: 196517
This patch removes most of the trivial cases of weak vtables by pinning them to
a single object file. The memory leaks in this version have been fixed. Thanks
Alexey for pointing them out.
Differential Revision: http://llvm-reviews.chandlerc.com/D2068
Reviewed by Andy
llvm-svn: 195064
This change is incorrect. If you delete virtual destructor of both a base class
and a subclass, then the following code:
Base *foo = new Child();
delete foo;
will not cause the destructor for members of Child class. As a result, I observe
plently of memory leaks. Notable examples I investigated are:
ObjectBuffer and ObjectBufferStream, AttributeImpl and StringSAttributeImpl.
llvm-svn: 194997
This patch removes most of the trivial cases of weak vtables by pinning them to
a single object file.
Differential Revision: http://llvm-reviews.chandlerc.com/D2068
Reviewed by Andy
llvm-svn: 194865
The global registry is used to allow command line override of the
scheduler selection, but does not work well as the normal selection
API. For example, the same LLVM process should be able to target
multiple targets or subtargets.
llvm-svn: 191071
This was an experimental scheduler a year ago. It's now used by
several subtargets, both in-order and out-of-order, and it
is about to be enabled by default for x86 and armv7. It will be the
new GenericScheduler for subtargets that don't provide their own
SchedulingStrategy.
llvm-svn: 191051
Arnold's idea.
I generally try to avoid stateful heuristics because it can make
debugging harder. However, we need a way to prevent the latency
priority from dominating, and it somewhat makes sense to schedule
aggressively for latency only within an issue group.
Swift in particular likes this, and it doesn't hurt anyone else:
| Benchmarks/MiBench/consumer-lame | 10.39% |
| Benchmarks/Misc/himenobmtxpa | 9.63% |
llvm-svn: 190360
Allow subtargets to customize the generic scheduling strategy.
This is convenient for targets that don't need to add new heuristics
by specializing the strategy.
llvm-svn: 190176
Fast register pressure tracking currently only takes effect during
bottom up scheduling. Forcing this is a bit faster and simpler for
targets that don't have many scheduling constraints and don't need
top-down scheduling.
llvm-svn: 190014
If the instruction window is < NumRegs/2, pressure tracking is not
likely to be effective. The scheduler has to process a very large
number of tiny blocks. We want this to be fast.
llvm-svn: 189991
Register pressure tracking is half the complexity of the
scheduler. It's useful to be able to turn it off for compile time and
performance comparisons.
llvm-svn: 189987
There was one case that we could hit a DebugValue where I didn't think
to check. DebugValues are evil. No checkinable test case, sorry. It's
an obvious fix.
llvm-svn: 189717
Created SUPressureDiffs array to hold the per node PDiff computed during DAG building.
Added a getUpwardPressureDelta API that will soon replace the old
one. Compute PressureDelta here from the precomputed PressureDiffs.
Updating for liveness will come next.
llvm-svn: 189640
Estimate the cyclic critical path within a single block loop. If the
acyclic critical path is longer, then the loop will exhaust OOO
resources after some number of iterations. If lag between the acyclic
critical path and cyclic critical path is longer the the time it takes
to issue those loop iterations, then aggressively schedule for
latency.
llvm-svn: 189120
When registers must be live throughout the scheduling region, increase
the limit for the register class. Once we exceed the original limit,
they will be spilled, and there's no point further reducing pressure.
This isn't a perfect heuristics but avoids a situation where the
scheduler could become trapped by trying to achieve the impossible.
llvm-svn: 187436