The SSE rsqrt instruction (a fast reciprocal square root estimate) was
grouped in the same scheduling IIC_SSE_SQRT* class as the accurate (but very
slow) SSE sqrt instruction. For code which uses rsqrt (possibly with
newton-raphson iterations) this poor scheduling was affecting performances.
This patch splits off the rsqrt instruction from the sqrt instruction scheduling
classes and creates new IIC_SSE_RSQER* classes with latency values based on
Agner's table.
Differential Revision: http://reviews.llvm.org/D5370
Patch by Simon Pilgrim.
llvm-svn: 218517
This is a first pass at a scheduling model for Jaguar.
It's structured largely on the existing SandyBridge and SLM sched models.
Using this model, in addition to turning on the PostRA scheduler, results in
some perf wins on internal and 3rd party benchmarks. There's not much difference
in LLVM's test-suite benchmarking subset of tests.
Differential Revision: http://reviews.llvm.org/D5229
llvm-svn: 217457