To extract a preoptimized LLVM-IR file from a C-file run:
clang -Xclang -load -Xclang LLVMPolly.so -O0 -mllvm -polly file.c -S -emit-llvm
On the generated file you can directly run passes such as:
'opt -view-scops file.s'
llvm-svn: 146560
Previously the scheduler was splitting bands at the level at which it detected
that the splitting of the band is necessary. This may introduce an additional
level of bands, that can be avoided by backtracking and splitting on a higher
level. Additional splits reduce the number of loops that can be tiled, such that
avoiding splits and maximizing the band depth seems preferable.
As a first data point we looked at 2mm and 3mm from the polybench test suite.
For both maximizing the tilable bands results in a significant (5-10x)
performance improvement.
This patch enables the isl scheduler option to maximize the band depth.
llvm-svn: 146557
If larger coefficients appear as part of the input dependences, the schedule
calculation can take a very long time. We observed that the main overhead in
this calculation is due to optimizing the constant coefficients. They are
misused to increase locality by merging several unrelated dimensions into a
single dimension. This unwanted optimization increases the complexity of the
generated code and furthermore slows it down.
We use a new isl scheduler option to bound the values in the constant dimension
by a user defined value (20 in our case). If the right value is choosen, costly
overoptimization is prevented.
This solution works, but requires a specific (here almost randomly choosen)
value by which the constants are bound. For the moment, this is our best
solution, but we hope to to find a more generic one later on.
After these patch the extremly long compile time for simple kernels like 2mm or
3mm is reduced to a reasonable amount of time (Not more than a couple of seconds
even in debug mode).
llvm-svn: 146556
dispatch functions that are implemented in hand-written assembly.
There is also hand-written eh_frame instructions for unwinding
from these functions.
Normally we don't use eh_frame instructions for the currently
executing function, prefering the assembly instruction profiling
method. But in these hand-written dispatch functions, the
profiling is doomed and we should use the eh_frame instructions.
Unfortunately there's no easy way to flag/extend the eh_frame/debug_frame
sections to annotate if the unwind instructions are accurate at
all addresses ("asynchronous") or if they are only accurate at locations
that can throw an exception ("synchronous" and the normal case for
gcc/clang generated eh_frame/debug_frame CFI).
<rdar://problem/10508134>
llvm-svn: 146551
to finalize MI bundles (i.e. add BUNDLE instruction and computing register def
and use lists of the BUNDLE instruction) and a pass to unpack bundles.
- Teach more of MachineBasic and MachineInstr methods to be bundle aware.
- Switch Thumb2 IT block to MI bundles and delete the hazard recognizer hack to
prevent IT blocks from being broken apart.
llvm-svn: 146542
the expression parser to locate instances where
dyn_cast<>() and isa<>() are used on types, and
replace them with getAs<>() as appropriate.
The difference is that dyn_cast<>() and isa<>()
are essentially LLVM/Clang's equivalent of RTTI
-- that is, they try to downcast the object and
return NULL if they cannot -- but getAs<>() can
traverse typedefs to perform a semantic cast.
llvm-svn: 146537
Some of the test cases do not currently work because the analyzer core
does not seem to call checkers for pre/post DeclRefExpr visits.
(Opened radar://10573500. To be fixed later on.)
llvm-svn: 146536