In case a GEP instruction references into a fixed size array e.g., an access
A[i][j] into an array A[100x100], LLVM-IR does not guarantee that the subscripts
always compute values that are within array bounds. We now derive the set of
parameter values for which all accesses are within bounds and add the assumption
that the scop is only every executed with this set of parameter values.
Example:
void foo(float A[][20], long n, long m {
for (long i = 0; i < n; i++)
for (long j = 0; j < m; j++)
A[i][j] = ...
This loop yields out-of-bound accesses if m is at least 20 and at the same time
at least one iteration of the outer loop is executed. Hence, we assume:
n <= 0 or m <= 20.
Doing so simplifies the dependence analysis problem, allows us to perform
more optimizations and generate better code.
TODO: The location where the GEP instruction is executed is not necessarily the
location where the memory is actually accessed. As a result scanning for GEP[s]
is imprecise. Even though this is not a correctness problem, this imprecision
may result in missed optimizations or non-optimal run-time checks.
In polybench where this mismatch between parametric loop bounds and fixed size
arrays is common, we see with this patch significant reductions in compile time
(up to 50%) and execution time (up to 70%). We see two significant compile time
regressions (fdtd-2d, jacobi-2d-imper), and one execution time regression
(trmm). Both regressions arise due to additional optimizations that have been
enabled by this patch. They can be addressed in subsequent commits.
http://reviews.llvm.org/D6369
llvm-svn: 222754
stored rather than the pointer type.
This change is analogous to r220138 which changed the canonicalization
for loads. The rationale is the same: memory does not have a type,
operations (and thus the values they produce) have a type. We should
match that type as closely as possible rather than reading some form of
semantics into the pointer type.
With this change, loads and stores should no longer be made with
nonsensical types for the values that tehy load and store. This is
particularly important when trying to match specific loaded and stored
types in the process of doing other instcombines, which is what led me
down this twisty maze of miscanonicalization.
I've put quite some effort into looking through IR to find places where
LLVM's optimizer was being unreasonably conservative in the face of
mismatched load and store types, however it is possible (let's say,
likely!) I have missed some. If you see regressions here, or from
r220138, the likely cause is some part of LLVM failing to cope with load
and store types differing. Test cases appreciated, it is important that
we root all of these out of LLVM.
llvm-svn: 222748
Fix ARMAttributeParser::CPU_arch_profile so that it doesn't switch on the value
'0' as a legal value of this build attribute.
Change-Id: Ie05a08900a82bb10b78c841b437df747ce3bb38e
llvm-svn: 222743
Summary:
This resolves [[ http://llvm.org/bugs/show_bug.cgi?id=17391 | PR17391 ]].
GCC's sources were used as a guide (couldn't find much information in ARM documentation).
Reviewers: doug.gregor, asl
Reviewed By: asl
Subscribers: asl, aemerson, cfe-commits
Differential Revision: http://reviews.llvm.org/D6339
llvm-svn: 222741
clearly only exactly equal width ptrtoint and inttoptr casts are no-op
casts, it says so right there in the langref. Make the code agree.
Original log from r220277:
Teach the load analysis to allow finding available values which require
inttoptr or ptrtoint cast provided there is datalayout available.
Eventually, the datalayout can just be required but in practice it will
always be there today.
To go with the ability to expose available values requiring a ptrtoint
or inttoptr cast, helpers are added to perform one of these three casts.
These smarts are necessary to finish canonicalizing loads and stores to
the operational type requirements without regressing fundamental
combines.
I've added some test cases. These should actually improve as the load
combining and store combining improves, but they may fundamentally be
highlighting some missing combines for select in addition to exercising
the specific added logic to load analysis.
llvm-svn: 222739
Summary:
First, remove lit configuration that sets ASAN_OPTIONS to detect_leaks=1
because this is already the default when leak detection is supported.
This removes a bit of duplication between various lit.cfg files.
Second, add a new feature 'leak-detection' if we're targetting x86_64
(not i386) on Linux.
Third, change a couple of tests that need leak detection to require the
new 'leak-detection' feature.
Reviewers: kcc, earthdok, samsonov
Reviewed By: samsonov
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D6396
llvm-svn: 222738
Only the super register flat_scr was marked as reserved,
so in some cases with high register usage it would still
try to allocate the subregisters.
llvm-svn: 222737
Rethrowing exceptions in the MS model is very simple: just call
_CxxThrowException with nullptr for both arguments.
N.B. They chose stdcall as the calling convention for x86 but cdecl for
all other platforms.
llvm-svn: 222733
The pattern matching failed to recognize all instances of "-1", because when
comparing against "-1" we didn't use an APInt of the same bitwidth.
This commit fixes this and also adds inverse versions of the conditon to catch
more cases.
llvm-svn: 222722
This handles cases where we are comparing a masked value against itself.
The analysis could be further improved by making it recursive but such
expense is not currently justified.
llvm-svn: 222716
The attn instruction is not part of the Power ISA, but is documented in the A2
user manual, and is accepted by the GNU assembler for the A2 and the POWER4+.
Reported as part of PR21650.
llvm-svn: 222712
This does not matter on newer cores (where we can use reciprocal estimates in
fast-math mode anyway), but for older cores this allows us to generate better
fast-math code where we have multiple FDIVs with a common divisor.
llvm-svn: 222710
We were matching against the assume intrinsic in every check. Since we know that it must be an assume, this is just wasted work. Somewhat surprisingly, matching an intrinsic id is actually relatively expensive. It devolves to a string construction and comparison in Function::isIntrinsic.
I originally spotted this because it showed up in a performance profile of my compiler. I've since discovered a separate issue which seems to be the actual root cause, but this is minor perf goodness regardless.
I'm likely to follow up with another change to factor out the comparison matching. There's no need to match the compare instruction in every single one of the tests.
Differential Revision: http://reviews.llvm.org/D6312
llvm-svn: 222709
Summary:
This patch adds CMake support for building and testing libc++abi without threads.
1. Add `LIBCXXABI_ENABLE_THREADS` option to CMake.
2. Propagate `LIBCXXABI_ENABLE_THREADS` to lit via lit.site.cfg.in
3. Configure tests for `LIBCXXABI_ENABLE_THREADS=OFF
Currently the test suite does not work when libc++abi is built without threads because that information does not propagate to the test suite.
Reviewers: danalbert, mclow.lists, jroelofs
Reviewed By: jroelofs
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6393
llvm-svn: 222702