This results in things such as
vmovaps -96(%rbx), %xmm1
vinsertf128 $1, %xmm1, %ymm0, %ymm0
to be combined to
vinsertf128 $1, -96(%rbx), %ymm0, %ymm0
rdar://10643481
llvm-svn: 152762
locations for diagnostics we're not going to emit, and don't track the subobject
designator outside C++11 (since we're not going to use it anyway).
This seems to give about a 0.5% speedup on 403.gcc/combine.c, but the results
were sufficiently noisy that I can't reject the null hypothesis.
llvm-svn: 152761
-fno-inline-functions.
This behaves much like -fno-inline in gcc, but based on a discussion with
Daniel it was decided that -fno-inline-functions should subsume -fno-inline.
Please speak up if you object. The -fno-inline flag remains ignored.
Final part of rdar://10972766
llvm-svn: 152754
correlated pairs of pointer arguments at the callsite. This is designed
to recognize the common C++ idiom of begin/end pointer pairs when the
end pointer is a constant offset from the begin pointer. With the
C-based idiom of a pointer and size, the inline cost saw the constant
size calculation, and this provides the same level of information for
begin/end pairs.
In order to propagate this information we have to search for candidate
operations on a pair of pointer function arguments (or derived from
them) which would be simplified if the pointers had a known constant
offset. Then the callsite analysis looks for such pointer pairs in the
argument list, and applies the appropriate bonus.
This helps LLVM detect that half of bounds-checked STL algorithms
(such as hash_combine_range, and some hybrid sort implementations)
disappear when inlined with a constant size input. However, it's not
a complete fix due the inaccuracy of our cost metric for constants in
general. I'm looking into that next.
Benchmarks showed no significant code size change, and very minor
performance changes. However, specific code such as hashing is showing
significantly cleaner inlining decisions.
llvm-svn: 152752
scoped enumeration members. Later uses of an enumeration temploid as a nested
name specifier should cause its instantiation. Plus some groundwork for
explicit specialization of member enumerations of class templates.
llvm-svn: 152750
Commit r152704 exposed a latent MSVC limitation (aka bug).
Both ilist and and iplist contains the same function:
template<class InIt> void insert(iterator where, InIt first, InIt last) {
for (; first != last; ++first) insert(where, *first);
}
Also ilist inherits from iplist and ilist contains a "using iplist<NodeTy>::insert".
MSVC doesn't know which one to pick and complain with an error.
I think it is safe to delete ilist::insert since it is redundant anyway.
llvm-svn: 152746
Like GCC, provide a NULL conversion to non-pointer conversion as a separate
flag, on by default. GCC's flag is "conversion-null" which we provide for
cross compatibility, but in the interests of consistency (with
-Wint-conversion, -Wbool-conversion, etc) the canonical Clang flag is called
-Wnull-conversion.
Patch by Lubos Lunak.
Review feedback by myself, Chandler Carruth, and Chad Rosier.
llvm-svn: 152745
http://llvm.org/bugs/show_bug.cgi?id=12232
Fixed a case where a missing "break" in a switch statement could cause an assertion to fire and kill the debug session.
The fix was derived from the findings of Andrea Bigagli, thanks Andrea.
llvm-svn: 152741
which are small enough to themselves be inlined. Delaying in this manner
can be harmful if the function is inelligible for inlining in some (or
many) contexts as it pessimizes the code of the function itself in the
event that inlining does not eventually happen.
Previously the check was written to only do this delaying of inlining
for static functions in the hope that they could be entirely deleted and
in the knowledge that all callers of static functions will have the
opportunity to inline if it is in fact profitable. However, with C++ we
get two other important sources of functions where the definition is
always available for inlining: inline functions and templated functions.
This patch generalizes the inliner to allow linkonce-ODR (the linkage
such C++ routines receive) to also qualify for this delay-based
inlining.
Benchmarking across a range of large real-world applications shows
roughly 2% size increase across the board, but an average speedup of
about 0.5%. Some benhcmarks improved over 2%, and the 'clang' binary
itself (when bootstrapped with this feature) shows a 1% -O0 performance
improvement when run over all Sema, Lex, and Parse source code smashed
into a single file. A clean re-build of Clang+LLVM with a bootstrapped
Clang shows approximately 2% improvement, but that measurement is often
noisy.
llvm-svn: 152737
- This may seem superflous, but actually this allows the optimizer to more
easily eliminate the isActive() checks needed by the SemaDiagnosticBuilder
and DiagnosticBuilder dtors. And by more easily, I mean the current LLVM is
actually able to do one and not the other. :)
This is good for another 20k code size reduction.
llvm-svn: 152709
- As with DiagnosticBuilder, it is very important that SemaDiagnosticBuilder be
completely inline to ensure that the compiler can rip it apart and sink it to
registers.
This is good for another 30k reduction in code size.
llvm-svn: 152708