This patch completely replaces the instruction scheduling information for the Haswell architecture target by modifying the file X86SchedHaswell.td located under the X86 Target.
We used the scheduling information retrieved from the Haswell architects in order to replace and modify the existing scheduling.
The patch continues the scheduling replacement effort started with the SNB target in r307529 and r310792.
Information includes latency, number of micro-Ops and used ports by each HSW instruction.
Please expect some performance fluctuations due to code alignment effects.
Reviewers: RKSimon, zvi, aymanmus, craig.topper, m_zuckerman, igorb, dim, chandlerc, aaboud
Differential Revision: https://reviews.llvm.org/D36663
llvm-svn: 311879
This adds builtin_cpu_init which will emit a call to cpu_indicator_init in libgcc or compiler-rt.
This is needed to support builtin_cpu_supports/builtin_cpu_is in an ifunc resolver.
Differential Revision: https://reviews.llvm.org/D36336
llvm-svn: 311874
Summary:
XRay has erroneously been returning the address of the first sled in the
instrumentation map for a function id instead of the (runtime-relocated)
functison address. This causes confusion and issues for applications
where:
- The first sled in the function may not be an entry sled (due to
re-ordering or some other reason).
- The caller attempts to find a symbol associated with the pointer at
runtime, because the sled may not be exactly where the function's
known address is (in case of inlined functions or those that have an
external definition for symbols).
This fixes http://llvm.org/PR34340.
Reviewers: eizan
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37202
llvm-svn: 311871
handleExpected is similar to handleErrors, but takes an Expected<T> as its first
input value and a fallback functor as its second, followed by an arbitary list
of error handlers (equivalent to the handler list of handleErrors). If the first
input value is a success value then it is returned from handleErrors
unmodified. Otherwise the contained error(s) are passed to handleErrors, along
with the handlers. If handleErrors returns success (indicating that all errors
have been handled) then handleExpected runs the fallback functor and returns its
result. If handleErrors returns a failure value then the failure value is
returned and the fallback functor is never run.
This simplifies the process of re-trying operations that return Expected values.
Without this utility such retry logic is cumbersome as the internal Error must
be explicitly extracted from the Expected value, inspected to see if its
handleable and then consumed:
enum FooStrategy { Aggressive, Conservative };
Expected<Foo> tryFoo(FooStrategy S);
Expected<Foo> Result;
(void)!!Result; // "Check" Result so that it can be safely overwritten.
if (auto ValOrErr = tryFoo(Aggressive))
Result = std::move(ValOrErr);
else {
auto Err = ValOrErr.takeError();
if (Err.isA<HandleableError>()) {
consumeError(std::move(Err));
Result = tryFoo(Conservative);
} else
return std::move(Err);
}
with handleExpected, this can be re-written as:
auto Result =
handleExpected(
tryFoo(Aggressive),
[]() { return tryFoo(Conservative); },
[](HandleableError&) { /* discard to handle */ });
llvm-svn: 311870
Heretofore asan_handle_no_return was used only by interceptors,
i.e. code private to the ASan runtime. However, on systems without
interceptors, code like libc++abi is built with -fsanitize=address
itself and should call asan_handle_no_return directly from
__cxa_throw so that no interceptor is required.
Patch by Roland McGrath
Differential Revision: https://reviews.llvm.org/D36811
llvm-svn: 311869
In cases where the entry block of a scop was not contained in a loop that was
part of the scop region and at the same time there was a loop surrounding the
scop, we missed to count the loops in the scop and consequently did not consider
the scop profitable. We correct this by only moving to the loop parent, in case
the current loop is loop contained in the scop.
This increases the number of loops in COSMO which we assume to be profitable
from 3974 to 4981.
llvm-svn: 311863
This patch enables generation of NMADD and NMSUB instructions when fneg node
is present. These instructions are currently only generated if fsub node is
present.
Patch by Stanislav Ocovaj.
Differential Revision: https://reviews.llvm.org/D34507
llvm-svn: 311862
By exposing the constant initializer, the optimizer can fold many
of these constructs.
Differential Revision: https://reviews.llvm.org/D34992
llvm-svn: 311857
Prior to this patch clang would not error here:
template <class T> struct B;
template <class T> struct A {
void foo();
void foo2();
void test1() {
B<T>::foo(); // OK, foo is declared in A<int> - matches type of 'this'.
B<T>::foo2(); // This should be an error!
// foo2 is found in B<int>, 'base unrelated' to 'this'.
}
};
template <class T> struct B : A<T> {
using A<T>::foo2;
};
llvm-svn: 311851
Original commit r311077 of D32871 was reverted in r311304 due to failures
reported in PR34248.
This recommit fixes PR34248 by restricting the packing of predicated scalars
into vectors only when vectorizing, avoiding doing so when unrolling w/o
vectorizing. Added a test derived from the reproducer of PR34248.
llvm-svn: 311849
This allows multi-module / incremental compilation environments to have unique
initializer symbols.
Patch by Axel Naumann with minor modifications by me!
llvm-svn: 311844
When isIncrementalProcessingEnabled is on we might want to produce multiple
llvm::Modules. This patch allows the clients to start a new llvm::Module,
allowing CodeGen to continue working after a HandleEndOfTranslationUnit call.
This should give the necessary facilities to write a unittest for D34059.
As discussed in the review this is meant to give us a way to proceed forward
in our efforts to upstream our interpreter-related patches. The design of this
will likely change soon.
llvm-svn: 311843
Remove the explicit i686 target that is completely duplicate to
the i386 target, with the latter being used more commonly.
1. The runtime built for i686 will be identical to the one built for
i386.
2. Supporting both -i386 and -i686 suffixes causes unnecessary confusion
on the clang end which has to expect either of them.
3. The checks are based on wrong assumption that __i686__ is defined for
all newer x86 CPUs. In fact, it is only declared when -march=i686 is
explicitly used. It is not available when a more specific (or newer)
-march is used.
Curious enough, if CFLAGS contain -march=i686, the runtime will be built
both for i386 and i686. For any other value, only i386 variant will be
built.
Differential Revision: https://reviews.llvm.org/D26764
llvm-svn: 311842
Prior to this patch, clang would do the wrong thing here (see inline comments for pre-patch behavior):
struct A {
void bar(int) { }
static void bar(double) { }
void g(int*);
static void g(char *);
};
struct B {
void f() {
A::bar(3); // selects (double) ??!!
A::g((int*)0); // Instead of no object argument, states conversion error?!!
}
};
The fix is as follows: When we detect that what appears to be an implicit member function call (A::bar) is actually a call to a member of a class (A) unrelated to the type (B) that contains the member function (B::f) from which the call is being made, don't treat it (A::bar) as an Implicit Member Call Expression.
P.S. I wonder if there is an existing bug report related to this? (Surprisingly, a cursory search did not find one).
llvm-svn: 311839
We used to do a late DAG combine to move the bitcasts out of the way, but I'm starting to think that it's better to canonicalize extract_subvector's type to match the type of its input. I've seen some cases where we've formed two different extract_subvector from the same node where one had a bitcast and the other didn't.
Add some more test cases to ensure we've also got most of the zero masking covered too.
llvm-svn: 311837
Use llvm::Triple::getArchTypeName() when looking for compiler-rt
libraries, rather than the exact arch string from the triple. This is
more correct as it matches the values used when building compiler-rt
(builtin-config-ix.cmake) which are the subset of the values allowed
in triples.
For example, this fixes an issue when the compiler set for
i686-pc-linux-gnu triple would not find an i386 compiler-rt library,
while this is the exact arch that is detected by compiler-rt. The same
applies to any other i?86 variant allowed by LLVM.
This also makes the special case for MSVC unnecessary, since now i386
will be used reliably for all 32-bit x86 variants.
Differential Revision: https://reviews.llvm.org/D26796
llvm-svn: 311836
Summary:
Remove redundant explicit template instantiation.
This was reported by Andrew Kelley building release_50 with gcc7.2.0 on MacOS: duplicate symbol llvm::DominatorTreeBase.
Reviewers: kuhar, andrewrk, davide, hans
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37185
llvm-svn: 311835
Summary:
If all the operands of a BUILD_VECTOR extract elements from same vector then split the
vector efficiently based on the maximum vector access index.
This will also fix PR 33784
Reviewers: zvi, delena, RKSimon, thakis
Reviewed By: RKSimon
Subscribers: chandlerc, eladcohen, llvm-commits
Differential Revision: https://reviews.llvm.org/D35788
llvm-svn: 311833