On AVX and AVX2, BROADCAST instructions can load a scalar into all elements of a target vector.
This patch improves the lowering of 'splat' shuffles of a loaded vector into a broadcast - currently the lowering only works for cases where we are splatting the zero'th element, which is now generalised to any element.
Fix for PR23022
Differential Revision: http://reviews.llvm.org/D15310
llvm-svn: 255061
If you externalize debug info for unit tests the test runner finds the mach-o inside the dsym bundle and tries to execute it as a test.
llvm-svn: 255056
This patch teaches the fully redundant load part of EarlyCSE how to forward from atomic and volatile loads and stores, and how to eliminate unordered atomics (only). This patch does not include dead store elimination support for unordered atomics, that will follow in the near future.
The basic idea is that we allow all loads and stores to be tracked by the AvailableLoad table. We store a bit in the table which tracks whether load/store was atomic, and then only replace atomic loads with ones which were also atomic.
No attempt is made to refine our handling of ordered loads or stores. Those are still treated as full fences. We could pretty easily extend the release fence handling to release stores, but that should be a separate patch.
Differential Revision: http://reviews.llvm.org/D15337
llvm-svn: 255054
Summary:
Before ARMv5T, Thumb1 code could not pop PC, as described at D14357 and D14986;
so we need the special fixup in the epilogue.
Reviewers: jroelofs, qcolombet
Subscribers: aemerson, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D15126
llvm-svn: 255047
Per LangRef: "Globals with available_externally linkage are
allowed to be discarded at will, and are otherwise the same
as linkonce_odr", since linkonce_odr is in this list it makes
sense to have available_externally there as well.
Reviewers: rafael
Differential Revision: http://reviews.llvm.org/D15323
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 255043
AND/BIC instructions do accept SP/PC, so the register class should be
more generic (rGPR -> GPR) to cope with that case. Adding more tests.
llvm-svn: 255034
Summary:
We don't check the size operand on ext/dext*/ins/dins* yet because the
permitted range depends on the pos argument and we can't check that using
this mechanism.
The bug was that dextu/dinsu accepted 0..31 in the pos operand instead of 32..63.
Reviewers: vkalintiris
Subscribers: llvm-commits, dsanders
Differential Revision: http://reviews.llvm.org/D15190
llvm-svn: 255015
ARMv8.2-A adds 16-bit floating point versions of all existing SIMD
floating-point instructions. This is an optional extension, so all of
these instructions require the FeatureFullFP16 subtarget feature.
Note that VFP without SIMD is not a valid combination for any version of
ARMv8-A, but I have ensured that these instructions all depend on both
FeatureNEON and FeatureFullFP16 for consistency.
The ".2h" vector type specifier is now legal (for the scalar pairwise
reduction instructions), so some unrelated tests have been modified as
different error messages are emitted. This is not a problem as the
invalid operands are still caught.
llvm-svn: 255010
Reduces the scope over which the struct is visible, making its usages
obvious. I did not move structs in cases where this wasn't a clear
win (the struct is too large, or is grouped in some other interesting
way).
llvm-svn: 255003
The StringRef constructor is unnecessary (since we're converting to
std::string anyway), and having it requires an explicit call to
StringRef's or std::string's constructor.
llvm-svn: 255000
Currently, vectors of halfs end up as ConstantVectors, but there isn't
a good reason they can't be ConstantDataVectors. This should save some
memory.
llvm-svn: 254991
It's strange to duplicate the logic for emitting FP values into
emitGlobalConstantDataSequential, and it's even stranger that we end
up printing the verbose assembly comments differently between the two
paths. Just call into emitGlobalConstantFP rather than crudely
duplicating its logic.
llvm-svn: 254988
Summary:
Also add a stricter post-condition for IndVarSimplify.
Fixes PR25578. Test case by Michael Zolotukhin.
Reviewers: hfinkel, atrick, mzolotukhin
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15059
llvm-svn: 254977
Summary:
(Note: the problematic invocation of hoistIVInc that caused PR24804 came
from IndVarSimplify, not from SCEVExpander itself)
Fixes PR24804. Test case by David Majnemer.
Reviewers: hfinkel, majnemer, atrick, mzolotukhin
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15058
llvm-svn: 254976
We were using unneccessarily large initial sizes for these SmallVectors. This was wasting around 50kb of memory for the O3 pipeline, even after the uniquing changes. We're still using around 20kb which is a bit much, but it's definitely better. This is about a 6% improvement in total O3 memory usage.
Note: The raw data on structure size which were used to pick these thresholds can be found in the review thread.
Differential Revision: http://reviews.llvm.org/D15244
llvm-svn: 254974
A manipulation (in this case, mkdir) can make slack between creating and touching %t.older/evenlen.
I would make this rewrote with python if this were still unstable.
llvm-svn: 254965