of a member function of a class template that is defined outside the template.
This substitution can actually fail in some weird cases.
llvm-svn: 220085
The generic code trying to use findCommutedOpIndices won't
understand that it needs to swap the modifier operands also,
so it should fail if they are set.
llvm-svn: 220064
When the input to a store instruction was a zero vector, the backend
always selected a normal vector store regardless of the non-temporal
hint. This is fixed by this patch.
This fixes PR19370.
llvm-svn: 220054
Hoist the IgnoreParens so that we ignore it around attributes as well in order
to future-proof the code. Addresses Richard's comments for SVN r219974.
llvm-svn: 220053
Clang would previously not get into C++ mode when invoked as 'clang++3.6'
(though clang++-3.6 would work).
I found the previous loop logic in this function confusing; hopefully this
makes it a little clearer.
Differential Revision: http://reviews.llvm.org/D5833
llvm-svn: 220052
We should be talking about the number of source elements, not the number of destination elements, given we know at this point that the source and dest element numbers are not the same.
While we're at it, avoid writing to std::vector::end()...
Bug found with random testing and a lot of coffee.
llvm-svn: 220051
The current documentation does not explain that the standalone build requires
the LLVM sources. This patch updates the documentation to reflect this
requirement and explains how to manually specify the location of the required
files.
llvm-svn: 220049
Currently the VSX support enables use of lxvd2x and stxvd2x for 2x64
types, but does not yet use lxvw4x and stxvw4x for 4x32 types. This
patch adds that support.
As with lxvd2x/stxvd2x, this involves straightforward overriding of
the patterns normally recognized for lvx/stvx, with preference given
to the VSX patterns when VSX is enabled.
In addition, the logic for permitting misaligned memory accesses is
modified so that v4r32 and v4i32 are treated the same as v2f64 and
v2i64 when VSX is enabled. Finally, the DAG generation for unaligned
loads is changed to just use a normal LOAD (which will become lxvw4x)
on P8 and later hardware, where unaligned loads are preferred over
lvsl/lvx/lvx/vperm.
A number of tests now generate the VSX loads/stores instead of
lvx/stvx, so this patch adds VSX variants to those tests. I've also
added <4 x float> tests to the vsx.ll test case, and created a
vsx-p8.ll test case to be used for testing code generation for the
P8Vector feature. For now, that simply tests the unaligned load/store
behavior.
This has been tested along with a temporary patch to enable the VSX
and P8Vector features, with no new regressions encountered with or
without the temporary patch applied.
llvm-svn: 220047
v2: use dyn_cast
fixup comments
v3: use cast
Reviewed-by: Matt Arsenault <arsenm2@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 220044
It broke some builders. I guess it'd be reproducible with --vg.
Failing Tests (3):
Clang :: CXX/except/except.spec/p1.cpp
Clang :: SemaTemplate/instantiate-exception-spec-cxx11.cpp
Clang :: SemaTemplate/instantiate-exception-spec.cpp
llvm-svn: 220038
DSE's overlap checking contained special logic, used only when no DataLayout
was available, which inferred a complete overwrite when the pointee types were
equal. This logic seems fine for regular loads/stores, but does not work for
memcpy and friends. Instead of fixing this, I'm just removing it.
Philosophically, transformations should not contain enhanced behavior used only
when data layout is lacking (data layout should be strictly additive), and
maintaining these rarely-tested code paths seems not worthwhile at this stage.
Credit to Aliaksei Zasenka for the bug report and the diagnosis. The test case
(slightly reduced from that provided by Aliaksei) replaces the original
contents of test/Transforms/DeadStoreElimination/no-targetdata.ll -- a few
other tests have been updated to have a data layout.
llvm-svn: 220035
RecordType->getDecl() which maps to TagType::getDecl() is not a simple
accessor but a loop on redecls in getInterestingTagDecl.
isStructureOrClassType() was calling getDecl() three times performing
three times the work actually required. It is optimized by calling
RT->getDecl() once and reusing the result three times.
llvm-svn: 220033
non-dependent types, in CXXScalarValueInitExprs and in the
nested-name-specifier or template arguments of a DeclRefExpr in particular.
llvm-svn: 220028