This restores the commit from SVN r219899 with an additional change to ensure
that the CodeGen is correct for the case that was identified as being incorrect
(originally PR7272).
In the case that during inlining we need to synthesize a value on the stack
(i.e. for passing a value byval), then any function involving that alloca must
be stripped of its tailness as the restriction that it does not access the
parent's stack no longer holds. Unfortunately, a single alloca can cause a
rippling effect through out the inlining as the value may be aliased or may be
mutated through an escaped external call. As such, we simply track if an alloca
has been introduced in the frame during inlining, and strip any tail calls.
llvm-svn: 220811
Before:
ReturnType MACRO
FunctionName() {}
After:
ReturnType MACRO
FunctionName() {}
This fixes llvm.org/PR21404.
I wonder what the motivation for that if-condition was. But as no test
breaks, ...
llvm-svn: 220801
Apparently, people are very much divided on what the "correct"
indentation is. So, best to give them a choice.
The new flag is called ObjCBlockIndentWidth and the default is now set
to the same value as IndentWidth for the pre-defined styles.
llvm-svn: 220784
explicitly using the resulting .pcm file. Unlike for an implicit module build,
we don't need nor want to require these flags to match between the module
and its users.
llvm-svn: 220780
This transformation worked if selector is produced by SETCC, however SETCC is needed only if we consider to swap operands. So I replaced SETCC check for this case.
Added tests for vselect of <X x i1> values.
llvm-svn: 220777
Ffter commit at rev219046 512-bit broadcasts lowering become non-optimal. Most of tests on broadcasting and embedded broadcasting were changed and they doesn’t produce efficient code.
Example below is from commit changes (it’s the first test from test/CodeGen/X86/avx512-vbroadcast.ll):
define <16 x i32> @_inreg16xi32(i32 %a) {
; CHECK-LABEL: _inreg16xi32:
; CHECK: ## BB#0:
-; CHECK-NEXT: vpbroadcastd %edi, %zmm0
+; CHECK-NEXT: vmovd %edi, %xmm0
+; CHECK-NEXT: vpbroadcastd %xmm0, %ymm0
+; CHECK-NEXT: vinserti64x4 $1, %ymm0, %zmm0, %zmm0
; CHECK-NEXT: retq
%b = insertelement <16 x i32> undef, i32 %a, i32 0
%c = shufflevector <16 x i32> %b, <16 x i32> undef, <16 x i32> zeroinitializer
ret <16 x i32> %c
}
Here, 256-bit broadcast was generated instead of 512-bit one.
In this patch
1) I added vector-shuffle lowering through broadcasts
2) Removed asserts and branches likes because this is incorrect
- assert(Subtarget->hasDQI() && "We can only lower v8i64 with AVX-512-DQI");
3) Fixed lowering tests
llvm-svn: 220774
Summary:
http://llvm.org/bugs/show_bug.cgi?id=18345
Tuple's constructor and assignment operators for "tuple-like" types evaluates __make_tuple_types unnecessarily. In the case of a large array this can blow the template instantiation depth.
Ex:
```
#include <array>
#include <tuple>
#include <memory>
typedef std::array<int, 1256> array_t;
typedef std::tuple<array_t> tuple_t;
int main() {
array_t a;
tuple_t t(a); // broken
t = a; // broken
// make_shared uses tuple behind the scenes. This bug breaks this code.
std::make_shared<array_t>(a);
}
```
To prevent this from happening we delay the instantiation of `__make_tuple_types` until after we perform the length check. Currently `__make_tuple_types` is instantiated at the same time that the length check .
Test Plan: Two tests have been added. One for the "tuple-like" constructors and another for the "tuple-like" assignment operator.
Reviewers: mclow.lists, EricWF
Reviewed By: EricWF
Subscribers: K-ballo, cfe-commits
Differential Revision: http://reviews.llvm.org/D4467
llvm-svn: 220769
This is a Microsoft calling convention that supports both x86 and x86_64
subtargets. It passes vector and floating point arguments in XMM0-XMM5,
and passes them indirectly once they are consumed.
Homogenous vector aggregates of up to four elements can be passed in
sequential vector registers, but this part is not implemented in LLVM
and will be handled in Clang.
On 32-bit x86, it is similar to fastcall in that it uses ecx:edx as
integer register parameters and is callee cleanup. On x86_64, it
delegates to the normal win64 calling convention.
Reviewers: majnemer
Differential Revision: http://reviews.llvm.org/D5943
llvm-svn: 220745
Benchmarks have shown that it's harmless to the performance there, and having a
unified set of passes between the two cores where possible helps big.LITTLE
deployment.
Patch by Z. Zheng.
llvm-svn: 220744
I noticed that it was untested, and forcing it on caused some tests to fail:
LLVM :: Linker/metadata-a.ll
LLVM :: Linker/prefixdata.ll
LLVM :: Linker/type-unique-odr-a.ll
LLVM :: Linker/type-unique-simple-a.ll
LLVM :: Linker/type-unique-simple2-a.ll
LLVM :: Linker/type-unique-simple2.ll
LLVM :: Linker/type-unique-type-array-a.ll
LLVM :: Linker/unnamed-addr1-a.ll
LLVM :: Linker/visibility1.ll
If it is to be resurrected, it has to be fixed and we should probably have a
-preserve-source command line option in llvm-mc and run tests with and without
it.
llvm-svn: 220741