Also, move the l-value emission code into CGObjC.cpp and teach it, for
completeness, to store away self for a super send.
Also, inline the super cases for property gets and sets and make them
use the correct result type for implicit getter/setter calls.
llvm-svn: 120887
when an initializer is variable (I handled the constant case in a previous
patch). This has three pieces:
1. Enhance AggValueSlot to have a 'isZeroed' bit to tell CGExprAgg that
the memory being stored into has previously been memset to zero.
2. Teach CGExprAgg to not emit stores of zero to isZeroed memory.
3. Teach CodeGenFunction::EmitAggExpr to scan initializers to determine
whether they are profitable to emit a memset + inividual stores vs
stores for everything.
The heuristic used is that a global has to be more than 16 bytes and
has to be 3/4 zero to be candidate for this xform. The two testcases
are illustrative of the scenarios this catches. We now codegen test9 into:
call void @llvm.memset.p0i8.i64(i8* %0, i8 0, i64 400, i32 4, i1 false)
%.array = getelementptr inbounds [100 x i32]* %Arr, i32 0, i32 0
%tmp = load i32* %X.addr, align 4
store i32 %tmp, i32* %.array
and test10 into:
call void @llvm.memset.p0i8.i64(i8* %0, i8 0, i64 392, i32 8, i1 false)
%tmp = getelementptr inbounds %struct.b* %S, i32 0, i32 0
%tmp1 = getelementptr inbounds %struct.a* %tmp, i32 0, i32 0
%tmp2 = load i32* %X.addr, align 4
store i32 %tmp2, i32* %tmp1, align 4
%tmp5 = getelementptr inbounds %struct.b* %S, i32 0, i32 3
%tmp10 = getelementptr inbounds %struct.a* %tmp5, i32 0, i32 4
%tmp11 = load i32* %X.addr, align 4
store i32 %tmp11, i32* %tmp10, align 4
Previously we produced 99 stores of zero for test9 and also tons for test10.
This xforms should substantially speed up -O0 builds when it kicks in as well
as reducing code size and optimizer heartburn on insane cases. This resolves
PR279.
llvm-svn: 120692
assignment to volatiles in C. This in effect reverts some of mjs's
work in and around r72572. Basically, the C++ standard is quite
clear, except that it lies about volatile behavior approximating
C's, whereas the C standard is almost actively misleading.
llvm-svn: 119344
data members by delaying the emission of the initializer until after
linkage and visibility have been set on the global. Also, don't
emit a guard unless the variable actually ends up with vague linkage,
and don't use thread-safe statics in any case.
llvm-svn: 118336
__builtin_ia32_vec_init_v8qi
__builtin_ia32_vec_init_v4hi
__builtin_ia32_vec_init_v2si
They are lowered to bitcasts. (These are all ready tested by the gcc testsuite.)
<rdar://problem/8529957>
llvm-svn: 116147
one of them) was causing a series of failures:
http://google1.osuosl.org:8011/builders/clang-x86_64-darwin10-selfhost/builds/4518
svn merge -c -114929 https://llvm.org/svn/llvm-project/cfe/trunk
--- Reverse-merging r114929 into '.':
U include/clang/Sema/Sema.h
U include/clang/AST/DeclCXX.h
U lib/Sema/SemaDeclCXX.cpp
U lib/Sema/SemaTemplateInstantiateDecl.cpp
U lib/Sema/SemaDecl.cpp
U lib/Sema/SemaTemplateInstantiate.cpp
U lib/AST/DeclCXX.cpp
svn merge -c -114925 https://llvm.org/svn/llvm-project/cfe/trunk
--- Reverse-merging r114925 into '.':
G include/clang/AST/DeclCXX.h
G lib/Sema/SemaDeclCXX.cpp
G lib/AST/DeclCXX.cpp
svn merge -c -114924 https://llvm.org/svn/llvm-project/cfe/trunk
--- Reverse-merging r114924 into '.':
G include/clang/AST/DeclCXX.h
G lib/Sema/SemaDeclCXX.cpp
G lib/Sema/SemaDecl.cpp
G lib/AST/DeclCXX.cpp
U lib/AST/ASTContext.cpp
svn merge -c -114921 https://llvm.org/svn/llvm-project/cfe/trunk
--- Reverse-merging r114921 into '.':
G include/clang/AST/DeclCXX.h
G lib/Sema/SemaDeclCXX.cpp
G lib/Sema/SemaDecl.cpp
G lib/AST/DeclCXX.cpp
llvm-svn: 114933
the cleanup might not be dominated by the allocation code.
In this case, we have to store aside all the delete arguments
in case we need them later. There's room for optimization here
in cases where we end up not actually needing the cleanup in
different branches (or being able to pop it after the
initialization code).
Also make sure we only call this operator delete along the path
where we actually allocated something.
Fixes rdar://problem/8439196.
llvm-svn: 114145
slot. The easiest way to do that was to bundle up the information
we care about for aggregate slots into a new structure which demands
that its creators at least consider the question.
I could probably be convinced that the ObjC 'needs GC' bit should
be rolled into this structure.
Implement generalized copy elision. The main obstacle here is that
IR-generation must be much more careful about making sure that exactly
llvm-svn: 113962
implement ARM array cookies. Also fix a few unfortunate bugs:
- throwing dtors in deletes prevented the allocation from being deleted
- adding the cookie to the new[] size was not being considered for
overflow (and, more seriously, was screwing up the earlier checks)
- deleting an array via a pointer to array of class type was not
causing any destructors to be run and was passing the unadjusted
pointer to the deallocator
- lots of address-space problems, in case anyone wants to support
free store in a variant address space :)
llvm-svn: 112814
update callers as best I can.
- This is a work in progress, our alignment handling is very horrible / sketchy -- I am just aiming for monotonic improvement.
- Serious review appreciated.
llvm-svn: 111707
This takes some trickery since CastExpr has subclasses (and indeed,
is abstract).
Also, smoosh the CastKind into the bitfield from Expr.
Drops two words of storage from Expr in the common case of expressions
which don't need inheritance paths. Avoids a separate allocation and
another word of overhead in cases needing inheritance paths. Also has
the advantage of not leaking memory, since destructors for AST nodes are
never run.
llvm-svn: 110507
enclosing normal cleanup, not the top of the EH stack. I'm *really*
surprised this hasn't been causing more problems.
Fixes rdar://problem/8231514.
llvm-svn: 109569
initializer of (). Make sure to use a simple memset() when we can, or
fall back to generating a loop when a simple memset will not
suffice. Fixes <rdar://problem/8212208>, a regression due to my work
in r107857.
llvm-svn: 108977
which generates more efficient and more obviously conformant
code. We now test for overflow of the multiply then force
the result to -1 if so. On X86, this generates nice code
like this:
__Z4testl: ## @_Z4testl
## BB#0: ## %entry
subl $12, %esp
movl $4, %eax
mull 16(%esp)
testl %edx, %edx
movl $-1, %ecx
cmovel %eax, %ecx
movl %ecx, (%esp)
call __Znam
addl $12, %esp
ret
llvm-svn: 108927
causing clang to compile this code into something that correctly throws a
length error, fixing a potential integer overflow security attack:
void *test(long N) {
return new int[N];
}
int main() {
test(1L << 62);
}
We do this even when exceptions are disabled, because it is better for the
code to abort than for the attack to succeed.
This is heavily based on a patch that Fariborz wrote.
llvm-svn: 108915
mostly in avoiding unnecessary work at compile time but also in producing more
sensible block orderings.
Move the destructor cleanups for local variables over to use lazy cleanups.
Eventually all cleanups will do this; for now we have some awkward code
duplication.
Tell IR generation just to never produce landing pads in -fno-exceptions.
This is a much more comprehensive solution to a problem which previously was
half-solved by checks in most cleanup-generation spots.
llvm-svn: 108270
emit metadata associating allocas and global values with a Decl*. This feature
is controlled by an option that (intentionally) cannot be enabled on the command
line.
To use this feature, simply set
CodeGenOptions.EmitDeclMetadata = true;
and then interpret the completely underspecified metadata. :)
llvm-svn: 107739
self-host. Hopefully these results hold up on different platforms.
I tried to keep the GNU ObjC runtime happy, but it's hard for me to test.
Reimplement how clang generates IR for exceptions. Instead of creating new
invoke destinations which sequentially chain to the previous destination,
push a more semantic representation of *why* we need the cleanup/catch/filter
behavior, then collect that information into a single landing pad upon request.
Also reorganizes how normal cleanups (i.e. cleanups triggered by non-exceptional
control flow) are generated, since it's actually fairly closely tied in with
the former. Remove the need to track which cleanup scope a block is associated
with.
Document a lot of previously poorly-understood (by me, at least) behavior.
The new framework implements the Horrible Hack (tm), which requires every
landing pad to have a catch-all so that inlining will work. Clang no longer
requires the Horrible Hack just to make exceptions flow correctly within
a function, however. The HH is an unfortunate requirement of LLVM's EH IR.
llvm-svn: 107631
alloca for an argument. Make sure the argument gets the proper
decl alignment, which may be different than the type alignment.
This fixes PR7567
llvm-svn: 107627
have CGF create and make accessible standard int32,int64 and
intptr types. This fixes a ton of 80 column violations
introduced by LLVMContextification and cleans up stuff a lot.
llvm-svn: 106977
load/store nonsense in the epilog. For example, for:
int foo(int X) {
int A[100];
return A[X];
}
we used to generate:
%arrayidx = getelementptr inbounds [100 x i32]* %A, i32 0, i64 %idxprom ; <i32*> [#uses=1]
%tmp1 = load i32* %arrayidx ; <i32> [#uses=1]
store i32 %tmp1, i32* %retval
%0 = load i32* %retval ; <i32> [#uses=1]
ret i32 %0
}
which codegen'd to this code:
_foo: ## @foo
## BB#0: ## %entry
subq $408, %rsp ## imm = 0x198
movl %edi, 400(%rsp)
movl 400(%rsp), %edi
movslq %edi, %rax
movl (%rsp,%rax,4), %edi
movl %edi, 404(%rsp)
movl 404(%rsp), %eax
addq $408, %rsp ## imm = 0x198
ret
Now we generate:
%arrayidx = getelementptr inbounds [100 x i32]* %A, i32 0, i64 %idxprom ; <i32*> [#uses=1]
%tmp1 = load i32* %arrayidx ; <i32> [#uses=1]
ret i32 %tmp1
}
and:
_foo: ## @foo
## BB#0: ## %entry
subq $408, %rsp ## imm = 0x198
movl %edi, 404(%rsp)
movl 404(%rsp), %edi
movslq %edi, %rax
movl (%rsp,%rax,4), %eax
addq $408, %rsp ## imm = 0x198
ret
This actually does matter, cutting out 2000 lines of IR from CGStmt.ll
for example.
Another interesting effect is that altivec.h functions which are dead
now get dce'd by the inliner. Hence all the changes to
builtins-ppc-altivec.c to ensure the calls aren't dead.
llvm-svn: 106970
'self' variable arising from uses of the 'super' keyword. Also reorganize
some code so that BlockInfo (now CGBlockInfo) can be opaque outside of
CGBlocks.cpp.
Fixes rdar://problem/8010633.
llvm-svn: 104312
__cxa_guard_abort along the exceptional edge into (in effect) a nested
"try" that rethrows after aborting. Fixes PR7144 and the remaining
Boost.ProgramOptions failures, along with the regressions that r103880
caused.
The crucial difference between this and r103880 is that we now follow
LLVM's little dance with the llvm.eh.exception and llvm.eh.selector
calls, then use _Unwind_Resume_or_Rethrow to rethrow.
llvm-svn: 103892
__cxa_guard_abort along the exceptional edge into (in effect) a nested
"try" that rethrows after aborting. Fixes PR7144 and the remaining
Boost.ProgramOptions failures.
llvm-svn: 103880
implicitly-generated copy constructor. Previously, Sema would perform
some checking and instantiation to determine which copy constructors,
etc., would be called, then CodeGen would attempt to figure out which
copy constructor to call... but would get it wrong, or poke at an
uninstantiated default argument, or fail in other ways.
The new scheme is similar to what we now do for the implicit
copy-assignment operator, where Sema performs all of the semantic
analysis and builds specific ASTs that look similar to the ASTs we'd
get from explicitly writing the copy constructor, so that CodeGen need
only do a direct translation.
However, it's not quite that simple because one cannot explicit write
elementwise copy-construction of an array. So, I've extended
CXXBaseOrMemberInitializer to contain a list of indexing variables
used to copy-construct the elements. For example, if we have:
struct A { A(const A&); };
struct B {
A array[2][3];
};
then we generate an implicit copy assignment operator for B that looks
something like this:
B::B(const B &other) : array[i0][i1](other.array[i0][i1]) { }
CodeGen will loop over the invented variables i0 and i1 to visit all
elements in the array, so that each element in the destination array
will be copy-constructed from the corresponding element in the source
array. Of course, if we're dealing with arrays of scalars or class
types with trivial copy-assignment operators, we just generate a
memcpy rather than a loop.
Fixes PR6928, PR5989, and PR6887. Boost.Regex now compiles and passes
all of its regression tests.
Conspicuously missing from this patch is handling for the exceptional
case, where we need to destruct those objects that we have
constructed. I'll address that case separately.
llvm-svn: 103079
assignment operators.
Previously, Sema provided type-checking and template instantiation for
copy assignment operators, then CodeGen would synthesize the actual
body of the copy constructor. Unfortunately, the two were not in sync,
and CodeGen might pick a copy-assignment operator that is different
from what Sema chose, leading to strange failures, e.g., link-time
failures when CodeGen called a copy-assignment operator that was not
instantiation, run-time failures when copy-assignment operators were
overloaded for const/non-const references and the wrong one was
picked, and run-time failures when by-value copy-assignment operators
did not have their arguments properly copy-initialized.
This implementation synthesizes the implicitly-defined copy assignment
operator bodies in Sema, so that the resulting ASTs encode exactly
what CodeGen needs to do; there is no longer any special code in
CodeGen to synthesize copy-assignment operators. The synthesis of the
body is relatively simple, and we generate one of three different
kinds of copy statements for each base or member:
- For a class subobject, call the appropriate copy-assignment
operator, after overload resolution has determined what that is.
- For an array of scalar types or an array of class types that have
trivial copy assignment operators, construct a call to
__builtin_memcpy.
- For an array of class types with non-trivial copy assignment
operators, synthesize a (possibly nested!) for loop whose inner
statement calls the copy constructor.
- For a scalar type, use built-in assignment.
This patch fixes at least a few tests cases in Boost.Spirit that were
failing because CodeGen picked the wrong copy-assignment operator
(leading to link-time failures), and I suspect a number of undiagnosed
problems will also go away with this change.
Some of the diagnostics we had previously have gotten worse with this
change, since we're going through generic code for our
type-checking. I will improve this in a subsequent patch.
llvm-svn: 102853
in a throw expression. Use EmitAnyExprToMem to emit the throw expression,
which magically elides the final copy-constructor call (which raises a new
strict-compliance bug, but baby steps). Give __cxa_throw a destructor pointer
if the exception type has a non-trivial destructor.
llvm-svn: 102039
of the block descriptor field. This field is the ObjC style @encode
signature of the implementation function, and was to this point
conditionally provided in the block literal data structure. That
provisional support is removed.
Additionally, eliminate unused enumerations for the block literal flags field.
The first shipping ABI unconditionally set (1<<29) but this bit is unused
by the runtime, so the second ABI will unconditionally have (1<<30) set so
that the runtime can in fact distinguish whether the additional data is
present or not.
llvm-svn: 96989