when an initializer is variable (I handled the constant case in a previous
patch). This has three pieces:
1. Enhance AggValueSlot to have a 'isZeroed' bit to tell CGExprAgg that
the memory being stored into has previously been memset to zero.
2. Teach CGExprAgg to not emit stores of zero to isZeroed memory.
3. Teach CodeGenFunction::EmitAggExpr to scan initializers to determine
whether they are profitable to emit a memset + inividual stores vs
stores for everything.
The heuristic used is that a global has to be more than 16 bytes and
has to be 3/4 zero to be candidate for this xform. The two testcases
are illustrative of the scenarios this catches. We now codegen test9 into:
call void @llvm.memset.p0i8.i64(i8* %0, i8 0, i64 400, i32 4, i1 false)
%.array = getelementptr inbounds [100 x i32]* %Arr, i32 0, i32 0
%tmp = load i32* %X.addr, align 4
store i32 %tmp, i32* %.array
and test10 into:
call void @llvm.memset.p0i8.i64(i8* %0, i8 0, i64 392, i32 8, i1 false)
%tmp = getelementptr inbounds %struct.b* %S, i32 0, i32 0
%tmp1 = getelementptr inbounds %struct.a* %tmp, i32 0, i32 0
%tmp2 = load i32* %X.addr, align 4
store i32 %tmp2, i32* %tmp1, align 4
%tmp5 = getelementptr inbounds %struct.b* %S, i32 0, i32 3
%tmp10 = getelementptr inbounds %struct.a* %tmp5, i32 0, i32 4
%tmp11 = load i32* %X.addr, align 4
store i32 %tmp11, i32* %tmp10, align 4
Previously we produced 99 stores of zero for test9 and also tons for test10.
This xforms should substantially speed up -O0 builds when it kicks in as well
as reducing code size and optimizer heartburn on insane cases. This resolves
PR279.
llvm-svn: 120692
a global is larger than 32 bytes and has fewer than 6 non-zero values in the
initializer. Previously we'd turn something like this:
char test8(int X) {
char str[10000] = "abc";
into a 10K global variable which we then memcpy'd from. Now we generate:
%str = alloca [10000 x i8], align 16
%tmp = getelementptr inbounds [10000 x i8]* %str, i64 0, i64 0
call void @llvm.memset.p0i8.i64(i8* %tmp, i8 0, i64 10000, i32 16, i1 false)
store i8 97, i8* %tmp, align 16
%0 = getelementptr [10000 x i8]* %str, i64 0, i64 1
store i8 98, i8* %0, align 1
%1 = getelementptr [10000 x i8]* %str, i64 0, i64 2
store i8 99, i8* %1, align 2
Which is much smaller in space and also likely faster.
This is part of PR279
llvm-svn: 120645
I mistakenly thought that this was checking for vector name mangling, but
it is not. Since we're no longer wrapping Neon vectors in structs, this
test can just return a vector directly. There are already other tests for
that, so just to make this interesting, change the test to return a struct
of two vectors.
llvm-svn: 119434
assignment to volatiles in C. This in effect reverts some of mjs's
work in and around r72572. Basically, the C++ standard is quite
clear, except that it lies about volatile behavior approximating
C's, whereas the C standard is almost actively misleading.
llvm-svn: 119344
Return the result of a complex assignment with the original values,
not by performing a load from the l-value; this is the correct
semantics in C, although not in C++.
llvm-svn: 119037
- tags with C linkage should ignore visibility=hidden
- functions and variables with explicit visibility attributes should
ignore the linkage of their types
Either of these should be sufficient to fix PR8457.
Also, FileCheck-ize a test case.
llvm-svn: 117351
completes support for C1X anonymous struct/union init features:
* Indexed anonymous member initializers should not be expanded. Doing so makes
little sense and would cause unresolvable semantic ambiguity in valid code
(regression introduced by r69153).
* Subobject initialization of (possibly nested) anonymous members are now
referred to with paths relative to the naming record context, eliminating the
synthesis of incorrect implicit InitListExprs that caused CodeGen to assert.
* Field lookup was missing a null check in IdentifierInfo comparison which
caused lookup for a known (already resolved) field to match the first unnamed
data member it encountered leading to silent miscompilation.
* Subobject paths are no longer built using the general purpose
Sema::BuildAnonymousStructUnionMemberPath(). If any corner cases crop up, we
will now assert earlier in Sema instead of passing invalid InitListExprs
through to CodeGen.
Fixes PR6955, from Alp Toker!
llvm-svn: 116098