Currently ASan instrumentation pass creates a string with global name
for each instrumented global (to include global names in the error report). Global
name is already mangled at this point, and we may not be able to demangle it
at runtime (e.g. there is no __cxa_demangle on Android).
Instead, create a string with fully qualified global name in Clang, and pass it
to ASan instrumentation pass in llvm.asan.globals metadata. If there is no metadata
for some global, ASan will use the original algorithm.
This fixes https://code.google.com/p/address-sanitizer/issues/detail?id=264.
llvm-svn: 212872
See https://code.google.com/p/address-sanitizer/issues/detail?id=299 for the
original feature request.
Introduce llvm.asan.globals metadata, which Clang (or any other frontend)
may use to report extra information about global variables to ASan
instrumentation pass in the backend. This metadata replaces
llvm.asan.dynamically_initialized_globals that was used to detect init-order
bugs. llvm.asan.globals contains the following data for each global:
1) source location (file/line/column info);
2) whether it is dynamically initialized;
3) whether it is blacklisted (shouldn't be instrumented).
Source location data is then emitted in the binary and can be picked up
by ASan runtime in case it needs to print error report involving some global.
For example:
0x... is located 4 bytes to the right of global variable 'C::array' defined in '/path/to/file:17:8' (0x...) of size 40
These source locations are printed even if the binary doesn't have any
debug info.
This is an ABI-breaking change. ASan initialization is renamed to
__asan_init_v4(). Pre-built libraries compiled with older Clang will not work
with the fresh runtime.
llvm-svn: 212188
Init-order and use-after-return modes can currently be enabled
by runtime flags. use-after-scope mode is not really working at the
moment.
The only problem I see is that users won't be able to disable extra
instrumentation for init-order and use-after-scope by a top-level Clang flag.
But this instrumentation was implicitly enabled for quite a while and
we didn't hear from users hurt by it.
llvm-svn: 210924
Add driver and frontend support for the GCC -Wframe-larger-than=bytes warning.
This is the first GCC-compatible backend diagnostic built around LLVM's
reporting feature.
This commit adds infrastructure to perform reverse lookup from mangled names
emitted after LLVM IR generation. We use that to resolve precise locations and
originating AST functions, lambdas or block declarations to produce seamless
codegen-guided diagnostics.
An associated change, StringMap now maintains unique mangled name strings
instead of allocating copies. This is a net memory saving in C++ and a small
hit for C where we no longer reuse IdentifierInfo storage, pending further
optimisation.
llvm-svn: 210293
The only remaining user didn't actually use the non-dynamic storage facility
this class provides.
The std::string is transitional and likely to be StringRefized shortly.
llvm-svn: 210058
Summary:
A reference temporary should inherit the linkage of the variable it
initializes. Otherwise, we may hit cases where a reference temporary
wouldn't have the same value in all translation units.
Reviewers: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D3515
llvm-svn: 207451
data members by addition of CXXDefaultInitExpr node to the initializer expression,
it has broken treatment of arc code for such initializations. Reviewed by John McCall.
// rdar://16299964
llvm-svn: 203935
LLVM currently has a hack (shouldEmitUsedDirectiveFor) that causes it to not
print no_dead_strip for symbols starting with 'l' or 'L'. These are exactly the
ones that the clang's objc codegen is producing. The net result, is that it is
equivalent to llvm.compiler.used.
The need for putting the private symbol in llvm.compiler.used should be clear
(the objc runtime uses them). The reason for also putting the weak symbols in
it is for LTO: ld64 will not ask us to preserve the it.
llvm-svn: 203172
When a non-trivial parameter is present, clang now gathers up all the
parameters that lack inreg and puts them into a packed struct. MSVC
always aligns each parameter to 4 bytes and no more, so this is a pretty
simple struct to lay out.
On win64, non-trivial records are passed indirectly. Prior to this
change, clang was incorrectly using byval on win64.
I'm able to self-host a working clang with this change and additional
LLVM patches.
Reviewers: rsmith
Differential Revision: http://llvm-reviews.chandlerc.com/D2636
llvm-svn: 200597
Summary:
MSVC destroys arguments in the callee from left to right. Because C++
objects have to be destroyed in the reverse order of construction, Clang
has to construct arguments from right to left and destroy arguments from
left to right.
This patch fixes the ordering by reversing the order of evaluation of
all call arguments under the MS C++ ABI.
Fixes PR18035.
Reviewers: rsmith
Differential Revision: http://llvm-reviews.chandlerc.com/D2275
llvm-svn: 196402
CodeGenABITypes is a wrapper built on top of CodeGenModule that exposes
some of the functionality of CodeGenTypes (held by CodeGenModule),
specifically methods that determine the LLVM types appropriate for
function argument and return values.
I addition to CodeGenABITypes.h, CGFunctionInfo.h is introduced, and the
definitions of ABIArgInfo, RequiredArgs, and CGFunctionInfo are moved
into this new header from the private headers ABIInfo.h and CGCall.h.
Exposing this functionality is one part of making it possible for LLDB
to determine the actual ABI locations of function arguments and return
values, making it possible for it to determine this for any supported
target without hard-coding ABI knowledge in the LLDB code.
llvm-svn: 193717
The key insight here is that weak linkage for a static local variable
should always mean linkonce_odr, because every file that needs it will
generate a definition. We don't actually care about the precise linkage
of the parent context. I feel a bit silly that I didn't realize this before.
llvm-svn: 185381
Blocks, like lambdas, can be written in contexts which are required to be
treated as the same under ODR. Unlike lambdas, it isn't possible to actually
take the address of a block, so the mangling of the block itself doesn't
matter. However, objects like static variables inside a block do need to
be mangled in a consistent way.
There are basically three components here. One, block literals need a
consistent numbering. Two, objects/types inside a block literal need
to be mangled using it. Three, objects/types inside a block literal need
to have their linkage computed correctly.
llvm-svn: 185372
Itanium destroys them in the caller at the end of the full expression,
but MSVC destroys them in the callee. This is further complicated by
the need to emit EH-only destructor cleanups in the caller.
This should help clang compile MSVC's debug iterators more correctly.
There is still an outstanding issue in PR5064 of a memcpy emitted by the
LLVM backend, which is not correct for C++ records.
Fixes PR16226.
Reviewers: rjmccall
Differential Revision: http://llvm-reviews.chandlerc.com/D929
llvm-svn: 184543
Introduce CXXStdInitializerListExpr node, representing the implicit
construction of a std::initializer_list<T> object from its underlying array.
The AST representation of such an expression goes from an InitListExpr with a
flag set, to a CXXStdInitializerListExpr containing a MaterializeTemporaryExpr
containing an InitListExpr (possibly wrapped in a CXXBindTemporaryExpr).
This more detailed representation has several advantages, the most important of
which is that the new MaterializeTemporaryExpr allows us to directly model
lifetime extension of the underlying temporary array. Using that, this patch
*drastically* simplifies the IR generation of this construct, provides IR
generation support for nested global initializer_list objects, fixes several
bugs where the destructors for the underlying array would accidentally not get
invoked, and provides constant expression evaluation support for
std::initializer_list objects.
llvm-svn: 183872
were lacking ExprWithCleanups nodes in some cases where the new approach to
lifetime extension needed them).
Original commit message:
Rework IR emission for lifetime-extended temporaries. Instead of trying to walk
into the expression and dig out a single lifetime-extended entity and manually
pull its cleanup outside the expression, instead keep a list of the cleanups
which we'll need to emit when we get to the end of the full-expression. Also
emit those cleanups early, as EH-only cleanups, to cover the case that the
full-expression does not terminate normally. This allows IR generation to
properly model temporary lifetime when multiple temporaries are extended by the
same declaration.
We have a pre-existing bug where an exception thrown from a temporary's
destructor does not clean up lifetime-extended temporaries created in the same
expression and extended to automatic storage duration; that is not fixed by
this patch.
llvm-svn: 183859
into the expression and dig out a single lifetime-extended entity and manually
pull its cleanup outside the expression, instead keep a list of the cleanups
which we'll need to emit when we get to the end of the full-expression. Also
emit those cleanups early, as EH-only cleanups, to cover the case that the
full-expression does not terminate normally. This allows IR generation to
properly model temporary lifetime when multiple temporaries are extended by the
same declaration.
We have a pre-existing bug where an exception thrown from a temporary's
destructor does not clean up lifetime-extended temporaries created in the same
expression and extended to automatic storage duration; that is not fixed by
this patch.
llvm-svn: 183721
This resolves the last of the PR14606 failures in the GDB 7.5 test
suite. (but there are still unresolved issues in the imported_decl case
- we need to implement optional/lazy decls for functions & variables
like we already do for types)
llvm-svn: 182329
This reverts commit r181947 (git d2990ce56a16050cac0d7937ec9919ff54c6df62 )
This addresses one of the two issues identified in r181947, ensuring
that types imported via using declarations only result in a declaration
being emitted for the type, not a definition. The second issue (emitting
using declarations that are unused) is hopefully an acceptable increase
as the real fix for this would be a bit difficult (probably at best we
could record which using directives were involved in lookups - but may
not have been the result of the lookup).
This also ensures that DW_TAG_imported_declarations (& directives) are
not emitted in line-tables-only mode as well as ensuring that typedefs
only require/emit declarations (rather than definitions) for referenced
types.
llvm-svn: 182231
This reverts commit r181393 (git 3923d6a87fe7b2c91cc4a7dbd90c4ec7e2316bcd).
This seems to be emitting too much extra debug info for two (known)
reasons:
* full class definitions are emitted when only declarations are expected
* unused using declarations still produce DW_TAG_imported_declarations
llvm-svn: 181947
Basic support is implemented here - it still doesn't account for
declared-but-not-defined variables or functions. It cannot handle out of
order (declared, 'using', then defined) cases for variables, but can
handle that for functions (& can handle declared, 'using'd, and not
defined at all cases for types).
llvm-svn: 181393
Add CapturedDecl to be the DeclContext for CapturedStmt, and perform semantic
analysis. Currently captures all variables by reference.
TODO: templates
Author: Ben Langmuir <ben.langmuir@intel.com>
Differential Revision: http://llvm-reviews.chandlerc.com/D433
llvm-svn: 179618
http://lab.llvm.org:8011/builders/clang-x86_64-darwin10-gdb went back green
before it processed the reverted 178663, so it could not have been the culprit.
Revert "Revert 178663."
This reverts commit 4f8a3eb2ce5d4ba422483439e20c8cbb4d953a41.
llvm-svn: 178682
For variables and functions clang used to store two storage classes. The one
"as written" in the code and a patched one, which, for example, propagates
static to the following decls.
This apparently is from the days clang lacked linkage computation. It is now
redundant and this patch removes it.
llvm-svn: 178663
* Store the .block_descriptor (instead of self) in the alloca so we
can guarantee that all captured variables are available at -O0.
* Add the missing OpDeref for the alloca.
rdar://problem/12767564
llvm-svn: 178361
aggregate types in a profoundly wrong way that has to be
worked around in every call site, to getEvaluationKind,
which classifies and distinguishes between all of these
cases.
Also, normalize the API for loading and storing complexes.
I'm working on a larger patch and wanted to pull these
changes out, but it would have be annoying to detangle
them from each other.
llvm-svn: 176656
Introduce a new AST Decl node "EmptyDecl" to model empty-declaration. Have attributes from attribute-declaration appertain
to the EmptyDecl node by creating the AST representations of these attributes and attach them to the EmptyDecl node so these
attributes can be sema checked just as attributes attached to "normal" declarations.
llvm-svn: 175900
arguments in function prologue is done
with objc_StoreStrong to pair it with
similar objc_StoreStrong for release in function
epilogue. This is done with -O0 only.
// rdar://13145317
llvm-svn: 175698
/// \param array - a value of type elementType*
/// \param destructionKind - the kind of destruction required
/// \param initializedElementCount - a value of type size_t* holding the number of successfully-constructed elements
llvm-svn: 171013
When we are visiting the extern declaration of 'i' in
static int i = 99;
int foo() {
extern int i;
return i;
}
We should not try to handle it as if it was an function static. That is, we
must consider the written storage class.
Fixing this then exposes that the assert in EmitGlobalVarDeclLValue and the
if leading to its call are not completely accurate. They were passing before
because the second decl was marked as having external storage. I changed them
to check the linkage, which I find easier to understand.
Last but not least, there is something strange going on with cuda and opencl.
My guess is that the linkage computation for these languages needs to be
audited, but I didn't want to change that in this patch so I just updated
the storage classes to keep the current behavior.
Thanks to Reed Kotler for reporting this.
llvm-svn: 170827
uncovered.
This required manually correcting all of the incorrect main-module
headers I could find, and running the new llvm/utils/sort_includes.py
script over the files.
I also manually added quite a few missing headers that were uncovered by
shuffling the order or moving headers up to be main-module-headers.
llvm-svn: 169237
This adds support for the tls_model attribute. This allows the user to
choose a TLS model that is better than what LLVM would select by
default. For example, a variable might be declared as:
__thread int x __attribute__((tls_model("initial-exec")));
if it will not be used in a shared library that is dlopen'ed.
This depends on LLVM r159077.
llvm-svn: 159078
initializer need be null initialized before initializer takes
hold, just like any other initialized retainable object pointer.
// rdar://11016025
llvm-svn: 158738
It reduces the amount of emitted debug information:
1) DIEs in .debug_info have types DW_TAG_compile_unit, DW_TAG_subprogram,
DW_TAG_inlined_subroutine (for opt builds) and DW_TAG_lexical_block only.
2) .debug_str contains only function names.
3) No debug data for types/namespaces/variables is emitted.
4) The data in .debug_line is enough to produce valid stack traces with
function names and line numbers.
Reviewed by Eric Christopher.
llvm-svn: 156160
jump into these scopes, and the cleanup-entering code sometimes wants
to do some operations first (e.g. a GEP), which can leave us with
unparented IR.
llvm-svn: 154684
the function body, but do so in a way that doesn't make any assumptions
about the static local actually having a proper, unique mangling,
since apparently we don't do that correctly at all.
llvm-svn: 153776
These patches cause us to miscompile and/or reject code with static
function-local variables in an extern-C context. Previously, we were
papering over this as long as the variables are within the same
translation unit, and had not seen any failures in the wild. We still
need a proper fix, which involves mangling static locals inside of an
extern-C block (as GCC already does), but this patch causes pretty
widespread regressions. Firefox, and many other applications no longer
build.
Lots of test cases have been posted to the list in response to this
commit, so there should be no problem reproducing the issues.
llvm-svn: 153768
the case that the variable already exists. Partly this is just
protection against people making crazy declarations with custom
asm labels or extern "C" names that intentionally collide with
the manglings of such variables, but the main reason is that we
can actually emit a static local variable twice with the
requirement that it match up. There may be other cases with
(e.g.) the various nested functions, but the main exemplar is
with constructor variants, where we can be forced into
double-emitting the function body under certain circumstances
like (currently) the presence of virtual bases.
llvm-svn: 153723
track whether the referenced declaration comes from an enclosing
local context. I'm amenable to suggestions about the exact meaning
of this bit.
llvm-svn: 152491
We now generate temporary arrays to back std::initializer_list objects
initialized with braces. The initializer_list is then made to point at
the array. We support both ptr+size and start+end forms, although
the latter is untested.
Array lifetime is correct for temporary std::initializer_lists (e.g.
call arguments) and local variables. It is untested for new expressions
and member initializers.
Things left to do:
Massively increase the amount of testing. I need to write tests for
start+end init lists, temporary objects created as a side effect of
initializing init list objects, new expressions, member initialization,
creation of temporary objects (e.g. std::vector) for initializer lists,
and probably more.
Get lifetime "right" for member initializers and new expressions. Not
that either are very useful.
Implement list-initialization of array new expressions.
llvm-svn: 150803
optional argument passed through the variadic ellipsis)
potentially affects how we need to lower it. Propagate
this information down to the various getFunctionInfo(...)
overloads on CodeGenTypes. Furthermore, rename those
overloads to clarify their distinct purposes, and make
sure we're calling the right one in the right place.
This has a nice side-effect of making it easier to construct
a function type, since the 'variadic' bit is no longer
separable.
This shouldn't really change anything for our existing
platforms, with one minor exception --- we should now call
variadic ObjC methods with the ... in the "right place"
(see the test case), which I guess matters for anyone
running GNUStep on MIPS. Mostly it's just a substantial
clean-up.
llvm-svn: 150788
constructor, and that constructor is used to initialize an object of static
storage duration such that all members and bases are initialized by constant
expressions, constant initialization is performed. In this case, the object
can still have a non-trivial destructor, and if it does, we must emit a dynamic
initializer which performs no initialization and instead simply registers that
destructor.
llvm-svn: 150419
- Add atomic-to/from-nonatomic cast types
- Emit atomic operations for arithmetic on atomic types
- Emit non-atomic stores for initialisation of atomic types, but atomic stores and loads for every other store / load
- Add a __atomic_init() intrinsic which does a non-atomic store to an _Atomic() type. This is needed for the corresponding C11 stdatomic.h function.
- Enables the relevant __has_feature() checks. The feature isn't 100% complete yet, but it's done enough that we want people testing it.
Still to do:
- Make the arithmetic operations on atomic types (e.g. Atomic(int) foo = 1; foo++;) use the correct LLVM intrinsic if one exists, not a loop with a cmpxchg.
- Add a signal fence builtin
- Properly set the fenv state in atomic operations on floating point values
- Correctly handle things like _Atomic(_Complex double) which are too large for an atomic cmpxchg on some platforms (this requires working out what 'correctly' means in this context)
- Fix the many remaining corner cases
llvm-svn: 148242
APValue::Array and APValue::MemberPointer. All APValue values can now be emitted
as constants.
Add new CGCXXABI entry point for emitting an APValue MemberPointer. The other
entrypoints dealing with constant member pointers are no longer necessary and
will be removed in a later change.
Switch codegen from using EvaluateAsRValue/EvaluateAsLValue to
VarDecl::evaluateValue. This performs caching and deals with the nasty cases in
C++11 where a non-const object's initializer can refer indirectly to
previously-initialized fields within the same object.
Building the intermediate APValue object incurs a measurable performance hit on
pathological testcases with huge initializer lists, so we continue to build IR
directly from the Expr nodes for array and record types outside of C++11.
llvm-svn: 148178
Also temporarily remove the assumption from IR gen that we can emit IR for every
constant we can fold, since it isn't currently true in C++11, to fix PR11676.
Original comment from r147271:
constexpr: perform zero-initialization prior to / instead of performing a
constructor call when appropriate. Thanks to Eli for spotting this.
llvm-svn: 147384
mutable member and a constant initializer. We'd previously promoted such
variables to global constants, resulting in nasal demons if the mutable member
was modified.
This is only a temporary fix. The subtle interplay between isConstantInitializer
and CGExprConstant is very bug-prone; there are some other issues in this area
which I will be addressing in subsequent, more major reworking of this code.
llvm-svn: 145654
full-expression. Naturally they're inactive before we enter
the block literal expression. This restores the intended
behavior that blocks belong to their enclosing scope.
There's a useful -O0 / compile-time optimization that we're
missing here with activating cleanups following straight-line
code from their inactive beginnings.
llvm-svn: 144268
language options. Use that .def file to declare the LangOptions class
and initialize all of its members, eliminating a source of annoying
initialization bugs.
AST serialization changes are next up.
llvm-svn: 139605
emit call results into potentially aliased slots. This allows us
to properly mark indirect return slots as noalias, at the cost
of requiring an extra memcpy when assigning an aggregate call
result into a l-value. It also brings us into compliance with
the x86-64 ABI.
llvm-svn: 138599
Example:
template <class T>
class A {
public:
template <class U> void f(U p) { }
template <> void f(int p) { } // <== class scope specialization
};
This extension is necessary to parse MSVC standard C++ headers, MFC and ATL code.
BTW, with this feature in, clang can parse (-fsyntax-only) all the MSVC 2010 standard header files without any error.
llvm-svn: 137573
__block variables where the act of initialization/assignment
itself causes the __block variable to be copied to the heap
because the variable is of block type and is being assigned
a block literal which captures the variable.
rdar://problem/9814099
llvm-svn: 136337
- an off-by-one error in emission of irregular array limits for
InitListExprs
- use an EH partial-destruction cleanup within the normal
array-destruction cleanup
- get the branch destinations right for the empty check
Also some refactoring which unfortunately obscures these changes.
llvm-svn: 134890
expecting so much concentrated oddity on what seemed like a
trivial feature. Thanks to François Pichet for doing the
MSVC legwork here.
llvm-svn: 134813
- Emit default-initialization of arrays that were partially initialized
with initializer lists with a loop, rather than emitting the default
initializer N times;
- support destroying VLAs of non-trivial type, although this is not
yet exposed to users; and
- support the partial destruction of arrays initialized with
initializer lists when an initializer throws an exception.
llvm-svn: 134784
trivial default constructors. This generated-code regression was
caused by r131796, which had simplified the handling of default
initialization in Sema. Fixes <rdar://problem/9694300>.
llvm-svn: 134260
they should still be officially __strong for the purposes of errors,
block capture, etc. Make a new bit on variables, isARCPseudoStrong(),
and set this for 'self' and these enumeration-loop variables. Change
the code that was looking for the old patterns to look for this bit,
and change IR generation to find this bit and treat the resulting
variable as __unsafe_unretained for the purposes of init/destroy in
the two places it can come up.
llvm-svn: 133243
Language-design credit goes to a lot of people, but I particularly want
to single out Blaine Garst and Patrick Beard for their contributions.
Compiler implementation credit goes to Argyrios, Doug, Fariborz, and myself,
in no particular order.
llvm-svn: 133103
- llvm.dbg.declare already receives line number information from ParmDecl
- Additional extra stoppoint messes up gdb's understanding of where function body starts.
llvm-svn: 133065
Emit debug info only if there is an insertion point. The debug info should not force an insertion point. Codegen may later on decide to not emit code for some reason, see extensive comment in CodeGenFunction::EmitStmt(), and debug info should not get in the way.
llvm-svn: 132610
add support for the OpenCL __private, __local, __constant and
__global address spaces, as well as the __read_only, _read_write and
__write_only image access specifiers. Patch originally by ARM;
language-specific address space support by myself.
llvm-svn: 127915
invocation function into the debug info. Rather than faking up a class,
which is tricky because of the custom layout we do, we just emit a struct
directly from the layout information we've already got.
Also, don't emit an unnecessarily parameter alloca for this "variable".
llvm-svn: 126255
_Block_object_* flags; it's just BLOCK_HAS_COPY_DISPOSE or not.
Also, we don't need to chase forwarding pointers prior to calling
_Block_object_dispose; _Block_object_dispose in fact already does
this.
rdar://problem/9006315
llvm-svn: 125823
bugs from other clients that don't expect to see a LabelDecl in a DeclStmt,
but if so they should be easy to fix.
This implements most of PR3429 and rdar://8287027
llvm-svn: 125817
LabelDecl and LabelStmt. There is a 1-1 correspondence between the
two, but this simplifies a bunch of code by itself. This is because
labels are the only place where we previously had references to random
other statements, causing grief for AST serialization and other stuff.
This does cause one regression (attr(unused) doesn't silence unused
label warnings) which I'll address next.
This does fix some minor bugs:
1. "The only valid attribute " diagnostic was capitalized.
2. Various diagnostics printed as ''labelname'' instead of 'labelname'
3. This reduces duplication of label checking between functions and blocks.
Review appreciated, particularly for the cindex and template bits.
llvm-svn: 125733
- Have CGM precompute a number of commonly-used types
- Have CGF copy that during initialization instead of recomputing them
- Use TBAA info when initializing a parameter variable
- Refactor the scalar ++/-- code
llvm-svn: 125562
- BlockDeclRefExprs always store VarDecls
- BDREs no longer store copy expressions
- BlockDecls now store a list of captured variables, information about
how they're captured, and a copy expression if necessary
With that in hand, change IR generation to use the captures data in
blocks instead of walking the block independently.
Additionally, optimize block layout by emitting fields in descending
alignment order, with a heuristic for filling in words when alignment
of the end of the block header is insufficient for the most aligned
field.
llvm-svn: 125005
fixing a crash which probably nobody was ever going to see. In doing so,
fix a horrendous number of problems with the conditional-cleanups code.
Also, make conditional cleanups re-use the cleanup's activation variable,
which avoids some unfortunate repetitiveness.
llvm-svn: 124481
process, perform a number of refactorings:
- Move MiscNameMangler member functions to MangleContext
- Remove GlobalDecl dependency from MangleContext
- Make MangleContext abstract and move Itanium/Microsoft functionality
to their own classes/files
- Implement ASTContext::createMangleContext and have CodeGen use it
No (intended) functionality change.
llvm-svn: 123386
in asm statements:
register int foo asm("rdi");
asm("..." : ... "r" (foo) ...
We also only accept these variables if the constraint in the asm statement is "r".
This fixes most of PR3933.
llvm-svn: 122643
when an initializer is variable (I handled the constant case in a previous
patch). This has three pieces:
1. Enhance AggValueSlot to have a 'isZeroed' bit to tell CGExprAgg that
the memory being stored into has previously been memset to zero.
2. Teach CGExprAgg to not emit stores of zero to isZeroed memory.
3. Teach CodeGenFunction::EmitAggExpr to scan initializers to determine
whether they are profitable to emit a memset + inividual stores vs
stores for everything.
The heuristic used is that a global has to be more than 16 bytes and
has to be 3/4 zero to be candidate for this xform. The two testcases
are illustrative of the scenarios this catches. We now codegen test9 into:
call void @llvm.memset.p0i8.i64(i8* %0, i8 0, i64 400, i32 4, i1 false)
%.array = getelementptr inbounds [100 x i32]* %Arr, i32 0, i32 0
%tmp = load i32* %X.addr, align 4
store i32 %tmp, i32* %.array
and test10 into:
call void @llvm.memset.p0i8.i64(i8* %0, i8 0, i64 392, i32 8, i1 false)
%tmp = getelementptr inbounds %struct.b* %S, i32 0, i32 0
%tmp1 = getelementptr inbounds %struct.a* %tmp, i32 0, i32 0
%tmp2 = load i32* %X.addr, align 4
store i32 %tmp2, i32* %tmp1, align 4
%tmp5 = getelementptr inbounds %struct.b* %S, i32 0, i32 3
%tmp10 = getelementptr inbounds %struct.a* %tmp5, i32 0, i32 4
%tmp11 = load i32* %X.addr, align 4
store i32 %tmp11, i32* %tmp10, align 4
Previously we produced 99 stores of zero for test9 and also tons for test10.
This xforms should substantially speed up -O0 builds when it kicks in as well
as reducing code size and optimizer heartburn on insane cases. This resolves
PR279.
llvm-svn: 120692
a global is larger than 32 bytes and has fewer than 6 non-zero values in the
initializer. Previously we'd turn something like this:
char test8(int X) {
char str[10000] = "abc";
into a 10K global variable which we then memcpy'd from. Now we generate:
%str = alloca [10000 x i8], align 16
%tmp = getelementptr inbounds [10000 x i8]* %str, i64 0, i64 0
call void @llvm.memset.p0i8.i64(i8* %tmp, i8 0, i64 10000, i32 16, i1 false)
store i8 97, i8* %tmp, align 16
%0 = getelementptr [10000 x i8]* %str, i64 0, i64 1
store i8 98, i8* %0, align 1
%1 = getelementptr [10000 x i8]* %str, i64 0, i64 2
store i8 99, i8* %1, align 2
Which is much smaller in space and also likely faster.
This is part of PR279
llvm-svn: 120645
data members by delaying the emission of the initializer until after
linkage and visibility have been set on the global. Also, don't
emit a guard unless the variable actually ends up with vague linkage,
and don't use thread-safe statics in any case.
llvm-svn: 118336