'create' functions conventionally return a pointer, not a reference.
Also use an OwningPtr to get replace the delete of a reference member.
No functional change.
llvm-svn: 198126
DiagIDs are a cached resource generally only constructed from compile-time
constant or stable format strings.
Escaping arbitrary messages and constructing DiagIDs from them didn't make
sense.
llvm-svn: 197856
Now CodeGenVTables has only one VTableContext object, which is either
Itanium or Microsoft.
Fixes a FIXME with no functionality change intended.
Ideally we could avoid the downcasts by pushing the things that
reference the Itanium vtable context into ItaniumCXXABI.cpp, but we're
not there yet.
llvm-svn: 197845
These members will still be lazily added to the relevant DWARF DIEs in
LLVM but when enumerating the members they will not appear. This allows
DWARF type units to be more consistent - the type unit will never
contain these special members (so all instances of the type should have
the same DIEs without some having some special members and others having
others) and the special members will be added to the skeletal
declaration that appears in the relevant compile_unit.
llvm-svn: 197844
This matches llc's behavior.
Before this patch clang would create a TargetInfo base on -triple but a llvm
CodeGen based on the triple in the module.
llvm-svn: 197837
Unlike Itanium's VTTs, the 'most derived' boolean or bitfield is the
last parameter for non-variadic constructors, rather than the second.
For variadic constructors, the 'most derived' parameter comes after the
'this' parameter. This affects constructor calls and constructor decls
in a variety of places.
Reviewers: timurrrr
Differential Revision: http://llvm-reviews.chandlerc.com/D2405
llvm-svn: 197518
clang still doesn't emit the right llvm code when initializing multi-D arrays it seems.
For e.g. the following code would still crash for me on Windows 7, 64 bit:
auto f4 = new int[100][200][300]{{{1,2,3}, {4, 5, 6}}, {{10, 20, 30}}};
It seems that the final new loop that iterates through each outermost array and memsets it to zero gets confused with its final ptr arithmetic.
This patch ensures that it converts the pointer to the allocated type (int [200][300]) before incrementing it (instead of using the base type: 'int').
Richard somewhat squeamishly approved the patch (as a quick fix to potentially make it into 3.4) - while exhorting for a more optimized fix in the future. http://llvm-reviews.chandlerc.com/D2398
Thanks Richard!
llvm-svn: 197294
There's nothing special about type traits accepting two arguments.
This commit eliminates BinaryTypeTraitExpr and switches all related handling
over to TypeTraitExpr.
Also fixes a CodeGen failure with variadic type traits appearing in a
non-constant expression.
The BTT/TT prefix and evaluation code is retained as-is for now but will soon
be further cleaned up.
This is part of the ongoing work to unify type traits.
llvm-svn: 197273
point reciprocal exponent, and floating-point reciprocal square root estimate
LLVM AArch64 intrinsics to use f32/f64 types, rather than their vector
equivalents.
llvm-svn: 197069
We were mistakengly giving linkonce_odr linkage instead of internal
linkage to the deleting and complete destructor thunks for classes in
anonymous namespaces.
Fixes PR17273.
llvm-svn: 197060
With the introduction of explicit address space casts into LLVM, there's
a need to provide a new cast kind the front-end can create for C/OpenCL/CUDA
and code to produce address space casts from those kinds when appropriate.
Patch by Michele Scandale!
llvm-svn: 197036
list, each element of the initializer list may provide more than one of the
base elements of the array. Be sure to initialize the right type and bump the
array pointer by the right amount.
llvm-svn: 196995
Thread an optional GV down to EmitGlobalFunctionDefinition so that it can
avoid the lookup when we already know the corresponding llvm global value.
llvm-svn: 196789
With this patch we output the in the order
C2
C1
D2
D1
D0
Which means that a destructor or constructor that call another is output after
the callee. This is a bit easier to read IHMO and a tiny bit more efficient
as we don't put a decl in DeferredDeclsToEmit.
llvm-svn: 196784
Testing has revealed that large integral constants (i.e. > INT64_MAX)
are always mangled as-if they are negative, even in places where it
would not make sense for them to be negative (like non-type template
parameters of type unsigned long long).
To address this, we change the way we model number mangling: always
mangle as-if our number is an int64_t. This should result in correct
results when we have large unsigned numbers.
N.B. Bizarrely, things that are 32-bit displacements like vbptr offsets
are mangled as-if they are unsigned 32-bit numbers. This is a pretty
egregious waste of space, it would be a 4x savings if we could mangle it
like a signed 32-bit number. Instead, we explicitly cast these
displacements to uint32_t and let the mangler proceed.
llvm-svn: 196771
Before this patch GetOrCreateLLVMFunction would add a decl to
DeferredDeclsToEmit even when it was being called by the function trying to
emit that decl.
llvm-svn: 196753
This can happen when we're trying to emit a thunk with available_externally
linkage with optimization enabled but bail because it doesn't make sense
for vararg functions.
PR18098.
llvm-svn: 196658
Summary:
In general, this type node can be used to represent any type adjustment
that occurs implicitly without losing type sugar. The immediate use of
this is to adjust the calling conventions of member function pointer
types without breaking template instantiation.
Fixes PR17996.
Reviewers: rsmith
Differential Revision: http://llvm-reviews.chandlerc.com/D2332
llvm-svn: 196451
The CodeGenOptions are not used for ABI type selection, so we will just
create one with the default constructor (there is a FloatABI option in
CodeGenOptions that is passed on to LLVM, but not used in Clang for LLVM
IR type generation).
We can use the DiagnosticsEngine on the ASTContext rather than making a
client pass one in explicitly.
llvm-svn: 196450
Summary:
MSVC destroys arguments in the callee from left to right. Because C++
objects have to be destroyed in the reverse order of construction, Clang
has to construct arguments from right to left and destroy arguments from
left to right.
This patch fixes the ordering by reversing the order of evaluation of
all call arguments under the MS C++ ABI.
Fixes PR18035.
Reviewers: rsmith
Differential Revision: http://llvm-reviews.chandlerc.com/D2275
llvm-svn: 196402
I'd misunderstood getIndirect() to mean that the argument should be passed
as a pointer at the ABI level, with the ByVal argument choosing caller-copy
semantics over no-caller-copy (callee-copy-on-write) semantics. But
getIndirect(x) actually means that x is passed by pointer at the IR
level but (at least on all other targets I looked at) directly at the
ABI level. getIndirect(x, false) selects a pointer to a caller-made
copy, which is what SystemZ was aiming for.
This fixes a miscompilation of c-index-test. Structure arguments were being
passed by pointer, but no copy was being made, so a write in the callee
stomped over a caller's local variable.
llvm-svn: 196370
This is a duplicate implementation.
E.g. this patch defines:
float64_t vabd_f64(float64_t a, float64_t b)
But there is already a similar intrinsic "vabdd_f64" with the same types.
Also, this intrinsic will be conflicted to the vector type intrinsic as following(Which is implemented by me and will be committed to trunk):
float64x1_t vabd_f64(float64x1_t a, float64x1_t b).
Two functions shouldn't have a same name in arm_neon.h.
According to ARM ACLE document, such vabd_f64 with float64_t is not existing.
So I revert this commit.
llvm-svn: 196205
clang's attribute support is mature by now so replace Sean's warning and email
address with a standard LLVM copyright heading.
Also remove a personal email address and credit docstring from CGObjCGNU that
shouldn't have been there.
This is in line with the LLVM developer policy introduced in r45410.
Contributors can add themselves to CREDITS.txt while active module owners are
listed in CODE_OWNERS.TXT.
llvm-svn: 195587
available always-inline functions. This breaks libc++'s locale
implementation. Code generation for this case should be fixed, but this
is a stop gap fix for clang 3.4.
llvm-svn: 195501
Not long ago I made the CodeGen of for loops simplify the condition at
-O0 in the same way we do for if and conditionals. Unfortunately this
ties how loops and simple conditions work together too tightly, which
makes features such as instrumentation based PGO awkward.
Ultimately, we should find a more general way to simplify the logic in
a given condition, but for now we'll just avoid using EmitBranchOnBool
for loops, like we already do for while and do loops.
llvm-svn: 195438
In OpenCL a vector of 3 elements, acts like a vector of four elements.
So for a vector of size 3 the '.hi' and '.odd' accessors, would access
the elements {2, 3} and {1, 3} respectively.
However, in EmitStoreThroughExtVectorComponentLValue we are still operating on
a vector of size 3, so we should only access {2} and {1}. We do this by checking
the last element to be accessed, and ignore it if it is out-of-bounds.
EmitLoadOfExtVectorElementLValue doesn't have a similar problem, because it does
a direct shufflevector with undef, so an out-of-bounds access just gives an undef
value.
Patch by Anastasia Stulova!
llvm-svn: 195367
This makes Clang emit a linkonce_odr definition for 'val' in the code below,
to be compatible with MSVC-compiled code:
struct Foo {
static const int val = 1;
};
Differential Revision: http://llvm-reviews.chandlerc.com/D2233
llvm-svn: 195283
Summary:
RTTI is not yet implemented for the Microsoft C++ ABI and isn't expected
soon. We could easily add the mangling, but the error is what prevents
us from silently miscompiling code that expects RTTI.
Instead, add a new mangleTypeName entry point that simply forwards to
mangleName or mangleType to produce a string that isn't part of the ABI.
Itanium can continue to use RTTI names to avoid unecessary test
breakage.
This also seems like the right design. The fact that TBAA names happen
to be RTTI names is now an implementation detail of the mangler, rather
than part of TBAA.
Differential Revision: http://llvm-reviews.chandlerc.com/D2153
llvm-svn: 195168
This adds -freroll-loops (and -fno-reroll-loops in the usual way) to enable
loop rerolling as part of the optimization pass manager. This transformation
can enable vectorization, reduce code size (or both).
Briefly, loop rerolling can transform a loop like this:
for (int i = 0; i < 3200; i += 5) {
a[i] += alpha * b[i];
a[i + 1] += alpha * b[i + 1];
a[i + 2] += alpha * b[i + 2];
a[i + 3] += alpha * b[i + 3];
a[i + 4] += alpha * b[i + 4];
}
into this:
for (int i = 0; i < 3200; ++i) {
a[i] += alpha * b[i];
}
Loop rerolling is currently disabled by default at all optimization levels.
llvm-svn: 194967
Instead of storing the vtable offset directly in the function pointer and
doing a branch to check for virtualness at each call site, the MS ABI
generates a thunk for calling the function at a specific vtable offset,
and puts that in the function pointer.
This patch adds support for emitting such thunks. However, it doesn't support
pointers to virtual member functions that are variadic, have an incomplete
aggregate return type or parameter, or are overriding a function in a virtual
base class.
Differential Revision: http://llvm-reviews.chandlerc.com/D2104
llvm-svn: 194827
This patch disables aliasing (and rauw) of derived dtors to base dtors at -O0.
This optimization can have a negative impact on the debug quality.
This was a latent bug for some time with local classes, but got noticed when it
was generalized and broke gdb's destrprint.exp.
llvm-svn: 194618
Now that the relevant tests use -mconstructor-aliases and the missing
features have been implemented, we can just drop this.
No functionality change.
llvm-svn: 194595
This adds a new option -fprofile-sample-use=filename to Clang. It
tells the driver to schedule the SampleProfileLoader pass and passes
on the name of the profile file to use.
llvm-svn: 194567
The problem was that given
template<typename T>
struct foo {
~foo() {}
};
template class foo<int>;
We would produce a alias, creating a comdat with D0 and D1, since the symbols
have to be weak. Another TU is not required to have a explicit template
instantiation definition or an explict template instantiation declaration and
for
template<typename T>
struct foo {
~foo() {}
};
foo<int> a;
we would produce a comdat with only one symbol in it.
llvm-svn: 194520
The original decls are created when used. The replacements are created at the
end of the TU in reverse order.
This makes the original order far better for testing. This is particularly
important since the replacement logic could be used even when
-mconstructor-aliases is not used, but that would make many tests hard to read.
This is a fixed version of r194357 which handles replacing a destructor with
another which is an alias to a third one.
llvm-svn: 194452
The assert this patch deletes was valid only when aliasing D2 to D1, not when
looking at a base class. Since the assert was in the path where we had already
decided to not produce an alias, just drop it.
llvm-svn: 194411
The original decls are created when used. The replacements are created at the
end of the TU in reverse order.
This makes the original order far better for testing. This is particularly
important since the replacement logic could be used even when
-mconstructor-aliases is not used, but that would make many tests hard to read.
llvm-svn: 194357
It is not safe to emit alias to undefined (not supported by ELF or COFF), but
it is safe to rauw when the alias would have been internal or linkonce_odr.
llvm-svn: 194307
whether we can safely lower a conditional operator to select was insufficient.
I've left a large comment in place to explaining the sort of problems that this
transform can encounter in clang in the hopes of discouraging others from
reimplementing it wrongly again in the future. (The test should also help with
that, but it's easy to work around any single test I might add and think that
your particular implementation doesn't miscompile any code.)
llvm-svn: 194289
Unlike an alias a rauw is always safe, so we don't need to avoid this
optimization when the replacement is not know to be available in every TU.
llvm-svn: 194288
Summary:
Similar to __FUNCTION__, MSVC exposes the name of the enclosing mangled
function name via __FUNCDNAME__. This implementation is very naive and
unoptimized, it is expected that __FUNCDNAME__ would be used rarely in
practice.
Reviewers: rnk, rsmith, thakis
CC: cfe-commits, silvas
Differential Revision: http://llvm-reviews.chandlerc.com/D2109
llvm-svn: 194181
On the microsoft ABI clang is producing one weak_odr and one linkonce_odr
destructor, which is reasonable since only one is required.
The fix is simply to move the assert past the special case treatment of
linkonce_odr.
llvm-svn: 194158
This is a small optimization on linux, but should help more on windows
where msvc only outputs one destructor if there would be two identical ones.
llvm-svn: 194095
deallocation function (and the corresponding unsized deallocation function has
been declared), emit a weak discardable definition of the function that
forwards to the corresponding unsized deallocation.
This allows a C++ standard library implementation to provide both a sized and
an unsized deallocation function, where the unsized one does not just call the
sized one, for instance by putting both in the same object file within an
archive.
llvm-svn: 194055
This is a small optimization on linux, but should help more on windows
where msvc only outputs one destructor if there would be two identical ones.
llvm-svn: 194046
With this patch we produce alias for cases like
template<typename T>
struct foobar {
foobar() {
}
};
template struct foobar<void>;
We just have to be careful to produce the same aliases in every TU because
of comdats.
llvm-svn: 194000
A while ago EmitForStmt was changed to explicitly evaluate the
condition expression and create a branch instead of using
EmitBranchOnBool, so that the condition expression could be used for
some cleanup logic. The cleanup stuff has since been reorganized, and
this is no longer necessary.
In EmitCXXForRange, the evaluated condition was never used for
anything else. The logic was presumably modeled on EmitForStmt.
llvm-svn: 193994
An initialization somehow found its way in between a comment and the
block of code the comment is about. Moving the initialization makes
this less confusing.
llvm-svn: 193993
CodeGenTypes.h instantiates llvm::FoldingSet<> with CGFunctionInfo,
and VC++ doesn't like the static_cast from FoldingSetImpl::Node* to
CGFunctionInfo* since it hasn't seen the definition of CGFunctionInfo
and that it inherits from FoldingSetImpl::Node.
llvm-svn: 193722
CodeGenABITypes is a wrapper built on top of CodeGenModule that exposes
some of the functionality of CodeGenTypes (held by CodeGenModule),
specifically methods that determine the LLVM types appropriate for
function argument and return values.
I addition to CodeGenABITypes.h, CGFunctionInfo.h is introduced, and the
definitions of ABIArgInfo, RequiredArgs, and CGFunctionInfo are moved
into this new header from the private headers ABIInfo.h and CGCall.h.
Exposing this functionality is one part of making it possible for LLDB
to determine the actual ABI locations of function arguments and return
values, making it possible for it to determine this for any supported
target without hard-coding ABI knowledge in the LLDB code.
llvm-svn: 193717
check using the ubsan runtime) and -fsanitize=local-bounds (for the middle-end
check which inserts traps).
Remove -fsanitize=local-bounds from -fsanitize=undefined. It does not produce
useful diagnostics and has false positives (PR17635), and is not a good
compromise position between UBSan's checks and ASan's checks.
Map -fbounds-checking to -fsanitize=local-bounds to restore Clang's historical
behavior for that flag.
llvm-svn: 193205
This is a fixed version of r193161. In order to handle
void foo() __attribute__((alias("bar")));
void bar() {}
void zed() __attribute__((alias("foo")));
it is not enough to delay aliases to the end of the TU, we have to do two
passes over them to find if they are defined or not.
This can be implemented by producing alias as we go and just doing the second
pass at the end. This has the advantage that other parts of clang that were
expecting alias to be processed in order don't have to be changed.
This patch also handles cyclic aliases.
llvm-svn: 193188
This reverts commit r193161.
It broke
void foo() __attribute__((alias("bar")));
void bar() {}
void zed() __attribute__((alias("foo")));
Looks like we have to fix pr17639 first :-(
llvm-svn: 193162
names. For example, with this patch we now reject
void f1(void) __attribute__((alias("g1")));
This patch is implemented in CodeGen. It is quiet a bit simpler and more
compatible with gcc than implementing it in Sema. The downside is that the
errors only fire during -emit-llvm.
llvm-svn: 193161
This uses function prefix data to store function type information at the
function pointer.
Differential Revision: http://llvm-reviews.chandlerc.com/D1338
llvm-svn: 193058
class. The instruction class includes the signed saturating doubling
multiply-add long, signed saturating doubling multiply-subtract long, and
the signed saturating doubling multiply long instructions.
llvm-svn: 192909
If a class is using the unspecified inheritance model for member
pointers and later we find the class is defined to use single
inheritance, zero out the vbptr offset field of the member pointer when
it is formed.
llvm-svn: 192664
Calling convention attributes can add sugar to methods that we have to
look through. This fixes an assertion failure in the provided test
case.
llvm-svn: 192496
Including following 14 instructions:
4 ld1 insts: load multiple 1-element structure to sequential 1/2/3/4 registers.
ld2/ld3/ld4: load multiple N-element structure to sequential N registers (N=2,3,4).
4 st1 insts: store multiple 1-element structure from sequential 1/2/3/4 registers.
st2/st3/st4: store multiple N-element structure from sequential N registers (N = 2,3,4).
llvm-svn: 192362
Including following 14 instructions:
4 ld1 insts: load multiple 1-element structure to sequential 1/2/3/4 registers.
ld2/ld3/ld4: load multiple N-element structure to sequential N registers (N=2,3,4).
4 st1 insts: store multiple 1-element structure from sequential 1/2/3/4 registers.
st2/st3/st4: store multiple N-element structure from sequential N registers (N = 2,3,4).
E.g. ld1(3 registers version) will load 32-bit elements {A, B, C, D, E, F} sequentially into the three 64-bit vectors list {BA, DC, FE}.
E.g. ld3 will load 32-bit elements {A, B, C, D, E, F} into the three 64-bit vectors list {DA, EB, FC}.
llvm-svn: 192351
With this patch we produce alias for cases like
template<typename T>
struct foobar {
foobar() {
}
};
template struct foobar<void>;
It is safe to use aliases to weak symbols, as long and the alias itself is also
weak.
llvm-svn: 192300
An updated version of r191586 with bug fix.
Struct-path aware TBAA generates tags to specify the access path,
while scalar TBAA only generates tags to scalar types.
We should not generate a TBAA tag with null being the first field. When
a TBAA type node is null, the tag should be null too. Make sure we
don't decorate an instruction with a null TBAA tag.
Added a testing case for the bug reported by Richard with -relaxed-aliasing
and -fsanitizer=thread.
llvm-svn: 192145
In functions that only need to use the CGCXXABI member of a CodeGenTypes
class, pass that reference around directly rather than a reference to
a CodeGenTypes class.
This makes the actual dependence on CGCXXABI clear at the call sites.
llvm-svn: 192052
These IR instructions are undefined when the amount is equal to operand
size, but NEON right shifts support such shifts. Work around that by
emitting a different IR in these cases.
llvm-svn: 191953
CodeGenTypes already has a reference to a CGCXXABI. Use this directly
rather than going through CodeGenModule to get to the same information.
This is consistent with other references to CGCXXABI in CodeGenTypes
functions defined in CGCall.cpp.
llvm-svn: 191854
This attribute allows users to use a modified C or C++ function as an ARM
exception-handling function and, with care, to successfully return control to
user-space after the issue has been dealt with.
rdar://problem/14207019
llvm-svn: 191769
The general strategy is to create template versions of the conversion function and static invoker and then during template argument deduction of the conversion function, create the corresponding call-operator and static invoker specializations, and when the conversion function is marked referenced generate the body of the conversion function using the corresponding static-invoker specialization. Similarly, Codegen does something similar - when asked to emit the IR for a specialized static invoker of a generic lambda, it forwards emission to the corresponding call operator.
This patch has been reviewed in person both by Doug and Richard. Richard gave me the LGTM.
A few minor changes:
- per Richard's request i added a simple check to gracefully inform that captures (init, explicit or default) have not been added to generic lambdas just yet (instead of the assertion violation).
- I removed a few lines of code that added the call operators instantiated parameters to the currentinstantiationscope. Not only did it not handle parameter packs, but it is more relevant in the patch for nested lambdas which will follow this one, and fix that problem more comprehensively.
- Doug had commented that the original implementation strategy of using the TypeSourceInfo of the call operator to create the static-invoker was flawed and allowed const as a member qualifier to creep into the type of the static-invoker. I currently kludge around it - but after my initial discussion with Doug, with a follow up session with Richard, I have added a FIXME so that a more elegant solution that involves the use of TrivialTypeSourceInfo call followed by the correct wiring of the template parameters to the functionprototypeloc is forthcoming.
Thanks!
llvm-svn: 191634
The intent of getTypeOperand() was to yield an unqualified type.
However QualType::getUnqualifiedType() does not strip away qualifiers on
arrays.
N.B. This worked fine when typeid() was applied to an expression
because we would inject as implicit cast to the unqualified array type
in the AST.
llvm-svn: 191487
Specifically, the following features are not included in this commit:
- any sort of capturing within generic lambdas
- generic lambdas within template functions and nested
within other generic lambdas
- conversion operator for captureless lambdas
- ensuring all visitors are generic lambda aware
(Although I have gotten some useful feedback on my patches of the above and will be incorporating that as I submit those patches for commit)
As an example of what compiles through this commit:
template <class F1, class F2>
struct overload : F1, F2 {
using F1::operator();
using F2::operator();
overload(F1 f1, F2 f2) : F1(f1), F2(f2) { }
};
auto Recursive = [](auto Self, auto h, auto ... rest) {
return 1 + Self(Self, rest...);
};
auto Base = [](auto Self, auto h) {
return 1;
};
overload<decltype(Base), decltype(Recursive)> O(Base, Recursive);
int num_params = O(O, 5, 3, "abc", 3.14, 'a');
Please see attached tests for more examples.
This patch has been reviewed by Doug and Richard. Minor changes (non-functionality affecting) have been made since both of them formally looked at it, but the changes involve removal of supernumerary return type deduction changes (since they are now redundant, with richard having committed a recent patch to address return type deduction for C++11 lambdas using C++14 semantics).
Some implementation notes:
- Add a new Declarator context => LambdaExprParameterContext to
clang::Declarator to allow the use of 'auto' in declaring generic
lambda parameters
- Add various helpers to CXXRecordDecl to facilitate identifying
and querying a closure class
- LambdaScopeInfo (which maintains the current lambda's Sema state)
was augmented to house the current depth of the template being
parsed (id est the Parser calls Sema::RecordParsingTemplateParameterDepth)
so that SemaType.cpp::ConvertDeclSpecToType may use it to immediately
generate a template-parameter-type when 'auto' is parsed in a generic
lambda parameter context. (i.e we do NOT use AutoType deduced to
a template parameter type - Richard seemed ok with this approach).
We encode that this template type was generated from an auto by simply
adding $auto to the name which can be used for better diagnostics if needed.
- SemaLambda.h was added to hold some common lambda utility
functions (this file is likely to grow ...)
- Teach Sema::ActOnStartOfFunctionDef to check whether it
is being called to instantiate a generic lambda's call
operator, and if so, push an appropriately prepared
LambdaScopeInfo object on the stack.
- various tests were added - but much more will be needed.
There is obviously more work to be done, and both Richard (weakly) and Doug (strongly)
have requested that LambdaExpr be removed form the CXXRecordDecl LambdaDefinitionaData
in a future patch which is forthcoming.
A greatful thanks to all reviewers including Eli Friedman, James Dennett,
and especially the two gracious wizards (Richard Smith and Doug Gregor)
who spent hours providing feedback (in person in Chicago and on the mailing lists).
And yet I am certain that I have allowed unidentified bugs to creep in; bugs, that I will do my best to slay, once identified!
Thanks!
llvm-svn: 191453
Patch by Ana Pazos.
1.Added support for v1ix and v1fx types.
2.Added Scalar Pairwise Reduce instructions.
3.Added initial implementation of Scalar Arithmetic instructions.
llvm-svn: 191264
LLVM supports applying conversion instructions to vectors of the same number of
elements (fptrunc, fptosi, etc.) but there had been no way for a Clang user to
cause such instructions to be generated when using builtin vector types.
C-style casting on vectors is already defined in terms of bitcasts, and so
cannot be used for these conversions as well (without leading to a very
confusing set of semantics). As a result, this adds a __builtin_convertvector
intrinsic (patterned after the OpenCL __builtin_astype intrinsic). This is
intended to aid the creation of vector intrinsic headers that create generic IR
instead of target-dependent intrinsics (in other words, this is a generic
_mm_cvtepi32_ps). As noted in the documentation, the action of
__builtin_convertvector is defined in terms of the action of a C-style cast on
each vector element.
llvm-svn: 190915
GCC ToT doesn't do this & it's worth about 3.2% on Clang's DWO file size
with Clang. Some or all of this may be due to things like r190715 which
could have source fixes/improvements, but it's not clear that's the case
and that doesn't help other source bases.
llvm-svn: 190716
This restores the sqrt -> llvm.sqrt mapping, but only in fast-math mode
(specifically, when the UnsafeFPMath or NoNaNsFPMath CodeGen options are
enabled). The @llvm.sqrt* intrinsics have slightly different semantics from the
libm call, specifically, they are undefined when given a non-zero negative
number (the libm calls will always return NaN for any negative number).
This mapping was removed in r100613, and replaced with a TODO, but at that time
the fast-math flags were not yet implemented. Now that we have these, restoring
this mapping is important because it will enable autovectorization of sqrt
calls in loops (at least in fast-math mode).
llvm-svn: 190646
The code in CGExpr was added back in 2012 (r165536) but not exercised in tests
until recently.
Detected on the MemorySanitizer bootstrap bot.
llvm-svn: 190521
Static locals requiring initialization are not thread safe on Windows.
Unfortunately, it's possible to create static locals that are actually
externally visible with inline functions and templates. As a result, we
have to implement an initialization guard scheme that is compatible with
TUs built by MSVC, which makes thread safety prohibitively difficult.
MSVC's scheme is that every function that requires a guard gets an i32
bitfield. Each static local is assigned a bit that indicates if it has
been initialized, up to 32 bits, at which point a new bitfield is
created. MSVC rejects inline functions with more than 32 static locals,
and the externally visible mangling (?_B) only allows for one guard
variable per function.
On Eli's recommendation, I used MangleNumberingContext to track which
bit each static corresponds to.
Implements PR16888.
Reviewers: rjmccall, eli.friedman
Differential Revision: http://llvm-reviews.chandlerc.com/D1416
llvm-svn: 190427
By removing the possibility of strange partial definitions with no
members that older GCC's produced for the otherwise unreferenced outer
types of referenced inner types, we can simplify debug info generation
and correct this bug. Newer (4.8.1 and ToT) GCC's don't produce this
quirky debug info, and instead produce the full definition for the outer
type (except in the case where that type is dynamic and its vtable is
not emitted in this TU).
During the creation of the context for a type, we may revisit that type
(due to the need to visit template parameters, among other things) and
used to end up visiting it first there. Then when we would reach the
original code attempting to define that type, we would lose debug info
by overwriting its members.
By avoiding the possibility of latent "defined with no members" types,
we can be sure than whenever we already have a type in a cache (either a
definition or declaration), we can just return that. In the case of a
full definition, our work is done. In the case of a partial definition,
we must already be in the process of completing it. And in the case of a
declaration, the completed/vtable/etc callbacks can handle converting it
to a definition.
llvm-svn: 190122
Debug info emission was tripping over an IRGen bug (fixed in r189996)
that was resulting in duplicate emission of static data members of class
templates in namespaces.
We could add more test coverage to debug info for this issue
specifically, but I think the underlying IRGen test is more targeted and
sufficient for the issue.
llvm-svn: 190001
A quirk of AST representation leads to class template static data member
definitions being visited twice during Clang IRGen resulting in
duplicate (benign) initializers.
Discovered while investigating a possibly-related debug info bug tickled
by the duplicate emission of these members & their associated debug
info.
With thanks to Richard Smith for help investigating, understanding, and
helping with the fix.
llvm-svn: 189996
I tried to implement this properly in r189051, but I didn't have enough
test coverage. Richard kindly provided more test cases than I could
possibly imagine and now we should have the correct condition.
llvm-svn: 189898
This fixes pr13124.
From the discussion at
http://lists.cs.uiuc.edu/pipermail/cfe-dev/2012-June/022606.html
we know that we cannot make funcions in a weak_odr vtable also weak_odr. They
should remain linkonce_odr.
The side effect is that we cannot emit a available_externally vtable unless we
also emit a copy of the function. This also has an issue: If codegen is going
to output a function, sema has to mark it used. Given llvm.org/pr9114, it looks
like sema cannot be more aggressive at marking functions used because
of vtables.
This leaves us with a few unpleasant options:
* Marking functions in vtables used if possible. This sounds a bit sloppy, so
we should avoid it.
* Producing available_externally vtables only when all the functions in it are
already used or weak_odr. This would cover cases like
--------------------
struct foo {
virtual ~foo();
};
struct bar : public foo {
virtual void zed();
};
void f() {
foo *x(new bar);
delete x;
}
void g(bar *x) {
x->~bar(); // force the destructor to be used
}
--------------------------
and
----------------------------------
template<typename T>
struct bar {
virtual ~bar();
};
template<typename T>
bar<T>::~bar() {
}
// make the destructor weak_odr instead of linkonce_odr
extern template class bar<int>;
void f() {
bar<int> *x(new bar<int>);
delete x;
}
----------------------------
These look like corner cases, so it is unclear if it is worth it.
* And finally: Just nuke this optimization. That is what this patch implements.
llvm-svn: 189852
We use CXX mangler to generate unique identifier for external C++ struct,
union, class and enum. Types with unique identifier are added to retained
types by DIBuilder.
Testing cases are updated to reflect the unique identifier generated for types.
The order of MDNodes is changed because of retained types and testing cases
are updated accordingly.
Testing case debug-info-uuid.cpp now emits error with Itanium mangler, since
uuid is not yet handled in Itanium mangler. And it will check for the error
message.
llvm-svn: 189622
We had further discussions on how to retain types, whether to do it in front end
or in DIBuilder. And we agree to do it in DIBuilder so front ends
generating unique identifier do not need to worry about retaining them.
llvm-svn: 189609
We use CXX mangler to generate unique identifier for external C++ struct,
union, class and enum. Types with unique identifier are added to RetainedTypes
to make sure they are treated as used even when all uses are replaced with
the identifiers.
A single type can be added to RetainedTypes multiple times. For example, both
createForwardDecl and createLimitedType can add the same type to RetainedTypes.
A set is used to avoid duplication when updating AllRetainTypes in DIBuilder.
Testing cases are updated to reflect the unique identifier generated for types.
The order of MDNodes is changed because of retained types and testing cases
are updated accordingly.
Testing case debug-info-uuid.cpp now emits error with Itanium mangler, since
uuid is not yet handled in Itanium mangler.
We choose to update RetainedTypes in clang, then at finalize(), we update
AllRetainTypes in DIBuilder. The other choice is to update AllRetainTypes
in DIBuilder when creating a DICompositeType with unique identifier. This
option requires using ValueHandle for AllRetainTypes in DIBuilder since
the created DICompositeType can be modified later on by setContainingType etc.
llvm-svn: 189600
Both functions will take a Type pointer instead of a Decl pointer. This helps
with follow-up type uniquing patches, which need the Type pointer to call
CXX mangler to generate unique identifiers.
No functionality change.
llvm-svn: 189519
In the transition from declaration (with some members) to definition, we
were overwriting the list of members with the empty list when attaching
template parameters.
The fix is in llvm::DICompositeType::addMember (along with asserts that
cause this bug to be covered by existing Clang test cases), including
adding some asserts to catch this sort of issue which found issues fixed
in this commit.
llvm-svn: 189494
These operations "vector add high-half narrow" actually correspond to the
sequence:
%sum = add <4 x i32> %lhs, %rhs
%high = lshr <4 x i32> %sum, <i32 16, i32 16, i32 16, i32 16>
%res = trunc <4 x i32> %high to <4 x i16>
Now that LLVM can spot this, Clang should emit the corresponding LLVM IR.
llvm-svn: 189463
The NEON intrinsics vqdmlal and vqdmlsl are really just combinations of a
saturating-doubling-multiply (vqdmull) and a saturating add/sub, so now that
LLVM can spot those patterns Clang should emit them instead of specialised
intrinsics.
Feature already tested by existing ARM NEON intrinsics tests.
llvm-svn: 189462
This reverts commit r189320.
Alexey Samsonov and Dmitry Vyukov presented some arguments for keeping
these around - though it still seems like those tasks could be solved by
a tool just using the symbol table. In a very small number of cases,
thunks may be inlined & debug info might be able to save profilers &
similar tools from misclassifying those cases as part of the caller.
The extra changes here plumb through the VarDecl for various cases to
CodeGenFunction - this provides better fidelity through a few APIs but
generally just causes the CGF::StartFunction to fallback to using the
name of the IR function as the name in the debug info.
The changes to debug-info-global-ctor-dtor.cpp seem like goodness. The
two names that go missing (in favor of only emitting those names as
linkage names) are names that can be demangled - emitting them only as
the linkage name should encourage tools to do just that.
Again, thanks to Dinesh Dwivedi for investigation/work on this issue.
llvm-svn: 189421
Summary:
Makes functions with implicit calling convention compatible with
function types with a matching explicit calling convention. This fixes
things like calls to qsort(), which has an explicit __cdecl attribute on
the comparator in Windows headers.
Clang will now infer the calling convention from the declarator. There
are two cases when the CC must be adjusted during redeclaration:
1. When defining a non-inline static method.
2. When redeclaring a function with an implicit or mismatched
convention.
Fixes PR13457, and allows clang to compile CommandLine.cpp for the
Microsoft C++ ABI.
Excellent test cases provided by Alexander Zinenko!
Reviewers: rsmith
Differential Revision: http://llvm-reviews.chandlerc.com/D1231
llvm-svn: 189412
This was added in r166676 based on PR13942 on the basis that tools may
need debug information for any executable code/function for some fairly
broad/non-specific purposes. It seems to me (as noted in PR14569) that
the major/only purpose is in backtraces, which should generally not
apply to thunks as they won't appear in the stack themselves. By
removing them we fix PR14569 and reduce the size of Clang's debug info.
Strangely enough this doesn't seem to have a substantial impact on
Clang's self-hosted debug info (at least looking at DWO file size) size
at all. Not sure if I failed to test this correctly but I only observed
a 0.004% change in DWO file size over Clang+LLVM.
With thanks to Dinesh Dwivedi for work on this PR.
llvm-svn: 189320
CodeGenFunction is run on only one function - a new object is made for
each new function. I would add an assertion/flag to this effect, but
there's an exception: ObjC properties involve emitting helper functions
that are all emitted by the same CodeGenFunction object, so such a check
is not possible/correct.
llvm-svn: 189277
- __func__ or __FUNCTION__ returns captured statement's parent
function name, not the one compiler generated.
Differential Revision: http://llvm-reviews.chandlerc.com/D1491
Reviewed by bkramer
llvm-svn: 189219
They were mostly copy&paste of each other, move it to CodeGenFunction. Of course
the two implementations have diverged over time; the one in CGExprCXX seems to
be the more modern one so I picked that one and moved it to CGClass which feels
like a better home for it. No intended functionality change.
llvm-svn: 189203
Summary:
Previously the backend wouldn't get to see the underlying GlobalValue
that corresponds to the template argument because it would be hidden by
a cast at the IR level. Instead strip the pointer casts off of the
value until we see the underlying GlobalValue.
Reviewers: dblaikie, echristo, majnemer
Reviewed By: majnemer
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1508
llvm-svn: 189200
Summary:
This allows us to handle the general case where a non-type template
argument evaluates to a constant expression which isn't integral or a
declaration.
This fixes PR16939.
Reviewers: dblaikie, rsmith
Reviewed By: dblaikie
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1453
llvm-svn: 189165
Summary:
These typically come from static data members of class template
specializations. This accomplishes two things:
1. May expose GlobalOpt optimizations for Itanium C++ ABI code.
2. Works toward fixing double initialization in the Microsoft C++ ABI.
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1475
llvm-svn: 189051
This might be able to be optimized further by only doing this in the
absence of a key function, but it doesn't look like GCC is doing that so
I'm not rushing to do it just yet.
llvm-svn: 189022
Specifically, the following features are not included in this commit:
- any sort of capturing within generic lambdas
- nested lambdas
- conversion operator for captureless lambdas
- ensuring all visitors are generic lambda aware
As an example of what compiles:
template <class F1, class F2>
struct overload : F1, F2 {
using F1::operator();
using F2::operator();
overload(F1 f1, F2 f2) : F1(f1), F2(f2) { }
};
auto Recursive = [](auto Self, auto h, auto ... rest) {
return 1 + Self(Self, rest...);
};
auto Base = [](auto Self, auto h) {
return 1;
};
overload<decltype(Base), decltype(Recursive)> O(Base, Recursive);
int num_params = O(O, 5, 3, "abc", 3.14, 'a');
Please see attached tests for more examples.
Some implementation notes:
- Add a new Declarator context => LambdaExprParameterContext to
clang::Declarator to allow the use of 'auto' in declaring generic
lambda parameters
- Augment AutoType's constructor (similar to how variadic
template-type-parameters ala TemplateTypeParmDecl are implemented) to
accept an IsParameterPack to encode a generic lambda parameter pack.
- Add various helpers to CXXRecordDecl to facilitate identifying
and querying a closure class
- LambdaScopeInfo (which maintains the current lambda's Sema state)
was augmented to house the current depth of the template being
parsed (id est the Parser calls Sema::RecordParsingTemplateParameterDepth)
so that Sema::ActOnLambdaAutoParameter may use it to create the
appropriate list of corresponding TemplateTypeParmDecl for each
auto parameter identified within the generic lambda (also stored
within the current LambdaScopeInfo). Additionally,
a TemplateParameterList data-member was added to hold the invented
TemplateParameterList AST node which will be much more useful
once we teach TreeTransform how to transform generic lambdas.
- SemaLambda.h was added to hold some common lambda utility
functions (this file is likely to grow ...)
- Teach Sema::ActOnStartOfFunctionDef to check whether it
is being called to instantiate a generic lambda's call
operator, and if so, push an appropriately prepared
LambdaScopeInfo object on the stack.
- Teach Sema::ActOnStartOfLambdaDefinition to set the
return type of a lambda without a trailing return type
to 'auto' in C++1y mode, and teach the return type
deduction machinery in SemaStmt.cpp to process either
C++11 and C++14 lambda's correctly depending on the flag.
- various tests were added - but much more will be needed.
A greatful thanks to all reviewers including Eli Friedman,
James Dennett and the ever illuminating Richard Smith. And
yet I am certain that I have allowed unidentified bugs to creep in;
bugs, that I will do my best to slay, once identified!
Thanks!
llvm-svn: 188977
With r185721, calling mangleCXXRTTIName on C code will cause crashes.
This commit fixes crashes on C testing cases when turning on struct-path TBAA.
For C code, we simply use the Decl name without the context. This can
cause two different structs having the same name, and may cause inaccurate but
conservative alias results.
llvm-svn: 188930
1. We now print the return type of lambdas and return type deduced functions
as "auto". Trailing return types with decltype print the underlying type.
2. Use the lambda or block scope for the PredefinedExpr type instead of the
parent function. This fixes PR16946, a strange mismatch between type of the
expression and the actual result.
3. Verify the type in CodeGen.
4. The type for blocks is still wrong. They are numbered and the name is not
known until CodeGen.
llvm-svn: 188900
This reverts commit r188687 (reverts r188642 (reverts 188600 (reverts
188576))).
With added test coverage & fix for -gline-tables-only.
Thanks Michael Gottesman for reverting this patch when it demonstrated
problems & providing a reproduction/details to help me track this down.
llvm-svn: 188739
Refactor the underlying code a bit to remove unnecessary calls to
"hasErrorOccurred" & make them consistently at all the entry points to
the IRGen ASTConsumer.
llvm-svn: 188707
This reverts commit r188642.
This change is causing LTO builds to cause our 16 GB machines to swap and OOM
all weekend. I am going to work with Dave Blaikie to resolve the issue.
Sorry Dave =(.
llvm-svn: 188687
This reverts commit r188600.
r188640/r188639 fixed the root cause of the crash-on-valid that r188600
originally introduced. This now appears to bootstrap debug clang
successfully to the best of my testing.
llvm-svn: 188642
A partner to r188639, this is a somewhat heavy-handed fix to the general
issue, since even after that prior change the issue does still
unavoidably arise with template parameters (see test case).
There are other ways we could consider addressing this (see FIXME).
llvm-svn: 188640
Possible minor reduction in debug info & avoid some cases where creating
a context chain could lead to the type the context chain is being
created for, being created. (this is still possible with template
parameters - tests/fixes/improvements to follow)
llvm-svn: 188639
Fixes a crash-on-valid introduced by r188486 (which should've occurred
earlier but for a blatant bug where calling createFwdDecl from the
requireCompleteType callback was useless under -flimit-debug-info and we
were just getting lucky with other later callbacks requiring the type
anyway).
llvm-svn: 188622