PR43145 revealed two places where Clang was attempting to create a
bitcast without considering the address space of class types during
C++ class code generation.
Differential Revision: https://reviews.llvm.org/D68403
llvm-svn: 375118
Remove dead virtual functions from vtables with
replaceNonMetadataUsesWith, so that CGProfile metadata gets cleaned up
correctly.
Original commit message:
Currently, it is hard for the compiler to remove unused C++ virtual
functions, because they are all referenced from vtables, which are referenced
by constructors. This means that if the constructor is called from any live
code, then we keep every virtual function in the final link, even if there
are no call sites which can use it.
This patch allows unused virtual functions to be removed during LTO (and
regular compilation in limited circumstances) by using type metadata to match
virtual function call sites to the vtable slots they might load from. This
information can then be used in the global dead code elimination pass instead
of the references from vtables to virtual functions, to more accurately
determine which functions are reachable.
To make this transformation safe, I have changed clang's code-generation to
always load virtual function pointers using the llvm.type.checked.load
intrinsic, instead of regular load instructions. I originally tried writing
this using clang's existing code-generation, which uses the llvm.type.test
and llvm.assume intrinsics after doing a normal load. However, it is possible
for optimisations to obscure the relationship between the GEP, load and
llvm.type.test, causing GlobalDCE to fail to find virtual function call
sites.
The existing linkage and visibility types don't accurately describe the scope
in which a virtual call could be made which uses a given vtable. This is
wider than the visibility of the type itself, because a virtual function call
could be made using a more-visible base class. I've added a new
!vcall_visibility metadata type to represent this, described in
TypeMetadata.rst. The internalization pass and libLTO have been updated to
change this metadata when linking is performed.
This doesn't currently work with ThinLTO, because it needs to see every call
to llvm.type.checked.load in the linkage unit. It might be possible to
extend this optimisation to be able to use the ThinLTO summary, as was done
for devirtualization, but until then that combination is rejected in the
clang driver.
To test this, I've written a fuzzer which generates random C++ programs with
complex class inheritance graphs, and virtual functions called through object
and function pointers of different types. The programs are spread across
multiple translation units and DSOs to test the different visibility
restrictions.
I've also tried doing bootstrap builds of LLVM to test this. This isn't
ideal, because only classes in anonymous namespaces can be optimised with
-fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not
work correctly with -fvisibility=hidden. However, there are only 12 test
failures when building with -fvisibility=hidden (and an unmodified compiler),
and this change does not cause any new failures for either value of
-fvisibility.
On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size
reduction of ~6%, over a baseline compiled with "-O2 -flto
-fvisibility=hidden -fwhole-program-vtables". The best cases are reductions
of ~14% in 450.soplex and 483.xalancbmk, and there are no code size
increases.
I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which
show a geomean size reduction of ~3%, again with no size increases.
I had hoped that this would have no effect on performance, which would allow
it to awlays be enabled (when using -fwhole-program-vtables). However, the
changes in clang to use the llvm.type.checked.load intrinsic are causing ~1%
performance regression in the C++ parts of SPEC2006. It should be possible to
recover some of this perf loss by teaching optimisations about the
llvm.type.checked.load intrinsic, which would make it worth turning this on
by default (though it's still dependent on -fwhole-program-vtables).
Differential revision: https://reviews.llvm.org/D63932
llvm-svn: 375094
Currently, it is hard for the compiler to remove unused C++ virtual
functions, because they are all referenced from vtables, which are referenced
by constructors. This means that if the constructor is called from any live
code, then we keep every virtual function in the final link, even if there
are no call sites which can use it.
This patch allows unused virtual functions to be removed during LTO (and
regular compilation in limited circumstances) by using type metadata to match
virtual function call sites to the vtable slots they might load from. This
information can then be used in the global dead code elimination pass instead
of the references from vtables to virtual functions, to more accurately
determine which functions are reachable.
To make this transformation safe, I have changed clang's code-generation to
always load virtual function pointers using the llvm.type.checked.load
intrinsic, instead of regular load instructions. I originally tried writing
this using clang's existing code-generation, which uses the llvm.type.test
and llvm.assume intrinsics after doing a normal load. However, it is possible
for optimisations to obscure the relationship between the GEP, load and
llvm.type.test, causing GlobalDCE to fail to find virtual function call
sites.
The existing linkage and visibility types don't accurately describe the scope
in which a virtual call could be made which uses a given vtable. This is
wider than the visibility of the type itself, because a virtual function call
could be made using a more-visible base class. I've added a new
!vcall_visibility metadata type to represent this, described in
TypeMetadata.rst. The internalization pass and libLTO have been updated to
change this metadata when linking is performed.
This doesn't currently work with ThinLTO, because it needs to see every call
to llvm.type.checked.load in the linkage unit. It might be possible to
extend this optimisation to be able to use the ThinLTO summary, as was done
for devirtualization, but until then that combination is rejected in the
clang driver.
To test this, I've written a fuzzer which generates random C++ programs with
complex class inheritance graphs, and virtual functions called through object
and function pointers of different types. The programs are spread across
multiple translation units and DSOs to test the different visibility
restrictions.
I've also tried doing bootstrap builds of LLVM to test this. This isn't
ideal, because only classes in anonymous namespaces can be optimised with
-fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not
work correctly with -fvisibility=hidden. However, there are only 12 test
failures when building with -fvisibility=hidden (and an unmodified compiler),
and this change does not cause any new failures for either value of
-fvisibility.
On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size
reduction of ~6%, over a baseline compiled with "-O2 -flto
-fvisibility=hidden -fwhole-program-vtables". The best cases are reductions
of ~14% in 450.soplex and 483.xalancbmk, and there are no code size
increases.
I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which
show a geomean size reduction of ~3%, again with no size increases.
I had hoped that this would have no effect on performance, which would allow
it to awlays be enabled (when using -fwhole-program-vtables). However, the
changes in clang to use the llvm.type.checked.load intrinsic are causing ~1%
performance regression in the C++ parts of SPEC2006. It should be possible to
recover some of this perf loss by teaching optimisations about the
llvm.type.checked.load intrinsic, which would make it worth turning this on
by default (though it's still dependent on -fwhole-program-vtables).
Differential revision: https://reviews.llvm.org/D63932
llvm-svn: 374539
The static analyzer is warning about potential null dereferences, but in these cases we should be able to use castAs<> directly and if not assert will fire for us.
llvm-svn: 373918
The static analyzer is warning about potential null dereferences, but in these cases we should be able to use castAs<RecordType> directly and if not assert will fire for us.
llvm-svn: 373584
has a constexpr destructor.
For constexpr variables, reject if the variable does not have constant
destruction. In all cases, do not emit runtime calls to the destructor
for variables with constant destruction.
llvm-svn: 373159
Reason: this commit causes crashes in the clang compiler when building
LLVM Support with libc++, see https://bugs.llvm.org/show_bug.cgi?id=42665
for details.
llvm-svn: 366429
Summary:
This patch does mainly three things:
1. It fixes a false positive error detection in Sema that is similar to
D62156. The error happens when explicitly calling an overloaded
destructor for different address spaces.
2. It selects the correct destructor when multiple overloads for
address spaces are available.
3. It inserts the expected address space cast when invoking a
destructor, if needed, and therefore fixes a crash due to the unmet
assertion in llvm::CastInst::Create.
The following is a reproducer of the three issues:
struct MyType {
~MyType() {}
~MyType() __constant {}
};
__constant MyType myGlobal{};
kernel void foo() {
myGlobal.~MyType(); // 1 and 2.
// 1. error: cannot initialize object parameter of type
// '__generic MyType' with an expression of type '__constant MyType'
// 2. error: no matching member function for call to '~MyType'
}
kernel void bar() {
// 3. The implicit call to the destructor crashes due to:
// Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed.
// in llvm::CastInst::Create.
MyType myLocal;
}
The added test depends on D62413 and covers a few more things than the
above reproducer.
Subscribers: yaxunl, Anastasia, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D64569
llvm-svn: 366422
Improved classification of address space cast when qualification
conversion is performed - prevent adding addr space cast for
non-pointer and non-reference types. Take address space correctly
from the pointee.
Also pass correct address space from 'this' object using
AggValueSlot when generating addrspacecast in the constructor
call.
Differential Revision: https://reviews.llvm.org/D59988
llvm-svn: 357682
As background, when constructing a complete object, virtual bases are
constructed first. If an exception is thrown later in the ctor, those
virtual bases are destroyed, so sema marks the relevant constructors and
destructors of virtual bases as referenced. If necessary, they are
emitted.
However, an abstract class can never be used to construct a complete
object. In the Itanium C++ ABI, this works out nicely, because we never
end up emitting the "complete" constructor variant, only the "base"
constructor variant, which can be called by constructors of derived
classes. Clang's Sema::MarkBaseAndMemberDestructorsReferenced is aware
of this optimization, and it does not mark ctors and dtors of virtual
bases referenced when the constructor of an abstract class is emitted.
In the Microsoft ABI, there are no complete/base variants, so before
this change, the constructor of an abstract class could reference ctors
and dtors of a virtual base without marking them referenced. This could
lead to unresolved symbol errors at link time, as reported in PR41065.
The fix is to implement the same optimization as Sema: If the class is
abstract, don't bother initializing its virtual bases. The "is this
class the most derived class" check in the constructor will never pass,
and the virtual base constructor calls are always dead. Skip them.
I think Richard noticed this missed optimization back in 2016 when he
was implementing inheriting constructors. I wasn't able to find any bugs
or email about it, though.
Fixes PR41065
llvm-svn: 356425
The address space for the Base class pointer when up-casting
from Derived should be taken from the Derived class pointer.
Differential Revision: https://reviews.llvm.org/D53818
llvm-svn: 355606
Emit{Nounwind,}RuntimeCall{,OrInvoke} have been modified to take a
FunctionCallee as an argument, and CreateRuntimeFunction has been
modified to return a FunctionCallee. All callers have been updated.
Additionally, CreateBuiltinFunction is removed, as it was redundant
with CreateRuntimeFunction after some previous changes.
Differential Revision: https://reviews.llvm.org/D57668
llvm-svn: 353184
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Summary:
https://reviews.llvm.org/D54862 removed the usages of `ASTContext&` from
within the `CXXMethodDecl::getThisType` method. Remove the parameter
altogether, as well as all usages of it. This does not result in any
functional change because the parameter was unused since
https://reviews.llvm.org/D54862.
Test Plan: check-clang
Reviewers: akyrtzi, mikael
Reviewed By: mikael
Subscribers: mehdi_amini, dexonsmith, cfe-commits
Differential Revision: https://reviews.llvm.org/D56509
llvm-svn: 350914
Fixes assertion
> Assertion failed: (isa<X>(Val) && "cast<Ty>() argument of incompatible type!"), function cast, file llvm/Support/Casting.h, line 255.
It was triggered by trying to cast `FunctionDecl` to `CXXMethodDecl` as
`CGF.CurCodeDecl` in `CallBaseDtor::Emit`. It was happening because
cleanups were emitted in `ScalarExprEmitter::VisitExprWithCleanups`
after destroying `InlinedInheritingConstructorScope`, so
`CodeGenFunction.CurCodeDecl` didn't correspond to expected cleanup decl.
Fix the assertion by emitting cleanups before leaving
`InlinedInheritingConstructorScope` and changing `CurCodeDecl`.
Test cases based on a patch by Shoaib Meenai.
Fixes PR36748.
rdar://problem/45805151
Reviewers: rsmith, rjmccall
Reviewed By: rjmccall
Subscribers: jkorous, dexonsmith, cfe-commits, smeenai, compnerd
Differential Revision: https://reviews.llvm.org/D55543
llvm-svn: 349848
Address spaces are cast into generic before invoking the constructor.
Added support for a trailing Qualifiers object in FunctionProtoType.
Note: This recommits the previously reverted patch,
but now it is commited together with a fix for lldb.
Differential Revision: https://reviews.llvm.org/D54862
llvm-svn: 349019
Address spaces are cast into generic before invoking the constructor.
Added support for a trailing Qualifiers object in FunctionProtoType.
Differential Revision: https://reviews.llvm.org/D54862
llvm-svn: 348927
As suggested by Richard Smith, and initially put up for review here:
https://reviews.llvm.org/D53341, this patch removes a hack that was used
to ensure that proper target-feature lists were used when emitting
cpu-dispatch (and eventually, target-clones) implementations. As a part
of this, the GlobalDecl object is proliferated to a bunch more
locations.
Originally, this was put up for review (see above) to get acceptance on
the approach, though discussion with Richard in San Diego showed he
approved of the approach taken here. Thus, I believe this is acceptable
for Review-After-commit
Differential Revision: https://reviews.llvm.org/D53341
Change-Id: I0a0bd673340d334d93feac789d653e03d9f6b1d5
llvm-svn: 346757
from those that aren't.
This patch changes the way __block variables that aren't captured by
escaping blocks are handled:
- Since non-escaping blocks on the stack never get copied to the heap
(see https://reviews.llvm.org/D49303), Sema shouldn't error out when
the type of a non-escaping __block variable doesn't have an accessible
copy constructor.
- IRGen doesn't have to use the specialized byref structure (see
https://clang.llvm.org/docs/Block-ABI-Apple.html#id8) for a
non-escaping __block variable anymore. Instead IRGen can emit the
variable as a normal variable and copy the reference to the block
literal. Byref copy/dispose helpers aren't needed either.
This reapplies r343518 after fixing a use-after-free bug in function
Sema::ActOnBlockStmtExpr where the BlockScopeInfo was dereferenced after
it was popped and deleted.
rdar://problem/39352313
Differential Revision: https://reviews.llvm.org/D51564
llvm-svn: 343542
from those that aren't.
This patch changes the way __block variables that aren't captured by
escaping blocks are handled:
- Since non-escaping blocks on the stack never get copied to the heap
(see https://reviews.llvm.org/D49303), Sema shouldn't error out when
the type of a non-escaping __block variable doesn't have an accessible
copy constructor.
- IRGen doesn't have to use the specialized byref structure (see
https://clang.llvm.org/docs/Block-ABI-Apple.html#id8) for a
non-escaping __block variable anymore. Instead IRGen can emit the
variable as a normal variable and copy the reference to the block
literal. Byref copy/dispose helpers aren't needed either.
This reapplies r341754, which was reverted in r341757 because it broke a
couple of bots. r341754 was calling markEscapingByrefs after the call to
PopFunctionScopeInfo, which caused the popped function scope to be
cleared out when the following code was compiled, for example:
$ cat test.m
struct A {
id data[10];
};
void foo() {
__block A v;
^{ (void)v; };
}
This commit calls markEscapingByrefs before calling PopFunctionScopeInfo
to prevent that from happening.
rdar://problem/39352313
Differential Revision: https://reviews.llvm.org/D51564
llvm-svn: 343518
from those that aren't.
This patch changes the way __block variables that aren't captured by
escaping blocks are handled:
- Since non-escaping blocks on the stack never get copied to the heap
(see https://reviews.llvm.org/D49303), Sema shouldn't error out when
the type of a non-escaping __block variable doesn't have an accessible
copy constructor.
- IRGen doesn't have to use the specialized byref structure (see
https://clang.llvm.org/docs/Block-ABI-Apple.html#id8) for a
non-escaping __block variable anymore. Instead IRGen can emit the
variable as a normal variable and copy the reference to the block
literal. Byref copy/dispose helpers aren't needed either.
rdar://problem/39352313
Differential Revision: https://reviews.llvm.org/D51564
llvm-svn: 341754
With this change compiler generates alignment checks for wider range
of types. Previously such checks were generated only for the record types
with non-trivial default constructor. So the types like:
struct alignas(32) S2 { int x; };
typedef __attribute__((ext_vector_type(2), aligned(32))) float float32x2_t;
did not get checks when allocated by 'new' expression.
This change also optimizes the checks generated for the arrays created
in 'new' expressions. Previously the check was generated for each
invocation of type constructor. Now the check is generated only once
for entire array.
Differential Revision: https://reviews.llvm.org/D49589
llvm-svn: 338199
Similarly to CFI on virtual and indirect calls, this implementation
tries to use program type information to make the checks as precise
as possible. The basic way that it works is as follows, where `C`
is the name of the class being defined or the target of a call and
the function type is assumed to be `void()`.
For virtual calls:
- Attach type metadata to the addresses of function pointers in vtables
(not the functions themselves) of type `void (B::*)()` for each `B`
that is a recursive dynamic base class of `C`, including `C` itself.
This type metadata has an annotation that the type is for virtual
calls (to distinguish it from the non-virtual case).
- At the call site, check that the computed address of the function
pointer in the vtable has type `void (C::*)()`.
For non-virtual calls:
- Attach type metadata to each non-virtual member function whose address
can be taken with a member function pointer. The type of a function
in class `C` of type `void()` is each of the types `void (B::*)()`
where `B` is a most-base class of `C`. A most-base class of `C`
is defined as a recursive base class of `C`, including `C` itself,
that does not have any bases.
- At the call site, check that the function pointer has one of the types
`void (B::*)()` where `B` is a most-base class of `C`.
Differential Revision: https://reviews.llvm.org/D47567
llvm-svn: 335569
This is similar to the LLVM change https://reviews.llvm.org/D46290.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46320
llvm-svn: 331834
the tail padding is not reused.
We track on the AggValueSlot (and through a couple of other
initialization actions) whether we're dealing with an object that might
share its tail padding with some other object, so that we can avoid
emitting stores into the tail padding if that's the case. We still
widen stores into tail padding when we can do so.
Differential Revision: https://reviews.llvm.org/D45306
llvm-svn: 329342
The indirect function argument is in alloca address space in LLVM IR. However,
during Clang codegen for C++, the address space of indirect function argument
should match its address space in the source code, i.e., default addr space, even
for indirect argument. This is because destructor of the indirect argument may
be called in the caller function, and address of the indirect argument may be
taken, in either case the indirect function argument is expected to be in default
addr space, not the alloca address space.
Therefore, the indirect function argument should be mapped to the temp var
casted to default address space. The caller will cast it to alloca addr space
when passing it to the callee. In the callee, the argument is also casted to the
default address space and used.
CallArg is refactored to facilitate this fix.
Differential Revision: https://reviews.llvm.org/D34367
llvm-svn: 326946
constructor.
Previously, clang would emit an over-aligned (16-byte) store to
initialize B::x in B's base constructor when compiling the following
code:
struct A {
__attribute__((aligned(16))) double data1;
};
struct B : public virtual A {
B() : x(123) {}
double a;
int x;
};
struct C : public virtual B {};
void test() { B b; C c; }
This was happening because the code in IRGen that does member
initialization was using the alignment of a complete object instead of
the non-virtual alignment.
This commit fixes the bug.
rdar://problem/36382481
Differential Revision: https://reviews.llvm.org/D42521
llvm-svn: 323578
The standard says:
[expr.static.cast] p11: "If the prvalue of type “pointer to cv1 B” points to a B
that is actually a subobject of an object of type D, the resulting pointer points
to the enclosing object of type D. Otherwise, the behavior is undefined."
Therefore, the GEP must be inbounds.
This should solve the failure to optimize away a null check shown in PR35909:
https://bugs.llvm.org/show_bug.cgi?id=35909
Differential Revision: https://reviews.llvm.org/D42249
llvm-svn: 322950
Under the Microsoft ABI, it is possible for an object not to have
a virtual table pointer of its own if all of its virtual functions
were introduced by virtual bases. In that case, we need to load the
vtable pointer from one of the virtual bases and perform the type
check using its type.
Differential Revision: https://reviews.llvm.org/D41036
llvm-svn: 320638
The information about access and type sizes is necessary for
producing TBAA metadata in the new size-aware format. With this
patch, D39955 and D39956 in place we should be able to change
CodeGenTBAA::createScalarTypeNode() and
CodeGenTBAA::getBaseTypeInfo() to generate metadata in the new
format under the -new-struct-path-tbaa command-line option. For
now, this new information remains unused.
Differential Revision: https://reviews.llvm.org/D40176
llvm-svn: 319012