-fno-inline-functions, -O0, and optnone.
These were really, really tangled together:
- We used the noinline LLVM attribute for -fno-inline
- But not for -fno-inline-functions (breaking LTO)
- But we did use it for -finline-hint-functions (yay, LTO is happy!)
- But we didn't for -O0 (LTO is sad yet again...)
- We had weird structuring of CodeGenOpts with both an inlining
enumeration and a boolean. They interacted in weird ways and
needlessly.
- A *lot* of set smashing went on with setting these, and then got worse
when we considered optnone and other inlining-effecting attributes.
- A bunch of inline affecting attributes were managed in a completely
different place from -fno-inline.
- Even with -fno-inline we failed to put the LLVM noinline attribute
onto many generated function definitions because they didn't show up
as AST-level functions.
- If you passed -O0 but -finline-functions we would run the normal
inliner pass in LLVM despite it being in the O0 pipeline, which really
doesn't make much sense.
- Lastly, we used things like '-fno-inline' to manipulate the pass
pipeline which forced the pass pipeline to be much more
parameterizable than it really needs to be. Instead we can *just* use
the optimization level to select a pipeline and control the rest via
attributes.
Sadly, this causes a bunch of churn in tests because we don't run the
optimizer in the tests and check the contents of attribute sets. It
would be awesome if attribute sets were a bit more FileCheck friendly,
but oh well.
I think this is a significant improvement and should remove the semantic
need to change what inliner pass we run in order to comply with the
requested inlining semantics by relying completely on attributes. It
also cleans up tho optnone and related handling a bit.
One unfortunate aspect of this is that for generating alwaysinline
routines like those in OpenMP we end up removing noinline and then
adding alwaysinline. I tried a bunch of other approaches, but because we
recompute function attributes from scratch and don't have a declaration
here I couldn't find anything substantially cleaner than this.
Differential Revision: https://reviews.llvm.org/D28053
llvm-svn: 290398
Much to my surprise, '-disable-llvm-optzns' which I thought was the
magical flag I wanted to get at the raw LLVM IR coming out of Clang
deosn't do that. It still runs some passes over the IR. I don't want
that, I really want the *raw* IR coming out of Clang and I strongly
suspect everyone else using it is in the same camp.
There is actually a flag that does what I want that I didn't know about
called '-disable-llvm-passes'. I suspect many others don't know about it
either. It both does what I want and is much simpler.
This removes the confusing version and makes that spelling of the flag
an alias for '-disable-llvm-passes'. I've also moved everything in Clang
to use the 'passes' spelling as it seems both more accurate (*all* LLVM
passes are disabled, not just optimizations) and much easier to remember
and spell correctly.
This is part of simplifying how Clang drives LLVM to make it cleaner to
wire up to the new pass manager.
Differential Revision: https://reviews.llvm.org/D28047
llvm-svn: 290392
It seems that there is small bug, and we can't generate assume loads
when some virtual functions have internal visibiliy
This reverts commit 982bb7d966947812d216489b3c519c9825cacbf2.
llvm-svn: 247332
Generating call assume(icmp %vtable, %global_vtable) after constructor
call for devirtualization purposes.
For more info go to:
http://lists.llvm.org/pipermail/cfe-dev/2015-July/044227.html
Edit:
Fixed version because of PR24479.
After this patch got reverted because of ScalarEvolution bug (D12719)
Merged after John McCall big patch (Added Address).
http://reviews.llvm.org/D11859
llvm-svn: 247199
There was linker problem, and it turns out that it is not always safe
to refer to vtable. If the vtable is used, then we can refer to it
without any problem, but because we don't know when it will be used or
not, we can only check if vtable is external or it is safe to to emit it
speculativly (when class it doesn't have any inline virtual functions).
It should be fixed in the future.
http://reviews.llvm.org/D12385
llvm-svn: 246214
When an internal-linkage thunk is code gen'd, CodeGenVTables::emitThunk
will first be called with ForVTable=true (which incorrectly set the
thunk's linkage to available_externally under the Itanium ABI) and later
with ForVTable=false (which reset it to internal). Because we will always
see a call with ForVTable=false, this incorrect linkage never ended up in
the final IR. However, the temporary presence of this linkage caused us
to give such functions a comdat as a result of code introduced in r241102.
To avoid this, check that the thunk is externally visible before giving it
available_externally linkage.
llvm-svn: 241136
Functions with available_externally linkage will not be emitted to object
files (they will just be undefined symbols), so it does not make sense to
put them in comdats.
Creates a second overload of maybeSetTrivialComdat that uses the GlobalObject
instead of the Decl, and uses that in several places that had the faulty
logic.
Differential Revision: http://reviews.llvm.org/D9580
llvm-svn: 236879
Currently we emit DeferredDeclsToEmit in reverse order. This patch changes that.
The advantages of the change are that
* The output order is a bit closer to the source order. The change to
test/CodeGenCXX/pod-member-memcpys.cpp is a good example.
* If we decide to deffer more, it will not cause as large changes in the
estcases as it would without this patch.
llvm-svn: 226751
This can happen when we're trying to emit a thunk with available_externally
linkage with optimization enabled but bail because it doesn't make sense
for vararg functions.
PR18098.
llvm-svn: 196658
Various tests had sprung up over the years which had --check-prefix=ABC on the
RUN line, but "CHECK-ABC:" later on. This happened to work before, but was
strictly incorrect. FileCheck is getting stricter soon though.
Patch by Ron Ofir.
llvm-svn: 188174
like Darwin that don't support it. We should also complain about
invalid -fvisibility=protected, but that information doesn't seem
to exist at the most appropriate time, so I've left a FIXME behind.
llvm-svn: 149186
without defining them. This should be an error, but I'm paranoid about
"uses" that end up not actually requiring a definition. I'll revisit later.
Also, teach IR generation to not set internal linkage on variable
declarations, just for safety's sake. Doing so produces an invalid module
if the variable is not ultimately defined.
Also, fix several places in the test suite where we were using internal
functions without definitions.
llvm-svn: 126016
there's no return adjustment from the overridden to the overrider doesn't
mean there isn't a return adjustment from the overrider to the final
overrider. This matters if we're emitting a virtual this-adjustment thunk
because the overrider virtually inherits from the class providing the
nearest overridden method. Do the appropriate return adjustment in this case.
Fixes PR7611.
llvm-svn: 118466
a -cc1 option. The Darwin linker complains about mixed visibility when linking
gcc-built objects with clang-built objects, and the optimization isn't really
that valuable. Platforms with less ornery linkers can feel free to enable this.
llvm-svn: 110979
functions with in-line definitions, since such thunks will be emitted at any
use of the function.
Completes the feature work for rdar://problem/7523229.
llvm-svn: 110285
not make copies non-POD arguments or arguments passed by reference:
just copy the pointers directly. This eliminates another source of the
dreaded memcpy-of-non-PODs. Fixes PR7188.
llvm-svn: 104327
class type (that uses a return slot), pass the return slot to the
callee directly rather than allocating new storage and trying to copy
the object. This appears to have been the cause of the remaining two
Boost.Interprocess failures.
llvm-svn: 104215
"used" (e.g., we will refer to the vtable in the generated code) and
when they are defined (i.e., because we've seen the key function
definition). Previously, we were effectively tracking "potential
definitions" rather than uses, so we were a bit too eager about emitting
vtables for classes without key functions.
The new scheme:
- For every use of a vtable, Sema calls MarkVTableUsed() to indicate
the use. For example, this occurs when calling a virtual member
function of the class, defining a constructor of that class type,
dynamic_cast'ing from that type to a derived class, casting
to/through a virtual base class, etc.
- For every definition of a vtable, Sema calls MarkVTableUsed() to
indicate the definition. This happens at the end of the translation
unit for classes whose key function has been defined (so we can
delay computation of the key function; see PR6564), and will also
occur with explicit template instantiation definitions.
- For every vtable defined/used, we mark all of the virtual member
functions of that vtable as defined/used, unless we know that the key
function is in another translation unit. This instantiates virtual
member functions when needed.
- At the end of the translation unit, Sema tells CodeGen (via the
ASTConsumer) which vtables must be defined (CodeGen will define
them) and which may be used (for which CodeGen will define the
vtables lazily).
From a language perspective, both the old and the new schemes are
permissible: we're allowed to instantiate virtual member functions
whenever we want per the standard. However, all other C++ compilers
were more lazy than we were, and our eagerness was both a performance
issue (we instantiated too much) and a portability problem (we broke
Boost test cases, which now pass).
Notes:
(1) There's a ton of churn in the tests, because the order in which
vtables get emitted to IR has changed. I've tried to isolate some of
the larger tests from these issues.
(2) Some diagnostics related to
implicitly-instantiated/implicitly-defined virtual member functions
have moved to the point of first use/definition. It's better this
way.
(3) I could use a review of the places where we MarkVTableUsed, to
see if I missed any place where the language effectively requires a
vtable.
Fixes PR7114 and PR6564.
llvm-svn: 103718