As discussed in https://reviews.llvm.org/D22666, our current mechanism to
support -pg profiling, where we insert calls to mcount(), or some similar
function, is fundamentally broken. We insert these calls in the frontend, which
means they get duplicated when inlining, and so the accumulated execution
counts for the inlined-into functions are wrong.
Because we don't want the presence of these functions to affect optimizaton,
they should be inserted in the backend. Here's a pass which would do just that.
The knowledge of the name of the counting function lives in the frontend, so
we're passing it here as a function attribute. Clang will be updated to use
this mechanism.
Differential Revision: https://reviews.llvm.org/D22825
llvm-svn: 280347
minimal and boring form than the old pass manager's version.
This pass does the very minimal amount of work necessary to inline
functions declared as always-inline. It doesn't support a wide array of
things that the legacy pass manager did support, but is alse ... about
20 lines of code. So it has that going for it. Notably things this
doesn't support:
- Array alloca merging
- To support the above, bottom-up inlining with careful history
tracking and call graph updates
- DCE of the functions that become dead after this inlining.
- Inlining through call instructions with the always_inline attribute.
Instead, it focuses on inlining functions with that attribute.
The first I've omitted because I'm hoping to just turn it off for the
primary pass manager. If that doesn't pan out, I can add it here but it
will be reasonably expensive to do so.
The second should really be handled by running global-dce after the
inliner. I don't want to re-implement the non-trivial logic necessary to
do comdat-correct DCE of functions. This means the -O0 pipeline will
have to be at least 'always-inline,global-dce', but that seems
reasonable to me. If others are seriously worried about this I'd like to
hear about it and understand why. Again, this is all solveable by
factoring that logic into a utility and calling it here, but I'd like to
wait to do that until there is a clear reason why the existing
pass-based factoring won't work.
The final point is a serious one. I can fairly easily add support for
this, but it seems both costly and a confusing construct for the use
case of the always inliner running at -O0. This attribute can of course
still impact the normal inliner easily (although I find that
a questionable re-use of the same attribute). I've started a discussion
to sort out what semantics we want here and based on that can figure out
if it makes sense ta have this complexity at O0 or not.
One other advantage of this design is that it should be quite a bit
faster due to checking for whether the function is a viable candidate
for inlining exactly once per function instead of doing it for each call
site.
Anyways, hopefully a reasonable starting point for this pass.
Differential Revision: https://reviews.llvm.org/D23299
llvm-svn: 278896
Summary:
Port the ModuleSummaryAnalysisWrapperPass to the new pass manager.
Use it in the ported BitcodeWriterPass (similar to how we use the
legacy ModuleSummaryAnalysisWrapperPass in the legacy WriteBitcodePass).
Also, pass the -module-summary opt flag through to the new pass
manager pipeline and through to the bitcode writer pass, and add
a test that uses it.
Reviewers: mehdi_amini
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D23439
llvm-svn: 278508
Summary:
Having -O0 in opt allows testing that -O0 optimization
pipeline is built correctly.
Reviewers: majnemer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23208
llvm-svn: 277829
This adds boilerplate code for all coroutine passes,
the passes are no-ops for now.
Also, a small test has been added to verify that passes execute in
the expected order or not at all if coroutine support is disabled.
Patch by Gor Nishanov!
Differential Revision: https://reviews.llvm.org/D22847
llvm-svn: 277033
Summary:
This is the first set of changes implementing the RFC from
http://thread.gmane.org/gmane.comp.compilers.llvm.devel/98334
This is a cross-sectional patch; rather than implementing the hotness
attribute for all optimization remarks and all passes in a patch set, it
implements it for the 'missed-optimization' remark for Loop
Distribution. My goal is to shake out the design issues before scaling
it up to other types and passes.
Hotness is computed as an integer as the multiplication of the block
frequency with the function entry count. It's only printed in opt
currently since clang prints the diagnostic fields directly. E.g.:
remark: /tmp/t.c:3:3: loop not distributed: use -Rpass-analysis=loop-distribute for more info (hotness: 300)
A new API added is similar to emitOptimizationRemarkMissed. The
difference is that it additionally takes a code region that the
diagnostic corresponds to. From this, hotness is computed using BFI.
The new API is exposed via an analysis pass so that it can be made
dependent on LazyBFI. (Thanks to Hal for the analysis pass idea.)
This feature can all be enabled by setDiagnosticHotnessRequested in the
LLVM context. If this is off, LazyBFI is not calculated (D22141) so
there should be no overhead.
A new command-line option is added to turn this on in opt.
My plan is to switch all user of emitOptimizationRemark* to use this
module instead.
Reviewers: hfinkel
Subscribers: rcox2, mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D21771
llvm-svn: 275583
Previously, there was a discrepancy between the population of function
passes in FPasses, and their invocation. Function passes specified on
the command line, after an optimizaton level was simply discared. This
fix PR27509.
Patch by Jesper Antonsson.
Differential Review: http://reviews.llvm.org/D20725
llvm-svn: 272770
looking for it along $PATH. This allows installs of LLVM tools outside of
$PATH to find the symbolizer and produce pretty backtraces if they crash.
llvm-svn: 272232
Having an enum member named Default is quite confusing: Is it distinct
from the others?
This patch removes that member and instead uses Optional<Reloc> in
places where we have a user input that still hasn't been maped to the
default value, which is now clear has no be one of the remaining 3
options.
llvm-svn: 269988
Summary:
This is a hook to allow TargetMachine to install passes at the
EP_EarlyAsPossible PassManagerBuilder extension point.
Reviewers: chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18614
llvm-svn: 267763
This intrinsic takes two arguments, ``%ptr`` and ``%offset``. It loads
a 32-bit value from the address ``%ptr + %offset``, adds ``%ptr`` to that
value and returns it. The constant folder specifically recognizes the form of
this intrinsic and the constant initializers it may load from; if a loaded
constant initializer is known to have the form ``i32 trunc(x - %ptr)``,
the intrinsic call is folded to ``x``.
LLVM provides that the calculation of such a constant initializer will
not overflow at link time under the medium code model if ``x`` is an
``unnamed_addr`` function. However, it does not provide this guarantee for
a constant initializer folded into a function body. This intrinsic can be
used to avoid the possibility of overflows when loading from such a constant.
Differential Revision: http://reviews.llvm.org/D18367
llvm-svn: 267223
Summary:
This is a follow-on to apply Duncan's new DIType ODR uniquing from
r266549 and r266713 in more places.
Enable enableDebugTypeODRUniquing() for ThinLTO backends invoked via
libLTO, similar to the way r266549 enabled this for ThinLTO backend
threads launched from gold-plugin.
Also enable enableDebugTypeODRUniquing in opt, similar to the way
r266549 enabled this for llvm-link (on by default, can be disabled with
new -disable-debug-info-type-map option), since we may perform ThinLTO
importing from opt.
Reviewers: dexonsmith, joker.eph
Subscribers: joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D19263
llvm-svn: 266746
The fast register-allocator cannot cope with inter-block dependencies without
spilling. This is fine for ldrex/strex loops coming from atomicrmw instructions
where any value produced within a block is dead by the end, but not for
cmpxchg. So we lower a cmpxchg at -O0 via a pseudo-inst that gets expanded
after regalloc.
Fortunately this is at -O0 so we don't have to care about performance. This
simplifies the various axes of expansion considerably: we assume a strong
seq_cst operation and ensure ordering via the always-present DMB instructions
rather than v8 acquire/release instructions.
Should fix the 32-bit part of PR25526.
llvm-svn: 266679
At the same time, fixes InstructionsTest::CastInst unittest: yes
you can leave the IR in an invalid state and exit when you don't
destroy the context (like the global one), no longer now.
This is the first part of http://reviews.llvm.org/D19094
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266379
Summary:
Let keep llvm-as "dumb": it converts textual IR to bitcode. This
commit removes the dependency from llvm-as to libLLVMAnalysis.
We'll add back summary in llvm-as if we get to a textual
representation for it at some point. In the meantime, opt seems
like a better place for that.
Reviewers: tejohnson
Subscribers: joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D19032
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266131
opt adds Verifier passes in AddOptimizationPasses even if
-disable-verify is on. Fix it so that the extra verification occurs
either when (1) -disable-verifier is off, or (2) -verify-each is on.
Thanks to David Jones for pointing out this behavior!
llvm-svn: 263090
Summary:
This is intended to be a performance flag, on the same level as clang
cc1 option "--disable-free". LLVM will never initialize it by default,
it will be up to the client creating the LLVMContext to request this
behavior. Clang will do it by default in Release build (just like
--disable-free).
"opt" and "llc" can opt-in using -disable-named-value command line
option.
When performing LTO on llvm-tblgen, the initial merging of IR peaks
at 92MB without this patch, and 86MB after this patch,setNameImpl()
drops from 6.5MB to 0.5MB.
The total link time goes from ~29.5s to ~27.8s.
Compared to a compile-time flag (like the IRBuilder one), it performs
very close. I profiled on SROA and obtain these results:
420ms with IRBuilder that preserve name
372ms with IRBuilder that strip name
375ms with IRBuilder that preserve name, and a runtime flag to strip
Reviewers: chandlerc, dexonsmith, bogner
Subscribers: joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D17946
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263086
Cloning the module was supposed to guard against the possibility
that the passes may be non-idempotent. However, for some reason
I decided to put that AFTER the passes had already run on the
module, defeating the point entirely. Fix that by moving up the
CloneModule as is done in llc.
llvm-svn: 254819
`Out` can be null if no output is requested, so move any access
to it inside the conditional. Thanks to Justin Bogner for finding
this.
llvm-svn: 254804
Summary: Lately, I have submitted a number of patches to fix bugs that
only occurred when using the same pass manager to compile multiple
modules (generally these bugs are failure to reset some persistent
state). Unfortunately I don't think there is currently a way to test
that from the command line. This adds a very simple flag to both llc
and opt, under which the tools will simply re-run their respective
pass pipelines using the same pass manager on (a clone of the same
module). Additionally, we verify that both outputs are bitwise the
same.
Reviewers: yaron.keren
Subscribers: loladiro, yaron.keren, kcc, llvm-commits
Differential Revision: http://reviews.llvm.org/D14965
llvm-svn: 254774
folding the code into the main Analysis library.
There already wasn't much of a distinction between Analysis and IPA.
A number of the passes in Analysis are actually IPA passes, and there
doesn't seem to be any advantage to separating them.
Moreover, it makes it hard to have interactions between analyses that
are both local and interprocedural. In trying to make the Alias Analysis
infrastructure work with the new pass manager, it becomes particularly
awkward to navigate this split.
I've tried to find all the places where we referenced this, but I may
have missed some. I have also adjusted the C API to continue to be
equivalently functional after this change.
Differential Revision: http://reviews.llvm.org/D12075
llvm-svn: 245318
remove ExecutionEngine's dependence on CodeGen. NFC.
This is a follow-up to r238080.
Differential Revision: http://reviews.llvm.org/D9830
llvm-svn: 238244
This is part of the work to remove TargetMachine::resetTargetOptions.
In this patch, instead of updating global variable NoFramePointerElim in
resetTargetOptions, its use in DisableFramePointerElim is replaced with a call
to TargetFrameLowering::noFramePointerElim. This function determines on a
per-function basis if frame pointer elimination should be disabled.
There is no change in functionality except that cl:opt option "disable-fp-elim"
can now override function attribute "no-frame-pointer-elim".
llvm-svn: 238080
options.
This commit fixes a bug in llc and opt where "-mcpu" and "-mattr" wouldn't
override function attributes "-target-cpu" and "-target-features" in the IR.
Differential Revision: http://reviews.llvm.org/D9537
llvm-svn: 236677
Remove all the global bits to do with preserving use-list order by
moving the `cl::opt`s to the individual tools that want them. There's a
minor functionality change to `libLTO`, in that you can't send in
`-preserve-bc-uselistorder=false`, but making that bit settable (if it's
worth doing) should be through explicit LTO API.
As a drive-by fix, I removed some includes of `UseListOrder.h` that were
made unnecessary by recent commits.
llvm-svn: 234973
Unify the error messages for the various tools when `verifyModule()`
fails on an input module. The "brave new way" is:
lltool: path/to/input.ll: error: input module is broken!
llvm-svn: 233667
Change `llc` and `opt` to run `verifyModule()`. This ensures that we
check the full module before `FunctionPass::doInitialization()` ever
gets called (I was getting crashes in `DwarfDebug` instead of verifier
failures when testing a WIP patch that checks operands of compile
units). In `opt`, also move up debug-info-stripping so that it still
runs before verification.
There was a fair bit of broken code that was sitting in tree.
Interestingly, some were cases of a `select` that referred to itself in
`-instcombine` tests (apparently an intermediate result). I split them
off to `*-noverify.ll` tests with RUN lines like this:
opt < %s -S -disable-verify -instcombine | opt -S | FileCheck %s
This avoids verifying the input file (so we can get the broken code into
`-instcombine), but still verifies the output with a second call to
`opt` (to verify that `-instcombine` will clean it up like it should).
llvm-svn: 233432
Remove `DebugInfoVerifierLegacyPass` and the `-verify-di` pass.
Instead, call into the `DebugInfoVerifier` from inside
`VerifierLegacyPass::finalizeModule()`. This better matches the logic
in `verifyModule()` (used by the new PassManager), avoids requiring two
separate passes to verify the IR, and makes the API for "add a pass to
verify the IR" simple.
Note: the `-verify-debug-info` flag still works (for now, at least;
eventually it might make sense to just remove it).
llvm-svn: 232772
`StripDebug` was only used by tools/opt/opt.cpp in
`AddStandardLinkPasses()`, but opt.cpp adds the same pass based on its
command-line flag before it calls `AddStandardLinkPasses()`. Stripping
debug info twice isn't very useful.
llvm-svn: 232765
Summary:
DataLayout keeps the string used for its creation.
As a side effect it is no longer needed in the Module.
This is "almost" NFC, the string is no longer
canonicalized, you can't rely on two "equals" DataLayout
having the same string returned by getStringRepresentation().
Get rid of DataLayoutPass: the DataLayout is in the Module
The DataLayout is "per-module", let's enforce this by not
duplicating it more than necessary.
One more step toward non-optionality of the DataLayout in the
module.
Make DataLayout Non-Optional in the Module
Module->getDataLayout() will never returns nullptr anymore.
Reviewers: echristo
Subscribers: resistor, llvm-commits, jholewinski
Differential Revision: http://reviews.llvm.org/D7992
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 231270