Commit Graph

332 Commits

Author SHA1 Message Date
Arthur Eubanks 07a9df5993 [NFC] Use getParamByValType instead of pointee type
To reduce dependence on pointee types for opaque pointers.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D101706
2021-05-01 21:22:41 -07:00
Roman Lebedev ba5b015b0d
[InlineCost] CallAnalyzer: use TTI info for extractvalue - they are free (PR50099)
It seems incorrect to use TTI data in some places,
and override it in others. In this case, TTI says
that `extractvalue` are free, yet we bill them.

While this doesn't address https://bugs.llvm.org/show_bug.cgi?id=50099 yet,
it reduces the cost from 55 to 50 while the threshold is 45.

Differential Revision: https://reviews.llvm.org/D101228
2021-04-30 13:55:11 +03:00
Arthur Eubanks a3a798d49d [InlineCost] Remove visitUnaryInstruction()
The simplifyInstruction() in visitUnaryInstruction() does not trigger
for all of check-llvm. Looking at all delegates to UnaryInstruction in
InstVisitor, the only instructions that either don't have a visitor in
CallAnalyzer, or redirect to UnaryInstruction, are VAArgInst and Alloca.
VAArgInst will never get simplified, and visitUnaryInstruction(Alloca)
would always return false anyway.

Reviewed By: mtrofin, lebedev.ri

Differential Revision: https://reviews.llvm.org/D101577
2021-04-29 20:33:30 -07:00
Arthur Eubanks 269b335bd7 [Inliner] Propagate SROA analysis through invariant group intrinsics
SROA can handle invariant group intrinsics, let the inliner know that
for better heuristics when the intrinsics are present.

This fixes size issues in a couple files when turning on
-fstrict-vtable-pointers in Chrome.

Reviewed By: rnk, mtrofin

Differential Revision: https://reviews.llvm.org/D100249
2021-04-12 10:54:22 -07:00
Liqiang Tao d2d6720a93 [InlineCost] Remove TODO comment that consider other forms of savings in the cost-benefit analysis
Attempts to compute savings more accurately cannot impact the set of critically important call sites.

Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D98577
2021-03-31 20:11:32 +08:00
Sander de Smalen 3ccbd4f3c7 NFC: Change getUserCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Depends on D97382

Reviewed By: ctetreau, paulwalker-arm

Differential Revision: https://reviews.llvm.org/D97466
2021-03-31 10:13:09 +01:00
Kazu Hirata 9d375a40c3 Reapply [InlineCost] Enable the cost benefit analysis on FDO
This patch enables the cost-benefit-analysis-based inliner by default
if we have instrumentation profile.

- SPEC CPU 2017 shows a 0.4% improvement.

- An internal large benchmark shows a 0.9% reduction in the cycle
  count along with 14.6% reduction in the number of call instructions
  executed.

Differential Revision: https://reviews.llvm.org/D98213
2021-03-25 21:51:38 -07:00
Kazu Hirata 3c775d93a1 [InlineCost] Reject a zero entry count
This patch teaches the cost-benefit-analysis-based inliner to reject a
zero entry count so that we don't trigger a divide-by-zero.
2021-03-25 21:51:36 -07:00
Nico Weber a60ffee3f4 Revert "[InlineCost] Enable the cost benefit analysis on FDO"
This reverts commit ef69aa961d.
Makes clang assert in PGO builds, see repro tgz in
https://bugs.chromium.org/p/chromium/issues/detail?id=1192783#c6
2021-03-25 16:42:19 -04:00
Wenlei He 6869e6c1e7 [InlineCost] Make cost-benefit decision explicit
With cost-benefit analysis for inlining, we bypass the cost-threshold by returning inline result from call analyzer early.

However the cost and threshold are still available from call analyzer, and when cost is actually higher than threshold, we incorrect set the reason.

The change makes the decision from cost-benefit analysis explicit. It's mostly NFC, except that it allows the priority-based sample loader inliner used by CSSPGO to use cost-benefit heuristic.

Differential Revision: https://reviews.llvm.org/D99302
2021-03-24 16:10:58 -07:00
Kazu Hirata ef69aa961d [InlineCost] Enable the cost benefit analysis on FDO
This patch enables the cost-benefit-analysis-based inliner by default
if we have instrumentation profile.

- SPEC CPU 2017 shows a 0.4% improvement.

- An internal large benchmark shows a 0.9% reduction in the cycle
  count along with 14.6% reduction in the number of call instructions
  executed.

Differential Revision: https://reviews.llvm.org/D98213
2021-03-24 15:36:49 -07:00
Stanislav Mekhanoshin b7b99b0799 [AMDGPU] Fix -amdgpu-inline-arg-alloca-cost
Before D94153 this threshold was in a pre-scaled units.
After D94153 inlining threshold multiplier is not applied
to this portion of the threshold anymore. Restore the
threshold by applying the multiplier.

Differential Revision: https://reviews.llvm.org/D98362
2021-03-12 10:19:50 -08:00
Arthur Eubanks a11bf9a7fb [AMDGPU][Inliner] Remove amdgpu-inline and add a new TTI inline hook
Having a custom inliner doesn't really fit in with the new PM's
pipeline. It's also extra technical debt.

amdgpu-inline only does a couple of custom things compared to the normal
inliner:
1) It disables inlining if the number of BBs in a function would exceed
   some limit
2) It increases the threshold if there are pointers to private arrays(?)

These can all be handled as TTI inliner hooks.
There already exists a hook for backends to multiply the inlining
threshold.

This way we can remove the custom amdgpu-inline pass.

This caused inline-hint.ll to fail, and after some investigation, it
looks like getInliningThresholdMultiplier() was previously getting
applied twice in amdgpu-inline (https://reviews.llvm.org/D62707 fixed it
not applying at all, so some later inliner change must have fixed
something), so I had to change the threshold in the test.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D94153
2021-01-21 20:29:17 -08:00
Kazu Hirata 65cd3cbb3f [Inliner] Compute the full cost for the cost benefit analsysis
This patch teaches the inliner to compute the full cost for a call
site where the newly introduced cost benefit analysis is enabled.

Note that the cost benefit analysis requires the full cost to be
computed.  However, without this patch or the -inline-cost-full
option, the early termination logic would kick in when the cost
exceeds the threshold, so we don't get to perform the cost benefit
analysis.  For this reason, we would need to specify four clang
options:

  -mllvm -inline-cost-full
  -mllvm -inline-enable-cost-benefit-analysis

This patch eliminates the need to specify -inline-cost-full.

Differential Revision: https://reviews.llvm.org/D93658
2021-01-05 12:48:49 -08:00
Kazu Hirata 9895c7012d [InlineCost] Implement cost-benefit-based inliner
This patch adds an alternative cost metric for the inliner to take
into account both the cost (i.e. size) and cycle count savings into
account.

Without this patch, we decide to inline a given call site if the size
of inlining the call site is below the threshold that is computed
according to the hotness of the call site.

This patch adds a new cost metric, turned off by default, to take over
the handling of hot call sites.  Specifically, with the new cost
metric, we decide to inline a given call site if the ratio of cycle
savings to size exceeds a threshold.  The cycle savings are computed
from call site costs, parameter propagation, folded conditional
branches, etc, all weighted by their respective profile counts.  The
size is primarily the callee size, but we subtract call site costs and
the size of basic blocks that are never executed.

The new cost metric implicitly takes advantage of the machine function
splitter recently introduced by Snehasish Kumar, which dramatically
reduces the cost of duplicating (e.g. inlining) cold basic blocks by
placing cold basic blocks of hot functions in the .text.split
section.

We evaluated the new cost metric on clang bootstrap and SPECInt 2017.

For clang bootstrap, we observe 0.69% runtime improvement.

For SPECInt we report the change in IntRate the C/C++ benchmarks.  All
benchmarks apart from perlbench and omnetpp improve, on average by
0.21% with the max for mcf at 1.96%.

Benchmark               % Change
500.perlbench_r         -0.45
502.gcc_r                0.13
505.mcf_r                1.96
520.omnetpp_r           -0.28
523.xalancbmk_r          0.49
525.x264_r               0.00
531.deepsjeng_r          0.00
541.leela_r              0.35
557.xz_r                 0.21

Differential Revision: https://reviews.llvm.org/D92780
2020-12-18 00:37:24 -08:00
Xun Li 31e60b9133 [coroutine] should disable inline before calling coro split
This is a rework of D85812, which didn't land.
When callee coroutine function is inlined into caller coroutine function before coro-split pass, llvm will emits "coroutine should have exactly one defining @llvm.coro.begin". It seems that coro-early pass can not handle this quiet well.
So we believe that unsplited coroutine function should not be inlined.
This patch fix such issue by not inlining function if it has attribute "coroutine.presplit" (it means the function has not been splited) to fix this issue
test plan: check-llvm, check-clang

In D85812, there was suggestions on moving the macros to Attributes.td to avoid circular header dependency issue.
I believe it's not worth doing just to be able to use one constant string in one place.
Today, there are already 3 possible attribute values for "coroutine.presplit": c6543cc6b8/llvm/lib/Transforms/Coroutines/CoroInternal.h (L40-L42)
If we move them into Attributes.td, we would be adding 3 new attributes to EnumAttr, just to support this, which I think is an overkill.

Instead, I think the best way to do this is to add an API in Function class that checks whether this function is a coroutine, by checking the attribute by name directly.

Differential Revision: https://reviews.llvm.org/D92706
2020-12-08 08:53:08 -08:00
Nick Desaulniers bc044a88ee [Inline] prevent inlining on stack protector mismatch
It's common for code that manipulates the stack via inline assembly or
that has to set up its own stack canary (such as the Linux kernel) would
like to avoid stack protectors in certain functions. In this case, we've
been bitten by numerous bugs where a callee with a stack protector is
inlined into an attribute((no_stack_protector)) caller, which
generally breaks the caller's assumptions about not having a stack
protector. LTO exacerbates the issue.

While developers can avoid this by putting all no_stack_protector
functions in one translation unit together and compiling those with
-fno-stack-protector, it's generally not very ergonomic or as
ergonomic as a function attribute, and still doesn't work for LTO. See also:
https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/
https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u

SSP attributes can be ordered by strength. Weakest to strongest, they
are: ssp, sspstrong, sspreq.  Callees with differing SSP attributes may be
inlined into each other, and the strongest attribute will be applied to the
caller. (No change)

After this change:
* A callee with no SSP attributes will no longer be inlined into a
  caller with SSP attributes.
* The reverse is also true: a callee with an SSP attribute will not be
  inlined into a caller with no SSP attributes.
* The alwaysinline attribute overrides these rules.

Functions that get synthesized by the compiler may not get inlined as a
result if they are not created with the same stack protector function
attribute as their callers.

Alternative approach to https://reviews.llvm.org/D87956.

Fixes pr/47479.

Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>

Reviewed By: rnk, MaskRay

Differential Revision: https://reviews.llvm.org/D91816
2020-12-02 11:00:16 -08:00
Nick Desaulniers 91aff1d8ba [InlineCost] prefer range-for. NFC
Prefer range-for over iterators when such methods exist. Precommitted
from https://reviews.llvm.org/D91816.

Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D92350
2020-11-30 16:07:40 -08:00
Kazu Hirata 60e749aa23 [InlineCost] Fix indentation (NFC) 2020-11-26 18:00:55 -08:00
Mikael Holmen faf848ac32 [Inline] Fix in handling of ptrtoint in InlineCost
ConstantOffsetPtrs contains mappings from a Value to a base pointer and
an offset. The offset is typed and has a size, and at least when dealing
with ptrtoint, it could happen that we had a mapping from a ptrtoint
with type i32 to an offset with type i16. This could later cause
problems, showing up in PR 47969 and PR 38500.

In PR 47969 we ended up in an assert complaining that trunc i16 to i16
is invalid and in Pr 38500 that a cmp on an i32 and i16 value isn't
valid.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D90610
2020-11-23 14:33:06 +01:00
Hongtao Yu f3c445697d [CSSPGO] IR intrinsic for pseudo-probe block instrumentation
This change introduces a new IR intrinsic named `llvm.pseudoprobe` for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story.

A pseudo probe is used to collect the execution count of the block where the probe is instrumented. This requires a pseudo probe to be persisting. The LLVM PGO instrumentation also instruments in similar places by placing a counter in the form of atomic read/write operations or runtime helper calls. While these operations are very persisting or optimization-resilient, in theory we can borrow the atomic read/write implementation from PGO counters and cut it off at the end of compilation with all the atomics converted into binary data. This was our initial design and we’ve seen promising sample correlation quality with it. However, the atomics approach has a couple issues:

1. IR Optimizations are blocked unexpectedly. Those atomic instructions are not going to be physically present in the binary code, but since they are on the IR till very end of compilation, they can still prevent certain IR optimizations and result in lower code quality.
2. The counter atomics may not be fully cleaned up from the code stream eventually.
3. Extra work is needed for re-targeting.

We choose to implement pseudo probes based on a special LLVM intrinsic, which is expected to have most of the semantics that comes with an atomic operation but does not block desired optimizations as much as possible. More specifically the semantics associated with the new intrinsic enforces a pseudo probe to be virtually executed exactly the same number of times before and after an IR optimization. The intrinsic also comes with certain flags that are carefully chosen so that the places they are probing are not going to be messed up by the optimizer while most of the IR optimizations still work. The core flags given to the special intrinsic is `IntrInaccessibleMemOnly`, which means the intrinsic accesses memory and does have a side effect so that it is not removable, but is does not access memory locations that are accessible by any original instructions. This way the intrinsic does not alias with any original instruction and thus it does not block optimizations as much as an atomic operation does. We also assign a function GUID and a block index to an intrinsic so that they are uniquely identified and not merged in order to achieve good correlation quality.

Let's now look at an example. Given the following LLVM IR:

```
define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 {
bb0:
  %cmp = icmp eq i32 %x, 0
   br i1 %cmp, label %bb1, label %bb2
bb1:
   br label %bb3
bb2:
   br label %bb3
bb3:
   ret void
}
```

The instrumented IR will look like below. Note that each `llvm.pseudoprobe` intrinsic call represents a pseudo probe at a block, of which the first parameter is the GUID of the probe’s owner function and the second parameter is the probe’s ID.

```
define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 {
bb0:
   %cmp = icmp eq i32 %x, 0
   call void @llvm.pseudoprobe(i64 837061429793323041, i64 1)
   br i1 %cmp, label %bb1, label %bb2
bb1:
   call void @llvm.pseudoprobe(i64 837061429793323041, i64 2)
   br label %bb3
bb2:
   call void @llvm.pseudoprobe(i64 837061429793323041, i64 3)
   br label %bb3
bb3:
   call void @llvm.pseudoprobe(i64 837061429793323041, i64 4)
   ret void
}

```

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D86490
2020-11-20 10:39:24 -08:00
Cullen Rhodes 2ccde3c96b [InlineCost] Fix scalable vectors in visitAlloca
Discovered as part of the VLS type work (see D85128).

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85848
2020-08-17 10:34:27 +00:00
Kirill Naumov d48c7859fb [InlineCost] GetElementPtr with constant operands
If the GEP instruction contanins only constants as its arguments,
then it should be recognized as a constant. For now, there was
also added a flag to turn off this simplification if it causes
any regressions ("disable-gep-const-evaluation") which is off
by default. Once I gather needed data of the effectiveness of
this simplification, the flag will be deleted.

Reviewers: apilipenko, davidxl, mtrofin

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D81026
2020-06-25 18:09:51 +00:00
Amara Emerson 090c108d04 Don't inline dynamic allocas that simplify to huge static allocas.
Some sequences of optimizations can generate call sites which may never be
executed during runtime, and through constant propagation result in dynamic
allocas being converted to static allocas with very large allocation amounts.

The inliner tries to move these to the caller's entry block, resulting in the
stack limits being reached/bypassed. Avoid inlining functions if this would
result.

The threshold of 64k currently doesn't get triggered on the test suite with an
-Os LTO build on arm64, care should be taken in changing this in future to avoid
needlessly pessimising inlining behaviour.

Differential Revision: https://reviews.llvm.org/D81765
2020-06-24 17:39:03 -07:00
Kirill Naumov 7f094f7f9d [InlineCost] PrinterPass prints constants to which instructions are simplified
This patch enables printing of constants to see which instructions were
constant-folded. Needed for tests and better visiual analysis of
inliner's work.

Reviewers: apilipenko, mtrofin, davidxl, fedor.sergeev

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D81024
2020-06-24 22:52:31 +00:00
Kirill Naumov 6a5d7d498c [InlineCost] InlineCostAnnotationWriterPass introduced
This class allows to see the inliner's decisions for better
optimization verifications and tests. To use, use flag
"-passes="print<inline-cost>"".

This is the second attempt to integrate the patch.
The problem from the first try has been discussed and
fixed in D82205.

Reviewers: apilipenko, mtrofin, davidxl, fedor.sergeev

Reviewed By: mtrofin

Differential revision: https://reviews.llvm.org/D81743
2020-06-24 21:27:07 +00:00
Kirill Naumov ca899bf90a [InlineCost] Added InlineCostCallAnalyzer::print()
For the upcoming changes, we need to have an ability to dump
InlineCostCallAnalyzer info in non-debug builds as well.

Reviewed-By: mtrofin
Differential Revision: https://reviews.llvm.org/D82205
2020-06-24 20:07:27 +00:00
Eric Christopher 10563e16aa [Analysis/Transforms/Sanitizers] As part of using inclusive language
within the llvm project, migrate away from the use of blacklist and
whitelist.
2020-06-20 00:42:26 -07:00
Kirill Naumov ea844c7520 Revert "[InlineCost] InlineCostAnnotationWriterPass introduced"
This reverts commit 37e06e8f5c.
2020-06-17 14:02:34 +00:00
Kirill Naumov dcf2a9f2ee Revert "[InlineCost] PrinterPass prints constants to which instructions are simplified"
This reverts commit 52b0db22f8.
2020-06-17 14:02:29 +00:00
Kirill Naumov 39a4505e34 Revert "[InlineCost] GetElementPtr with constant operands"
This reverts commit 34fba68d80.
2020-06-17 14:02:18 +00:00
Kirill Naumov 34fba68d80 [InlineCost] GetElementPtr with constant operands
If the GEP instruction contanins only constants as its arguments,
then it should be recognized as a constant. For now, there was
also added a flag to turn off this simplification if it causes
any regressions ("disable-gep-const-evaluation") which is off
by default. Once I gather needed data of the effectiveness of
this simplification, the flag will be deleted.

Reviewers: apilipenko, davidxl, mtrofin

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D81026
2020-06-17 13:40:19 +00:00
Kirill Naumov 52b0db22f8 [InlineCost] PrinterPass prints constants to which instructions are simplified
This patch enables printing of constants to see which instructions were
constant-folded. Needed for tests and better visiual analysis of
inliner's work.

Reviewers: apilipenko, mtrofin, davidxl, fedor.sergeev

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D81024
2020-06-17 13:40:18 +00:00
Kirill Naumov 37e06e8f5c [InlineCost] InlineCostAnnotationWriterPass introduced
This class allows to see the inliner's decisions for better
optimization verifications and tests. To use, use flag
"-passes="print<inline-cost>"".

Reviewers: apilipenko, mtrofin, davidxl, fedor.sergeev

Reviewed By: mtrofin

Differential revision: https://reviews.llvm.org/D81743
2020-06-17 13:40:17 +00:00
Kirill Naumov 1022b5eb5b [InlineCost] Preparational patch for creation of Printer pass.
- Renaming the printer class, flag
- Refactoring
- Changing some tests

This patch is a preparational stage for introducing a new printing pass and new
functionality to the existing Annotation Writer. I plan to extend
this functionality for this tool to be more useful when looking at the inline
process.
2020-06-11 22:29:03 +00:00
Kazu Hirata 347a599e5f [Inlining] Introduce -enable-npm-pgo-inline-deferral
Summary:
Experiments show that inline deferral past pre-inlining slightly
pessimizes the performance.

This patch introduces an option to control inline deferral during PGO.
The option defaults to true for now (that is, NFC).

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: eraman, hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80776
2020-06-04 00:40:58 -07:00
Mircea Trofin 08e2386dee Revert "Revert "[llvm][NFC] Cleanup uses of std::function in Inlining-related APIs""
This reverts commit 454de99a6f.

The problem was that one of the ctor arguments of CallAnalyzer was left
to be const std::function<>&. A function_ref was passed for it, and then
the ctor stored the value in a function_ref field. So a std::function<>
would be created as a temporary, and not survive past the ctor
invocation, while the field would.

Tested locally by following https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild

Original Differential Revision: https://reviews.llvm.org/D79917
2020-05-15 12:29:16 -07:00
Mircea Trofin 454de99a6f Revert "[llvm][NFC] Cleanup uses of std::function in Inlining-related APIs"
This reverts commit 767db5be67.
2020-05-14 22:32:44 -07:00
Mircea Trofin 767db5be67 [llvm][NFC] Cleanup uses of std::function in Inlining-related APIs
Summary:
Replacing uses of std::function pointers or refs, or Optional, to
function_ref, since the usage pattern allows that. If the function is
optional, using a default parameter value (nullptr). This led to a few
parameter reshufles, to push all optionals to the end of the parameter
list.

Reviewers: davidxl, dblaikie

Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, haicheng, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79917
2020-05-14 22:13:53 -07:00
Kirill Naumov 0383253cdf [InlineCost] Addressing a very strict assert check in CostAnnotationWriter::emitInstructionAnnot
The assert checks that every instruction must be annotated by this point while it is not
necessary. If the inlining process was interrupted because the threshold was reached, the rest
of the instructions would not be annotated which triggers the assert.
The added test shows the situation in which it can happen.
This is a recommit as the original commit fail due to the absence of REQUIRES: assert in the test.

Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D79107
2020-04-30 15:38:36 +00:00
Kirill Naumov 0fa793e798 Revert "[InlineCost] Addressing a very strict assert check in CostAnnotationWriter::emitInstructionAnnot"
This reverts commit 66947d05fd.
2020-04-29 22:00:51 +00:00
Kirill Naumov 66947d05fd [InlineCost] Addressing a very strict assert check in CostAnnotationWriter::emitInstructionAnnot
The assert checks that every instruction must be annotated by this point while it is not
necessary. If the inlining process was interrupted because the threshold was reached, the rest
of the instructions would not be annotated which triggers the assert.
The added test shows the situation in which it can happen.

Reviewed-By: mtrofin
Diff: https://reviews.llvm.org/D79107
2020-04-29 20:44:10 +00:00
Sam Parker e9c9329aa4 [TTI] Add TargetCostKind argument to getUserCost
There are several different types of cost that TTI tries to provide
explicit information for: throughput, latency, code size along with
a vague 'intersection of code-size cost and execution cost'.

The vectorizer is a keen user of RecipThroughput and there's at least
'getInstructionThroughput' and 'getArithmeticInstrCost' designed to
help with this cost. The latency cost has a single use and a single
implementation. The intersection cost appears to cover most of the
rest of the API.

getUserCost is explicitly called from within TTI when the user has
been explicit in wanting the code size (also only one use) as well
as a few passes which are concerned with a mixture of size and/or
a relative cost. In many cases these costs are closely related, such
as when multiple instructions are required, but one evident diverging
cost in this function is for div/rem.

This patch adds an argument so that the cost required is explicit,
so that we can make the important distinction when necessary.

Differential Revision: https://reviews.llvm.org/D78635
2020-04-28 08:57:45 +01:00
Mircea Trofin 011a07c075 Fix missing namespace in API implementation. 2020-04-27 21:05:33 -07:00
Mircea Trofin 8a4013ed38 [llvm][NFC] Add an explicit 'ComputeFullInlineCost' API
Summary:
Added getInliningCostEstimate, which is essentially what getInlineCost
computes if passed default inlining params, and  non-null ORE or
InlineParams::ComputeFullInlineCost.

Reviewers: davidxl, eraman, jdoerfert

Subscribers: hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78730
2020-04-27 09:11:45 -07:00
Mircea Trofin 201498c6f3 [llvm][NFC] Factor out cost-model independent inling decision
Summary:
llvm::getInlineCost starts off by determining whether inlining should
happen or not because of user directives or easily determinable
unviability. This CL refactors this functionality as a reusable API.

Reviewers: davidxl, eraman

Reviewed By: davidxl, eraman

Subscribers: hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73825
2020-04-23 10:58:43 -07:00
Mircea Trofin ec73ae11a3 [llvm][NFC][CallSite] Remove CallSite from ProfileSummary
Summary: Depends on D78395.

Reviewers: craig.topper, dblaikie, wmi, davidxl

Subscribers: eraman, hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78414
2020-04-18 12:03:14 -07:00
Mehdi Amini ed03d9485e Revert "[TLI] Per-function fveclib for math library used for vectorization"
This reverts commit 60c642e74b.

This patch is making the TLI "closed" for a predefined set of VecLib
while at the moment it is extensible for anyone to customize when using
LLVM as a library.
Reverting while we figure out a way to re-land it without losing the
generality of the current API.

Differential Revision: https://reviews.llvm.org/D77925
2020-04-11 01:05:01 +00:00
Wenlei He 60c642e74b [TLI] Per-function fveclib for math library used for vectorization
Summary:
Encode `-fveclib` setting as per-function attribute so it can threaded through to LTO backends. Accordingly per-function TLI now reads
the attributes and select available vector function list based on that. Now we also populate function list for all supported vector
libraries for the shared per-module `TargetLibraryInfoImpl`, so each function can select its available vector list independently but without
duplicating the vector function lists. Inlining between incompatbile vectlib attributed is also prohibited now.

Subscribers: hiraditya, dexonsmith, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77632
2020-04-09 18:26:38 -07:00
Mircea Trofin b4924f01a4 [llvm][nfc] InstructionCostDetail encapsulation
Ensured initialized fields; encapsulad delta calulations and evaluation
of threshold having had changed; assertion for CostThresholdMap
dereference, to indicate design intent.

Differential Revision: https://reviews.llvm.org/D77762
2020-04-09 08:21:18 -07:00