Commit Graph

2850 Commits

Author SHA1 Message Date
Weiming Zhao 8213072a45 [SimplifyLibCalls] Optimization for pow(x, n) where n is some constant
Summary:
    In order to avoid calling pow function we generate repeated fmul when n is a
    positive or negative whole number.
    
    For each exponent we pre-compute Addition Chains in order to minimize the no.
    of fmuls.
    Refer: http://wwwhomes.uni-bielefeld.de/achim/addition_chain.html
    
    We pre-compute addition chains for exponents upto 32 (which results in a max of
    7 fmuls).

    For eg:
    4 = 2+2
    5 = 2+3
    6 = 3+3 and so on
    
    Hence,
    pow(x, 4.0) ==> y = fmul x, x
                    x = fmul y, y
                    ret x

    For negative exponents, we simply compute the reciprocal of the final result.
    
    Note: This transformation is only enabled under fast-math.
    
    Patch by Mandeep Singh Grang <mgrang@codeaurora.org>

Reviewers: weimingz, majnemer, escha, davide, scanon, joerg

Subscribers: probinson, escha, llvm-commits

Differential Revision: http://reviews.llvm.org/D13994

llvm-svn: 254776
2015-12-04 22:00:47 +00:00
David Majnemer 70497c696a Move EH-specific helper functions to a more appropriate place
No functionality change is intended.

llvm-svn: 254562
2015-12-02 23:06:39 +00:00
Rafael Espindola baa3bf8f76 Bring r254336 back:
The difference is that now we don't error on out-of-comdat access to
internal global values. We copy them instead. This seems to match the
expectation of COFF linkers (see pr25686).

Original message:

    Start deciding earlier what to link.

    A traditional linker is roughly split in symbol resolution and
"copying
    stuff".

    The two tasks are badly mixed in lib/Linker.

    This starts splitting them apart.

    With this patch there are no direct call to linkGlobalValueBody or
    linkGlobalValueProto. Everything is linked via WapValue.

    This also includes a few fixes:
    * A GV goes undefined if the comdat is dropped (comdat11.ll).
    * We error if an internal GV goes undefined (comdat13.ll).
    * We don't link an unused comdat.

    The first two match the behavior of an ELF linker. The second one is
    equivalent to running globaldce on the input.

llvm-svn: 254418
2015-12-01 15:19:48 +00:00
Evgeniy Stepanov 42f3b12274 [safestack] Protect byval function arguments.
Detect unsafe byval function arguments and move them to the unsafe
stack.

llvm-svn: 254353
2015-12-01 00:40:05 +00:00
Rafael Espindola e9841a6bb5 This reverts commit r254336 and r254344.
They broke a bot and I am debugging why.

llvm-svn: 254347
2015-11-30 23:54:19 +00:00
Rafael Espindola c109200c53 Start deciding earlier what to link.
A traditional linker is roughly split in symbol resolution and "copying
stuff".

The two tasks are badly mixed in lib/Linker.

This starts splitting them apart.

With this patch there are no direct call to linkGlobalValueBody or
linkGlobalValueProto. Everything is linked via WapValue.

This also includes a few fixes:
* A GV goes undefined if the comdat is dropped (comdat11.ll).
* We error if an internal GV goes undefined (comdat13.ll).
* We don't link an unused comdat.

The first two match the behavior of an ELF linker. The second one is
equivalent to running globaldce on the input.

llvm-svn: 254336
2015-11-30 22:01:43 +00:00
Davide Italiano 1aeed6a955 [SimplifyLibCalls] Transform log(exp2(y)) to y*log(2) under fast-math.
llvm-svn: 254317
2015-11-30 19:36:35 +00:00
Davide Italiano 0b14f29285 [SimplifyLibCalls] Don't crash if the function doesn't have a name.
llvm-svn: 254265
2015-11-29 21:58:56 +00:00
Davide Italiano e2db58cfb8 [SimplifyLibCalls] Cross out implemented transformations.
llvm-svn: 254264
2015-11-29 21:00:43 +00:00
Davide Italiano b8b7133c94 [SimplifyLibCalls] Tranform log(pow(x, y)) -> y*log(x).
This one is enabled only under -ffast-math. There are cases where the
difference between the value computed and the correct value is huge
even for ffast-math, e.g. as Steven pointed out:

x = -1, y = -4
log(pow(-1), 4) = 0
4*log(-1) = NaN

I checked what GCC does and apparently they do the same optimization
(which result in the dramatic difference). Future work might try to
make this (slightly) less worse.

Differential Revision:	http://reviews.llvm.org/D14400

llvm-svn: 254263
2015-11-29 20:58:04 +00:00
Davide Italiano da3beebad1 [SimplifyLibCalls] Use any_of(). Suggested by David Blaikie!
llvm-svn: 254239
2015-11-28 22:27:48 +00:00
Benjamin Kramer 89766e5b1d [SimplifyLibCalls] Fix inverted condition that lead to an uninitialized memory read below.
Found by msan!

llvm-svn: 254238
2015-11-28 21:43:12 +00:00
Rafael Espindola 19b52383c5 Simplify the linking of recursive data.
Now the ValueMapper has two callbacks. The first one maps the
declaration. The ValueMapper records the mapping and then materializes
the body/initializer.

llvm-svn: 254209
2015-11-27 20:28:19 +00:00
Davide Italiano ac0953a2e6 [SimplifyLibCalls] Use range-based loop. NFC.
llvm-svn: 254193
2015-11-27 08:05:40 +00:00
Benjamin Kramer fb419e71f4 [SimplifyLibCalls] Don't depend on a called function having a name, it might be an indirect call.
Fixes the crasher in PR25651 and related crashers using the same pattern.

llvm-svn: 254145
2015-11-26 09:51:17 +00:00
Sanjoy Das c521c7bea5 [OperandBundles] Extract duplicated code into a helper function, NFC
llvm-svn: 254047
2015-11-25 00:42:24 +00:00
Weiming Zhao 45d4cb9a14 [Utils] Put includes in correct order. NFC.
Summary:
    Followed the guidelines in:
    http://llvm.org/docs/CodingStandards.html#include-style
    
    However, I noticed that uppercase named headers come before lowercase ones
    throughout the codebase. So kept them as is.
    
    Patch by Mandeep Singh Grang <mgrang@codeaurora.org>

Reviewers: majnemer, davide, jmolloy, atrick

Subscribers: sanjoy

Differential Revision: http://reviews.llvm.org/D14939

llvm-svn: 254005
2015-11-24 18:57:06 +00:00
Weiming Zhao 8d5c08f591 [SimplifyLibCalls] Removed some TODOs which are already implemented. NFC.
Summary:
D14302 implements tan(atan(x)) -> x
D14045 implements pow(exp(x), y) -> exp(x*y)

Patch by Mandeep Singh Grang <mgrang@codeaurora.org>

Reviewers: majnemer, davide

Differential Revision: http://reviews.llvm.org/D14882

llvm-svn: 253768
2015-11-21 06:10:20 +00:00
Dehao Chen 014fb55711 Fix the debug build breakage that getDiscriminator is called by mistake.
llvm-svn: 253597
2015-11-19 20:29:27 +00:00
Michael Zolotukhin 6c11c04db3 Revert r253253 and r253126: "Don't recompute LCSSA after loop-unrolling when possible."
The change exposed a bug in IndVarSimplify (PR25578), which led to a
failure (PR25538). When the bug is fixed, this patch can be reapplied.

The tests are kept in tree, as they're useful anyway, and will not break
with this revert.

llvm-svn: 253596
2015-11-19 20:28:32 +00:00
Dehao Chen 23e2278e27 Reimplement discriminator assignment algorithm.
Summary: The new algorithm is more efficient (O(n), n is number of basic blocks). And it is guaranteed to cover all cases of multiple BB mapped to same line.

Reviewers: dblaikie, davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14738

llvm-svn: 253594
2015-11-19 19:53:05 +00:00
Pete Cooper 67cf9a723b Revert "Change memcpy/memset/memmove to have dest and source alignments."
This reverts commit r253511.

This likely broke the bots in
http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202
http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787

llvm-svn: 253543
2015-11-19 05:56:52 +00:00
Davide Italiano c5cedd195a [SimplifyLibCalls] New trick: pow(x, 0.5) -> sqrt(x) under -ffast-math.
Differential Revision:	http://reviews.llvm.org/D14466

llvm-svn: 253521
2015-11-18 23:21:32 +00:00
Davide Italiano 455ea11d13 [BuildLibCalls] EmitStrNLen() is dead code. Garbage collect.
llvm-svn: 253514
2015-11-18 22:29:38 +00:00
Pete Cooper 72bc23ef02 Change memcpy/memset/memmove to have dest and source alignments.
Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

These intrinsics currently have an explicit alignment argument which is
required to be a constant integer.  It represents the alignment of the
source and dest, and so must be the minimum of those.

This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments.  The alignment
argument itself is removed.

There are a few places in the code for which the code needs to be
checked by an expert as to whether using only src/dest alignment is
safe.  For those places, they currently take the minimum of src/dest
alignments which matches the current behaviour.

For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false)
will now read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false)

For out of tree owners, I was able to strip alignment from calls using sed by replacing:
  (call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\)
with:
  $1i1 false)

and similarly for memmove and memcpy.

I then added back in alignment to test cases which needed it.

A similar commit will be made to clang which actually has many differences in alignment as now
IRBuilder can generate different source/dest alignments on calls.

In IRBuilder itself, a new argument was added.  Instead of calling:
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false)
you now call
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false)

There is a temporary class (IntegerAlignment) which takes the source alignment and rejects
implicit conversion from bool.  This is to prevent isVolatile here from passing its default
parameter to the source alignment.

Note, changes in future can now be made to codegen.  I didn't change anything here, but this
change should enable better memcpy code sequences.

Reviewed by Hal Finkel.

llvm-svn: 253511
2015-11-18 22:17:24 +00:00
Igor Laevsky 7310c68e85 Revert "Revert "Strip metadata when speculatively hoisting instructions (r252604)"
Failing clang test is now fixed by the r253458.

llvm-svn: 253459
2015-11-18 14:50:18 +00:00
Sanjoy Das f79d3449c5 [OperandBundles] Tighten OperandBundleDef's interface; NFC
llvm-svn: 253446
2015-11-18 08:30:07 +00:00
Sanjoy Das 2d16145acf Teach the inliner to track deoptimization state
Summary:
This change teaches LLVM's inliner to track and suitably adjust
deoptimization state (tracked via deoptimization operand bundles) as it
inlines through call sites.  The operation is described in more detail
in the LangRef changes.

Reviewers: reames, majnemer, chandlerc, dexonsmith

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14552

llvm-svn: 253438
2015-11-18 06:23:38 +00:00
Michael Zolotukhin 927bdba29d [PR25538]: Fix a failure caused by r253126.
In r253126 we stopped to recompute LCSSA after loop unrolling in all
cases, except the unrolling is full and at least one of the loop exits
is outside the parent loop. In other cases the transformation should not
break LCSSA, but it turned out, that we also call SimplifyLoop on the
parent loop, which might break LCSSA by itself. This fix just triggers
LCSSA recomputation in this case as well.

I'm committing it without a test case for now, but I'll try to invent
one. It's a bit tricky because in an isolated test LoopSimplify would
be scheduled before LoopUnroll, and thus will change the test and hide
the problem.

llvm-svn: 253253
2015-11-16 21:17:26 +00:00
Davide Italiano ed5cc95d22 [SimplifyLibCalls] Generalize a comment. This doesn't apply only to sqrt.
llvm-svn: 253224
2015-11-16 16:54:28 +00:00
Pavel Labath 978060ce2f Don't generate discriminators for calls to debug intrinsics
Summary:
This fails a check in Verifier.cpp, which checks for location matches between the declared
variable and the !dbg attachments.

Reviewers: dnovillo, dblaikie, danielcdh

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14657

llvm-svn: 253194
2015-11-16 10:40:38 +00:00
Keno Fischer 2ac0c27001 Also map the personality function in CloneFunctionInto
Summary: The Old personality function gets copied over, but the
Materializer didn't  have a chance to inspect it (e.g. to fix up
references to the correct module for the target function).
Also add a verifier check that makes sure the personality routine
is in the same module as the function whose personality it is.

Reviewers: majnemer

Subscribers: jevinskie, llvm-commits

Differential Revision: http://reviews.llvm.org/D14474

llvm-svn: 253183
2015-11-16 05:13:30 +00:00
Teresa Johnson 83d03ddbf6 Fix mapping of unmaterialized global values during metadata linking
Summary:
The patch to move metadata linking after global value linking didn't
correctly map unmaterialized global values to null as desired. They
were in fact mapped to the source copy. It largely worked by accident
since most module linker clients destroyed the source module which
caused the source GVs to be replaced by null, but caused a failure with
LTO linking on Windows:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312869.html

The problem is that a null return value from materializeValueFor is
handled by mapping the value to self. This is the desired behavior when
materializeValueFor is passed a non-GlobalValue. The problem is how to
distinguish that case from the case where we really do want to map to
null.

This patch addresses this by passing in a new flag to the value mapper
indicating that unmapped global values should be mapped to null. Other
Value types are handled as before.

Note that the documented behavior of asserting on unmapped values when
the flag RF_IgnoreMissingValues isn't set is currently disabled with
FIXME notes due to bootstrap failures. I modified these disabled asserts
so when they are eventually enabled again it won't assert for the
unmapped values when the new RF_NullMapMissingGlobalValues flag is set.

I also considered using a callback into the value materializer, but a
flag seemed cleaner given that there are already existing flags.
I also considered modifying materializeValueFor to return the input
value when we want to map to source and then treat a null return
to mean map to null. However, there are other value materializer
subclasses that implement materializeValueFor, and they would all need
to be audited and the return values possibly changed, which seemed
error-prone.

Reviewers: dexonsmith, joker.eph

Subscribers: pcc, llvm-commits

Differential Revision: http://reviews.llvm.org/D14682

llvm-svn: 253170
2015-11-15 14:50:14 +00:00
Michael Zolotukhin 8ef44f93ca Don't recompute LCSSA after loop-unrolling when possible.
Summary:
Currently we always recompute LCSSA for outer loops after unrolling an
inner loop. That leads to compile time problem when we have big loop
nests, and we can solve it by avoiding unnecessary work. For instance,
if w eonly do partial unrolling, we don't break LCSSA, so we don't need
to rebuild it. Also, if all exits from the inner loop are inside the
enclosing loop, then complete unrolling won't break LCSSA either.

I replaced unconditional LCSSA recomputation with conditional recomputation +
unconditional assert and added several tests, which were failing when I
experimented with it.

Soon I plan to follow up with a similar patch for recalculation of dominators
tree.

Reviewers: hfinkel, dexonsmith, bogner, joker.eph, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14526

llvm-svn: 253126
2015-11-14 05:51:41 +00:00
Davide Italiano b883b01a8e [SimplifyLibCalls] Make a function shorter. NFC.
llvm-svn: 252970
2015-11-12 23:39:00 +00:00
Diego Novillo 0354a9f67b SamplePGO - Fix PR 25482 - Do not rely on llvm.dbg.cu for discriminators
The discriminators pass relied on the presence of llvm.dbg.cu to decide
whether to add discriminators, but this fails in the case where debug
info is only enabled partially when -fprofile-sample-use is active.

The reason llvm.dbg.cu is not present in these cases is to prevent
codegen from emitting debug info (as it is only used for the sample
profile pass).

This changes the discriminators pass to also emit discriminators even
when debug info is not being emitted.

llvm-svn: 252763
2015-11-11 17:54:37 +00:00
Renato Golin 0e77d72b0a Revert "Strip metadata when speculatively hoisting instructions"
This reverts commit r252604, as it broke all ARM and AArch64 buildbots, as
well as some x86, et al.

llvm-svn: 252623
2015-11-10 18:01:16 +00:00
Igor Laevsky 01c3692a10 Strip metadata when speculatively hoisting instructions
This is fix for PR24059.

When we are hoisting instruction above some condition it may turn out
that metadata on this instruction was control dependant on the condition.
This metadata becomes invalid and we need to drop it.

This patch should cover most obvious places of speculative execution (which
I have found by greping isSafeToSpeculativelyExecute). I think there are more
cases but at least this change covers the severe ones.

Differential Revision: http://reviews.llvm.org/D14398

llvm-svn: 252604
2015-11-10 14:10:31 +00:00
Dehao Chen 3656e3064b Add discriminators for call instructions that are from the same line and same basic block.
Summary: Call instructions that are from the same line and same basic block needs to have separate discriminators to distinguish between different callsites.

Reviewers: davidxl, dnovillo, dblaikie

Subscribers: dblaikie, probinson, llvm-commits

Differential Revision: http://reviews.llvm.org/D14464

llvm-svn: 252492
2015-11-09 17:30:38 +00:00
Silviu Baranga 2910a4f6b1 Allow LLE/LD and the loop versioning infrastructure to use SCEV predicates
Summary:
LAA currently generates a set of SCEV predicates that must be checked by users.
In the case of Loop Distribute/Loop Load Elimination, no such predicates could have
been emitted, since we don't allow stride versioning. However, in the future there
could be SCEV predicates that will need to be checked.

This change adds support for SCEV predicate versioning in the Loop Distribute, Loop
Load Eliminate and the loop versioning infrastructure.

Reviewers: anemet

Subscribers: mssimpso, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D14240

llvm-svn: 252467
2015-11-09 13:26:09 +00:00
Duncan P. N. Exon Smith 83c4b68720 ADT: Remove last implicit ilist iterator conversions, NFC
Some implicit ilist iterator conversions have crept back into Analysis,
Transforms, Hexagon, and llvm-stress.  This removes them.

I'll commit a patch immediately after this to disallow them (in a
separate patch so that it's easy to revert if necessary).

llvm-svn: 252371
2015-11-07 00:01:16 +00:00
Davide Italiano d9f87b4642 [SimplifyLibCalls] Don't hardcode the function name.
llvm-svn: 252342
2015-11-06 21:05:07 +00:00
Sanjoy Das 55ea67cea7 [ValueTracking] Add parameters to isImpliedCondition; NFC
Summary:
This change makes the `isImpliedCondition` interface similar to the rest
of the functions in ValueTracking (in that it takes a DataLayout,
AssumptionCache etc.).  This is an NFC, intended to make a later diff
less noisy.

Depends on D14369

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14391

llvm-svn: 252333
2015-11-06 19:01:08 +00:00
Peter Collingbourne d4bff30370 DI: Reverse direction of subprogram -> function edge.
Previously, subprograms contained a metadata reference to the function they
described. Because most clients need to get or set a subprogram for a given
function rather than the other way around, this created unneeded inefficiency.

For example, many passes needed to call the function llvm::makeSubprogramMap()
to build a mapping from functions to subprograms, and the IR linker needed to
fix up function references in a way that caused quadratic complexity in the IR
linking phase of LTO.

This change reverses the direction of the edge by storing the subprogram as
function-level metadata and removing DISubprogram's function field.

Since this is an IR change, a bitcode upgrade has been provided.

Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is
attached to the PR.

Differential Revision: http://reviews.llvm.org/D14265

llvm-svn: 252219
2015-11-05 22:03:56 +00:00
Davide Italiano a345877ce8 [SimplifyLibCalls] Use hasFloatVersion(). NFCI.
llvm-svn: 252186
2015-11-05 19:18:23 +00:00
James Molloy 9e959ac397 [SimplifyCFG] Tweak heuristic for merging conditional stores
We were correctly skipping dbginfo intrinsics and terminators, but the initial bailout wasn't, causing it to bail out on almost any block.

llvm-svn: 252152
2015-11-05 08:40:19 +00:00
Davide Italiano 51507d2ad8 [SimplifyLibCalls] New transformation: tan(atan(x)) -> x
This is enabled only under -ffast-math.
So, instead of emitting:
  4007b0:       50                      push   %rax
  4007b1:       e8 8a fd ff ff          callq  400540 <atanf@plt>
  4007b6:       58                      pop    %rax
  4007b7:       e9 94 fd ff ff          jmpq   400550 <tanf@plt>
  4007bc:       0f 1f 40 00             nopl   0x0(%rax)

for:
float mytan(float x) {
  return tanf(atanf(x));
}
we emit a single retq.

Differential Revision:	 http://reviews.llvm.org/D14302

llvm-svn: 252098
2015-11-04 23:36:56 +00:00
James Molloy 4de84ddec9 [SimplifyCFG] Merge conditional stores
We can often end up with conditional stores that cannot be speculated. They can come from fairly simple, idiomatic code:

  if (c & flag1)
    *a = x;
  if (c & flag2)
    *a = y;
  ...

There is no dominating or post-dominating store to a, so it is not legal to move the store unconditionally to the end of the sequence and cache the intermediate result in a register, as we would like to.

It is, however, legal to merge the stores together and do the store once:

  tmp = undef;
  if (c & flag1)
    tmp = x;
  if (c & flag2)
    tmp = y;
  if (c & flag1 || c & flag2)
    *a = tmp;

The real power in this optimization is that it allows arbitrary length ladders such as these to be completely and trivially if-converted. The typical code I'd expect this to trigger on often uses binary-AND with constants as the condition (as in the above example), which means the ending condition can simply be truncated into a single binary-AND too: 'if (c & (flag1|flag2))'. As in the general case there are bitwise operators here, the ladder can often be optimized further too.

This optimization involves potentially increasing register pressure. Even in the simplest case, the lifetime of the first predicate is extended. This can be elided in some cases such as using binary-AND on constants, but not in the general case. Threading 'tmp' through all branches can also increase register pressure.

The optimization as in this patch is enabled by default but kept in a very conservative mode. It will only optimize if it thinks the resultant code should be if-convertable, and additionally if it can thread 'tmp' through at least one existing PHI, so it will only ever in the worst case create one more PHI and extend the lifetime of a predicate.

This doesn't trigger much in LNT, unfortunately, but it does trigger in a big way in a third party test suite.

llvm-svn: 252051
2015-11-04 15:28:04 +00:00
Davide Italiano c8a7913f23 [SimplifyLibCalls] Add a new transformation: pow(exp(x), y) -> exp(x*y)
This one is enabled only under -ffast-math (due to rounding/overflows)
but allows us to emit shorter code.

Before (on FreeBSD x86-64):
4007f0:       50                      push   %rax
4007f1:       f2 0f 11 0c 24          movsd  %xmm1,(%rsp)
4007f6:       e8 75 fd ff ff          callq  400570 <exp2@plt>
4007fb:       f2 0f 10 0c 24          movsd  (%rsp),%xmm1
400800:       58                      pop    %rax
400801:       e9 7a fd ff ff          jmpq   400580 <pow@plt>
400806:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
40080d:       00 00 00

After:
4007b0:       f2 0f 59 c1             mulsd  %xmm1,%xmm0
4007b4:       e9 87 fd ff ff          jmpq   400540 <exp2@plt>
4007b9:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)

Differential Revision:	http://reviews.llvm.org/D14045

llvm-svn: 251976
2015-11-03 20:32:23 +00:00
Davide Italiano b7487e6b8d [SimplifyLibCalls] Remove variables that are not used. NFC.
llvm-svn: 251852
2015-11-02 23:07:14 +00:00