Commit Graph

25579 Commits

Author SHA1 Message Date
Stefanos Baziotis a3345300b6 [LCSSA] Doc for special treatment of PHIs
Differential Revision: https://reviews.llvm.org/D89739
2020-10-29 22:50:07 +02:00
Nikita Popov 20b386aae0 [LoopUtils] Fix neutral value for vector.reduce.fadd
Use -0.0 instead of 0.0 as the start value. The previous use of 0.0
was fine for all existing uses of this function though, as it is
always generated with fast flags right now, and thus nsz.
2020-10-29 21:45:13 +01:00
Florian Hahn 1922570489 [SLP] Consider alternatives for cost of select instructions.
Some architectures do not have general vector select instructions (e.g.
AArch64). But some cmp/select patterns can be vectorized using other
instructions/intrinsics.

One example is using min/max instructions for certain patterns.

This patch updates the cost calculations for selects in the SLP
vectorizer to consider using min/max intrinsics.

This patch does not change SLP vectorizer's codegen itself to actually
generate those intrinsics, but relies on the backends to lower the
vector cmps & selects. This keeps things simple on the SLP side and
works well in practice for AArch64.

This exposes additional SLP vectorization opportunities in some
benchmarks on AArch64 (-O3 -flto).

Metric: SLP.NumVectorInstructions

Program                                        base    slp     diff
 test-suite...ications/JM/ldecod/ldecod.test   502.00  697.00  38.8%
 test-suite...ications/JM/lencod/lencod.test   1023.00 1414.00 38.2%
 test-suite...-typeset/consumer-typeset.test    56.00   65.00  16.1%
 test-suite...6/464.h264ref/464.h264ref.test   804.00  822.00   2.2%
 test-suite...006/453.povray/453.povray.test   3335.00 3357.00  0.7%
 test-suite...CFP2000/177.mesa/177.mesa.test   2110.00 2121.00  0.5%
 test-suite...:: External/Povray/povray.test   2378.00 2382.00  0.2%

Reviewed By: RKSimon, samparker

Differential Revision: https://reviews.llvm.org/D89969
2020-10-29 20:39:50 +00:00
Dávid Bolvanský 7a2abf5aca [InferAttrs] Add nocapture/writeonly to string/mem libcalls
One step closer to fix PR47644.

Differential Revision: https://reviews.llvm.org/D89645
2020-10-29 20:06:43 +01:00
Simon Pilgrim dcb3dc101d [InstCombine] visitShl - ensure inner shifts have inrange amounts
Noticed when fixing OSS Fuzz #26716
2020-10-29 15:28:15 +00:00
Max Kazantsev a5b2e795c3 [NFC][SCEV] Refactor monotonic predicate checks to return enums instead of bools
This patch gets rid of output parameter which is not needed for most users
and prepares this API for further refactoring.
2020-10-29 16:01:25 +07:00
Johannes Doerfert d39f574dcc [Attributor][FIX] Properly promote arguments pointers to arrays
When we promote pointer arguments we did compute a wrong offset and use
a wrong type for the array case.

Bug reported and reduced by Whitney Tsang <whitneyt@ca.ibm.com>.
2020-10-29 00:45:32 -05:00
Fangrui Song 39856d5d0b [Debugify] Move global namespace functions into llvm::
Also move exportDebugifyStats from tools/opt to Debugify.cpp
2020-10-28 19:11:41 -07:00
Florian Hahn 53f4c4b2cc [InstCombine] Do not introduce bitcasts for swifterror arguments.
The following constraints hold for swifterror values:

    A swifterror value (either the parameter or the alloca) can only
    be loaded and stored from, or used as a swifterror argument.

This patch updates instcombine to not try to convert a bitcast of a
function into a bitcast of a swifterror argument.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D90258
2020-10-28 21:52:12 +00:00
Benjamin Kramer 207cf71fa9 Revert "[OpenMP] Add Passing in Original Declaration Names To Mapper API"
This reverts commit d981c7b758 and
a87d7b3d44. Test fails under msan.
2020-10-28 13:58:14 +01:00
Max Kazantsev 160a453138 Return "[IndVars] Remove monotonic checks with unknown exit count"
This reverts commit e038b60d91.
This reverts commit a0d84d8031.

This revert was a mistake. The reason of the failures was
"Use uint64_t for branch weights instead of uint32_t"

Differential Revision: https://reviews.llvm.org/D87832
2020-10-28 18:51:40 +07:00
Florian Hahn b82f80057d [DSE] Use walker to skip noalias stores between current & clobber def.
Instead of getting the defining access we should be able to use
getClobberingMemoryAccess to skip non-aliasing MemoryDefs. No additional
checks should be needed, because we only remove the starting def if it
matches the defining access of the load. All we need to worry about is
that there are no (may)alias stores between the starting def and the
load and getClobberingMemoryAccess should guarantee that.

Partly fixes PR47887.

This improves the number of redundant stores removed in some cases
(numbers below for MultiSource, SPEC2000, SPEC2006 on X86 with -flto
-O3).

Same hash: 226 (filtered out)
Remaining: 11
Metric: dse.NumRedundantStores

Program                                        base   patch1 diff
 test-suite...:: External/Povray/povray.test     1.00   5.00 400.0%
 test-suite...chmarks/MallocBench/gs/gs.test     1.00   3.00 200.0%
 test-suite...0/253.perlbmk/253.perlbmk.test    21.00  37.00 76.2%
 test-suite...0.perlbench/400.perlbench.test    24.00  37.00 54.2%
 test-suite.../Applications/SPASS/SPASS.test     3.00   4.00 33.3%
 test-suite...006/453.povray/453.povray.test    15.00  18.00 20.0%
 test-suite...T2006/445.gobmk/445.gobmk.test    27.00  29.00  7.4%
 test-suite.../CINT2006/403.gcc/403.gcc.test   136.00 137.00  0.7%
 test-suite.../CINT2000/176.gcc/176.gcc.test     6.00   6.00  0.0%
 test-suite.../Benchmarks/Bullet/bullet.test    NaN     3.00  nan%
 test-suite.../Benchmarks/Ptrdist/bc/bc.test    NaN     1.00  nan%

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D89647
2020-10-28 11:01:25 +00:00
Luqman Aden 4c0a016927 Rename EHPersonality::MSVC_Win64SEH to EHPersonality::MSVC_TableSEH. NFC.
The types of SEH aren't x86(-32) vs x64 but rather stack-based exception chaining
vs table-based exception handling. x86-32 is the only arch for which Windows
uses the former. 32-bit ARM would use what is called Win64SEH today, which
is a bit confusing so instead let's just rename it to be a bit more clear.

Reviewed By: compnerd, rnk

Differential Revision: https://reviews.llvm.org/D90117
2020-10-27 23:22:13 -07:00
Kazu Hirata b2f05fae80 [JumpThreading] Remove extraneous calls to setEdgeProbability
This patch removes extraneous calls to setEdgeProbability introduced
in c91487769d.

The follow-up patch, a7b662d0f4, has
since fixed BranchProbabilityInfo::eraseBlock, so we don't need to
worry about getting stale values from getEdgeProbability.

Also, since getEdgeProbability(BB, BB->getSingleSuccessor()) returns
edge probability 1/1 by default for BB with exactly one successor
edge, we don't need to explicitly call setEdgeProbability.

This patch introduces almost no functional change, but we do end up
reducing debug messages from setEdgeProbability.

Differential Revision: https://reviews.llvm.org/D90284
2020-10-27 21:12:54 -07:00
Johannes Doerfert d13daa4018 [Attributor] Finalize the CGUpdater after each SCC
This matches the new PM model.
2020-10-27 22:07:56 -05:00
Johannes Doerfert 50d34958df [Attributor][NFC] Introduce a debug counter for `AA::manifest`
This will simplify debugging and tracking down problems.
2020-10-27 22:07:56 -05:00
Johannes Doerfert 1d57b7f503 [Attributor][NFC] Print the right value in debug output 2020-10-27 22:07:55 -05:00
Johannes Doerfert 1c2531c9e1 [Attributor][FIX] Delete all unreachable static functions
Before we used to only mark unreachable static functions as dead if all
uses were known dead. Now we optimistically assume uses to be dead until
proven otherwise.
2020-10-27 22:07:55 -05:00
Johannes Doerfert bfe05b1aff [Attributor][FIX] Do not attach range metadata to the wrong Instruction
If we are looking at a call site argument it might be a load or call
which is in a different context than the call site argument. We cannot
simply use the call site argument range for the call or load.

Bug reported and reduced by Whitney Tsang <whitneyt@ca.ibm.com>.
2020-10-27 22:07:55 -05:00
Johannes Doerfert 724fcce109 [Attributor][NFC] Clang-format 2020-10-27 22:07:55 -05:00
Johannes Doerfert d504f7b91a [Attributor][NFC] Hoist call out of a lambda
The call is not free, unsure if  this is needed but it does not make it
worse either.
2020-10-27 22:07:54 -05:00
Johannes Doerfert 30e5a1f0be [Attributor][FIX] Properly check uses in the call not uses of the call
In the AANoAlias logic we determine if a pointer may have been captured
before a call. We need to look at other uses in the call not uses of the
call.

The new code is not perfect as it does not allow trivial cases where the
call has multiple arguments but it is at least not unsound and a TODO
was added.
2020-10-27 22:07:54 -05:00
Johannes Doerfert cb813ab66a [Attributor][NFC] Improve time trace output 2020-10-27 22:07:54 -05:00
Kazu Hirata c91487769d [JumpThreading] Set edge probabilities when creating basic blocks
This patch teaches the jump threading pass to set edge probabilities
whenever the pass creates new basic blocks.

Without this patch, the compiler sometimes produces non-deterministic
results.  The non-determinism comes from the jump threading pass using
stale edge probabilities in BranchProbabilityInfo.  Specifically, when
the jump threading pass creates a new basic block, we don't initialize
its outgoing edge probability.

Edge probabilities are maintained in:

  DenseMap<Edge, BranchProbability> Probs;

in class BranchProbabilityInfo, where Edge is an ordered pair of
BasicBlock * and a successor index declared as:

  using Edge = std::pair<const BasicBlock *, unsigned>;

Probs maps edges to their corresponding probabilities.

Now, we rarely remove entries from this map, so if we happen to
allocate a new basic block at the same address as a previously deleted
basic block with an edge probability assigned, the newly created basic
block appears to have an edge probability, albeit a stale one.

This patch fixes the problem by explicitly setting edge probabilities
whenever the jump threading pass creates new basic blocks.

Differential Revision: https://reviews.llvm.org/D90106
2020-10-27 16:07:27 -07:00
Joseph Huber a87d7b3d44 [OpenMP] Add Passing in Original Declaration Names To Mapper API
Summary:
This patch adds support for passing in the original delcaration name in the
source file to the libomptarget runtime. This will allow the runtime to provide
more intelligent debugging messages. This patch takes the original expression
parsed from the OpenMP map / update clause and provides a textual
representation if it was explicitly mapped, otherwise it takes the name of the
variable declaration as a fallback. The information in passed to the runtime in
a global array of strings that matches the existing ident_t source location
strings using ";name;filename;column;row;;". See
clang/test/OpenMP/target_map_names.cpp for an example of the generated output
for a given map clause.

Reviewers: jdoervert

Differential Revision: https://reviews.llvm.org/D89802
2020-10-27 16:09:19 -04:00
Nicolai Hähnle e025d09b21 Revert multiple patches based on "Introduce CfgTraits abstraction"
These logically belong together since it's a base commit plus
followup fixes to less common build configurations.

The patches are:

Revert "CfgInterface: rename interface() to getInterface()"

This reverts commit a74fc48158.

Revert "Wrap CfgTraitsFor in namespace llvm to please GCC 5"

This reverts commit f2a06875b6.

Revert "Try to make GCC5 happy about the CfgTraits thing"

This reverts commit 03a5f7ce12.

Revert "Introduce CfgTraits abstraction"

This reverts commit c0cdd22c72.
2020-10-27 20:33:30 +01:00
Nicolai Hähnle ce6900c6cb Revert "DomTree: Extract (mostly) read-only logic into type-erased base classes"
This reverts commit 848a68a032.
2020-10-27 20:33:29 +01:00
Vedant Kumar 5a3ef55a52 [Utils] Skip RemoveRedundantDbgInstrs in MergeBlockIntoPredecessor (PR47746)
This patch changes MergeBlockIntoPredecessor to skip the call to
RemoveRedundantDbgInstrs, in effect partially reverting D71480 due to
some compile-time issues spotted in LoopUnroll and SimplifyCFG.

The call to RemoveRedundantDbgInstrs appears to have changed the
worst-case behavior of the merging utility. Loosely speaking, it seems
to have gone from O(#phis) to O(#insts).

It might not be possible to mitigate this by scanning a block to
determine whether there are any debug intrinsics to remove, since such a
scan costs O(#insts).

So: skip the call to RemoveRedundantDbgInstrs. There's surprisingly
little fallout from this, and most of it can be addressed by doing
RemoveRedundantDbgInstrs later. The exception is (the block-local
version of) SimplifyCFG, where it might just be too expensive to call
RemoveRedundantDbgInstrs.

Differential Revision: https://reviews.llvm.org/D88928
2020-10-27 10:12:59 -07:00
Raphael Isemann e038b60d91 Revert "[IndVars] Remove monotonic checks with unknown exit count"
This reverts commit c6ca26c0bf.
This breaks stage2 builds due to hitting this assert:
```
   Assertion failed: (WeightSum <= UINT32_MAX && "Expected weights to scale down to 32 bits"), function calcMetadataWeights
```
when compiling AArch64RegisterBankInfo.cpp in LLVM.
2020-10-27 15:31:37 +01:00
Raphael Isemann a0d84d8031 Revert "[NFC] Factor away lambda's redundant parameter"
This reverts commit fdc845b361.
It seems to be a follow-up to c6372b3fb495 which will be reverted.
2020-10-27 15:30:52 +01:00
Simon Pilgrim bce770ffa6 Revert rG0905bd5c2fa42bd4c "[InstCombine] collectBitParts - add trunc support."
This reverts commit 0905bd5c2f.

Causing failures in multistage buildbots that I need to investigate
2020-10-27 13:43:54 +00:00
Nico Weber 2a4e704c92 Revert "Use uint64_t for branch weights instead of uint32_t"
This reverts commit e5766f25c6.
Makes clang assert when building Chromium, see https://crbug.com/1142813
for a repro.
2020-10-27 09:26:21 -04:00
Simon Pilgrim 0905bd5c2f [InstCombine] collectBitParts - add trunc support.
This should allow us to remove the rather limited matchOrConcat fold and just use recognizeBSwapOrBitReverseIdiom.
2020-10-27 13:14:54 +00:00
Roman Lebedev 0ac56e8eaa
[InstCombine] Fold `(X >>? C1) << C2` patterns to shift+bitmask (PR37872)
This is essentially finalizes a revert of rL155136,
because nowadays the situation has improved, SCEV can model
all these patterns well, and we canonicalize rotate-like patterns
into a funnel shift intrinsics in InstCombine.
So this should not cause any pessimization.

I've verified the canonicalize-{a,l}shr-shl-to-masking.ll transforms
with alive, which confirms that we can freely preserve exact-ness,
and no-wrap flags.

Profs:
* base: https://rise4fun.com/Alive/gPQ
* exact-ness preservation: https://rise4fun.com/Alive/izi
* nuw preservation: https://rise4fun.com/Alive/DmD
* nsw preservation: https://rise4fun.com/Alive/SLN6N
* nuw nsw preservation: https://rise4fun.com/Alive/Qp7

Refs. https://reviews.llvm.org/D46760
2020-10-27 14:42:53 +03:00
Florian Hahn f067bc3c0a [LoopRotation] Allow loop header duplication if vectorization is forced.
-Oz normally does not allow loop header duplication so this loop wouldn't be
vectorized.  However the vectorization pragma should override this and allow
for loop rotation.

rdar://problem/49281061

Original patch by Adam Nemet.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D59832
2020-10-27 09:28:01 +00:00
Max Kazantsev fdc845b361 [NFC] Factor away lambda's redundant parameter 2020-10-27 12:56:52 +07:00
Serguei Katkov b69919b537 [GVN LoadPRE] Add an option to disable splitting backedge
GVN Load PRE can split the backedge causing breaking the loop structure where the latch
contains the conditional branch with for example induction variable.

Different optimizations expect this form of the loop, so it is better to preserve it for some time.
This CL adds an option to control an ability to split backedge.

Default value is true so technically it is NFC and current behavior is not changed.

Reviewers: fedor.sergeev, mkazantsev, nikic, reames, fhahn
Reviewed By: mkazasntsev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D89854
2020-10-27 11:59:52 +07:00
Max Kazantsev c6ca26c0bf [IndVars] Remove monotonic checks with unknown exit count
Even if the exact exit count is unknown, we can still prove that this
exit will not be taken. If we can prove that the predicate is monotonic,
fulfilled on first & last iteration, and no overflow happened in between,
then the check can be removed.

Differential Revision: https://reviews.llvm.org/D87832
Reviewed By: apilipenko
2020-10-27 11:35:16 +07:00
Arthur Eubanks 42f76e193b Reland [AlwaysInliner] Pass callee AAResults to InlineFunction()
Test copied from noalias-calls.ll with small changes.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D89609
2020-10-26 20:40:46 -07:00
Arthur Eubanks e5766f25c6 Use uint64_t for branch weights instead of uint32_t
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32.
To be more consistent everywhere and remove lots of casts from uint64_t
to uint32_t, use i64 for branch_weights.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D88609
2020-10-26 20:24:04 -07:00
Arthur Eubanks 4af5ba1726 Revert "[AlwaysInliner] Pass callee AAResults to InlineFunction()"
This reverts commit 504fbec7a6.

Test failure.
2020-10-26 20:23:38 -07:00
Arthur Eubanks 504fbec7a6 [AlwaysInliner] Pass callee AAResults to InlineFunction()
Test copied from noalias-calls.ll with small changes.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D89609
2020-10-26 20:10:09 -07:00
Arthur Eubanks 3dd1c72458 Port -objc-arc-expand to NPM
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D90182
2020-10-26 20:05:10 -07:00
Arthur Eubanks 90c0b0d3d6 Port -objc-arc-apelim to NPM
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D90181
2020-10-26 20:01:46 -07:00
Chen Zheng 00e573cadb [LSR] fix typo in comments and rename for a new added hook. 2020-10-26 22:29:22 -04:00
TaWeiTu 0efbfa38ae [NPM] Port -slsr to NPM
`-separate-const-offset-from-gep` has not yet be ported, so some tests are not updated.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D90149
2020-10-27 09:21:40 +08:00
Sriraman Tallam ad1b9daa4b Prepend "__uniq" to symbol names hash with -funique-internal-linkage-names.
Prepend the module name hash with a fixed string ".__uniq." which helps tools
that consume sampled profiles and attribute it to functions to understand
that this symbol belongs to a unique internal linkage type symbol.

Symbols with suffixes can result from various optimizations in the compiler.
Function Multiversioning, function splitting, parameter constant propogation,
unique internal linkage names.

External tools like sampled profile aggregators combine profiles from multiple
runs of a binary. They use various heuristics with symbols that have suffixes
to try and attribute the profile to the right function instance. For instance
multi-versioned symbols like foo.avx, foo.sse4.2, etc even though different
should be attributed to the same source function if a single function is
versioned, using attribute target_clones (supported in GCC but yet to land in
LLVM). Similarly, functions that are split (split part having a .cold suffix)
could have profiles for both the original and split symbols but would be
aggregated and attributed to the original function that was split.

Unique internal linkage functions however have different source instances and
the aggregator must not put them together but attribute it to the appropriate
function instance. To be sure that we are dealing with a symbol of a unique
internal linkage function, we would like to prepend the hash with a known
string ".__uniq." which these tools can check to understand the suffix type.

Differential Revision: https://reviews.llvm.org/D89617
2020-10-26 14:24:28 -07:00
Sanjay Patel 5a6e66ec72 [InstCombine] add folds for icmp+ctpop
https://alive2.llvm.org/ce/z/XjFPQJ

  define void @src(i64 %value) {
    %t0 = call i64 @llvm.ctpop.i64(i64 %value)
    %gt = icmp ugt i64 %t0, 63
    %lt = icmp ult i64 %t0, 64
    call void @use(i1 %gt, i1 %lt)
    ret void
  }

  define void @tgt(i64 %value) {
    %eq = icmp eq i64 %value, -1
    %ne = icmp ne i64 %value, -1
    call void @use(i1 %eq, i1 %ne)
    ret void
  }

  declare i64 @llvm.ctpop.i64(i64) #1
  declare void @use(i1, i1)
2020-10-26 16:48:56 -04:00
Sanjay Patel 437d7551c5 [InstCombine] reduce code duplication in icmp intrinsic folds; NFC 2020-10-26 16:48:56 -04:00
Stanislav Mekhanoshin 00928a1956 Fix SROA with a PHI mergig values from a same block
This fixes the bug 47945. It is legal to have a PHI with values
from from the same block, but values must stay the same. In this
case it is illegal to merge different values.

Differential Revision: https://reviews.llvm.org/D89978
2020-10-26 12:58:27 -07:00