Commit Graph

19299 Commits

Author SHA1 Message Date
Mikael Holmen 9f047795fb Bail out of a SimplifyCFG switch table opt at undef values.
Summary:
A true or false result is expected from a comparison, but it seems the possibility of undef was overlooked, which could lead to a failed assert. This is fixed by this patch by bailing out if we encounter undef.

The bug is old and the assert has been there since the end of 2014, so it seems this is unusual enough to forego optimization.

Patch by: JesperAntonsson

Reviewers: spatel, eeckstein, hans

Reviewed By: hans

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40639

llvm-svn: 319537
2017-12-01 12:30:49 +00:00
Dinar Temirbulatov 29e86584c6 [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops.
Patch tries to improve vectorization of the following code:
    
            void add1(int * __restrict dst, const int * __restrict src) {
              *dst++ = *src++;
              *dst++ = *src++ + 1;
              *dst++ = *src++ + 2;
              *dst++ = *src++ + 3;
            }
            Allows to vectorize even if the very first operation is not a binary add, but just a load.
    
            Fixed issues related to previous commit.
    
            Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev
    
            Reviewed By: ABataev, RKSimon
    
            Subscribers: llvm-commits, RKSimon
    
            Differential Revision: https://reviews.llvm.org/D28907

llvm-svn: 319531
2017-12-01 11:10:47 +00:00
Hiroshi Inoue 48e4c7aae6 Recommit rL319407: [SROA] enable splitting for non-whole-alloca loads and stores
Recommiting once reverted patch rL319407 after adding a check for bit vector size to avoid failures in some build bots.

llvm-svn: 319522
2017-12-01 06:05:05 +00:00
Zachary Turner 8065f0b975 Mark all library options as hidden.
These command line options are not intended for public use, and often
don't even make sense in the context of a particular tool anyway. About
90% of them are already hidden, but when people add new options they
forget to hide them, so if you were to make a brand new tool today, link
against one of LLVM's libraries, and run tool -help you would get a
bunch of junk that doesn't make sense for the tool you're writing.

This patch hides these options. The real solution is to not have
libraries defining command line options, but that's a much larger effort
and not something I'm prepared to take on.

Differential Revision: https://reviews.llvm.org/D40674

llvm-svn: 319505
2017-12-01 00:53:10 +00:00
Peter Collingbourne 1f03422610 ThinLTOBitcodeWriter: Try harder to discard unused references to the merged module.
If the thin module has no references to an internal global in the
merged module, we need to make sure to preserve that property if the
global is a member of a comdat group, as otherwise promotion can end
up adding global symbols to the comdat, which is not allowed.

This situation can arise if the external global in the thin module
has dead constant users, which would cause use_empty() to return
false and would cause us to try to promote it. To prevent this from
happening, discard the dead constant users before asking whether a
global is empty.

Differential Revision: https://reviews.llvm.org/D40593

llvm-svn: 319494
2017-11-30 23:05:52 +00:00
Dan Gohman 59e4c0b938 [memcpyopt] Teach memcpyopt to optimize across basic blocks
This teaches memcpyopt to make a non-local memdep query when a local query
indicates that the dependency is non-local. This notably allows it to
eliminate many more llvm.memcpy calls in common Rust code, often by 20-30%.

Fixes PR28958.

Differential Revision: https://reviews.llvm.org/D38374

llvm-svn: 319482
2017-11-30 22:10:53 +00:00
Xinliang David Li c23d2c6883 [PGO] Skip counter promotion for infinite loops
Differential Revision: http://reviews.llvm.org/D40662

llvm-svn: 319462
2017-11-30 19:16:25 +00:00
Hiroshi Inoue 21e8ded4d2 Revert rL319407: [SROA] enable splitting for non-whole-alloca loads and stores
This reverts commit rL319407 due to failures in some buildbot.

llvm-svn: 319410
2017-11-30 08:29:51 +00:00
Hiroshi Inoue 422e80aee2 [SROA] enable splitting for non-whole-alloca loads and stores
Currently, SROA splits loads and stores only when they are accessing the whole alloca.
This patch relaxes this limitation to allow splitting a load/store if all other loads and stores to the alloca are disjoint to or fully included in the current load/store. If there is no other load or store that crosses the boundary of the current load/store, the current splitting implementation works as is.
The whole-alloca loads and stores meet this new condition and so they are still splittable.

Here is a simplified motivating example.

struct record {
    long long a;
    int b;
    int c;
};

int func(struct record r) {
    for (int i = 0; i < r.c; i++)
        r.b++;
    return r.b;
}

When updating r.b (or r.c as well), LLVM generates redundant instructions on some platforms (such as x86_64, ppc64); here, r.b and r.c are packed into one 64-bit GPR when the struct is passed as a method argument.

With this patch, the above example is compiled into only few instructions without loop.
Without the patch, unnecessary loop-carried dependency is introduced by SROA and the loop cannot be eliminated by the later optimizers.

Differential Revision: https://reviews.llvm.org/D32998

llvm-svn: 319407
2017-11-30 07:44:46 +00:00
Graham Yiu 70293fa27a - Removed unused lamba (IsReturnBlock) causing build bots to fail for r319398
- Added lit testcases that were supposed to be part of r319398

llvm-svn: 319399
2017-11-30 03:36:57 +00:00
Graham Yiu 8b1882c186 With PGO information, we can do more aggressive outlining of cold regions in the inline candidate function. This contrasts with the scheme of keeping only the 'early return' portion of the inline candidate and outlining the rest of the function as a single function call.
Support for outlining multiple regions of each function is added, as well as some basic heuristics to determine which regions are good to outline. Outline candidates limited to regions that are single-entry & single-exit. We also avoid outlining regions that produce live-exit variables, which may inhibit some forms of code motion (like commoning).

Fallback to the regular partial inlining scheme is retained when either i) no regions are identified for outlining in the function, or ii) the outlined function could not be inlined in any of its callers.

Differential Revision: https://reviews.llvm.org/D38190

llvm-svn: 319398
2017-11-30 02:41:36 +00:00
Peter Collingbourne 9e3175bb6b LowerTypeTests: Deduplicate code. NFC.
llvm-svn: 319390
2017-11-30 00:27:08 +00:00
Peter Collingbourne 943aca3c27 LowerTypeTests: Remove unnecessary cast. NFC.
llvm-svn: 319387
2017-11-30 00:02:55 +00:00
Adam Nemet 2e92289014 Demote this opt remark to DEBUG.
From a random opt-stat output:

Top 10 remarks:
  tailcallelim/tailcall          53%
  inline/AlwaysInline            13%
  gvn/LoadClobbered              13%
  inline/Inlined                  8%
  inline/TooCostly                2%
  inline/NoDefinition             2%
  licm/LoadWithLoopInvariantAddressInvalidated  2%
  licm/Hoisted                    1%
  asm-printer/InstructionCount    1%
  prologepilog/StackSize          1%

llvm-svn: 319235
2017-11-28 22:11:00 +00:00
Adrian Prantl 77d90b0c39 SROA: Don't create variable fragments that are outside of the variable.
An alloca may be larger than a variable that is described to be stored
there. Don't create a dbg.value for fragments that are outside of the
variable.

This fixes PR35447.
https://bugs.llvm.org/show_bug.cgi?id=35447

llvm-svn: 319230
2017-11-28 21:30:38 +00:00
Hans Wennborg ca46db957d EntryExitInstrumenter: set DebugLocs on the inserted call instructions (PR35412)
Apparently the verifier requires that inlineable calls in a function
with debug info have debug locations.

llvm-svn: 319199
2017-11-28 18:44:26 +00:00
Jonas Paulsson f0ff20f1f0 Use getStoreSize() in various places instead of 'BitSize >> 3'.
This is needed for cases when the memory access is not as big as the width of
the data type. For instance, storing i1 (1 bit) would be done in a byte (8
bits).

Using 'BitSize >> 3' (or '/ 8') would e.g. give the memory access of an i1 a
size of 0, which for instance makes alias analysis return NoAlias even when
it shouldn't.

There are no tests as this was done as a follow-up to the bugfix for the case
where this was discovered (r318824). This handles more similar cases.

Review: Björn Petterson
https://reviews.llvm.org/D40339

llvm-svn: 319173
2017-11-28 14:44:32 +00:00
Chandler Carruth c34f789e38 Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable.
The core idea is to (re-)introduce some redundancies where their cost is
hidden by the cost of materializing immediates for constant operands of
PHI nodes. When the cost of the redundancies is covered by this,
avoiding materializing the immediate has numerous benefits:
1) Less register pressure
2) Potential for further folding / combining
3) Potential for more efficient instructions due to immediate operand

As a motivating example, consider the remarkably different cost on x86
of a SHL instruction with an immediate operand versus a register
operand.

This pattern turns up surprisingly frequently, but is somewhat rarely
obvious as a significant performance problem.

The pass is entirely target independent, but it does rely on the target
cost model in TTI to decide when to speculate things around the PHI
node. I've included x86-focused tests, but any target that sets up its
immediate cost model should benefit from this pass.

There is probably more that can be done in this space, but the pass
as-is is enough to get some important performance on our internal
benchmarks, and should be generally performance neutral, but help with
more extensive benchmarking is always welcome.

One awkward part is that this pass has to be scheduled after
*everything* that can eliminate these kinds of redundancies. This
includes SimplifyCFG, GVN, etc. I'm open to suggestions about better
places to put this. We could in theory make it part of the codegen pass
pipeline, but there doesn't really seem to be a good reason for that --
it isn't "lowering" in any sense and only relies on pretty standard cost
model based TTI queries, so it seems to fit well with the "optimization"
pipeline model. Still, further thoughts on the pipeline position are
welcome.

I've also only implemented this in the new pass manager. If folks are
very interested, I can try to add it to the old PM as well, but I didn't
really see much point (my use case is already switched over to the new
PM).

I've tested this pretty heavily without issue. A wide range of
benchmarks internally show no change outside the noise, and I don't see
any significant changes in SPEC either. However, the size class
computation in tcmalloc is substantially improved by this, which turns
into a 2% to 4% win on the hottest path through tcmalloc for us, so
there are definitely important cases where this is going to make
a substantial difference.

Differential revision: https://reviews.llvm.org/D37467

llvm-svn: 319164
2017-11-28 11:32:31 +00:00
Florian Hahn 25ea91a838 [TailRecursionElimination] Skip debug intrinsics.
Summary:
I think we do not need to analyze debug intrinsics here, as they should
not impact codegen. This has 2 benefits: 1) slightly less work to do and
2) avoiding generating optimization remarks for converting calls to
debug intrinsics to tail calls, which are not really helpful for users.

Based on work by Sander de Smalen.

Reviewers: davide, trentxintong, aprantl

Reviewed By: aprantl

Subscribers: llvm-commits, JDevlieghere

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D40440

llvm-svn: 319158
2017-11-28 09:32:25 +00:00
Max Kazantsev 115607226a [GVN] Prevent ScalarPRE from hoisting across instructions that don't pass control flow to successors
This is to address a problem similar to those in D37460 for Scalar PRE. We should not
PRE across an instruction that may not pass execution to its successor unless it is safe
to speculatively execute it.

Differential Revision: https://reviews.llvm.org/D38619

llvm-svn: 319147
2017-11-28 07:07:55 +00:00
Rafael Espindola c06f55e1e8 This reverts commit r319096 and r319097.
Revert "[SROA] Propagate !range metadata when moving loads."
Revert "[Mem2Reg] Clang-format unformatted parts of this file. NFCI."

Davide says they broke a bot.

llvm-svn: 319131
2017-11-28 01:25:38 +00:00
Adrian Prantl d7f6f1636d SROA: Avoid creating a fragment expression that covers the entire variable.
Fixes PR35416.

https://bugs.llvm.org/show_bug.cgi?id=35416

llvm-svn: 319126
2017-11-28 00:57:53 +00:00
Davide Italiano 824d71a9c5 [Mem2Reg] Clang-format unformatted parts of this file. NFCI.
llvm-svn: 319097
2017-11-27 21:25:52 +00:00
Davide Italiano b5d59e73ee [SROA] Propagate !range metadata when moving loads.
This tries to propagate !range metadata to a pre-existing load
when a load is optimized out. This is done instead of adding an
assume because converting loads to and from assumes creates a
lot of IR.

Patch by Ariel Ben-Yehuda.

Differential Revision:  https://reviews.llvm.org/D37216

llvm-svn: 319096
2017-11-27 21:25:13 +00:00
Sanjay Patel 0de1a4bc2d [PartiallyInlineLibCalls][x86] add TTI hook to allow sqrt inlining to depend on arg rather than result
This should fix PR31455:
https://bugs.llvm.org/show_bug.cgi?id=31455

Differential Revision: https://reviews.llvm.org/D28314

llvm-svn: 319094
2017-11-27 21:15:43 +00:00
Arnold Schwaighofer d9e710984d Inliner: Don't mark notail calls with the 'tail' attribute
enum TailCallKind { TCK_None = 0, TCK_Tail = 1, TCK_MustTail = 2,
                    TCK_NoTail = 3 };

TCK_NoTail is greater than TCK_Tail so taking the min does not do the
correct thing.

rdar://35639547

llvm-svn: 319075
2017-11-27 19:03:40 +00:00
Sanjay Patel 863d494730 [InstCombine] use 'auto' with 'dyn_cast'; NFC
llvm-svn: 319067
2017-11-27 18:19:32 +00:00
Benjamin Kramer 51ebcaaf25 Make helpers static. NFC.
llvm-svn: 318953
2017-11-24 14:55:41 +00:00
Alexander Potapenko 9e5477f473 MSan: remove an unnecessary cast. NFC for userspace instrumenetation.
llvm-svn: 318923
2017-11-23 15:06:51 +00:00
Alexander Potapenko 391804f54b [MSan] Move the access address check before the shadow access for that address
MSan used to insert the shadow check of the store pointer operand
_after_ the shadow of the value operand has been written.
This happens to work in the userspace, as the whole shadow range is
always mapped. However in the kernel the shadow page may not exist, so
the bug may cause a crash.

This patch moves the address check in front of the shadow access.

llvm-svn: 318901
2017-11-23 08:34:32 +00:00
Max Kazantsev 716e647d74 [IRCE][NFC] Add no wrap flags to no-wrapping SCEV calculation
In a lambda where we expect to have result within bounds, add respective `nsw/nuw` flags to
help SCEV just in case if it fails to figure them out on its own.

Differential Revision: https://reviews.llvm.org/D40168

llvm-svn: 318898
2017-11-23 06:14:39 +00:00
Davide Italiano b480b5c2ee [SCCP] Pick the right lattice value for constants.
After the dataflow algorithm proves that an argument is constant,
it replaces it value with the integer constant and drops the lattice
value associated to the DEF.

e.g. in the example we have @f() that's called twice:
call @f(undef, ...)
call @f(2, ...)

`undef` MEET 2 = 2 so we replace the argument and all its uses with
the constant 2.

Shortly after, tryToReplaceWithConstantRange() tries to get the lattice
value for the argument we just replaced, causing an assertion.
This function is a little peculiar as it runs when we're doing replacement
and not as part of the solver but still queries the solver.

The fix is that of checking whether we replaced the value already and
get a temporary lattice value for the constant.

Thanks to Zhendong Su for the report!

Fixes PR35357.

llvm-svn: 318817
2017-11-22 03:04:55 +00:00
Hans Wennborg 37cbf28e79 EntryExitInstrumenter: support __cyg_profile_func_enter_bare
It works just like __cyg_profile_func_enter but takes no arguments.

llvm-svn: 318783
2017-11-21 17:22:19 +00:00
Alina Sbirlea ff8b8aea2e Add MemorySSA as loop dependency, disabled by default [NFC].
Summary:
First step in adding MemorySSA as dependency for loop pass manager.
Adding the dependency under a flag.

New pass manager: MSSA pointer in LoopStandardAnalysisResults can be null.
Legacy and new pass manager: Use cl::opt EnableMSSALoopDependency. Disabled by default.

Reviewers: sanjoy, davide, gberry

Subscribers: mehdi_amini, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D40274

llvm-svn: 318772
2017-11-21 15:45:46 +00:00
NAKAMURA Takumi 519ea284af SLPVectorizer.cpp: Avoid std::stable_sort(properlyDominates()).
properlyDominates() shouldn't be used as sort key. It causes different output between stdlibc++ and libc++.
Instead, I introduced RPOT. In most cases, it works for CSE.

llvm-svn: 318743
2017-11-21 09:41:01 +00:00
Davide Italiano 5df8080011 [SCCP] If we replace with a constant, we can't replace with a range.
This microoptimization is NFC.

llvm-svn: 318711
2017-11-21 00:21:52 +00:00
Vitaly Buka 8000f228b3 [msan] Don't sanitize "nosanitize" instructions
Reviewers: eugenis

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40205

llvm-svn: 318708
2017-11-20 23:37:56 +00:00
Hiroshi Yamauchi c94d4d70d8 Add heuristics for irreducible loop metadata under PGO
Summary:
Add the following heuristics for irreducible loop metadata:

- When an irreducible loop header is missing the loop header weight metadata,
  give it the minimum weight seen among other headers.
- Annotate indirectbr targets with the loop header weight metadata (as they are
  likely to become irreducible loop headers after indirectbr tail duplication.)

These greatly improve the accuracy of the block frequency info of the Python
interpreter loop (eg. from ~3-16x off down to ~40-55% off) and the Python
performance (eg. unpack_sequence from ~50% slower to ~8% faster than GCC) due to
better register allocation under PGO.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39980

llvm-svn: 318693
2017-11-20 21:03:38 +00:00
Teresa Johnson 3309002a86 [SROA] Correctly invalidate analyses when dead instructions deleted
Summary:
SROA can fail in rewriting alloca but still rewrite a phi resulting
in dead instruction elimination. The Changed flag was not being set
correctly, resulting in downstream passes using stale analyses.
The included test case will assert during the second BDCE pass as a
result.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39921

llvm-svn: 318677
2017-11-20 18:33:38 +00:00
Evgeniy Stepanov 8e7018d92f [asan] Use dynamic shadow on 32-bit Android, try 2.
Summary:
This change reverts r318575 and changes FindDynamicShadowStart() to
keep the memory range it found mapped PROT_NONE to make sure it is
not reused. We also skip MemoryRangeIsAvailable() check, because it
is (a) unnecessary, and (b) would fail anyway.

Reviewers: pcc, vitalybuka, kcc

Subscribers: srhines, kubamracek, mgorny, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D40203

llvm-svn: 318666
2017-11-20 17:41:57 +00:00
Gil Rapaport 8b9d1f3c5b [LV] Model masking in VPlan, introducing VPInstructions
This patch adds a new abstraction layer to VPlan and leverages it to model the planned
instructions that manipulate masks (AND, OR, NOT), introduced during predication.

The new VPValue and VPUser classes model how data flows into, through and out
of a VPlan, forming the vertices of a planned Def-Use graph. The new
VPInstruction class is a generic single-instruction Recipe that models a
planned instruction along with its opcode, operands and users. See
VectorizationPlan.rst for more details.

Differential Revision: https://reviews.llvm.org/D38676

llvm-svn: 318645
2017-11-20 12:01:47 +00:00
Max Kazantsev 268467869b [IRCE] Smart range intersection
In rL316552, we ban intersection of unsigned latch range with signed range check and vice
versa, unless the entire range check iteration space is known positive. It was a correct
functional fix that saved us from dealing with ambiguous values, but it also appeared
to be a very restrictive limitation. In particular, in the following case:

  loop:
    %iv = phi i32 [ 0, %preheader ], [ %iv.next, %latch]
    %iv.offset = add i32 %iv, 10
    %rc = icmp slt i32 %iv.offset, %len
    br i1 %rc, label %latch, label %deopt

  latch:
    %iv.next = add i32 %iv, 11
    %cond = icmp i32 ult %iv.next, 100
    br it %cond, label %loop, label %exit

Here, the unsigned iteration range is `[0, 100)`, and the safe range for range
check is `[-10, %len - 10)`. For unsigned iteration spaces, we use unsigned
min/max functions for range intersection. Given this, we wanted to avoid dealing
with `-10` because it is interpreted as a very big unsigned value. Semantically, range
check's safe range goes through unsigned border, so in fact it is two disjoint
ranges in IV's iteration space. Intersection of such ranges is not trivial, so we prohibited
this case saying that we are not allowed to intersect such ranges.

What semantics of this safe range actually means is that we can start from `-10` and go
up increasing the `%iv` by one until we reach `%len - 10` (for simplicity let's assume that
`%len - 10`  is a reasonably big positive value).

In particular, this safe iteration space includes `0, 1, 2, ..., %len - 11`. So if we were able to return
safe iteration space `[0, %len - 10)`, we could safely intersect it with IV's iteration space. All
values in this range are non-negative, so using signed/unsigned min/max for them is unambiguous.

In this patch, we alter the algorithm of safe range calculation so that it returnes a subset of the
original safe space which is represented by one continuous range that does not go through wrap.
In order to reach this, we use modified SCEV substraction function. It can be imagined as a function
that substracts by `1` (or `-1`) as long as the further substraction does not cause a wrap in IV iteration
space. This allows us to perform IRCE in many situations when we deal with IV space and range check
of different types (in terms of signed/unsigned).

We apply this approach for both matching and not matching types of IV iteration space and the
range check. One implication of this is that now IRCE became smarter in detection of empty safe
ranges. For example, in this case:
  loop:
    %iv = phi i32 [ %begin, %preheader ], [ %iv.next, %latch]
    %iv.offset = sub i32 %iv, 10
    %rc = icmp ult i32 %iv.offset, %len
    br i1 %rc, label %latch, label %deopt

  latch:
    %iv.next = add i32 %iv, 11
    %cond = icmp i32 ult %iv.next, 100
    br it %cond, label %loop, label %exit

If `%len` was less than 10 but SCEV failed to trivially prove that `%begin - 10 >u %len- 10`,
we could end up executing entire loop in safe preloop while the main loop was still generated,
but never executed. Now, cutting the ranges so that if both `begin - 10` and `%len - 10` overflow,
we have a trivially empty range of `[0, 0)`. This in some cases prevents us from meaningless optimization.

Differential Revision: https://reviews.llvm.org/D39954

llvm-svn: 318639
2017-11-20 06:07:57 +00:00
Sanjay Patel 9771a96f6e [LibCallSimplifier] allow splat vectors for pow(x, 0.5) -> sqrt() transforms
llvm-svn: 318629
2017-11-19 16:42:27 +00:00
Sanjay Patel fbd3e66b9a [LibCallSimplifier] partly fix pow(x, 0.5) -> sqrt() transforms
As the first test shows, we could transform an llvm intrinsic which never sets errno 
into a libcall which could set errno (even though it's marked readnone?), so that's 
not ideal.

It's possible that we can also transform a libcall which could set errno to an
intrinsic given the fast-math-flags constraint, but that's deferred to determine
exactly which set of FMF are needed.

Differential Revision: https://reviews.llvm.org/D40150

llvm-svn: 318628
2017-11-19 16:13:14 +00:00
Florian Hahn 2a266a343f [CallSiteSplitting] Remove some indirection (NFC).
Summary:
With this patch I tried to reduce the complexity of the code sightly, by
removing some indirection. Please let me know what you think.

Reviewers: junbuml, mcrosier, davidxl

Reviewed By: junbuml

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40037

llvm-svn: 318593
2017-11-18 18:14:13 +00:00
Walter Lee 9abeecc07c [asan] Add a full redzone after every stack variable
We were not doing that for large shadow granularity.  Also add more
stack frame layout tests for large shadow granularity.

Differential Revision: https://reviews.llvm.org/D39475

llvm-svn: 318581
2017-11-18 01:13:18 +00:00
Evgeniy Stepanov 9d564cdcb0 Revert "[asan] Use dynamic shadow on 32-bit Android" and 3 more.
Revert the following commits:
  r318369 [asan] Fallback to non-ifunc dynamic shadow on android<22.
  r318235 [asan] Prevent rematerialization of &__asan_shadow.
  r317948 [sanitizer] Remove unnecessary attribute hidden.
  r317943 [asan] Use dynamic shadow on 32-bit Android.

MemoryRangeIsAvailable() reads /proc/$PID/maps into an mmap-ed buffer
that may overlap with the address range that we plan to use for the
dynamic shadow mapping. This is causing random startup crashes.

llvm-svn: 318575
2017-11-18 00:22:34 +00:00
Jun Bum Lim 0f90672ae9 [LICM] Fix PR35342
Summary: This change fix PR35342 by replacing only the current use with undef in unreachable blocks.

Reviewers: efriedma, mcrosier, igor-laevsky

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40184

llvm-svn: 318551
2017-11-17 20:38:25 +00:00
Chandler Carruth 693eedb138 [PM/Unswitch] Teach SimpleLoopUnswitch to do non-trivial unswitching,
making it no longer even remotely simple.

The pass will now be more of a "full loop unswitching" pass rather than
anything substantively simpler than any other approach. I plan to rename
it accordingly once the dust settles.

The key ideas of the new loop unswitcher are carried over for
non-trivial unswitching:
1) Fully unswitch a branch or switch instruction from inside of a loop to
   outside of it.
2) Update the CFG and IR. This avoids needing to "remember" the
   unswitched branches as well as avoiding excessively cloning and
   reliance on complex parts of simplify-cfg to cleanup the cfg.
3) Update the analyses (where we can) rather than just blowing them away
   or relying on something else updating them.

Sadly, #3 is somewhat compromised here as the dominator tree updates
were too complex for me to want to reason about. I will need to make
another attempt to do this now that we have a nice dynamic update API
for dominators. However, we do adhere to #3 w.r.t. LoopInfo.

This approach also adds an important principls specific to non-trivial
unswitching: not *all* of the loop will be duplicated when unswitching.
This fact allows us to compute the cost in terms of how much *duplicate*
code is inserted rather than just on raw size. Unswitching conditions
which essentialy partition loops will work regardless of the total loop
size.

Some remaining issues that I will be addressing in subsequent commits:
- Handling unstructured control flow.
- Unswitching 'switch' cases instead of just branches.
- Moving to the dynamic update API for dominators.

Some high-level, interesting limitationsV that folks might want to push
on as follow-ups but that I don't have any immediate plans around:
- We could be much more clever about not cloning things that will be
  deleted. In fact, we should be able to delete *nothing* and do
  a minimal number of clones.
- There are many more interesting selection criteria for which branch to
  unswitch that we might want to look at. One that I'm interested in
  particularly are a set of conditions which all exit the loop and which
  can be merged into a single unswitched test of them.

Differential revision: https://reviews.llvm.org/D34200

llvm-svn: 318549
2017-11-17 19:58:36 +00:00
Max Kazantsev 1ac6e8ae61 [IRCE] Remove folding of two range checks into RANGE_CHECK_BOTH
The logic of replacing of a couple `RANGE_CHECK_LOWER + RANGE_CHECK_UPPER`
into `RANGE_CHECK_BOTH` in fact duplicates the logic of range intersection which
happens when we calculate safe iteration space. Effectively, the result of intersection of
these ranges doesn't differ from the range of merged range check.

We chose to remove duplicating logic in favor of code simplicity.

Differential Revision: https://reviews.llvm.org/D39589

llvm-svn: 318508
2017-11-17 06:49:26 +00:00
David Blaikie b3bde2ea50 Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).

llvm-svn: 318490
2017-11-17 01:07:10 +00:00
Mandeep Singh Grang e6bb66357c [PredicateInfo] Add comment about why we require stable sort
llvm-svn: 318487
2017-11-17 00:43:24 +00:00
Walter Lee 8f1545c629 [asan] Fix small X86_64 ShadowOffset for non-default shadow scale
The requirement is that shadow memory must be aligned to page
boundaries (4k in this case).  Use a closed form equation that always
satisfies this requirement.

Differential Revision: https://reviews.llvm.org/D39471

llvm-svn: 318421
2017-11-16 17:03:00 +00:00
Sanjay Patel b3fa94586f [InstCombine] include 'sub' in the list of narrow-able binops
// trunc (binop X, C) --> binop (trunc X, C')
      // trunc (binop (ext X), Y) --> binop X, (trunc Y)

I'm grouping sub with the other binops  because that makes the code simpler
and the transforms are valid:
https://rise4fun.com/Alive/UeF
...so even though we don't expect a sub with constant Op1 or any of the
other opcodes with constant Op0 due to canonicalization rules, we might as
well handle those situations if non-canonical code somehow reaches this
point (it should just make instcombine more efficient in reaching its
end goal).

This should solve the problem that later manifests in the vectorizers in 
PR35295:
https://bugs.llvm.org/show_bug.cgi?id=35295

llvm-svn: 318404
2017-11-16 14:40:51 +00:00
Walter Lee 2a2b69e9c7 [asan] Fix size/alignment issues with non-default shadow scale
Fix a couple places where the minimum alignment/size should be a
function of the shadow granularity:
- alignment of AllGlobals
- the minimum left redzone size on the stack

Added a test to verify that the metadata_array is properly aligned
for shadow scale of 5, to be enabled when we add build support
for testing shadow scale of 5.

Differential Revision: https://reviews.llvm.org/D39470

llvm-svn: 318395
2017-11-16 12:57:19 +00:00
Max Kazantsev b1b8aff2e7 [IRCE] Fix SCEVExpander's usage in IRCE
When expanding exit conditions for pre- and postloops, we may end up expanding a
recurrency from the loop to in its loop's preheader. This produces incorrect IR.

This patch ensures that IRCE uses SCEVExpander correctly and only expands code which
is safe to expand in this particular location.

Differentian Revision: https://reviews.llvm.org/D39234

llvm-svn: 318381
2017-11-16 06:06:27 +00:00
Evgeniy Stepanov 396ed67950 [asan] Fallback to non-ifunc dynamic shadow on android<22.
Summary: Android < 22 does not support ifunc.

Reviewers: pcc

Subscribers: srhines, kubamracek, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40116

llvm-svn: 318369
2017-11-16 02:52:19 +00:00
Craig Topper 062bcf30b1 [GVNHoist] Fix a signed/unsigned comparison warning that occurs in 32-bit builds with gcc.
std::distance returns ptrdiff_t which is signed. 64-bit builds don't notice because type promotion widens the unsigned first.

llvm-svn: 318354
2017-11-16 00:19:59 +00:00
Sanjay Patel 03d0cd6a81 [InstCombine] trunc (binop X, C) --> binop (trunc X, C')
Note that one-use and shouldChangeType() are checked ahead of the switch.

Without the narrowing folds, we can produce inferior vector code as shown in PR35299:
https://bugs.llvm.org/show_bug.cgi?id=35299

llvm-svn: 318323
2017-11-15 19:12:01 +00:00
Reid Kleckner 72b819b8ee [InstCombine] Salvage debug info during initial DCE
InstCombine salvages debug info for every instruction it erases from its
worklist, but it wasn't doing it during its initial DCE when populating
its worklist. This fixes that.

This should help improve availability of 'this' in optimized debug info
when casts are necessary.

llvm-svn: 318320
2017-11-15 18:51:12 +00:00
Adam Nemet 572a87c76f [SLP] Added more missed optimization remarks
Summary:
Added more remarks to SLP pass, in particular "missed" optimization remarks.
Also proposed several tests for new functionality.

Patch by Vladimir Miloserdov!

For reference you may look at: https://reviews.llvm.org/rL302811

Reviewers: anemet, fhahn

Reviewed By: anemet

Subscribers: javed.absar, lattner, petecoup, yakush, llvm-commits

Differential Revision: https://reviews.llvm.org/D38367

llvm-svn: 318307
2017-11-15 17:04:53 +00:00
Sanjay Patel d1becd082a [Reassociate] simplify code; NFCI
llvm-svn: 318298
2017-11-15 16:19:17 +00:00
Craig Topper f7b86728fa [InstCombine] Simplify binops that are only used by a select and are fed by a select with the same condition.
Summary:
This patch optimizes a binop sandwiched between 2 selects with the same condition. Since we know its only used by the select we can propagate the appropriate input value from the earlier select.

As I'm writing this I realize I may need to avoid doing this for division in case the select was protecting a divide by zero?

Reviewers: spatel, majnemer

Reviewed By: majnemer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39999

llvm-svn: 318267
2017-11-15 05:23:02 +00:00
Hans Wennborg 45cabacd2f Revert r318193 "[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops."
It crashes building sqlite; see reply on the llvm-commits thread.

> [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops.
>
>         Patch tries to improve vectorization of the following code:
>
>         void add1(int * __restrict dst, const int * __restrict src) {
>           *dst++ = *src++;
>           *dst++ = *src++ + 1;
>           *dst++ = *src++ + 2;
>           *dst++ = *src++ + 3;
>         }
>         Allows to vectorize even if the very first operation is not a binary add, but just a load.
>
>         Fixed issues related to previous commit.
>
>         Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev
>
>         Reviewed By: ABataev, RKSimon
>
>         Subscribers: llvm-commits, RKSimon
>
>         Differential Revision: https://reviews.llvm.org/D28907

llvm-svn: 318239
2017-11-15 00:38:13 +00:00
Craig Topper bf6495fbcb [LoopRotate] processLoop should return true even if it just simplified the loop latch without making any other changes
Simplifying a loop latch changes the IR and we need to make sure the pass manager knows to invalidate analysis passes if that happened.

PR35210 discovered a case where we failed to invalidate the post dominator tree after this simplification because we no changes other than simplifying the loop latch.

Fixes PR35210.

Differential Revision: https://reviews.llvm.org/D40035

llvm-svn: 318237
2017-11-15 00:22:42 +00:00
Evgeniy Stepanov cff19ee233 [asan] Prevent rematerialization of &__asan_shadow.
Summary:
In the mode when ASan shadow base is computed as the address of an
external global (__asan_shadow, currently on android/arm32 only),
regalloc prefers to rematerialize this value to save register spills.
Even in -Os. On arm32 it is rather expensive (2 loads + 1 constant
pool entry).

This changes adds an inline asm in the function prologue to suppress
this behavior. It reduces AsanTest binary size by 7%.

Reviewers: pcc, vitalybuka

Subscribers: aemerson, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40048

llvm-svn: 318235
2017-11-15 00:11:51 +00:00
Davide Italiano 1380cb8055 [EntryExitInstrumenter] Placate GCC, the semicolon is redundant. NFCI.
llvm-svn: 318217
2017-11-14 23:13:38 +00:00
Sanjay Patel 64fd333304 [Reassociate] use dyn_cast instead of isa+cast; NFCI
llvm-svn: 318212
2017-11-14 23:03:56 +00:00
Reid Kleckner 29a5c03cc2 Make salvageDebugInfo of casts work for dbg.declare and dbg.addr
Summary:
Instcombine (and probably other passes) sometimes want to change the
type of an alloca. To do this, they generally create a new alloca with
the desired type, create a bitcast to make the new pointer type match
the old pointer type, replace all uses with the cast, and then simplify
the casts. We already knew how to salvage dbg.value instructions when
removing casts, but we can extend it to cover dbg.addr and dbg.declare.

Fixes a debug info quality issue uncovered in Chromium in
http://crbug.com/784609

Reviewers: aprantl, vsk

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40042

llvm-svn: 318203
2017-11-14 21:49:06 +00:00
Hans Wennborg e1ecd61b98 Rename CountingFunctionInserter and use for both mcount and cygprofile calls, before and after inlining
Clang implements the -finstrument-functions flag inherited from GCC, which
inserts calls to __cyg_profile_func_{enter,exit} on function entry and exit.

This is useful for getting a trace of how the functions in a program are
executed. Normally, the calls remain even if a function is inlined into another
function, but it is useful to be able to turn this off for users who are
interested in a lower-level trace, i.e. one that reflects what functions are
called post-inlining. (We use this to generate link order files for Chromium.)

LLVM already has a pass for inserting similar instrumentation calls to
mcount(), which it does after inlining. This patch renames and extends that
pass to handle calls both to mcount and the cygprofile functions, before and/or
after inlining as controlled by function attributes.

Differential Revision: https://reviews.llvm.org/D39287

llvm-svn: 318195
2017-11-14 21:09:45 +00:00
Dinar Temirbulatov 2bd1836520 [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops.
Patch tries to improve vectorization of the following code:
    
        void add1(int * __restrict dst, const int * __restrict src) {
          *dst++ = *src++;
          *dst++ = *src++ + 1;
          *dst++ = *src++ + 2;
          *dst++ = *src++ + 3;
        }
        Allows to vectorize even if the very first operation is not a binary add, but just a load.
    
        Fixed issues related to previous commit.
    
        Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev
    
        Reviewed By: ABataev, RKSimon
    
        Subscribers: llvm-commits, RKSimon
    
        Differential Revision: https://reviews.llvm.org/D28907

llvm-svn: 318193
2017-11-14 20:55:08 +00:00
Mandeep Singh Grang b8a11bbcf1 [PredicateInfo] Stable sort ValueDFS to remove non-deterministic ordering
Summary: This fixes failure in Transforms/Util/PredicateInfo/testandor.ll uncovered by D39245.

Reviewers: dberlin

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39630

llvm-svn: 318165
2017-11-14 18:22:50 +00:00
Gil Rapaport 848581cadb [LV] Introduce VPBlendRecipe, VPWidenMemoryInstructionRecipe
This patch is part of D38676.

The patch introduces two new Recipes to handle instructions whose vectorization
involves masking. These Recipes take VPlan-level masks in D38676, but still rely
on ILV's existing createEdgeMask(), createBlockInMask() in this patch.

VPBlendRecipe handles intra-loop phi nodes, which are vectorized as a sequence
of SELECTs. Its execute() code is refactored out of ILV::widenPHIInstruction(),
which now handles only loop-header phi nodes.

VPWidenMemoryInstructionRecipe handles load/store which are to be widened
(but are not part of an Interleave Group). In this patch it simply calls
ILV::vectorizeMemoryInstruction on execute().

Differential Revision: https://reviews.llvm.org/D39068

llvm-svn: 318149
2017-11-14 12:09:30 +00:00
Chandler Carruth 00a301d568 [PM] Port BoundsChecking to the new PM.
Registers it and everything, updates all the references, etc.

Next patch will add support to Clang's `-fexperimental-new-pass-manager`
path to actually enable BoundsChecking correctly.

Differential Revision: https://reviews.llvm.org/D39084

llvm-svn: 318128
2017-11-14 01:30:04 +00:00
Chandler Carruth 1594feea94 [PM] Refactor BoundsChecking further to prepare it to be exposed both as
a legacy and new PM pass.

This essentially moves the class state to parameters and re-shuffles the
code to make that reasonable. It also does some minor cleanups along the
way and leaves some comments.

Differential Revision: https://reviews.llvm.org/D39081

llvm-svn: 318124
2017-11-14 01:13:59 +00:00
Hans Wennborg 08b34a017a Update some code.google.com links
llvm-svn: 318115
2017-11-13 23:47:58 +00:00
Jatin Bhateja c61ade1ca0 [SCEV] Handling for ICmp occuring in the evolution chain.
Summary:
 If a compare instruction is same or inverse of the compare in the
 branch of the loop latch, then return a constant evolution node.
 This shall facilitate computations of loop exit counts in cases
 where compare appears in the evolution chain of induction variables.

 Will fix PR 34538

Reviewers: sanjoy, hfinkel, junryoungju

Reviewed By: sanjoy, junryoungju

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D38494

llvm-svn: 318050
2017-11-13 16:43:24 +00:00
Bill Seurer 44156a0efb [PowerPC][msan] Update msan to handle changed memory layouts in newer kernels
In more recent Linux kernels (including those with 47 bit VMAs) the layout of
virtual memory for powerpc64 changed causing the memory sanitizer to not
work properly. This patch adjusts a bit mask in the memory sanitizer to work
on the newer kernels while continuing to work on the older ones as well.

This is the non-runtime part of the patch and finishes it. ref: r317802

Tested on several 4.x and 3.x kernel releases.

llvm-svn: 318045
2017-11-13 15:43:19 +00:00
Florian Hahn 7114755913 [CodeExtractor] Add missing AllowVarArgs initialization.
llvm-svn: 318029
2017-11-13 11:08:47 +00:00
Florian Hahn 0e9dec672d [PartialInliner] Inline vararg functions that forward varargs.
Summary:
This patch extends the partial inliner to support inlining parts of
vararg functions, if the vararg handling is done in the outlined part.

It adds a `ForwardVarArgsTo` argument to InlineFunction. If it is
non-null, all varargs passed to the inlined function will be added to
all calls to `ForwardVarArgsTo`.

The partial inliner takes care to only pass `ForwardVarArgsTo` if the
varargs handing is done in the outlined function. It checks that vastart
is not part of the function to be inlined.

`test/Transforms/CodeExtractor/PartialInlineNoInline.ll` (already part
of the repo) checks we do not do partial inlining if vastart is used in
a basic block that will be inlined.

Reviewers: davide, davidxl, grosser

Reviewed By: davide, davidxl, grosser

Subscribers: gyiu, grosser, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D39607

llvm-svn: 318028
2017-11-13 10:35:52 +00:00
Craig Topper d3e5781e53 [InstCombine] Teach visitICmpInst to not break integer absolute value idioms
Summary:
This patch adds an early out to visitICmpInst if we are looking at a compare as part of an integer absolute value idiom. Similar is already done for min/max.

In the particular case I observed in a benchmark we had an absolute value of a load from an indexed global. We simplified the compare using foldCmpLoadFromIndexedGlobal into a magic bit vector, a shift, and an and. But the load result was still used for the select and the negate part of the absolute valute idiom. So we overcomplicated the code and lost the ability to recognize it as an absolute value.

I've chosen a simpler case for the test here.

Reviewers: spatel, davide, majnemer

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39766

llvm-svn: 317994
2017-11-12 02:28:21 +00:00
Evgeniy Stepanov 989299c42b [asan] Use dynamic shadow on 32-bit Android.
Summary:
The following kernel change has moved ET_DYN base to 0x4000000 on arm32:
https://marc.info/?l=linux-kernel&m=149825162606848&w=2

Switch to dynamic shadow base to avoid such conflicts in the future.

Reserve shadow memory in an ifunc resolver, but don't use it in the instrumentation
until PR35221 is fixed. This will eventually let use save one load per function.

Reviewers: kcc

Subscribers: aemerson, srhines, kubamracek, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D39393

llvm-svn: 317943
2017-11-10 22:27:48 +00:00
Davide Italiano acf6065183 [SimplifyCFG] Use auto * when the type is obvious. NFCI.
llvm-svn: 317923
2017-11-10 20:46:21 +00:00
Daniel Neilson 6e4aa1e481 Expand IRBuilder interface for atomic memcpy to require pointer alignments. (NFC)
Summary:
 The specification of the @llvm.memcpy.element.unordered.atomic intrinsic requires
that the pointer arguments have alignments of at least the element size. The existing
IRBuilder interface to create a call to this intrinsic does not allow for providing
the alignment of these pointer args. Having an interface that makes it easy to
construct invalid intrinsic calls doesn't seem sensible, so this patch simply
adds the requirement that one provide the argument alignments when using IRBuilder
to create atomic memcpy calls.

llvm-svn: 317918
2017-11-10 19:38:12 +00:00
Sanjoy Das 6fabb90765 [CVP] Remove some {s|u}add.with.overflow checks.
Summary:
This adds logic to CVP to remove some overflow checks.  It uses LVI to remove
operations with at least one constant.  Specifically, this can remove many
overflow intrinsics immediately following an overflow check in the source code,
such as:

if (x < INT_MAX)
    ... x + 1 ...

Patch by Joel Galenson!

Reviewers: sanjoy, regehr

Reviewed By: sanjoy

Subscribers: fhahn, pirama, srhines, llvm-commits

Differential Revision: https://reviews.llvm.org/D39483

llvm-svn: 317911
2017-11-10 19:13:35 +00:00
Easwaran Raman 0a0913def2 Add a wrapper function to set branch weights metadata.
Summary:
This wrapper checks if there is at least one non-zero weight before
setting the metadata.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39872

llvm-svn: 317845
2017-11-09 22:52:20 +00:00
Paul Robinson b46256b0b4 Fix out-of-order stepping behavior in programs with hoisted constants.
When the Constant Hoisting pass moves expensive constants into a
common block, it would assign a debug location equal to the last use
of that constant. While this is certainly intuitive, it places the
constant in an out-of-order location, according to the debug location
information. This produces out-of-order stepping when debugging
programs affected by this pass.

This patch creates in-order stepping behavior by merging the debug
locations for hoisted constants, and the new insertion point.

Patch by Matthew Voss!

Differential Revision: https://reviews.llvm.org/D38088

llvm-svn: 317827
2017-11-09 20:01:31 +00:00
Alexey Bataev 0bd9004425 [SLP] Fix PR23510: Try to find best possible vectorizable stores.
Summary:
The analysis of the store sequence goes in straight order - from the
first store to the last. Bu the best opportunity for vectorization will
happen if we're going to use reverse order - from last store to the
first. It may be best because usually users have some initialization
part + further processing and this first initialization may confuse
SLP vectorizer.

Reviewers: RKSimon, hfinkel, mkuper, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39606

llvm-svn: 317821
2017-11-09 19:07:16 +00:00
Sanjay Patel 0d66010454 [Reassociate] don't name values "tmp"; NFCI
The toxic stew of created values named 'tmp' and tests that already have
values named 'tmp' and CHECK lines looking for values named 'tmp' causes
bad things to happen in our test line auto-generation scripts because it
wants to use 'TMP' as a prefix for unnamed values. Use less 'tmp' to 
avoid that.

llvm-svn: 317818
2017-11-09 18:14:24 +00:00
Serguei Katkov 722339e405 [GVN PRE] Patch the source for Phi node in PRE
We must patch all existing incoming values of Phi node,
otherwise it is possible that we can see poison
where program does not expect to see it.

This is the similar what GVN does.

The added test test/Transforms/GVN/PRE/pre-jt-add.ll shows an
example of wrong optimization done by jump threading due to
GVN PRE did not patch existing incoming value.

Reviewers: mkazantsev, wmi, dberlin, davide
Reviewed By: dberlin
Subscribers: efriedma, llvm-commits
Differential Revision: https://reviews.llvm.org/D39637

llvm-svn: 317768
2017-11-09 06:02:18 +00:00
Dan Gohman 2c74fe977d Add an @llvm.sideeffect intrinsic
This patch implements Chandler's idea [0] for supporting languages that
require support for infinite loops with side effects, such as Rust, providing
part of a solution to bug 965 [1].

Specifically, it adds an `llvm.sideeffect()` intrinsic, which has no actual
effect, but which appears to optimization passes to have obscure side effects,
such that they don't optimize away loops containing it. It also teaches
several optimization passes to ignore this intrinsic, so that it doesn't
significantly impact optimization in most cases.

As discussed on llvm-dev [2], this patch is the first of two major parts.
The second part, to change LLVM's semantics to have defined behavior
on infinite loops by default, with a function attribute for opting into
potential-undefined-behavior, will be implemented and posted for review in
a separate patch.

[0] http://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html
[1] https://bugs.llvm.org/show_bug.cgi?id=965
[2] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118632.html

Differential Revision: https://reviews.llvm.org/D38336

llvm-svn: 317729
2017-11-08 21:59:51 +00:00
Teresa Johnson 07ec7d59c2 [ThinLTO] Ensure sanitizer passes are run
Summary:
In ThinLTO compilation, we exit populateModulePassManager early and
were not adding PM extension passes meant to run at the end of the
pipeline. This includes sanitizer passes. Add these passes before
the early exit.

A test will be added to projects/compiler-rt.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D39565

llvm-svn: 317714
2017-11-08 19:45:52 +00:00
Mitch Phillips 0222224da6 Revert rL317618
The implemented pass fails and is breaking a large number of unit tests.
Example:
http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/5777/steps/build-stage3-compiler/logs/stdio

This reverts commit rL317618

llvm-svn: 317641
2017-11-08 00:20:53 +00:00
Dinar Temirbulatov b9a2832874 [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops.
Patch tries to improve vectorization of the following code:

    void add1(int * __restrict dst, const int * __restrict src) {
      *dst++ = *src++;
      *dst++ = *src++ + 1;
      *dst++ = *src++ + 2;
      *dst++ = *src++ + 3;
    }
    Allows to vectorize even if the very first operation is not a binary add, but just a load.

    Fixed PR34619 and other issues related to previous commit.

    Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev

    Reviewed By: ABataev, RKSimon

    Subscribers: llvm-commits, RKSimon

    Differential Revision: https://reviews.llvm.org/D28907

llvm-svn: 317618
2017-11-07 21:25:34 +00:00
Craig Topper 7dd4d32431 Recommit r317510 "[InstCombine] Pull shifts through a select plus binop with constant"
The hexagon test should be fixed now.

Original commit message:

This pulls shifts through a select+binop with a constant where the select conditionally executes the binop. We already do this for just the binop, but not with the select.

This can allow us to get the select closer to other selects to enable removing one.

Differential Revision: https://reviews.llvm.org/D39222

llvm-svn: 317600
2017-11-07 18:47:24 +00:00
Craig Topper 386fc2516c [InstCombine] Update stale comment. NFC
Datalayout is no longer optional so the comment didn't match what the code currently does.

llvm-svn: 317594
2017-11-07 17:37:32 +00:00
Adrian Prantl 25a09dd408 Make DIExpression::createFragmentExpression() return an Optional.
We can't safely split arithmetic into multiple fragments because we
can't express carry-over between fragments.

llvm-svn: 317534
2017-11-07 00:45:34 +00:00
Davide Italiano 1a46affb45 [IPO/LowerTypesTest] Skip blockaddress(es) when replacing uses.
Blockaddresses refer to the function itself, therefore replacing them
would cause an assertion in doRAUW.

Fixes https://bugs.llvm.org/show_bug.cgi?id=35201

This was found when trying CFI on a proprietary kernel by Dmitry Mikulin.

Differential Revision:  https://reviews.llvm.org/D39695

llvm-svn: 317527
2017-11-07 00:09:25 +00:00
Adrian Prantl 182f9fea37 InstCombine: salvage the debug info of DCE'ed add instructions.
rdar://problem/31209283

llvm-svn: 317522
2017-11-06 22:49:39 +00:00
Hans Wennborg 8c4b10e84a Revert r317510 "[InstCombine] Pull shifts through a select plus binop with constant"
This broke the CodeGen/Hexagon/loop-idiom/pmpy-mod.ll test on a bunch of buildbots.

> This pulls shifts through a select+binop with a constant where the select conditionally executes the binop. We already do this for just the binop, but not with the select.
>
> This can allow us to get the select closer to other selects to enable removing one.
>
> Differential Revision: https://reviews.llvm.org/D39222
>
> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317510 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-svn: 317518
2017-11-06 22:28:02 +00:00
Xinliang David Li a531f189fc Fix comment /NFC
llvm-svn: 317514
2017-11-06 21:57:51 +00:00
Craig Topper 8917647333 [InstCombine] Pull shifts through a select plus binop with constant
This pulls shifts through a select+binop with a constant where the select conditionally executes the binop. We already do this for just the binop, but not with the select.

This can allow us to get the select closer to other selects to enable removing one.

Differential Revision: https://reviews.llvm.org/D39222

llvm-svn: 317510
2017-11-06 21:07:22 +00:00
Dehao Chen 5d2a1a5045 Include already promoted counts when computing SUM for VP.
Summary: When computing the SUM for indirect call promotion, if the callsite is already promoted in the profile, it will be promoted before ICP. In the current implementation, ICP only sees remaining counts in SUM. This may cause extra indirect call targets being promoted. This patch updates the SUM to include the counts already promoted earlier. This way we do not end up promoting too many indirect call targets.

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D38763

llvm-svn: 317502
2017-11-06 19:52:49 +00:00
Sanjay Patel 629c411538 [IR] redefine 'UnsafeAlgebra' / 'reassoc' fast-math-flags and add 'trans' fast-math-flag
As discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-November/107104.html
and again more recently:
http://lists.llvm.org/pipermail/llvm-dev/2017-October/118118.html

...this is a step in cleaning up our fast-math-flags implementation in IR to better match
the capabilities of both clang's user-visible flags and the backend's flags for SDNode.

As proposed in the above threads, we're replacing the 'UnsafeAlgebra' bit (which had the 
'umbrella' meaning that all flags are set) with a new bit that only applies to algebraic 
reassociation - 'AllowReassoc'.

We're also adding a bit to allow approximations for library functions called 'ApproxFunc' 
(this was initially proposed as 'libm' or similar).

...and we're out of bits. 7 bits ought to be enough for anyone, right? :) FWIW, I did 
look at getting this out of SubclassOptionalData via SubclassData (spacious 16-bits), 
but that's apparently already used for other purposes. Also, I don't think we can just 
add a field to FPMathOperator because Operator is not intended to be instantiated. 
We'll defer movement of FMF to another day.

We keep the 'fast' keyword. I thought about removing that, but seeing IR like this:
%f.fast = fadd reassoc nnan ninf nsz arcp contract afn float %op1, %op2
...made me think we want to keep the shortcut synonym.

Finally, this change is binary incompatible with existing IR as seen in the 
compatibility tests. This statement:
"Newer releases can ignore features from older releases, but they cannot miscompile 
them. For example, if nsw is ever replaced with something else, dropping it would be 
a valid way to upgrade the IR." 
( http://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility )
...provides the flexibility we want to make this change without requiring a new IR 
version. Ie, we're not loosening the FP strictness of existing IR. At worst, we will 
fail to optimize some previously 'fast' code because it's no longer recognized as 
'fast'. This should get fixed as we audit/squash all of the uses of 'isFast()'.

Note: an inter-dependent clang commit to use the new API name should closely follow 
commit.

Differential Revision: https://reviews.llvm.org/D39304

llvm-svn: 317488
2017-11-06 16:27:15 +00:00
Sean Fertile 4595a915f6 [LTO][ThinLTO] Use the linker resolutions to mark global values as dso_local.
Now that we have a way to mark GlobalValues as local we can use the symbol
resolutions that the linker plugin provides as part of lto/thinlto link
step to refine the compilers view on what symbols will end up being local.

Originally commited as r317374, but reverted in r317395 to update some missed
tests.

Differential Revision: https://reviews.llvm.org/D35702

llvm-svn: 317408
2017-11-04 17:04:39 +00:00
Sean Fertile 39770ca0a1 Revert "[LTO][ThinLTO] Use the linker resolutions to mark global values ..."
Changes more tests then expected on one of the build bots.
reverting to investigate.

This reverts https://llvm.org/svn/llvm-project/llvm/trunk@317374

llvm-svn: 317395
2017-11-04 01:54:20 +00:00
Davide Italiano c7c05ae4be [CallSiteSplitting] clang-format my last commit. NFCI.
Thanks to Rui for pointing out.

llvm-svn: 317393
2017-11-04 00:44:01 +00:00
Davide Italiano 91b4790b33 [CallSiteSplitting] Silence GCC's -Wparentheses. NFCI.
llvm-svn: 317385
2017-11-03 23:03:38 +00:00
Adrian Prantl 261ac8b23c Invoke salvageDebugInfo from CodeGenPrepare's SinkCast()
This preserves the debug info for the cast operation in the original location.

rdar://problem/33460652

Reapplied r317340 with the test moved into an ARM-specific directory.

llvm-svn: 317375
2017-11-03 21:55:03 +00:00
Sean Fertile 36528c2a9b [LTO][ThinLTO] Use the linker resolutions to mark global values as dso_local.
Now that we have a way to mark GlobalValues as local we can use the symbol
resolutions that the linker plugin provides as part of lto/thinlto link
step to refine the compilers view on what symbols will end up being local.

Differential Revision: https://reviews.llvm.org/D35702

llvm-svn: 317374
2017-11-03 21:45:55 +00:00
Craig Topper 12463779d3 [SimplifyCFG] When merging conditional stores, don't count the store we're merging against the PHINodeFoldingThreshold
Merging conditional stores tries to check to see if the code is if convertible after the store is moved. But the store hasn't been moved yet so its being counted against the threshold.

The patch adds 1 to the threshold comparison to make sure we don't count the store. I've adjusted a test to use a lower threshold to ensure we still do that conversion with the lower threshold.

Differential Revision: https://reviews.llvm.org/D39570

llvm-svn: 317368
2017-11-03 21:08:13 +00:00
Jun Bum Lim 0c99007db1 Recommit r317351 : Add CallSiteSplitting pass
This recommit r317351 after fixing a buildbot failure.

Original commit message:

    Summary:
    This change add a pass which tries to split a call-site to pass
    more constrained arguments if its argument is predicated in the control flow
    so that we can expose better context to the later passes (e.g, inliner, jump
    threading, or IPA-CP based function cloning, etc.).
    As of now we support two cases :

    1) If a call site is dominated by an OR condition and if any of its arguments
    are predicated on this OR condition, try to split the condition with more
    constrained arguments. For example, in the code below, we try to split the
    call site since we can predicate the argument (ptr) based on the OR condition.

    Split from :
          if (!ptr || c)
            callee(ptr);
    to :
          if (!ptr)
            callee(null ptr)  // set the known constant value
          else if (c)
            callee(nonnull ptr)  // set non-null attribute in the argument

    2) We can also split a call-site based on constant incoming values of a PHI
    For example,
    from :
          BB0:
           %c = icmp eq i32 %i1, %i2
           br i1 %c, label %BB2, label %BB1
          BB1:
           br label %BB2
          BB2:
           %p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ]
           call void @bar(i32 %p)
    to
          BB0:
           %c = icmp eq i32 %i1, %i2
           br i1 %c, label %BB2-split0, label %BB1
          BB1:
           br label %BB2-split1
          BB2-split0:
           call void @bar(i32 0)
           br label %BB2
          BB2-split1:
           call void @bar(i32 1)
           br label %BB2
          BB2:
           %p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ]

llvm-svn: 317362
2017-11-03 20:41:16 +00:00
Aaron Ballman ecf0e95267 Add llvm::for_each as a range-based extensions to <algorithm> and make use of it in some cases where it is a more clear alternative to std::for_each.
llvm-svn: 317356
2017-11-03 20:01:25 +00:00
Jun Bum Lim 0eb1c2d63a Revert "Add CallSiteSplitting pass"
Revert due to Buildbot failure.

This reverts commit r317351.

llvm-svn: 317353
2017-11-03 19:17:11 +00:00
Jun Bum Lim 2a58933519 Add CallSiteSplitting pass
Summary:
This change add a pass which tries to split a call-site to pass
more constrained arguments if its argument is predicated in the control flow
so that we can expose better context to the later passes (e.g, inliner, jump
threading, or IPA-CP based function cloning, etc.).
As of now we support two cases :

1) If a call site is dominated by an OR condition and if any of its arguments
are predicated on this OR condition, try to split the condition with more
constrained arguments. For example, in the code below, we try to split the
call site since we can predicate the argument (ptr) based on the OR condition.

Split from :
      if (!ptr || c)
        callee(ptr);
to :
      if (!ptr)
        callee(null ptr)  // set the known constant value
      else if (c)
        callee(nonnull ptr)  // set non-null attribute in the argument

2) We can also split a call-site based on constant incoming values of a PHI
For example,
from :
      BB0:
       %c = icmp eq i32 %i1, %i2
       br i1 %c, label %BB2, label %BB1
      BB1:
       br label %BB2
      BB2:
       %p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ]
       call void @bar(i32 %p)
to
      BB0:
       %c = icmp eq i32 %i1, %i2
       br i1 %c, label %BB2-split0, label %BB1
      BB1:
       br label %BB2-split1
      BB2-split0:
       call void @bar(i32 0)
       br label %BB2
      BB2-split1:
       call void @bar(i32 1)
       br label %BB2
      BB2:
       %p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ]

Reviewers: davidxl, huntergr, chandlerc, mcrosier, eraman, davide

Reviewed By: davidxl

Subscribers: sdesmalen, ashutosh.nema, fhahn, mssimpso, aemerson, mgorny, mehdi_amini, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D39137

llvm-svn: 317351
2017-11-03 19:01:57 +00:00
Evgeny Stupachenko d699de2b50 The patch fixes PR35131
Summary:

Fix a misprint which led to false CTLZ recognition.

Reviewers: craig.topper

Differential Revision: https://reviews.llvm.org/D39585

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 317348
2017-11-03 18:50:03 +00:00
Adrian Prantl 8fe9fb0ae5 Revert "Invoke salvageDebugInfo from CodeGenPrepare's SinkCast()"
This reverts commit 317342 while investigating bot breakage.

llvm-svn: 317345
2017-11-03 18:26:36 +00:00
Adrian Prantl 58e9a0bb16 Invoke salvageDebugInfo from CodeGenPrepare's SinkCast()
This preserves the debug info for the cast operation in the original location.

rdar://problem/33460652

llvm-svn: 317340
2017-11-03 18:00:02 +00:00
Jun Bum Lim f5fb3d745d [LICM] sink through non-trivially replicable PHI
Summary:
The current LICM allows sinking an instruction only when it is exposed to exit
blocks through a trivially replacable PHI of which all incoming values are the
same instruction. This change enhance LICM to sink a sinkable instruction
through non-trivially replacable PHIs by spliting predecessors of loop
exits.

Reviewers: hfinkel, majnemer, davidxl, bmakam, mcrosier, danielcdh, efriedma, jtony

Reviewed By: efriedma

Subscribers: nemanjai, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D37163

llvm-svn: 317335
2017-11-03 16:24:53 +00:00
Anna Thomas 6879721453 [LoopPredication] NFC: Refactored code to separate out functions being reused
Summary:
Refactored the code to separate out common functions that are being
reused.
This is to reduce the changes for changes coming up wrt loop
predication with reverse loops.

This refactoring is what we have in our downstream code.

llvm-svn: 317324
2017-11-03 14:25:39 +00:00
Mikael Holmen 6018104d5e [ADCE] Use MapVector for BlockInfo to make iteration order deterministic
Summary:
Also added a reserve() method to MapVector since we want to use that from
ADCE.

DenseMap does not provide deterministic iteration order so with that
we will handle the members of BlockInfo in random order, eventually
leading to random order of the blocks in the predecessor lists.

Without this change, I get the same predecessor order in about 90% of the
time when I compile a certain reproducer and in 10% I get a different one.

No idea how to make a proper test case for this.

Reviewers: kuhar, david2050

Reviewed By: kuhar

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39593

llvm-svn: 317323
2017-11-03 14:15:08 +00:00
Florian Hahn 41e32bfd68 [PartialInliner] Skip call sites where inlining fails.
Summary:
InlineFunction can fail, for example when trying to inline vararg
fuctions. In those cases, we do not want to bump partial inlining
counters or set AnyInlined to true, because this could leave an unused
function hanging around.

Reviewers: davidxl, davide, gyiu

Reviewed By: davide

Subscribers: llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D39581

llvm-svn: 317314
2017-11-03 11:29:00 +00:00
Vedant Kumar 9196ed1be1 [LSR] Clarify a comment. NFC.
llvm-svn: 317295
2017-11-03 01:01:28 +00:00
Adrian Prantl fbb6fbf709 IndVarSimplify: preserve debug information attached to widened PHI nodes.
This fixes PR35015.

https://bugs.llvm.org/show_bug.cgi?id=35015

Differential Revision: https://reviews.llvm.org/D39345

llvm-svn: 317282
2017-11-02 23:17:06 +00:00
Hiroshi Yamauchi dce9def3dd Irreducible loop metadata for more accurate block frequency under PGO.
Summary:
Currently the block frequency analysis is an approximation for irreducible
loops.

The new irreducible loop metadata is used to annotate the irreducible loop
headers with their header weights based on the PGO profile (currently this is
approximated to be evenly weighted) and to help improve the accuracy of the
block frequency analysis for irreducible loops.

This patch is a basic support for this.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: mehdi_amini, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D39028

llvm-svn: 317278
2017-11-02 22:26:51 +00:00
Anna Thomas 1d02b13eb7 [LoopPredication] Enable predication when latchCheckIV is wider than rangeCheck
Summary:
This patch allows us to predicate range checks that have a type narrower than
the latch check type. We leverage SCEV analysis to identify a truncate for the
latchLimit and latchStart.
There is also safety checks in place which requires the start and limit to be
known at compile time. We require this to make sure that the SCEV truncate expr
for the IV corresponding to the latch does not cause us to lose information
about the IV range.
Added tests show the loop predication over range checks that are of various
types and are narrower than the latch type.
This enhancement has been in our downstream tree for a while.

Reviewers: apilipenko, sanjoy, mkazantsev

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39500

llvm-svn: 317269
2017-11-02 21:21:02 +00:00
Anna Thomas 729dafc16b Strip off invariant.start because memory locations arent invariant
The original change was reverted in rL317217 because of the failure in
the RS4GC testcase. I couldn't reproduce the failure on my local machine
(macbook) but could reproduce it on a linux box.

The failure was around removing the uses of invariant.start. The fix
here is to just RAUW undef (which was the first implementation in D39388).
This is perfectly valid IR as discussed in the review.

llvm-svn: 317225
2017-11-02 18:24:04 +00:00
Anna Thomas ebe429d99f Revert "[RS4GC] Strip off invariant.start because memory locations arent invariant"
This reverts commit r317215, investigating the test failure.

llvm-svn: 317217
2017-11-02 16:45:51 +00:00
Anna Thomas 486a7aaa31 [RS4GC] Strip off invariant.start because memory locations arent invariant
Summary:
Invariant.start on memory locations has the property that the memory
location is unchanging. However, this is not true in the face of
rewriting statepoints for GC.
Teach RS4GC about removing invariant.start so that optimizations after
RS4GC does not incorrect sink a load from the memory location past a
statepoint.

Added test showcasing the issue.

Reviewers: reames, apilipenko, dneilson

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39388

llvm-svn: 317215
2017-11-02 16:23:31 +00:00
Clement Courbet 82bade615b Revert "[ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass."
undefined reference to `llvm::TargetPassConfig::ID' on
clang-ppc64le-linux-multistage

This reverts commit eea333c33fa73ad225ef28607795984829f65688.

llvm-svn: 317213
2017-11-02 15:53:10 +00:00
Clement Courbet 1dc37b9c3b [ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass.
Summary:
This is mostly a noop (most of the test diffs are renamed blocks).
There are a few temporary register renames (eax<->ecx) and a few blocks are
shuffled around.

See the discussion in PR33325 for more details.

Reviewers: spatel

Subscribers: mgorny

Differential Revision: https://reviews.llvm.org/D39456

llvm-svn: 317211
2017-11-02 15:02:51 +00:00
Bjorn Pettersson e73b85d1ab [SimplifyCFG] Discard speculated dbg intrinsics
Summary:
SpeculativelyExecuteBB can flatten the CFG by doing
speculative execution followed by a select instruction.
When the speculatively executed BB contained dbg intrinsics
the result could be a little bit weird, since those dbg
intrinsics were inserted before the select in the flattened
CFG. So when single stepping in the debugger, printing the
value of the variable referenced in the dbg intrinsic, it
could happen that it looked like the variable had values
that never actually were assigned to the variable.

This patch simply discards all dbg intrinsics that were found
in the speculatively executed BB.

Reviewers: aprantl, chandlerc, craig.topper

Reviewed By: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39494

llvm-svn: 317198
2017-11-02 11:55:14 +00:00
Adrian Prantl bfa77c4c85 loop-unroll: teach remapInstruction to update dbg.value intrinsics.
Fixes PR35112.

https://bugs.llvm.org/show_bug.cgi?id=35112

llvm-svn: 317138
2017-11-01 23:12:35 +00:00
Adrian Prantl 98c6549e4a loop-rotate: avoid duplicating dbg.value intrinsics in the entry block.
This fixes the second half of PR35113.

This reapplies r317106 without modifications.

llvm-svn: 317121
2017-11-01 20:53:22 +00:00
Adrian Prantl d60f34c20a loop-rotate: eliminate duplicate debug intrinsics after splicing.
Fixes part of PR35113.

This reapplies r317105 with an additional check for isa<Instruction>
as found by the bots.

llvm-svn: 317120
2017-11-01 20:43:30 +00:00
Dehao Chen c6c051f2ea Include GUIDs from the same module when computing GUIDs that needs to be imported.
Summary: In the compile phase of SamplePGO+ThinLTO, ICP is not invoked. Instead, indirect call targets will be included as function metadata for ThinIndex to buidl the call graph. This should not only include functions defined in other modules, but also functions defined in the same module, otherwise ThinIndex may find the callee dead and eliminate it, while ICP in backend will revive the symbol, which leads to undefined symbol.

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: sanjoy, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D39480

llvm-svn: 317118
2017-11-01 20:26:47 +00:00
Philip Reames 7b861f08cd Revert 317016 and 317048
The former appears to have introduced a miscompile in a stage2 clang build.  Revert so I can investigate offline.

llvm-svn: 317116
2017-11-01 19:49:20 +00:00
Adrian Prantl c8516346e4 Revert r317105 to investigate bot breakage.
llvm-svn: 317110
2017-11-01 18:06:38 +00:00
Adrian Prantl 40a0ea5f29 Revert r317106 to facilitate reverting r317105.
llvm-svn: 317109
2017-11-01 18:06:35 +00:00
Peter Collingbourne 9fb6e1a037 LTO: Apply global DCE to ThinLTO modules at LTO opt level 0.
This is necessary because DCE is applied to full LTO modules. Without
this change, a reference from a dead ThinLTO global to a dead full
LTO global will result in an undefined reference at link time.

This problem is only observable when --gc-sections is disabled, or
when targeting COFF, as the COFF port of lld requires all symbols to
have a definition even if all references are dead (this is consistent
with link.exe).

This change also adds an EliminateAvailableExternally pass at -O0. This
is necessary to handle the situation on Windows where a non-prevailing
copy of a linkonce_odr function has an SEH filter function; any
such filters must be DCE'd because they will contain a call to the
llvm.localrecover intrinsic, passing as an argument the address of the
function that the filter belongs to, and llvm.localrecover requires
this function to be defined locally.

Fixes PR35142.

Differential Revision: https://reviews.llvm.org/D39484

llvm-svn: 317108
2017-11-01 17:58:39 +00:00
Adrian Prantl 9259f21604 loop-rotate: avoid duplicating dbg.value intrinsics in the entry block.
This fixes the second half of PR35113.

llvm-svn: 317106
2017-11-01 17:28:50 +00:00
Adrian Prantl b627acd0ce loop-rotate: eliminate duplicate debug intrinsics after splicing.
Fixes part of PR35113.

llvm-svn: 317105
2017-11-01 17:28:47 +00:00
Max Kazantsev 6f5229d7da Revert rL311205 "[IRCE] Fix buggy behavior in Clamp"
This patch reverts rL311205 that was initially a wrong fix. The real problem
was in intersection of signed and unsigned ranges (see rL316552), and the
patch being reverted masked the problem instead of fixing it.

By now, the test against which rL311205 was made works OK even without this
code. This revert patch also contains a test case that demonstrates incorrect
behavior caused by rL311205: it is caused by incorrect choise of signed max
instead of unsigned.

llvm-svn: 317088
2017-11-01 13:21:56 +00:00
Florian Hahn b93c06331e [CodeExtractor] Fix iterator invalidation in findOrCreateBlockForHoisting.
Summary:
By replacing branches to CommonExitBlock, we remove the node from
CommonExitBlock's predecessors, invalidating the iterator. The problem
is exposed when the common exit block has multiple predecessors and
needs to sink lifetime info. The modification in the test case trigger
the issue.

Reviewers: davidxl, davide, wmi

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39112

llvm-svn: 317084
2017-11-01 09:48:12 +00:00
Philip Reames 357cd3289e [SimplifyIndVar] Inline makIVComparisonInvariant to eleminate code duplication [NFC]
This formulation might be slightly slower since I eagerly compute the cheap replacements.  If anyone sees this having a compile time impact, let me know and I'll use lazy population instead.

llvm-svn: 317048
2017-10-31 22:56:16 +00:00
Adrian Prantl deb437b038 loop-rotate: simplify code by using llvm::findDbgValues(). (NFC)
llvm-svn: 317037
2017-10-31 21:03:22 +00:00
Benjamin Kramer 992fc4ea2d [coro] Make Spill a proper struct instead of deriving from pair.
No functionality change.

llvm-svn: 317027
2017-10-31 19:22:55 +00:00
Craig Topper 7c7fcabd3f [SimplifyCFG] Use a more generic name for the selects created by SpeculativelyExecuteBB to prevent long names from being created
Currently the selects are created with the names of their inputs concatenated together. It's possible to get cases that chain these selects together resulting in long names due to multiple levels of concatenation. Our internal branch of llvm managed to generate names over 100000 characters in length on a particular test due to an extreme compounding of the names.

This patch changes the name to a generic name that is not dependent on its inputs.

Differential Revision: https://reviews.llvm.org/D39440

llvm-svn: 317024
2017-10-31 19:03:51 +00:00
Philip Reames dc417a9819 [IndVarSimplify] Extract wrapper around SE-.isLoopInvariantPredicate [NFC]
This an intermediate state, the next patch will re-inline the markLoopInvariantPredicate function to reduce code duplication.

llvm-svn: 317016
2017-10-31 18:04:57 +00:00
Philip Reames cd0a5bb96c [IndVarSimplify] Simplify code using a dictionary
Possibly very slightly slower, but this code is not performance critical and the readability benefit alone is huge.

llvm-svn: 317012
2017-10-31 17:06:32 +00:00
Reid Kleckner c212cc88e2 [asan] Upgrade private linkage globals to internal linkage on COFF
COFF comdats require symbol table entries, which means the comdat leader
cannot have private linkage.

llvm-svn: 317009
2017-10-31 16:16:08 +00:00
Benjamin Kramer 3f3d5be759 [LoopVectorize] Replace manual VPlan memory management with unique_ptr.
No functionality change intended.

llvm-svn: 317003
2017-10-31 14:58:22 +00:00
Matthew Simpson b6915fbfa2 [InstCombine] Simplify selects that test cmpxchg instructions
If a select instruction tests the returned flag of a cmpxchg instruction and
selects between the returned value of the cmpxchg instruction and its compare
operand, the result of the select will always be equal to its false value.

Differential Revision: https://reviews.llvm.org/D39383

llvm-svn: 316994
2017-10-31 12:34:02 +00:00
David Green 64f53b4214 [LoopUnroll] Clean up remarks for unroll remainder
The optimisation remarks for loop unrolling with an unrolled remainder looks something like:

test.c:7:18: remark: completely unrolled loop with 3 iterations [-Rpass=loop-unroll]
            C[i] += A[i*N+j];
                 ^
test.c:6:9: remark: unrolled loop by a factor of 4 with run-time trip count [-Rpass=loop-unroll]
        for(int j = 0; j < N; j++)
        ^
This removes the first of the two messages.

Differential revision: https://reviews.llvm.org/D38725

llvm-svn: 316986
2017-10-31 10:47:46 +00:00
Max Kazantsev 84286ce5dd [IRCE][NFC] Rename fields of InductiveRangeCheck
Rename `Offset`, `Scale`, `Length` into `Begin`, `Step`, `End` respectively
to make naming of similar entities for Ranges and Range Checks more
consistent.

Differential Revision: https://reviews.llvm.org/D39414

llvm-svn: 316979
2017-10-31 06:19:05 +00:00
Max Kazantsev 21e7b53490 [NFC] Get rid of variables used in assert only
llvm-svn: 316977
2017-10-31 05:33:58 +00:00
Philip Reames 59bf1e0548 [IndVarSimplify] Simplify code using preheader assumption
As noted in the nice block comment, the previous code didn't actually handle multi-entry loops correctly, it just assumed SCEV didn't analyze such loops.  Given SCEV has comments to the contrary, that seems a bit suspect.  More importantly, the pass actually requires loopsimplify form which ensures a loop-preheader is available.  Remove the excessive generaility and shorten the code greatly.

Note that we do successfully analyze many multi-entry loops, but we do so by converting them to single entry loops.  See the added test case.

llvm-svn: 316976
2017-10-31 05:16:46 +00:00
Max Kazantsev 488ec975bb Reapply "[GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors"
This patch fixes the miscompile that happens when PRE hoists loads across guards and
other instructions that don't always pass control flow to their successors. PRE is now prohibited
to hoist across such instructions because there is no guarantee that the load standing after such
instruction is still valid before such instruction. For example, a load from under a guard may be
invalid before the guard in the following case:
  int array[LEN];
  ...
  guard(0 <= index && index < LEN);
  use(array[index]);

Differential Revision: https://reviews.llvm.org/D37460

llvm-svn: 316975
2017-10-31 05:07:56 +00:00
Philip Reames 39a8dbff87 [SimplifyIndVar] Extract out invariant expression handling
Previously, the code returned early from the *function* when it couldn't find a free expansion, it should be returning from the *transform*.  I don't have a test case, noticed this via inspection.

As a follow up, I'm going to revisit the logic in the extract function.  I think that essentially the whole helper routine can be replaced with SCEVExpander, but I wanted to do that in a series of separate commits.

llvm-svn: 316974
2017-10-31 04:19:06 +00:00
Philip Reames 5552f503d5 Undo accidental commit
These files shouldn't have been submitted in 316967

llvm-svn: 316968
2017-10-31 00:04:09 +00:00
Philip Reames 9c3cbeea39 [CGP] Fix crash on i96 bit multiply
Issue found by llvm-isel-fuzzer on OSS fuzz, https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3725

If anyone actually cares about > 64 bit arithmetic, there's a lot more to do in this area.  There's a bunch of obviously wrong code in the same function.  I don't have the time to fix all of them and am just using this to understand what the workflow for fixing fuzzer cases might look like.

llvm-svn: 316967
2017-10-30 23:59:51 +00:00
Yaxun Liu d23f23d81c InferAddressSpaces: Fix bug about replacing addrspacecast
InferAddressSpaces assumes the pointee type of addrspacecast
is the same as the operand, which is not always true and causes
invalid IR.

This bug cause build failure in HCC.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D39432

llvm-svn: 316957
2017-10-30 21:19:41 +00:00
Davide Italiano 834b45129b [NewGVN] Stop assuming PHI args ordering when looking at phi-of-ops.
It's not guaranteed. There's a bug open to sort them in predecessor
order, but it won't happen anytime soon. In the meanwhile, passes
will have to do an O(#preds) scan. Such is life.

llvm-svn: 316953
2017-10-30 20:20:16 +00:00
Daniel Neilson f9c7d29c77 Create instruction classes for identifying any atomicity of memory intrinsic. (NFC)
Summary:
For reference, see: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116589.html

This patch fleshes out the instruction class hierarchy with respect to atomic and
non-atomic memory intrinsics. With this change, the relevant part of the class
hierarchy becomes:

IntrinsicInst
  -> MemIntrinsicBase (methods-only class)
    -> MemIntrinsic (non-atomic intrinsics)
      -> MemSetInst
      -> MemTransferInst
        -> MemCpyInst
        -> MemMoveInst
    -> AtomicMemIntrinsic (atomic intrinsics)
      -> AtomicMemSetInst
      -> AtomicMemTransferInst
        -> AtomicMemCpyInst
        -> AtomicMemMoveInst
    -> AnyMemIntrinsic (both atomicities)
      -> AnyMemSetInst
      -> AnyMemTransferInst
        -> AnyMemCpyInst
        -> AnyMemMoveInst

This involves some class renaming:
    ElementUnorderedAtomicMemCpyInst -> AtomicMemCpyInst
    ElementUnorderedAtomicMemMoveInst -> AtomicMemMoveInst
    ElementUnorderedAtomicMemSetInst -> AtomicMemSetInst
A script for doing this renaming in downstream trees is included below.

An example of where the Any* classes should be used in LLVM is when reasoning
about the effects of an instruction (ex: aliasing).

---
Script for renaming AtomicMem* classes:
PREFIXES="[<,([:space:]]"
CLASSES="MemIntrinsic|MemTransferInst|MemSetInst|MemMoveInst|MemCpyInst"
SUFFIXES="[;)>,[:space:]]"

REGEX="(${PREFIXES})ElementUnorderedAtomic(${CLASSES})(${SUFFIXES})"
REGEX2="visitElementUnorderedAtomic(${CLASSES})"

FILES=$( grep -E "(${REGEX}|${REGEX2})" -r . | tr ':' ' ' | awk '{print $1}' | sort | uniq )

SED_SCRIPT="s~${REGEX}~\1Atomic\2\3~g"
SED_SCRIPT2="s~${REGEX2}~visitAtomic\1~g"

for f in $FILES; do
    echo "Processing: $f"
    sed  -i ".bak" -E "${SED_SCRIPT};${SED_SCRIPT2};${EA_SED_SCRIPT};${EA_SED_SCRIPT2}" $f
done

Reviewers: sanjoy, deadalnix, apilipenko, anna, skatkov, mkazantsev

Reviewed By: sanjoy

Subscribers: hfinkel, jholewinski, arsenm, sdardis, nhaehnle, JDevlieghere, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D38419

llvm-svn: 316950
2017-10-30 19:51:48 +00:00
Mandeep Singh Grang f83268bd9e [GVNHoist] Fix non-deterministic sort order of PHIs for identical instructions
Summary: This fixes failure in Transforms/GVNHoist/hoist.ll uncovered by D39245.

Reviewers: hiraditya, spop, dberlin

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39410

llvm-svn: 316949
2017-10-30 19:42:41 +00:00
Clement Courbet b2c3eb8cf1 [CodeGen][ExpandMemcmp] Allow memcmp to expand to vector loads (2).
- Targets that want to support memcmp expansions now return the list of
   supported load sizes.
 - Expansion codegen does not assume that all power-of-two load sizes
   smaller than the max load size are valid. For examples, this is not the
   case for x86(32bit)+sse2.

Fixes PR34887.

llvm-svn: 316905
2017-10-30 14:19:33 +00:00
Florian Hahn d0208b4b1c Recommit r315288: [SCCP] Propagate integer range info for parameters in IPSCCP.
This version of the patch includes a fix addressing a stage2 LTO buildbot
failure and addressed some additional nits.

Original commit message:
This updates the SCCP solver to use of the ValueElement lattice for
parameters, which provides integer range information. The range
information is used to remove unneeded icmp instructions.

For the following function, f() can be optimized to ret i32 2 with
this change

    source_filename = "sccp.c"
    target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
    target triple = "x86_64-unknown-linux-gnu"

    ; Function Attrs: norecurse nounwind readnone uwtable
    define i32 @main() local_unnamed_addr #0 {
    entry:
      %call = tail call fastcc i32 @f(i32 1)
      %call1 = tail call fastcc i32 @f(i32 47)
      %add3 = add nsw i32 %call, %call1
      ret i32 %add3
    }

    ; Function Attrs: noinline norecurse nounwind readnone uwtable
    define internal fastcc i32 @f(i32 %x) unnamed_addr #1 {
    entry:
      %c1 = icmp sle i32 %x, 100

      %cmp = icmp sgt i32 %x, 300
      %. = select i1 %cmp, i32 1, i32 2
      ret i32 %.
    }

    attributes #1 = { noinline }

Reviewers: davide, sanjoy, efriedma, dberlin

Reviewed By: davide, dberlin

Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D36656

llvm-svn: 316891
2017-10-30 10:07:42 +00:00
Max Kazantsev 390fc57771 [IRCE][NFC] Store Length as SCEV in RangeCheck instead of Value
llvm-svn: 316889
2017-10-30 09:35:16 +00:00
Florian Hahn d18443edad Revert r316887 to fix buildbot failures.
llvm-svn: 316888
2017-10-30 09:21:50 +00:00
Florian Hahn 925d3e4a98 Recommit r315288: [SCCP] Propagate integer range info for parameters in IPSCCP.
This version of the patch includes a fix addressing a stage2 LTO buildbot
failure and addressed some additional nits.

Original commit message:
This updates the SCCP solver to use of the ValueElement lattice for
parameters, which provides integer range information. The range
information is used to remove unneeded icmp instructions.

For the following function, f() can be optimized to ret i32 2 with
this change

    source_filename = "sccp.c"
    target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
    target triple = "x86_64-unknown-linux-gnu"

    ; Function Attrs: norecurse nounwind readnone uwtable
    define i32 @main() local_unnamed_addr #0 {
    entry:
      %call = tail call fastcc i32 @f(i32 1)
      %call1 = tail call fastcc i32 @f(i32 47)
      %add3 = add nsw i32 %call, %call1
      ret i32 %add3
    }

    ; Function Attrs: noinline norecurse nounwind readnone uwtable
    define internal fastcc i32 @f(i32 %x) unnamed_addr #1 {
    entry:
      %c1 = icmp sle i32 %x, 100

      %cmp = icmp sgt i32 %x, 300
      %. = select i1 %cmp, i32 1, i32 2
      ret i32 %.
    }

    attributes #1 = { noinline }

Reviewers: davide, sanjoy, efriedma, dberlin

Reviewed By: davide, dberlin

Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D36656

llvm-svn: 316887
2017-10-30 09:04:18 +00:00
Max Kazantsev 1d7c0439b9 [GVN][NFC] Mark instruction for deletion instead of immediate erasing in LoadPRE
It is done to uniformly handle instructions removal.

Differential Revision: https://reviews.llvm.org/D39369

llvm-svn: 316884
2017-10-30 04:48:34 +00:00
Sanjay Patel b049173157 [SimplifyCFG] use pass options and remove the latesimplifycfg pass
This is no-functional-change-intended.

This is repackaging the functionality of D30333 (defer switch-to-lookup-tables) and 
D35411 (defer folding unconditional branches) with pass parameters rather than a named
"latesimplifycfg" pass. Now that we have individual options to control the functionality,
we could decouple when these fire (but that's an independent patch if desired). 

The next planned step would be to add another option bit to disable the sinking transform
mentioned in D38566. This should also make it clear that the new pass manager needs to
be updated to limit simplifycfg in the same way as the old pass manager.

Differential Revision: https://reviews.llvm.org/D38631

llvm-svn: 316835
2017-10-28 18:43:07 +00:00
Craig Topper 49687104d6 [PartialInlineLibCalls] Teach PartialInlineLibCalls to honor nobuiltin, properly check the function signature, and check TLI::has
Summary:
We shouldn't do this transformation if the function is marked nobuitlin.

We were only checking that the return type is floating point, we really should be checking the argument types and argument count as well. This can be accomplished by using the other version of getLibFunc that takes the Function and not just the name.

We should also be checking TLI::has since sqrtf is a macro on Windows.

Fixes PR32559.

Reviewers: hfinkel, spatel, davide, efriedma

Reviewed By: davide, efriedma

Subscribers: efriedma, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D39381

llvm-svn: 316819
2017-10-28 00:36:58 +00:00
Artur Pilipenko 8aadc643cf [LoopPredication] Handle the case when the guard and the latch IV have different offsets
This is a follow up change for D37569.

Currently the transformation is limited to the case when:
 * The loop has a single latch with the condition of the form: ++i <pred> latchLimit, where <pred> is u<, u<=, s<, or s<=.
 * The step of the IV used in the latch condition is 1.
 * The IV of the latch condition is the same as the post increment IV of the guard condition.
 * The guard condition is of the form i u< guardLimit.

This patch enables the transform in the case when the latch is

 latchStart + i <pred> latchLimit, where <pred> is u<, u<=, s<, or s<=.

And the guard is

 guardStart + i u< guardLimit

Reviewed By: anna

Differential Revision: https://reviews.llvm.org/D39097

llvm-svn: 316768
2017-10-27 14:46:17 +00:00
Max Kazantsev 665907c3c2 [GVN][NFC] Refactor loop iteration with foreach
llvm-svn: 316748
2017-10-27 08:19:35 +00:00
Eugene Zelenko 57bd5a0274 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316724
2017-10-27 01:09:08 +00:00
Philip Reames 29dd40b38e [SimplifyIndVars] Shorten code by using SCEV helper [NFC]
llvm-svn: 316709
2017-10-26 22:02:16 +00:00
Dehao Chen ed2d5402cb Do not add discriminator encoding for debug intrinsics.
Summary: There are certain requirements for debug location of debug intrinsics, e.g. the scope of the DILocalVariable should be the same as the scope of its debug location. As a result, we should not add discriminator encoding for debug intrinsics.

Reviewers: dblaikie, aprantl

Reviewed By: aprantl

Subscribers: JDevlieghere, aprantl, bjope, sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D39343

llvm-svn: 316703
2017-10-26 21:20:52 +00:00
Philip Reames 21cc2fa3f6 [LICM] Restructure implicit exit handling to be more clear [NFCI]
When going to explain this to someone else, I got tripped up by the complicated meaning of IsKnownNonEscapingObject in load-store promotion.  Extract a helper routine and clarify naming/scopes to make this a bit more obvious.

llvm-svn: 316699
2017-10-26 21:00:15 +00:00
Balaram Makam 9ee942f481 Reapply r316582 [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred.
Summary: This reverts r316612 to reapply r316582. The buildbot failure was unrelated to this commit.

Reviewers:

Subscribers:

llvm-svn: 316669
2017-10-26 15:04:53 +00:00
Bjorn Pettersson 86db068e39 [LSV] Avoid adding vectors of pointers as candidates
Summary:
We no longer add vectors of pointers as candidates for
load/store vectorization. It does not seem to work anyway,
but without this patch we can end up in asserts when trying
to create casts between an integer type and the pointer of
vectors type.

The test case I've added used to assert like this when trying to
cast between i64 and <2 x i16*>:
opt: ../lib/IR/Instructions.cpp:2565: Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed.
#0 PrintStackTraceSignalHandler(void*)
#1 SignalHandler(int)
#2 __restore_rt
#3 __GI_raise
#4 __GI_abort
#5 __GI___assert_fail
#6 llvm::CastInst::Create(llvm::Instruction::CastOps, llvm::Value*, llvm::Type*, llvm::Twine const&, llvm::Instruction*)
#7 llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>::CreateBitOrPointerCast(llvm::Value*, llvm::Type*, llvm::Twine const&)
#8 Vectorizer::vectorizeStoreChain(llvm::ArrayRef<llvm::Instruction*>, llvm::SmallPtrSet<llvm::Instruction*, 16u>*)

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D39296

llvm-svn: 316665
2017-10-26 13:59:15 +00:00
Bjorn Pettersson 22a2282da1 [LSV] Skip all non-byte sizes, not only less than eight bits
Summary:
The code comments indicate that no effort has been spent on
handling load/stores when the size isn't a multiple of the
byte size correctly. However, the code only avoided types
smaller than 8 bits. So for example a load of an i28 could
still be considered as a candidate for vectorization.

This patch adjusts the code to behave according to the code
comment.

The test case used to hit the following assert when
trying to use "cast" an i32 to i28 using CreateBitOrPointerCast:

opt: ../lib/IR/Instructions.cpp:2565: Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed.
#0 PrintStackTraceSignalHandler(void*)
#1 SignalHandler(int)
#2 __restore_rt
#3 __GI_raise
#4 __GI_abort
#5 __GI___assert_fail
#6 llvm::CastInst::Create(llvm::Instruction::CastOps, llvm::Value*, llvm::Type*, llvm::Twine const&, llvm::Instruction*)
#7 llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>::CreateBitOrPointerCast(llvm::Value*, llvm::Type*, llvm::Twine const&)
#8 (anonymous namespace)::Vectorizer::vectorizeLoadChain(llvm::ArrayRef<llvm::Instruction*>, llvm::SmallPtrSet<llvm::Instruction*, 16u>*)

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39295

llvm-svn: 316663
2017-10-26 13:42:55 +00:00
Eugene Zelenko 5c2aecef78 [Transforms] Revert r316630 changes in Scalar/MergeICmps.cpp to fix broken build bots (NFC).
llvm-svn: 316634
2017-10-26 01:25:14 +00:00
Eugene Zelenko 5adb96cc92 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316630
2017-10-26 00:55:39 +00:00
Matthew Simpson 99f57933ba Attempt to unbreak the expensive-checks-win bot
llvm-svn: 316625
2017-10-25 22:46:34 +00:00
Balaram Makam 52252fe20d Revert r316582 [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred.
Summary: This reverts commit r316582. It looks like this commit broke tests on one buildbot:
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/5719

. . .
Failing Tests (1):
    LLVM :: Transforms/CalledValuePropagation/simple-arguments.ll

Reviewers:

Subscribers:

llvm-svn: 316612
2017-10-25 21:32:54 +00:00
Balaram Makam 925ddf1a93 [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred.
Summary: For some irreducible CFG the domtree nodes might be dead, do not update domtree for dead nodes.

Reviewers: kuhar, dberlin, hfinkel

Reviewed By: kuhar

Subscribers: llvm-commits, mcrosier

Differential Revision: https://reviews.llvm.org/D38960

llvm-svn: 316582
2017-10-25 14:55:48 +00:00
Matthew Simpson cb58558c2f Add CalledValuePropagation pass
This patch adds a new pass for attaching !callees metadata to indirect call
sites. The pass propagates values to call sites by performing an IPSCCP-like
analysis using the generic sparse propagation solver. For indirect call sites
having a small set of possible callees, the attached metadata indicates what
those callees are. The metadata can be used to facilitate optimizations like
intersecting the function attributes of the possible callees, refining the call
graph, performing indirect call promotion, etc.

Differential Revision: https://reviews.llvm.org/D37355

llvm-svn: 316576
2017-10-25 13:40:08 +00:00
Max Kazantsev 9ac7021a25 [IRCE] Fix intersection between signed and unsigned ranges
IRCE for unsigned latch conditions was temporarily disabled by rL314881. The motivating
example contained an unsigned latch condition and a signed range check. One of the safe
iteration ranges was `[1, SINT_MAX + 1]`. Its right border was incorrectly interpreted as a negative
value in `IntersectRange` function, this lead to a miscompile under which we deleted a range check
without inserting a postloop where it was needed.

This patch brings back IRCE for unsigned latch conditions. Now we treat range intersection more
carefully. If the latch condition was unsigned, we only try to consider a range check for deletion if:
1. The range check is also unsigned, or
2. Safe iteration range of the range check lies within `[0, SINT_MAX]`.
The same is done for signed latch.

Values from `[0, SINT_MAX]` are unambiguous, these values are non-negative under any interpretation,
and all values of a range intersected with such range are also non-negative.

We also use signed/unsigned min/max functions for range intersection depending on type of the
latch condition.

Differential Revision: https://reviews.llvm.org/D38581

llvm-svn: 316552
2017-10-25 06:47:39 +00:00
Max Kazantsev 4332a943bc [IRCE] Smarter detection of empty ranges using SCEV
For a SCEV range, this patch replaces the naive emptiness check for SCEV ranges
which looks like `Begin == End` with a SCEV check. The range is guaranteed to be
empty of `Begin >= End`. We should filter such ranges out and do not try to perform
IRCE for them.

For example, we can get such range when intersecting range `[A, B)` and `[C, D)`
where `A < B < C < D`. The resulting range is `[max(A, C), min(B, D)) = [C, B)`.
This range is empty, but its `Begin` does not match with `End`.

Making IRCE for an empty range is basically safe but unprofitable because we
never actually get into the main loop where the range checks are supposed to
be eliminated. This patch uses SCEV mechanisms to treat loops with proved
`Begin >= End` as empty.

Differential Revision: https://reviews.llvm.org/D39082

llvm-svn: 316550
2017-10-25 06:10:02 +00:00
Eugene Zelenko 7f0f9bc5ab [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316503
2017-10-24 21:24:53 +00:00
Artem Belevich cb8f6328dc [NVPTX] allow address space inference for volatile loads/stores.
If particular target supports volatile memory access operations, we can
avoid AS casting to generic AS. Currently it's only enabled in NVPTX for
loads and stores that access global & shared AS.

Differential Revision: https://reviews.llvm.org/D39026

llvm-svn: 316495
2017-10-24 20:31:44 +00:00
Adrian Prantl d20442d383 Delete unused instantiations of DIBuilder. NFC
llvm-svn: 316494
2017-10-24 20:26:17 +00:00
Marek Olsak ce76ea0394 AMDGPU: Add new intrinsic llvm.amdgcn.kill(i1)
Summary:
Kill the thread if operand 0 == false.
llvm.amdgcn.wqm.vote can be applied to the operand.

Also allow kill in all shader stages.

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D38544

llvm-svn: 316427
2017-10-24 10:27:13 +00:00
Marek Olsak 2114fc3bcb AMDGPU: Add llvm.amdgcn.wqm.vote intrinsic
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D38543

llvm-svn: 316426
2017-10-24 10:26:59 +00:00
Saleem Abdulrasool 619b3269fd ObjCARC: do not increment past the end of the BB
The `BasicBlock::getFirstInsertionPt` call may return `std::end` for the
BB.  Dereferencing the end iterator results in an assertion failure
"(!NodePtr->isKnownSentinel()), function operator*".  Ensure that the
returned iterator is valid before dereferencing it.  If the end is
returned, move one position backward to get a valid insertion point.

llvm-svn: 316401
2017-10-24 00:09:10 +00:00
Mandeep Singh Grang 9ed81c66ce [GVNSink] Fix failing GVNSink tests in the reverse iteration bot
Summary:

The elts of ActivePreds which is defined as a SmallPtrSet are copied
into Blocks using std::copy. This makes the resultant order of Blocks
non-deterministic. We cannot simply sort Blocks as they need to match
the corresponding Values. So a better approach is to define ActivePreds
as SmallSetVector.

This fixes the following failures in
http://lab.llvm.org:8011/builders/reverse-iteration:
  LLVM :: Transforms/GVNSink/indirect-call.ll
  LLVM :: Transforms/GVNSink/sink-common-code.ll
  LLVM :: Transforms/GVNSink/struct.ll

Reviewers: dberlin, jmolloy, bkramer, efriedma

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39025

llvm-svn: 316369
2017-10-23 19:56:52 +00:00
Sanjay Patel b80daf0b48 [SimplifyCFG] delay switch condition forwarding to -latesimplifycfg
As discussed in D39011:
https://reviews.llvm.org/D39011
...replacing constants with a variable is inverting the transform done
by other IR passes, so we definitely don't want to do this early. 
In fact, it's questionable whether this transform belongs in SimplifyCFG 
at all. I'll look at moving this to codegen as a follow-up step.

llvm-svn: 316298
2017-10-22 19:10:07 +00:00
Sanjay Patel 24226504a7 [SimplifyCFG] try harder to forward switch condition to phi (PR34471)
The missed canonicalization/optimization in the motivating test from PR34471 leads to very different codegen:

  int switcher(int x) {
      switch(x) {
      case 17: return 17;
      case 19: return 19;
      case 42: return 42;
      default: break;
      }
      return 0;
    }

  int comparator(int x) {
    if (x == 17) return 17;
    if (x == 19) return 19;
    if (x == 42) return 42;
    return 0;
  }

For the first example, we use a bit-test optimization to avoid a series of compare-and-branch:
https://godbolt.org/g/BivDsw

Differential Revision: https://reviews.llvm.org/D39011

llvm-svn: 316293
2017-10-22 16:51:03 +00:00
David Green 907b60fbba [LoopInterchange] Fix phi node ordering miscompile.
The way that splitInnerLoopHeader splits blocks requires that
the induction PHI will be the first PHI in the inner loop
header. This makes sure that is actually the case when there
are both IV and reduction phis.

Differential Revision: https://reviews.llvm.org/D38682

llvm-svn: 316261
2017-10-21 13:58:37 +00:00
Eugene Zelenko fce435764e [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316253
2017-10-21 00:57:46 +00:00
Eugene Zelenko 99241d75c1 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316241
2017-10-20 21:47:29 +00:00
Eugene Zelenko bff0ef0324 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316190
2017-10-19 22:07:16 +00:00
Eugene Zelenko f27d161bf0 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316187
2017-10-19 21:21:30 +00:00
Simon Pilgrim 0444e4fcd4 Fix MSVC signed/unsigned comparison warning
llvm-svn: 316161
2017-10-19 15:00:31 +00:00
Max Kazantsev 3612d4b4f9 [NFC][IRCE] Filter out empty ranges early
llvm-svn: 316146
2017-10-19 05:33:28 +00:00
whitequark a99ecf1bbb [MergeFunctions] Don't blindly RAUW a GlobalValue with a ConstantExpr.
MergeFunctions uses (through FunctionComparator) a map of GlobalValues
to identifiers because it needs to compare functions and globals
do not have an inherent total order. Thus, FunctionComparator
(through GlobalNumberState) has a ValueMap<GlobalValue *>.

r315852 added a RAUW on globals that may have been previously
encountered by the FunctionComparator, which would replace
a GlobalValue * key with a ConstantExpr *, which is illegal.

This commit adjusts that code path to remove the function being
replaced from the ValueMap as well.

llvm-svn: 316145
2017-10-19 04:47:48 +00:00
Chandler Carruth 3f0e056df4 [PM] Refactor the bounds checking pass to remove a method only called in
one place.

llvm-svn: 316135
2017-10-18 22:42:36 +00:00
Sanjoy Das 2f27456c82 Revert "[ScalarEvolution] Handling for ICmp occuring in the evolution chain."
This reverts commit r316054.  There was some confusion over the review process:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20171016/495884.html

llvm-svn: 316129
2017-10-18 22:00:57 +00:00
Eugene Zelenko 306d29977d [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316128
2017-10-18 21:46:47 +00:00
Jatin Bhateja 1fc49627e4 [ScalarEvolution] Handling for ICmp occuring in the evolution chain.
Summary:
 If a compare instruction is same or inverse of the compare in the
 branch of the loop latch, then return a constant evolution node.
 Currently scope of evaluation is limited to SCEV computation for
 PHI nodes.

 This shall facilitate computations of loop exit counts in cases
 where compare appears in the evolution chain of induction variables.

 Will fix PR 34538
Reviewers: sanjoy, hfinkel, junryoungju

Reviewed By: junryoungju

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D38494

llvm-svn: 316054
2017-10-18 01:36:16 +00:00
Michael Zolotukhin c4fcc189d2 [GlobalDCE] Use DenseMap instead of unordered_multimap for GVDependencies.
Summary:
std::unordered_multimap happens to be very slow when the number of elements
grows large. On one of our internal applications we observed a 17x compile time
improvement from changing it to DenseMap.

Reviewers: mehdi_amini, serge-sans-paille, davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38916

llvm-svn: 316045
2017-10-17 23:47:06 +00:00
Eugene Zelenko 6cadde7f40 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316034
2017-10-17 21:27:42 +00:00
Vitaly Buka 524c0a639d Fix signed overflow detected by ubsan
This overflow does not affect algorithm, so just suppress it.

llvm-svn: 316018
2017-10-17 18:33:15 +00:00
Philip Reames 6a7bbfb2e2 Revert 315440 on behalf of mkazantsev
This patch reverts rL315440 because of the bug described at
https://bugs.llvm.org/show_bug.cgi?id=34937

The fix for the bug is on review as D38944, but not yet ready.  Given this is a regression reverting until a fix is ready is called for.

Max would have done the revert himself, but is having trouble doing a build of fresh LLVM for some reason.  I did the build and test to ensure the revert worked as expected on his behalf.

llvm-svn: 315974
2017-10-17 06:21:07 +00:00
Craig Topper 91259e2681 [JumpThreading] Move two PredValueInfoTy vectors to a scope closer to their usage. NFCI
llvm-svn: 315941
2017-10-16 21:54:13 +00:00
Eugene Zelenko dd40f5e7c1 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 315940
2017-10-16 21:34:24 +00:00
Akira Hatanaka e8c1a54c07 [ObjCARC] Do not move a release that has the clang.imprecise_release tag
above PHI instructions.

ARC optimizer has an optimization that moves a call to an ObjC runtime
function above a phi instruction when the phi has a null operand and is
an argument passed to the function call. This optimization should not
kick in when the runtime function is an objc_release that releases an
object with precise lifetime semantics.

rdar://problem/34959669

llvm-svn: 315914
2017-10-16 16:46:59 +00:00
Sanjay Patel 42135beac8 [InstCombine] don't unnecessarily generate a constant; NFCI
llvm-svn: 315910
2017-10-16 14:47:24 +00:00
NAKAMURA Takumi 414151a47e Revert rL315894, "SLPVectorizer.cpp: Try to appease stage2-3 difference. (D38586)"
llvm-svn: 315896
2017-10-16 09:50:01 +00:00
Nikolai Bozhenov 0e7ebbccc7 Move folding of icmp with zero after checking for min/max idioms.
Summary:
The following transformation for cmp instruction:

  icmp smin(x, PositiveValue), 0 -> icmp x, 0

should only be done after checking for min/max to prevent infinite
looping caused by a reverse canonicalization. That is why this
transformation was moved to place after the mentioned check.

Reviewers: spatel, efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38934

Patch by: Artur Gainullin <artur.gainullin@intel.com>

llvm-svn: 315895
2017-10-16 09:19:21 +00:00
NAKAMURA Takumi 4543affa98 SLPVectorizer.cpp: Try to appease stage2-3 difference. (D38586)
llvm-svn: 315894
2017-10-16 09:15:23 +00:00
Sanjay Patel 934738a3da revert r314984: revert r314698 - [InstCombine] remove one-use restriction for icmp (shr exact X, C1), C2 --> icmp X, (C2<<C1)
Recommitting r314698. The bug exposed by this change should be fixed with:
https://reviews.llvm.org/rL315579 

llvm-svn: 315857
2017-10-15 15:39:15 +00:00
Sanjay Patel 30f30d37fb [SimplifyCFG] use range-for-loops, tidy; NFCI
There seems to be something missing here as shown in PR34471:
https://bugs.llvm.org/show_bug.cgi?id=34471 

llvm-svn: 315855
2017-10-15 14:43:39 +00:00
Aaron Ballman 615eb47035 Reverting r315590; it did not include changes for llvm-tblgen, which is causing link errors for several people.
Error LNK2019 unresolved external symbol "public: void __cdecl `anonymous namespace'::MatchableInfo::dump(void)const " (?dump@MatchableInfo@?A0xf4f1c304@@QEBAXXZ) referenced in function "public: void __cdecl `anonymous namespace'::AsmMatcherEmitter::run(class llvm::raw_ostream &)" (?run@AsmMatcherEmitter@?A0xf4f1c304@@QEAAXAEAVraw_ostream@llvm@@@Z) llvm-tblgen D:\llvm\2017\utils\TableGen\AsmMatcherEmitter.obj 1

llvm-svn: 315854
2017-10-15 14:32:27 +00:00
whitequark ae12efab20 [MergeFunctions] Merge small functions if possible without a thunk.
This can result in significant code size savings in some cases,
e.g. an interrupt table all filled with the same assembly stub
in a certain Cortex-M BSP results in code blowup by a factor of 2.5.

Differential Revision: https://reviews.llvm.org/D34806

llvm-svn: 315853
2017-10-15 12:29:09 +00:00
whitequark b2ce9ffede [MergeFunctions] Replace all uses of unnamed_addr functions.
This reduces code size for constructs like vtables or interrupt
tables that refer to functions in global initializers.

Differential Revision: https://reviews.llvm.org/D34805

llvm-svn: 315852
2017-10-15 12:29:01 +00:00
Hongbin Zheng 73f650435b [LoopInfo][Refactor] Make SetLoopAlreadyUnrolled a member function of the Loop Pass, NFC.
This avoid code duplication and allow us to add the disable unroll metadata elsewhere.

Differential Revision: https://reviews.llvm.org/D38928

llvm-svn: 315850
2017-10-15 07:31:02 +00:00
Sanjay Patel b869f76d85 [InstCombine] use m_Neg() to reduce code; NFCI
llvm-svn: 315762
2017-10-13 21:28:50 +00:00
Eugene Zelenko 3b87939604 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 315760
2017-10-13 21:17:07 +00:00
Peter Collingbourne 868783e855 LowerTypeTests: Give imported symbols a type with size 0 so that they are not assumed not to alias.
It is possible for both a base and a derived class to be satisfied
with a unique vtable. If a program contains casts of the same pointer
to both of those types, the CFI checks will be lowered to this
(with ThinLTO):

if (p != &__typeid_base_global_addr)
  trap();
if (p != &__typeid_derived_global_addr)
  trap();

The optimizer may then use the first condition combined
with the assumption that __typeid_base_global_addr and
__typeid_derived_global_addr may not alias to optimize away the second
comparison, resulting in an unconditional trap.

This patch fixes the bug by giving imported globals the type [0 x i8]*,
which prevents the optimizer from assuming that they do not alias.

Differential Revision: https://reviews.llvm.org/D38873

llvm-svn: 315753
2017-10-13 21:02:16 +00:00
Sanjay Patel f0242de143 [InstCombine] move code to remove repeated constant check; NFCI
Also, consolidate tests for this fold in one place.

llvm-svn: 315745
2017-10-13 20:29:11 +00:00
Sanjay Patel 28b3aa3663 [InstCombine] recycle adds for better efficiency
Also, clean up unnecessary matcher capture variable initializations.

llvm-svn: 315743
2017-10-13 20:12:21 +00:00
Sanjay Patel 2118952162 [InstCombine] use local var to reduce code duplication; NFCI
llvm-svn: 315728
2017-10-13 18:32:53 +00:00
Matthew Simpson 2284937bbc [IPSCCP] Move common functions to ValueLatticeUtils (NFC)
This patch moves some common utility functions out of IPSCCP and makes them
available globally. The functions determine if interprocedural data-flow
analyses can propagate information through function returns, arguments, and
global variables.

Differential Revision: https://reviews.llvm.org/D37638

llvm-svn: 315719
2017-10-13 17:53:44 +00:00
Sanjay Patel c419c9f640 [InstCombine] add hasOneUse check to add-zext-add fold to prevent increasing instructions
llvm-svn: 315718
2017-10-13 17:47:25 +00:00
Sanjay Patel 76ed9eab29 [InstCombine] use AddOne helper to reduce code; NFC
llvm-svn: 315709
2017-10-13 17:00:47 +00:00
Sanjay Patel 8d810fee43 [InstCombine] rearrange code to remove repeated constant check; NFCI
llvm-svn: 315703
2017-10-13 16:43:58 +00:00
Sanjay Patel 2150651ac3 [InstCombine] allow zext(bool) + C --> select bool, C+1, C for vector types
The backend should be prepared for this transform after:
https://reviews.llvm.org/rL311731

llvm-svn: 315701
2017-10-13 16:29:38 +00:00
Daniel Neilson fa14ebd138 [RS4GC] Look through vector bitcasts when looking for base pointer
Summary:
 In RS4GC it is possible that a base pointer is contained in a vector that
has undergone a bitcast from one element-pointertype to another. We teach
RS4GC how to look through bitcasts of vector types when looking for a base
pointer.

Reviewers: anna

Reviewed By: anna

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38849

llvm-svn: 315694
2017-10-13 15:59:13 +00:00
Daniel Jasper 3344a21236 Revert r314923: "Recommit : Use the basic cost if a GEP is not used as addressing mode"
Significantly reduces performancei (~30%) of gipfeli
(https://github.com/google/gipfeli)

I have not yet managed to reproduce this regression with the open-source
version of the benchmark on github, but will work with others to get a
reproducer to you later today.

llvm-svn: 315680
2017-10-13 14:04:21 +00:00
Marco Castelluccio 0dcf64ad20 Disable gcov instrumentation of functions using funclet-based exception handling
Summary: This patch fixes the crash from https://bugs.llvm.org/show_bug.cgi?id=34659 and https://bugs.llvm.org/show_bug.cgi?id=34833.

Reviewers: rnk, majnemer

Reviewed By: rnk, majnemer

Subscribers: majnemer, llvm-commits

Differential Revision: https://reviews.llvm.org/D38223

llvm-svn: 315677
2017-10-13 13:49:15 +00:00
Eugene Zelenko 5323550e9a [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 315640
2017-10-12 23:30:03 +00:00
Anna Thomas 61aec18d46 [CVP] Process binary operations even when def is local
Summary:
This patch adds processing of binary operations when the def of operands are in
the same block (i.e. local processing).

Earlier we bailed out in such cases (the bail out was introduced in rL252032)
because LVI at that time was more precise about context at the end of basic
blocks, which implied local def and use analysis didn't benefit CVP.

Since then we've added support for LVI in presence of assumes and guards. The
test cases added show how local def processing in CVP helps adding more
information to the ashr, sdiv, srem and add operators.

Note: processCmp which suffers from the same problem will
be handled in a later patch.

Reviewers: philip, apilipenko, SjoerdMeijer, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38766

llvm-svn: 315634
2017-10-12 22:39:52 +00:00
Artur Pilipenko ead69ee4bd [LoopPredication] Check whether the loop is already guarded by the first iteration check condition
llvm-svn: 315623
2017-10-12 21:21:17 +00:00
Bruno Cardoso Lopes 993d2e67d8 Revert "Reintroduce "[SCCP] Propagate integer range info for parameters in IPSCCP.""
This reverts commit r315593: still affect two bots:

http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/5308
http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/21751/

llvm-svn: 315618
2017-10-12 20:52:34 +00:00
Artur Pilipenko b4527e1ce2 [LoopPredication] Support ule, sle latch predicates
This is a follow up for the loop predication change 313981 to support ule, sle latch predicates.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D38177

llvm-svn: 315616
2017-10-12 20:40:27 +00:00
Bruno Cardoso Lopes 326fdcbff8 Reintroduce "[SCCP] Propagate integer range info for parameters in IPSCCP."
This is r315288 & r315294, which were reverted due to stage2 bot
failures.

Summary:
This updates the SCCP solver to use of the ValueElement lattice for
parameters, which provides integer range information. The range
information is used to remove unneeded icmp instructions.

For the following function, f() can be optimized to `ret i32 2` with
this change

  source_filename = "sccp.c"
  target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
  target triple = "x86_64-unknown-linux-gnu"

  ; Function Attrs: norecurse nounwind readnone uwtable
  define i32 @main() local_unnamed_addr #0 {
  entry:
    %call = tail call fastcc i32 @f(i32 1)
    %call1 = tail call fastcc i32 @f(i32 47)
    %add3 = add nsw i32 %call, %call1
    ret i32 %add3
  }

  ; Function Attrs: noinline norecurse nounwind readnone uwtable
  define internal fastcc i32 @f(i32 %x) unnamed_addr #1 {
  entry:
    %c1 = icmp sle i32 %x, 100

    %cmp = icmp sgt i32 %x, 300
    %. = select i1 %cmp, i32 1, i32 2
    ret i32 %.
  }

  attributes #1 = { noinline }

Reviewers: davide, sanjoy, efriedma, dberlin

Reviewed By: davide, dberlin

Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D36656

llvm-svn: 315593
2017-10-12 16:54:11 +00:00
Don Hinton 3e0199f7eb [dump] Remove NDEBUG from test to enable dump methods [NFC]
Summary:
Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with
LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP.

Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods.

Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so
it'll be picked up by public headers.

Differential Revision: https://reviews.llvm.org/D38406

llvm-svn: 315590
2017-10-12 16:16:06 +00:00
Hongbin Zheng d36f2030e2 [SimplifyIndVar] Replace IVUsers with loop invariant whenever possible
Differential Revision: https://reviews.llvm.org/D38415

llvm-svn: 315551
2017-10-12 02:54:11 +00:00