Commit Graph

28196 Commits

Author SHA1 Message Date
Adam Nemet 546675cc7f [OptDiag] Fix function comment
Function is not passed unlike in the original of this
(llvm::emitOptimizationRemarkMissed).

llvm-svn: 276150
2016-07-20 18:16:45 +00:00
Sanjay Patel 683170bf56 move decomposeBitTestICmp() to Transforms/Utils; NFC
As noted in https://reviews.llvm.org/D22537 , we can use this functionality in 
visitSelectInstWithICmp() and InstSimplify, but currently we have duplicated
code.

llvm-svn: 276140
2016-07-20 17:18:45 +00:00
Wei Mi db80c0c77f Use ValueOffsetPair to enhance value reuse during SCEV expansion.
In D12090, the ExprValueMap was added to reuse existing value during SCEV expansion.
However, const folding and sext/zext distribution can make the reuse still difficult.

A simplified case is: suppose we know S1 expands to V1 in ExprValueMap, and
  S1 = S2 + C_a
  S3 = S2 + C_b
where C_a and C_b are different SCEVConstants. Then we'd like to expand S3 as
V1 - C_a + C_b instead of expanding S2 literally. It is helpful when S2 is a
complex SCEV expr and S2 has no entry in ExprValueMap, which is usually caused
by the fact that S3 is generated from S1 after const folding.

In order to do that, we represent ExprValueMap as a mapping from SCEV to
ValueOffsetPair. We will save both S1->{V1, 0} and S2->{V1, C_a} into the
ExprValueMap when we create SCEV for V1. When S3 is expanded, it will first
expand S2 to V1 - C_a because of S2->{V1, C_a} in the map, then expand S3 to
V1 - C_a + C_b.

Differential Revision: https://reviews.llvm.org/D21313

llvm-svn: 276136
2016-07-20 16:40:33 +00:00
Sanjay Patel be53c65fab fix documentation comments; NFC
llvm-svn: 276135
2016-07-20 16:30:55 +00:00
Adam Nemet 67c8929a2c [LV] Add hotness attribute to missed-optimization remarks
The new OptimizationRemarkEmitter analysis pass is hooked up to both new
and old PM passes.

llvm-svn: 276080
2016-07-20 04:03:43 +00:00
Matthias Braun 5b9722d6c7 Revert "RegScavenging: Add scavengeRegisterBackwards()"
Reverting this commit for now as it seems to be causing failures on
test-suite tests on the clang-ppc64le-linux-lnt bot.

This reverts commit r276044.

llvm-svn: 276068
2016-07-20 00:21:32 +00:00
Sean Silva e3c18a5ae8 [PM] Port LoopUnroll.
We just set PreserveLCSSA to always true since we don't have an
analogous method `mustPreserveAnalysisID(LCSSA)`.

Also port LoopInfo verifier pass to test LoopUnrollPass.

llvm-svn: 276063
2016-07-19 23:54:23 +00:00
Kyle Butt 9e52c064c2 Codegen: Factor out canTailDuplicate
canTailDuplicate accepts two blocks and returns true if the first can be
duplicated into the second successfully. Use this function to
encapsulate the heuristic.

llvm-svn: 276062
2016-07-19 23:54:21 +00:00
Justin Lebar 7ab570ec3a [ADT] Warn on unused results from ArrayRef and StringRef functions that read like they might mutate.
Summary:
Functions like "slice" and "drop_front" sound like they might mutate the
underlying object, but they don't.  Warning on unused results would have
saved me an hour yesterday, and I'm sure I'm not the only one.

LLVM and Clang are clean wrt this warning after D22540.

Reviewers: majnemer

Subscribers: sanjoy, chandlerc, llvm-commits

Differential Revision: https://reviews.llvm.org/D22541

llvm-svn: 276058
2016-07-19 23:19:25 +00:00
Daniel Berlin 5c46b943db Make MemorySSA::dominates/locallydominates constant time
Summary: Make MemorySSA::dominates/locallydominates constant time

Reviewers: george.burgess.iv, gberry

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22527

llvm-svn: 276046
2016-07-19 22:49:43 +00:00
Chandler Carruth 2aff750cb8 Add AIX support to Path.inc, Host.h, and CMake.
Patch by Andrew Paprocki!

Differential Revision: https://reviews.llvm.org/D18359

llvm-svn: 276045
2016-07-19 22:46:39 +00:00
Matthias Braun 84fd4bee6c RegScavenging: Add scavengeRegisterBackwards()
This is a variant of scavengeRegister() that works for
enterBasicBlockEnd()/backward(). The benefit of the backward mode is
that it is not affected by incomplete kill flags.

This patch also changes
PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register
scavenger in backwards mode.

Differential Revision: http://reviews.llvm.org/D21885

llvm-svn: 276044
2016-07-19 22:37:09 +00:00
Matthias Braun 4cb68e1048 RegisterScavenger: Introduce backward() mode.
This adds two pieces:
- RegisterScavenger:::enterBasicBlockEnd() which behaves similar to
  enterBasicBlock() but starts tracking at the end of the basic block.
- A RegisterScavenger::backward() method. It is subtly different
  from the existing unprocess() method which only considers uses with
  the kill flag set: If a value is dead at the end of a basic block with
  a last use inside the basic block, unprocess() will fail to mark it as
  live. However we cannot change/fix this behaviour because unprocess()
  needs to perform the exact reverse operation of forward().

Differential Revision: http://reviews.llvm.org/D21873

llvm-svn: 276043
2016-07-19 22:37:02 +00:00
Kevin Enderby 6524bd8c00 Next step along the way to getting good error messages for bad archives.
This step builds on Lang Hames work to change Archive::child_iterator
for better interoperation with Error/Expected.  Building on that it is now
possible to return an error message when the size field of an archive
contains non-decimal characters.

llvm-svn: 276025
2016-07-19 20:47:07 +00:00
Rafael Espindola 3816c53f04 Use posix_fallocate instead of ftruncate.
This makes sure that space is actually available. With this change
running lld on a full file system causes it to exit with

failed to open foo: No space left on device

instead of crashing with a sigbus.

llvm-svn: 276017
2016-07-19 20:19:56 +00:00
David Majnemer 938a6c7ce0 [RegionInfo] Some cleanups
- Use unique_ptr instead of managing a container of new'd pointers.
- Use range based for loops.

No functional change is intended.

llvm-svn: 276001
2016-07-19 17:50:30 +00:00
Simon Pilgrim 0ea8d275cc [X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using generic IR
D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead.

It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match).

This patch changes both scalar and packed versions back to using x86-specific builtins.

It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding.

A companion clang patch is at D22105

Differential Revision: https://reviews.llvm.org/D22106

llvm-svn: 275981
2016-07-19 15:07:43 +00:00
Simon Pilgrim 766345e331 Get rid of VS2015 operator precedence warning. NFCI.
llvm-svn: 275971
2016-07-19 12:26:51 +00:00
Daniel Sanders 2cb55d7dfd [mips] Recognise the triple used by Debian stretch for mips64el.
Summary:
The triple used for this distribution is mips64el-linux-gnuabi64.

Reviewers: sdardis

Subscribers: sdardis, llvm-commits

Differential Revision: https://reviews.llvm.org/D22406

llvm-svn: 275966
2016-07-19 10:22:19 +00:00
Tobias Grosser 3a49a8e13c Style: drop some unnecessary ';' [NFC]
llvm-svn: 275963
2016-07-19 09:01:46 +00:00
George Burgess IV 5f30897b7b [MemorySSA] Update to the new shiny walker.
This patch updates MemorySSA's use-optimizing walker to be more
accurate and, in some cases, faster.

Essentially, this changed our core walking algorithm from a
cache-as-you-go DFS to an iteratively expanded DFS, with all of the
caching happening at the end. Said expansion happens when we hit a Phi,
P; we'll try to do the smallest amount of work possible to see if
optimizing above that Phi is legal in the first place. If so, we'll
expand the search to see if we can optimize to the next phi, etc.

An iteratively expanded DFS lets us potentially quit earlier (because we
don't assume that we can optimize above all phis) than our old walker.
Additionally, because we don't cache as we go, we can now optimize above
loops.

As an added bonus, this patch adds a ton of verification (if
EXPENSIVE_CHECKS are enabled), so finding bugs is easier.

Differential Revision: https://reviews.llvm.org/D21777

llvm-svn: 275940
2016-07-19 01:29:15 +00:00
Vedant Kumar e3a0bf5048 Retry: [llvm-profdata] Speed up merging by using a thread pool
Add a "-j" option to llvm-profdata to control the number of threads used.
Auto-detect NumThreads when it isn't specified, and avoid spawning threads when
they wouldn't be beneficial.

I tested this patch using a raw profile produced by clang (147MB). Here is the
time taken to merge 4 copies together on my laptop:

  No thread pool: 112.87s user 5.92s system 97% cpu 2:01.08 total
  With 2 threads: 134.99s user 26.54s system 164% cpu 1:33.31 total

Changes since the initial commit:

  - When handling odd-length inputs, call ThreadPool::wait() before merging the
    last profile. Should fix a race/off-by-one (see r275937).

Differential Revision: https://reviews.llvm.org/D22438

llvm-svn: 275938
2016-07-19 01:17:20 +00:00
Vedant Kumar 21ab20e005 Revert "[llvm-profdata] Speed up merging by using a thread pool"
This reverts commit r275921. It broke the ppc64be bot:

  http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/3537

I'm not sure why it broke, but based on the output, it looks like an
off-by-one (one profile left un-merged).

llvm-svn: 275937
2016-07-19 00:57:09 +00:00
Matt Arsenault 4cb438b93c TableGen: Allow custom register operand decoder method
This is for a situation where the encoding for a register may be
different depending on the specific operand. For some instructions,
we want to apply additional restrictions beyond the encoding's
constraints.

In AMDGPU some operands are VSrc_32, using the VS_32 pseudo register
class which accept VGPRs, SGPRs, or immediates in the encoding.
Some specific instructions with the same encoding operand do not want
to allow immediates or SGPRs, but the encoding format is different
in this case than a regular VGPR_32 operand.

This allows specifying the encoding should be treated the same
without introducing yet another dummy register class.

llvm-svn: 275929
2016-07-18 23:20:46 +00:00
Vedant Kumar 0bd9907581 [llvm-profdata] Speed up merging by using a thread pool
Add a "-j" option to llvm-profdata to control the number of threads
used. Auto-detect NumThreads when it isn't specified, and avoid spawning
threads when they wouldn't be beneficial.

I tested this patch using a raw profile produced by clang (147MB). Here is the
time taken to merge 4 copies together on my laptop:

  No thread pool: 112.87s user 5.92s system 97% cpu 2:01.08 total
  With 2 threads: 134.99s user 26.54s system 164% cpu 1:33.31 total

Differential Revision: https://reviews.llvm.org/D22438

llvm-svn: 275921
2016-07-18 22:02:39 +00:00
Dehao Chen 6132ee8502 [PM] Convert Loop Strength Reduce pass to new PM
Summary: Convert Loop String Reduce pass to new PM

Reviewers: davidxl, silvas

Subscribers: junbuml, sanjoy, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D22468

llvm-svn: 275919
2016-07-18 21:41:50 +00:00
Mehdi Amini 4d74631ea4 Update doxygen description for `WriteBitcodeToFile()` API (NFC)
llvm-svn: 275917
2016-07-18 21:29:24 +00:00
Teresa Johnson 2124157102 [PM] Port FunctionImport Pass to new PM
Summary: Port FunctionImport Pass to new PM.

Reviewers: mehdi_amini, davide

Subscribers: davidxl, llvm-commits

Differential Revision: https://reviews.llvm.org/D22475

llvm-svn: 275916
2016-07-18 21:22:24 +00:00
Justin Lebar 4133584504 Write isUInt using template specializations to work around an incorrect MSVC warning.
Summary:
Per D22441, MSVC warns on our old implementation of isUInt<64>.  It sees
uint64_t(1) << 64 and doesn't realize that it's not going to be
executed.  Writing as a template specialization is ugly, but prevents
the warning.

Reviewers: RKSimon

Subscribers: majnemer, llvm-commits

Differential Revision: https://reviews.llvm.org/D22472

llvm-svn: 275909
2016-07-18 20:40:35 +00:00
Matt Arsenault c96e1deffa AMDGPU: Add intrinsic for s_flbit_i32/v_ffbh_i32
llvm-svn: 275871
2016-07-18 18:35:05 +00:00
Matt Arsenault 4c519d3518 AMDGPU/R600: Replace barrier intrinsics
llvm-svn: 275870
2016-07-18 18:34:59 +00:00
David Majnemer a2a218fbd4 [MathExtras] Fix UB in minIntN
We negated a value with a signed type which invited problems when that
value was the most negative signed number.  Use an unsigned type
for the value instead.  It will compute the same twos complement
result without the UB.

llvm-svn: 275815
2016-07-18 17:03:09 +00:00
Adam Nemet b2593f78ca [LoopDist] Port to new PM
Summary:
The direct motivation for the port is to ensure that the OptRemarkEmitter
tests work with the new PM.

This remains a function pass because we not only create multiple loops
but could also version the original loop.

In the test I need to invoke opt
with -passes='require<aa>,loop-distribute'.  LoopDistribute does not
directly depend on AA however LAA does.  LAA uses getCachedResult so
I *think* we need manually pull in 'aa'.

Reviewers: davidxl, silvas

Subscribers: sanjoy, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22437

llvm-svn: 275811
2016-07-18 16:29:27 +00:00
Adam Nemet 79ac42a5c9 [OptRemarkEmitter] Port to new PM
Summary:
The main goal is to able to start using the new OptRemarkEmitter
analysis from the LoopVectorizer.  Since the vectorizer was recently
converted to the new PM, it makes sense to convert this analysis as
well.

This pass is currently tested through the LoopDistribution pass, so I am
also porting LoopDistribution to get coverage for this analysis with the
new PM.

Reviewers: davidxl, silvas

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22436

llvm-svn: 275810
2016-07-18 16:29:21 +00:00
Simon Dardis d32a2d30cb [inlineasm] Propagate operand constraints to the backend
When SelectionDAGISel transforms a node representing an inline asm
block, memory constraint information is not preserved. This can cause
constraints to be broken when a memory offset is of the form:

offset + frame index

when the frame is resolved.

By propagating the constraints all the way to the backend, targets can
enforce memory operands of inline assembly to conform to their constraints.

For MIPSR6, some instructions had their offsets reduced to 9 bits from
16 bits such as ll/sc. This becomes problematic when using inline assembly
to perform atomic operations, as an offset can generated that is too big to
encode in the instruction.

Reviewers: dsanders, vkalintris

Differential Review: https://reviews.llvm.org/D21615

llvm-svn: 275786
2016-07-18 13:17:31 +00:00
Diana Picus 774d157a5d [ARM] Honour ABI for rem under -O0 for EABI, GNUEABI, Android and Musl
At higher optimization levels, we generate the libcall for DIVREM_Ix, which is
fine: aeabi_{u|i}divmod. At -O0 we generate the one for REM_Ix, which is the
default {u}mod{q|h|s|d}i3.

This commit makes sure that we don't generate REM_Ix calls for ABIs that
don't support them (i.e. where we need to use DIVREM_Ix instead). This is
achieved by bailing out of FastISel, which can't handle non-double multi-reg
returns, and letting the legalization infrastructure expand the REM_Ix calls.

It also updates the divmod-eabi.ll test to run under -O0 as well, and adds some
Windows checks to it to make sure we don't break things for it.

Fixes PR27068

Differential Revision: https://reviews.llvm.org/D21926

llvm-svn: 275773
2016-07-18 06:48:25 +00:00
Justin Lebar b59c1dd5cf Avoid UB in maxIntN(64).
Summary:
Previously we were relying on 2's complement underflow in an int64_t.
Now we cast to a uint64_t so we explicitly get the behavior we want.

Reviewers: rnk

Subscribers: dylanmckay, llvm-commits

Differential Revision: https://reviews.llvm.org/D22445

llvm-svn: 275722
2016-07-17 18:19:26 +00:00
Justin Lebar 6df6bde694 Clean up some comments in MathExtras.h.
Reviewers: rnk

Subscribers: llvm-commits, dylanmckay

Differential Revision: https://reviews.llvm.org/D22444

llvm-svn: 275721
2016-07-17 18:19:25 +00:00
Justin Lebar ab549c8187 Add assertions checking SignExtend{32,64}'s bit width.
Summary:
The bit width must be greater than zero, otherwise we shift by the
integer's width, which is UB.  Also (more obviously) the width must be
less than or equal to the integer's width, otherwise we shift by a
negative number, which is also UB.

Reviewers: rnk

Subscribers: llvm-commits, dylanmckay

Differential Revision: https://reviews.llvm.org/D22442

llvm-svn: 275720
2016-07-17 18:19:23 +00:00
Justin Lebar cbba3c4aef Fix isShiftedInt and isShiftedUint for widths > 32.
Summary:
Previously we were doing 1 << S.  "1" is an int, so this doesn't work
when S >= 32.

This patch also adds some static_asserts to these functions to ensure
that we don't hit UB by shifting left too much.

Reviewers: rnk

Subscribers: llvm-commits, dylanmckay

Differential Revision: https://reviews.llvm.org/D22441

llvm-svn: 275719
2016-07-17 18:19:21 +00:00
Justin Lebar f2d0066af7 Use a faster implementation of maxUIntN.
Summary:
On x86-64 with clang 3.8, before:

   mov     edx, 1
   mov     cl, dil
   shl     rdx, cl
   cmp     rdi, 64
   mov     rax, -1
   cmovne  rax, rdx
   ret

after:

  mov     ecx, 64
  sub     ecx, edi
  mov     rax, -1
  shr     rax, cl
  ret

Reviewers: rnk

Subscribers: dylanmckay, mkuper, llvm-commits

Differential Revision: https://reviews.llvm.org/D22440

llvm-svn: 275718
2016-07-17 18:19:19 +00:00
Teresa Johnson cd21a646f6 [ThinLTO] Perform profile-guided indirect call promotion
Summary:
To enable profile-guided indirect call promotion in ThinLTO mode, we
simply add call graph edges for each profitable target from the profile
to the summaries, then the summary-guided importing will consider the
callee for importing as usual.

Also we need to enable the indirect call promotion pass creation in the
PassManagerBuilder when PerformThinLTO=true (we are in the ThinLTO
backend), so that the newly imported functions are considered for
promotion in the backends.

The IC promotion profiles refer to callees by GUID, which required
adding GUIDs to the per-module VST in bitcode (and assigning them
valueIds similar to how they are assigned valueIds in the combined
index).

Reviewers: mehdi_amini, xur

Subscribers: mehdi_amini, davidxl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21932

llvm-svn: 275707
2016-07-17 14:47:01 +00:00
Dehao Chen 1a44452b11 [PM] Convert IVUsers analysis to new pass manager.
Summary: Convert IVUsers analysis to new pass manager.

Reviewers: davidxl, silvas

Subscribers: junbuml, sanjoy, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22434

llvm-svn: 275698
2016-07-16 22:51:33 +00:00
Eric Christopher 7a0b7dfd05 Reword comment to be more clear.
llvm-svn: 275659
2016-07-16 01:55:45 +00:00
Richard Smith 3bfed96632 Fix modules buildbot after r275633.
llvm-svn: 275657
2016-07-16 01:05:39 +00:00
Justin Lebar 8d56f47cfe Don't do uint64_t(1) << 64 in maxUIntN.
Summary:
This shift is undefined behavior (and, as compiled by clang, gives the
wrong answer for maxUIntN(64)).

Reviewers: mkuper

Subscribers: llvm-commits, jroelofs, rsmith

Differential Revision: https://reviews.llvm.org/D22430

llvm-svn: 275656
2016-07-16 00:59:41 +00:00
Vedant Kumar 7a4bd83c6c [Support] Fix a doxygen comment (NFC)
There was a missing "<" on a line, so its contents wrapped around into
the description of the next argument.

llvm-svn: 275638
2016-07-15 22:44:52 +00:00
Alexei Starovoitov cfb51f54ba BPF: Use official ELF e_machine value
The same value for EM_BPF is being propagated to glibc,
elfutils, and binutils.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 275633
2016-07-15 22:27:55 +00:00
Zachary Turner b927e02e1b [pdb] Teach MsfBuilder and other classes about the Free Page Map.
Block 1 and 2 of an MSF file are bit vectors that represent the
list of blocks allocated and free in the file.  We had been using
these blocks to write stream data and other data, so we mark them
as the free page map now.  We don't yet serialize these pages to
the disk, but at least we make a note of what it is, and avoid
writing random data to them.

Doing this also necessitated cleaning up some of the tests to be
more general and hardcode fewer values, which is nice.

llvm-svn: 275629
2016-07-15 22:17:19 +00:00
Zachary Turner 5e534c7fb3 [pdb] Round trip the NameMap data structure to YAML.
llvm-svn: 275628
2016-07-15 22:17:08 +00:00