Commit Graph

29911 Commits

Author SHA1 Message Date
Abhilash Bhandari a8d45de6ce [ADT] Fix for compilation error when operator++(int) (post-increment function) of SmallPtrSetIterator is used.
The bug was introduced in r289619.

Reviewers: Mehdi Amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28134

llvm-svn: 290749
2016-12-30 12:34:36 +00:00
Dehao Chen cc76344ef5 Use continuous boosting factor for complete unroll.
Summary:
The current loop complete unroll algorithm checks if unrolling complete will reduce the runtime by a certain percentage. If yes, it will apply a fixed boosting factor to the threshold (by discounting cost). The problem for this approach is that the threshold abruptly. This patch makes the boosting factor a function of runtime reduction percentage, capped by a fixed threshold. In this way, the threshold changes continuously.

The patch also simplified the code by reducing one parameter in UP.

The patch only affects code-gen of two speccpu2006 benchmark:

445.gobmk binary size decreases 0.08%, no performance change.
464.h264ref binary size increases 0.24%, no performance change.

Reviewers: mzolotukhin, chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26989

llvm-svn: 290737
2016-12-30 00:50:28 +00:00
Craig Topper ea03513332 [Analysis] Remove repeated text from a comment. NFC
llvm-svn: 290723
2016-12-29 21:48:28 +00:00
Justin Lebar 7cc6059058 [ADT] Rewrite IntrusiveRefCntPtr's comments. NFC
Edit for voice, and also add examples.  In particular, add an
explanation for why you might want to specialize IntrusiveRefCntPtrInfo,
which is not obvious.

llvm-svn: 290720
2016-12-29 19:59:38 +00:00
Justin Lebar 2d5622596a [ADT] Rename RefCountedBase::ref_cnt to RefCount. NFC
This makes it comply with the LLVM style guide, and also makes it
consistent with ThreadSafeRefCountedBase below.

llvm-svn: 290719
2016-12-29 19:59:34 +00:00
Justin Lebar a27accfe03 [ADT] clang-format IntrusiveRefCntrPtr.h. NFC
This file had some strange indentation.

Also remove some unnecessary whitespace between one-line member
functions.

llvm-svn: 290718
2016-12-29 19:59:30 +00:00
Justin Lebar 175ab74dc5 [ADT] Delete RefCountedBaseVPTR.
Summary:
This class is unnecessary.

Its comment indicated that it was a compile error to allocate an
instance of a class that inherits from RefCountedBaseVPTR on the stack.
This may have been true at one point, but it's not today.

Moreover you really do not want to allocate *any* refcounted object on
the stack, vptrs or not, so if we did have a way to prevent these
objects from being stack-allocated, we'd want to apply it to regular
RefCountedBase too, obviating the need for a separate RefCountedBaseVPTR
class.

It seems that the main way RefCountedBaseVPTR provides safety is by
making its subclass's destructor virtual.  This may have been helpful at
one point, but these days clang will emit an error if you define a class
with virtual functions that inherits from RefCountedBase but doesn't
have a virtual destructor.

Reviewers: compnerd, dblaikie

Subscribers: cfe-commits, klimek, llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D28162

llvm-svn: 290717
2016-12-29 19:59:26 +00:00
Justin Lebar 25eeb38acc [ADT] Use memcpy for type punning in MathExtras.
Summary: Previously we type-punned through a union, which is not safe.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28161

llvm-svn: 290715
2016-12-29 18:15:34 +00:00
Reid Kleckner cd46c1df80 Revert "[COFF] Use 32-bit jump table entries in .rdata for Win64"
This reverts commit r290694. It broke sanitizer tests on Win64. I'll
probably bring this back, but the jump tables will just live in .text
like they do for MSVC.

llvm-svn: 290714
2016-12-29 17:07:10 +00:00
Sanjoy Das 600d2a5a6b [TBAAVerifier] Make things const-consistent; NFC
llvm-svn: 290712
2016-12-29 15:47:01 +00:00
Sanjoy Das 55f12d9de9 [TBAAVerifier] Memoize validity of scalar tbaa nodes; NFCI
llvm-svn: 290711
2016-12-29 15:46:57 +00:00
Igor Laevsky 4f31e52f94 Introduce element-wise atomic memcpy intrinsic
This change adds a new intrinsic which is intended to provide memcpy functionality
with additional atomicity guarantees. Please refer to the review thread
or language reference for further details.

Differential Revision: https://reviews.llvm.org/D27133

llvm-svn: 290708
2016-12-29 14:31:07 +00:00
Mehdi Amini fce3af0192 Remove BitstreamWriter::Emit64(), it was never called (NFC)
llvm-svn: 290701
2016-12-29 01:40:53 +00:00
Reid Kleckner 32f171fec4 Fix mingw build by moving the static const data member before the bitfields
Apparently GCC targeting Windows breaks bitfields on static data members:
  struct Foo {
    unsigned X : 16;
    static const int M = 42;
    unsigned Y : 16;
  };
  static_assert(sizeof(Foo) == 4, "asdf"); // fails

Who knew.

llvm-svn: 290700
2016-12-29 01:14:41 +00:00
Justin Lebar ddece375a1 [GlobalValue] Move HasLLVMReservedName into existing bitfield. NFC
Summary:
Follow-up to r290691, where I introduced HasLLVMReservedName.  rnk
pointed out that that patch added an extra word to GlobalValue on MSVC,
because it doesn't pack bitfields with different types.

This patch moves HasLLVMReservedName into the existing bitfield, where
we appear to have plenty of bits to spare.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28149

llvm-svn: 290696
2016-12-29 00:30:46 +00:00
Justin Lebar 23a53501a4 [IR] Clarify that Value::getName() is not actually cheap.
It involves a hashtable lookup when the Value has a name.

llvm-svn: 290695
2016-12-29 00:30:42 +00:00
Reid Kleckner c9e0a153cf [COFF] Use 32-bit jump table entries in .rdata for Win64
Summary:
We were already using 32-bit jump table entries, but this was a
consequence of the default PIC model on Win64, and not an intentional
design decision. This patch ensures that we always use 32-bit label
difference jump table entries on Win64 regardless of the PIC model. This
is a good idea because it saves executable size and object file size.

Moving the jump tables to .rdata cleans up the disassembled object code
and reduces the available ROP targets, but it requires adding one more
RIP-relative lea to the code.  COFF doesn't have relocations to express
the difference between two arbitrary symbols, so we can't use the jump
table label in the label difference like we do elsewhere.

Fixes PR31488

Reviewers: majnemer, compnerd

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28141

llvm-svn: 290694
2016-12-29 00:12:39 +00:00
Mehdi Amini 5022bb7238 Change Metadata Index emission in the bitcode to use 2x32 bits for the placeholder
The Bitstream reader and writer are limited to handle a "size_t" at
most, which means that we can't backpatch and read back a 64bits
value on 32 bits platform.

llvm-svn: 290693
2016-12-28 23:45:54 +00:00
Justin Lebar 291abd3ebb Speed up Function::isIntrinsic() by adding a bit to GlobalValue. NFC
Summary:
Previously isIntrinsic() called getName().  This involves a hashtable
lookup, so is nontrivially expensive.  And isIntrinsic() is called
frequently, particularly by dyn_cast<IntrinsicInstr>.

This patch steals a bit of IntID and uses that to store whether or not
getName() starts with "llvm."

Reviewers: bogner, arsenm, joker-eph

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D22949

llvm-svn: 290691
2016-12-28 22:59:45 +00:00
Mehdi Amini e98f925834 Add an index for Module Metadata record in the bitcode
This index record the position for each metadata record in
the bitcode, so that the reader will be able to lazy-load
on demand each individual record.

We also make sure that every abbrev is emitted upfront so
that the block can be skipped while reading.

I don't plan to commit this before having the reader
counterpart, but I figured this can be reviewed mostly
independently.

Recommit r290684 (was reverted in r290686 because a test
was broken) after adding a threshold to avoid emitting
the index when unnecessary (little amount of metadata).
This optimization "hides" a limitation of the ability
to backpatch in the bitstream: we can only backpatch
safely when the position has been flushed. So if we emit
an index for one metadata, it is possible that (part of)
the offset placeholder hasn't been flushed and the backpatch
will fail.

Differential Revision: https://reviews.llvm.org/D28083

llvm-svn: 290690
2016-12-28 22:30:28 +00:00
Saleem Abdulrasool 2b59eca1f7 Revert "Add an index for Module Metadata record in the bitcode"
This reverts commit a0ca6ae2d38339e4ede0dfa588086fc23d87e836.  Revert at
Mehdi's request as it is breaking bots.

llvm-svn: 290686
2016-12-28 20:37:22 +00:00
Mehdi Amini 32ca148198 Add an index for Module Metadata record in the bitcode
Summary:
This index record the position for each metadata record in
the bitcode, so that the reader will be able to lazy-load
on demand each individual record.

We also make sure that every abbrev is emitted upfront so
that the block can be skipped while reading.

I don't plan to commit this before having the reader
counterpart, but I figured this can be reviewed mostly
independently.

Reviewers: pcc, tejohnson

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28083

llvm-svn: 290684
2016-12-28 19:44:19 +00:00
Mehdi Amini cc7fbf718d [ThinLTO] Honor -O{0,1,2,4} passed through the libLTO interface for ThinLTO
This was hardcoded to be O3 till now, without any way to change it
without changing the code.

llvm-svn: 290682
2016-12-28 19:37:16 +00:00
Chandler Carruth 05ca5acc9e [PM] Introduce a devirtualization iteration layer for the new PM.
This is an orthogonal and separated layer instead of being embedded
inside the pass manager. While it adds a small amount of complexity, it
is fairly minimal and the composability and control seems worth the
cost.

The logic for this ends up being nicely isolated and targeted. It should
be easy to experiment with different iteration strategies wrapped around
the CGSCC bottom-up walk using this kind of facility.

The mechanism used to track devirtualization is the simplest one I came
up with. I think it handles most of the cases the existing iteration
machinery handles, but I haven't done a *very* in depth analysis. It
does however match the basic intended semantics, and we can tweak or
tune its exact behavior incrementally as necessary. One thing that we
may want to revisit is freshly building the value handle set on each
iteration. While I don't think this will be a significant cost (it is
strictly fewer value handles but more churn of value handes than the old
call graph), it is conceivable that we'll want a somewhat more clever
tracking mechanism. My hope is to layer that on as a follow up patch
with data supporting any implementation complexity it adds.

This code also provides for a basic count heuristic: if the number of
indirect calls decreases and the number of direct calls increases for
a given function in the SCC, we assume devirtualization is responsible.
This matches the heuristics currently used in the legacy pass manager.

Differential Revision: https://reviews.llvm.org/D23114

llvm-svn: 290665
2016-12-28 11:07:33 +00:00
Chandler Carruth 443e57e01d [PM] Teach the CGSCC's CG update utility to more carefully invalidate
analyses when we're about to break apart an SCC.

We can't wait until after breaking apart the SCC to invalidate things:
1) Which SCC do we then invalidate? All of them?
2) Even if we invalidate all of them, a newly created SCC may not have
   a proxy that will convey the invalidation to functions!

Previously we only invalidated one of the SCCs and too late. This led to
stale analyses remaining in the cache. And because the caching strategy
actually works, they would get used and chaos would ensue.

Doing invalidation early is somewhat pessimizing though if we *know*
that the SCC structure won't change. So it turns out that the design to
make the mutation API force the caller to know the *kind* of mutation in
advance was indeed 100% correct and we didn't do enough of it. So this
change also splits two cases of switching a call edge to a ref edge into
two separate APIs so that callers can clearly test for this and take the
easy path without invalidating when appropriate. This is particularly
important in this case as we expect most inlines to be between functions
in separate SCCs and so the common case is that we don't have to so
aggressively invalidate analyses.

The LCG API change in turn needed some basic cleanups and better testing
in its unittest. No interesting functionality changed there other than
more coverage of the returned sequence of SCCs.

While this seems like an obvious improvement over the current state, I'd
like to revisit the core concept of invalidating within the CG-update
layer at all. I'm wondering if we would be better served forcing the
callers to handle the invalidation beforehand in the cases that they
can handle it. An interesting example is when we want to teach the
inliner to *update and preserve* analyses. But we can cross that bridge
when we get there.

With this patch, the new pass manager an build all of the LLVM test
suite at -O3 and everything passes. =D I haven't bootstrapped yet and
I'm sure there are still plenty of bugs, but this gives a nice baseline
so I'm going to increasingly focus on fleshing out the missing
functionality, especially the bits that are just turned off right now in
order to let us establish this baseline.

llvm-svn: 290664
2016-12-28 10:34:50 +00:00
Gadi Haber 19c4fc5e62 This is a large patch for X86 AVX-512 of an optimization for reducing code size by encoding EVEX AVX-512 instructions using the shorter VEX encoding when possible.
There are cases of AVX-512 instructions that have two possible encodings. This is the case with instructions that use vector registers with low indexes of 0 - 15 and do not use the zmm registers or the mask k registers.
The EVEX encoding prefix requires 4 bytes whereas the VEX prefix can take only up to 3 bytes. Consequently, using the VEX encoding for these instructions results in a code size reduction of ~2 bytes even though it is compiled with the AVX-512 features enabled.

Reviewers: Craig Topper, Zvi Rackoover, Elena Demikhovsky 
Differential Revision: https://reviews.llvm.org/D27901

llvm-svn: 290663
2016-12-28 10:12:48 +00:00
Hemant Kulkarni e60b294be8 llvm-readobj: ELF: Make DT tags machine aware
llvm-svn: 290623
2016-12-27 19:59:29 +00:00
Chandler Carruth e14524ca30 [PM] Teach MemDep to invalidate its result object when its cached
analysis handles become invalid.

Add a test case for its invalidation logic.

llvm-svn: 290620
2016-12-27 19:33:04 +00:00
Chandler Carruth aa35167578 [PM] Teach BasicAA how to invalidate its result object.
This requires custom handling because BasicAA caches handles to other
analyses and so it needs to trigger indirect invalidation.

This fixes one of the common crashes when using the new PM in real
pipelines. I've also tweaked a regression test to check that we are at
least handling the most immediate case.

I'm going to work at re-structuring this test some to both scale better
(rather than all being in one file) and check more invalidation paths in
a follow-up commit, but I wanted to get the basic bug fix in place.

llvm-svn: 290603
2016-12-27 10:30:45 +00:00
Eugene Leviant c089e406b9 Allow setting multiple debug types
Differential revision: https://reviews.llvm.org/D28109

llvm-svn: 290597
2016-12-27 09:31:20 +00:00
Chandler Carruth 17c630a09c [PM] Teach the AAManager and AAResults layer (the worst offender for
inter-analysis dependencies) to use the new invalidation infrastructure.

This teaches it to invalidate itself when any of the peer function
AA results that it uses become invalid. We do this by just tracking the
originating IDs. I've kept it in a somewhat clunky API since some users
of AAResults are outside the new PM right now. We can clean this API up
if/when those users go away.

Secondly, it uses the registration on the outer analysis manager proxy
to trigger deferred invalidation when a module analysis result becomes
invalid.

I've included test cases that specifically try to trigger use-after-free
in both of these cases and they would crash or hang pretty horribly for
me even without ASan. Now they work nicely.

The `InvalidateAnalysis` utility pass required some tweaking to be
useful in this context and it still is pretty garbage. I'd like to
switch it back to the previous implementation and teach the explicit
invalidate method on the AnalysisManager to take care of correctly
triggering indirect invalidation, but I wanted to go ahead and send this
out so folks could see how all of this stuff works together in practice.
And, you know, that it does actually work. =]

Differential Revision: https://reviews.llvm.org/D27205

llvm-svn: 290595
2016-12-27 08:44:39 +00:00
Chandler Carruth ba90ae969c [PM] Introduce the facilities for registering cross-IR-unit dependencies
that require deferred invalidation.

This handles the other real-world invalidation scenario that we have
cases of: a function analysis which caches references to a module
analysis. We currently do this in the AA aggregation layer and might
well do this in other places as well.

Since this is relative rare, the technique is somewhat more cumbersome.
Analyses need to register themselves when accessing the outer analysis
manager's proxy. This proxy is already necessarily present to allow
access to the outer IR unit's analyses. By registering here we can track
and trigger invalidation when that outer analysis goes away.

To make this work we need to enhance the PreservedAnalyses
infrastructure to support a (slightly) more explicit model for "sets" of
analyses, and allow abandoning a single specific analyses even when
a set covering that analysis is preserved. That allows us to describe
the scenario of preserving all Function analyses *except* for the one
where deferred invalidation has triggered.

We also need to teach the invalidator API to support direct ID calls
instead of always going through a template to dispatch so that we can
just record the ID mapping.

I've introduced testing of all of this both for simple module<->function
cases as well as for more complex cases involving a CGSCC layer.

Much like the previous patch I've not tried to fully update the loop
pass management layer because that layer is due to be heavily reworked
to use similar techniques to the CGSCC to handle updates. As that
happens, we'll have a better testing basis for adding support like this.

Many thanks to both Justin and Sean for the extensive reviews on this to
help bring the API design and documentation into a better state.

Differential Revision: https://reviews.llvm.org/D27198

llvm-svn: 290594
2016-12-27 08:40:39 +00:00
Craig Topper 2da265b7bf [AVX-512] Remove masked pmuldq and pmuludq intrinsics and autoupgrade them to unmasked intrinsics plus a select.
llvm-svn: 290583
2016-12-27 05:30:14 +00:00
Chandler Carruth 162504578b [LCG] Teach the LazyCallGraph to handle visiting the blockaddress
constant expression and to correctly form function reference edges
through them without crashing because one of the operands (the
`BasicBlock` isn't actually a constant despite being an operand of
a constant).

llvm-svn: 290581
2016-12-27 05:00:45 +00:00
Craig Topper 89b3e0223f [AVX-512] Add 512-bit unmasked intrinsics for pmuldq and pmuludq so we can add them to InstCombine with the 128 and 256 bit versions.
The 128 and 256 bit masked intrinsics are currently unused by clang. The sse and avx2 unmasked intrinsics are used instead. The new 512-bit intrinsic will be used to do the same. Then all masked versions will removed and autoupgraded.

llvm-svn: 290573
2016-12-27 03:46:05 +00:00
Chandler Carruth 6e9bb7e064 [PM] Teach the always inliner in the new pass manager to support
removing fully-dead comdats without removing dead entries in comdats
with live members.

This factors the core logic out of the current inliner's internals to
a reusable utility and leverages that in both places. The factored out
code should also be (minorly) more efficient in cases where we have very
few dead functions or dead comdats to consider.

I've added a test case to cover this behavior of the always inliner.
This is the last significant bug in the new PM's always inliner I've
found (so far).

llvm-svn: 290557
2016-12-26 23:43:27 +00:00
Chandler Carruth cc44ab63b6 [ADT] Add an llvm::erase_if utility to make the standard erase+remove_if
pattern easier to write.

Differential Revision: https://reviews.llvm.org/D28120

llvm-svn: 290555
2016-12-26 23:30:44 +00:00
Chandler Carruth d9eaa54ef4 [ADT] Add a boring std::partition wrapper similar to our std::remove_if
wrapper.

llvm-svn: 290553
2016-12-26 23:10:40 +00:00
Daniel Berlin 85f91b0ec3 clang-format NewGVN files
llvm-svn: 290551
2016-12-26 20:06:58 +00:00
Daniel Berlin 85cbc8c097 Misc cleanups and simplifications for NewGVN.
Mostly use a bit more idiomatic C++ where we can,
so we can combine some things later.

Reviewers: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28111

llvm-svn: 290550
2016-12-26 19:57:25 +00:00
Davide Italiano fe7a3ee51e [NewGVN] Add a flag to enable the pass via `-mllvm`.
NewGVN can be tested passing `-mllvm -enable-newgvn` to clang.

Differential Revision:  https://reviews.llvm.org/D28059

llvm-svn: 290548
2016-12-26 18:26:19 +00:00
Simon Pilgrim cd9d729461 Wdocumentation fix
llvm-svn: 290545
2016-12-26 17:48:19 +00:00
Chandler Carruth 0cf829c171 Fix some bad indentation that I or another introduced somehow.
llvm-svn: 290531
2016-12-26 01:20:59 +00:00
Chandler Carruth cb22b89f3f [ADT] Add a generic concatenating iterator and range (take 2).
This recommits r290512 that was reverted when MSVC failed to compile it. Since
then I've played with various approaches using rextester.com (where I was able
to reproduce the failure) and think that I have a solution thanks in part to
the help of Dave Blaikie! It seems MSVC just has a defective `decltype` in this
version. Manually writing out the type seems to do the trick, even though it is
.... quite complicated.

Original commit message:
This allows both defining convenience iterator/range accessors on types
which walk across N different independent ranges within the object, and
more direct and simple usages with range based for loops such as shown
in the unittest. The same facilities are used for both. They end up
quite small and simple as it happens.

I've also switched an iterator on `Module` to use this. I would like to
add another convenience iterator that includes even more sequences as
part of it and seeing this one already present motivated me to actually
abstract it away and introduce a general utility.

Differential Revision: https://reviews.llvm.org/D28093

llvm-svn: 290528
2016-12-25 23:41:14 +00:00
Bryant Wong 4213d94142 [MemorySSA] Define a restricted upward AccessList splice.
Differential Revision: https://reviews.llvm.org/D26661

llvm-svn: 290527
2016-12-25 23:34:07 +00:00
Daniel Berlin 65f5f0d728 Rename GVNExpression *ops_ members to *op_* to match conventions in the rest of LLVM
llvm-svn: 290524
2016-12-25 22:10:37 +00:00
Lang Hames c9d0ff1302 [Orc][RPC] Add a ParallelCallGroup utility for dispatching and waiting on
multiple asynchronous RPC calls.

ParallelCallGroup allows multiple asynchronous calls to be dispatched,
and provides a wait method that blocks until all asynchronous calls have
been executed on the remote and all return value handlers run on the
local machine.

This will allow, for example, the JIT client to issue memory allocation calls
for all sections in parallel, then block until all memory has been allocated
on the remote and the allocated addresses registered with the client, at which
point the JIT client can proceed to applying relocations.

llvm-svn: 290523
2016-12-25 21:55:05 +00:00
Lang Hames aac390ee85 [Orc][RPC] Clang-format RPCUtils header.
Some of the recent RPC call type-checking changes weren't formatted prior to
commit.

llvm-svn: 290520
2016-12-25 19:55:59 +00:00
Greg Clayton 1eb0bca178 Add newline to end of file to quiet warnings.
llvm-svn: 290519
2016-12-25 18:41:47 +00:00
Amjad Aboud 7faeecc8f7 [DebugInfo] Added support for Checksum debug info feature.
Differential Revision: https://reviews.llvm.org/D27642

llvm-svn: 290514
2016-12-25 10:12:09 +00:00