Commit Graph

86557 Commits

Author SHA1 Message Date
Evgeniy Stepanov 9fb70f53ce Fix PR26152.
Fix the condition for when the new global takes over the name of
the existing one to be the negation of the condition for the new
global to get internal linkage.

llvm-svn: 258355
2016-01-20 22:05:50 +00:00
Evgeniy Stepanov b640415f9b Fix build warning.
error: field 'CCMgr' will be initialized after field 'IndirectStubsMgr' [-Werror,-Wreorder]
    : DL(TM.createDataLayout()), CCMgr(std::move(CCMgr)),

llvm-svn: 258354
2016-01-20 22:02:07 +00:00
Tom Stellard d1efda8e9e AMDGPU/SI: Promote i1 SETCC operations
Summary:
While working on uniform branching, I've hit a few cases where we emit
i1 SETCC operations.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D16233

llvm-svn: 258352
2016-01-20 21:48:24 +00:00
Matt Arsenault 7836f895fe AMDGPU: Fix old comments that mention AMDIL
llvm-svn: 258350
2016-01-20 21:22:21 +00:00
Matt Arsenault 7ba334a7d9 AMDGPU: Remove AMDGPU.trunc intrinsic
llvm-svn: 258348
2016-01-20 21:05:53 +00:00
Matt Arsenault 15fbe49daf AMDGPU: Remove AMDIL.fraction intrinsic
llvm-svn: 258347
2016-01-20 21:05:49 +00:00
Matt Arsenault 7cccd2672e AMDGPU: Remove AMDIL.round.nearest intrinsic
llvm-svn: 258346
2016-01-20 21:05:40 +00:00
Quentin Colombet 105cf2b179 [GlobalISel] Add the proper cmake plumbing.
This patch adds the necessary plumbing to cmake to build the sources related to
GlobalISel.

To build the sources related to GlobalISel, we need to add -DBUILD_GLOBAL_ISEL=ON.
By default, this is OFF, thus GlobalISel sources will not impact people that do
not explicitly opt-in.

Differential Revision: http://reviews.llvm.org/D15983

llvm-svn: 258344
2016-01-20 20:58:56 +00:00
Matt Arsenault 1c9e4ef0df AMDGPU: Remove abs intrinsic
llvm-svn: 258343
2016-01-20 20:58:29 +00:00
Matt Arsenault f7e6e89718 AMDGPU: Remove min/max intrinsics
This removes support for mesa 11.0.x

llvm-svn: 258342
2016-01-20 20:50:19 +00:00
Sanjoy Das a34ce95b60 Add a "gc-transition" operand bundle
Summary:
This adds a new kind of operand bundle to LLVM denoted by the
`"gc-transition"` tag.  Inputs to `"gc-transition"` operand bundle are
lowered into the "transition args" section of `gc.statepoint` by
`RewriteStatepointsForGC`.

This removes the last bit of functionality that was unsupported in the
deopt bundle based code path in `RewriteStatepointsForGC`.

Reviewers: pgavlin, JosephTremoulet, reames

Subscribers: sanjoy, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D16342

llvm-svn: 258338
2016-01-20 19:50:25 +00:00
Simon Atanasyan 2d0d8530e3 [llvm-readobj][ELF] Teach llvm-readobj to show arch specific ELF section's flags
Some architecture specific ELF section flags might have the same value
(for example SHF_X86_64_LARGE and SHF_HEX_GPREL) and we have to check
machine architectures to select an appropriate set of possible flags.

The patch selects architecture specific flags into separate arrays
`ElfxxxSectionFlags` and combines `ElfSectionFlags` and `ElfxxxSectionFlags`
before pass to the `StreamWriter::printFlags()` method.

Differential Revision: http://reviews.llvm.org/D16269

llvm-svn: 258334
2016-01-20 19:15:18 +00:00
Sanjay Patel f44bd38092 fix typo; NFC
llvm-svn: 258332
2016-01-20 18:59:48 +00:00
Sanjay Patel 545a456235 fix formatting; NFC
llvm-svn: 258330
2016-01-20 18:59:16 +00:00
Rafael Espindola b718237dfc Accept subtractions involving a weak symbol.
When a symbol S shows up in an expression in assembly there are two
possible interpretations
* The expression is referring to the value of S in this file.
* The expression is referring to the value after symbol resolution.

In the first case the assembler can reason about the value and try to
produce a relocation.
In the second case, that is only possible if the symbol cannot be
preempted.

Assemblers are not very consistent about which interpretation gets used.
This changes MC to agree with GAS in the case of an expression of the
form "Sym - WeakSym".

llvm-svn: 258329
2016-01-20 18:57:48 +00:00
Sanjay Patel bd2dc67142 [LibCallSimplifier] don't get fooled by a fake sqrt()
The test case will crash without this patch because the subsequent call to
hasUnsafeAlgebra() assumes that the call instruction is an FPMathOperator
(ie, returns an FP type).

This part of the function signature check was omitted for the sqrt() case, 
but seems to be in place for all other transforms.

Before:
http://reviews.llvm.org/rL257400
...we would have needlessly continued execution in optimizeSqrt(), but the
bug was harmless because we'd eventually fail some other check and return
without damage.

This should fix:
https://llvm.org/bugs/show_bug.cgi?id=26211

Differential Revision: http://reviews.llvm.org/D16198

llvm-svn: 258325
2016-01-20 17:41:14 +00:00
Lang Hames 6c3e790e78 [Orc] Fix a use-after-move bug in the Orc C-bindings stack.
llvm-svn: 258324
2016-01-20 17:39:52 +00:00
Sanjay Patel 1c600c6e83 80-cols; NFC
llvm-svn: 258323
2016-01-20 16:41:43 +00:00
Keith Walker 8c44bf1b89 Write AArch64 big endian data fixup entries as BE.
There was support for writing the AArch64 big endian data fixup entries in
the .eh_frame section in BE.    This is changed to write all such fixup
entries in BE with no restriction on the section.  This is similar to
the existing support for fixup entries for ARM.

A test is added to check the length field in the .debug_line section as
this is an example of where such a fixup occurs.

Differential Revision: http://reviews.llvm.org/D16064

llvm-svn: 258320
2016-01-20 15:59:14 +00:00
Tom Stellard 77a177722f Correctly initialize SIAnnotateControlFlow
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D16304

llvm-svn: 258319
2016-01-20 15:48:27 +00:00
Michael Zuckerman 65c40afb03 [AVX512] Adding VPERMB Intrinsics
Differential Revision: http://reviews.llvm.org/D16296

llvm-svn: 258316
2016-01-20 15:24:56 +00:00
Marina Yatsina 701938d64e Fixing bug in rL258132: [X86] Adding support for missing variations of X86 string related instructions
There was a bug in my rL258132 because there's an overloading of the "movsd" and "cmpsd" instructions, e.g. movsd can be either "Move Data from String to String" (the case I wanted to handle) or "Move or Merge Scalar Double-Precision Floating-Point Value" (the case that causes the asserts).
Added  code for escaping the unfamiliar scenarios and falling back to old behviour.
Also changed the asserts to llvm_unreachable. 

llvm-svn: 258312
2016-01-20 14:03:47 +00:00
Krzysztof Parzyszek 2451c4835a Proper handling of diamond-like cases in if-conversion
If converter was somewhat careless about "diamond" cases, where there
was no join block, or in other words, where the true/false blocks did
not have analyzable branches. In such cases, it was possible for it to
remove (needed) branches, resulting in a loss of entire basic blocks.

Differential Revision: http://reviews.llvm.org/D16156

llvm-svn: 258310
2016-01-20 13:14:52 +00:00
Igor Breger d3341f5021 AVX512: Store (MOVNTPD, MOVNTPS, MOVNTDQ) using non-temporal hint intrinsic implementation.
Differential Revision: http://reviews.llvm.org/D16350

llvm-svn: 258309
2016-01-20 13:11:47 +00:00
Oliver Stannard f7696f8267 [AArch64] Fix two bugs in the .inst directive
The AArch64 .inst directive was implemented using EmitIntValue, which resulted
in both $x and $d (code and data) mapping symbols being emitted at the same
address. This fixes it to only emit the $x mapping symbol.

EmitIntValue also emits the value in big-endian order when targeting big-endian
systems, but instructions are always emitted in little-endian order for
AArch64.

Differential Revision: http://reviews.llvm.org/D16349

llvm-svn: 258308
2016-01-20 12:54:31 +00:00
Petr Pavlu eba3039238 [LTO] Fix error reporting when a file passed to libLTO is invalid or non-existent
This addresses PR26060 where function lto_module_create() could return nullptr
but lto_get_error_message() returned an empty string.

The error() call after LTOModule::createFromFile() in llvm-lto is then removed
because any error from this function should go through the diagnostic handler in
llvm-lto which will exit the program. The error() call was added because this
previously did not happen when the file was non-existent. This is fixed by the
patch. (The situation that llvm-lto reports an error when the input file does
not exist is tested by llvm/tools/llvm-lto/error.ll).

Differential Revision: http://reviews.llvm.org/D16106

llvm-svn: 258298
2016-01-20 09:03:42 +00:00
Ivan Krasin 3b1c260d22 [Verifier] Fix performance regression for LTO builds
Summary:
Fix a significant performance regression by introducing GlobalValueVisited field and reusing the map.
This is a follow up to r257823 that slowed down linking Chrome with LTO by 2.5x.

If you revert this commit, please, also revert r257823.

BUG=https://llvm.org/bugs/show_bug.cgi?id=26214

Reviewers: pcc, loladiro, joker.eph

Subscribers: krasin1, joker.eph, loladiro, pcc

Differential Revision: http://reviews.llvm.org/D16338

llvm-svn: 258297
2016-01-20 08:41:22 +00:00
Dan Gohman edf98c5682 [SelectionDAG] Fold more offsets into GlobalAddresses
SelectionDAG previously missed opportunities to fold constants into
GlobalAddresses in several areas. For example, given `(add (add GA, c1), y)`, it
would often reassociate to `(add (add GA, y), c1)`, missing the opportunity to
create `(add GA+c, y)`. This isn't often visible on targets such as X86 which
effectively reassociate adds in their complex address-mode folding logic,
however it is currently visible on WebAssembly since it currently has very
simple address mode folding code that doesn't reassociate anything.

This patch fixes this by making SelectionDAG fold offsets into GlobalAddresses
at the same times that it folds constants together, so that it doesn't miss any
opportunities to perform such folding.

Differential Revision: http://reviews.llvm.org/D16090

llvm-svn: 258296
2016-01-20 07:03:08 +00:00
Dan Gohman 8394756937 [WebAssembly] Minor code cleanups. NFC.
llvm-svn: 258294
2016-01-20 05:54:22 +00:00
Dan Gohman 26cf4f3689 [WebAssembly] Remove the Relooper code, as it is not currently being used.
llvm-svn: 258293
2016-01-20 05:50:29 +00:00
Dan Gohman 7e64917fd1 [WebAssembly] Don't stackify stores across instructions with side effects.
llvm-svn: 258285
2016-01-20 04:21:16 +00:00
Joseph Tremoulet b41632bf0f [Inliner/WinEH] Honor implicit nounwinds
Summary:
Funclet EH tables require that a given funclet have only one unwind
destination for exceptional exits.  The verifier will therefore reject
e.g. two cleanuprets with different unwind dests for the same cleanup, or
two invokes exiting the same funclet but to different unwind dests.
Because catchswitch has no 'nounwind' variant, and because IR producers
are not *required* to annotate calls which will not unwind as 'nounwind',
it is legal to nest a call or an "unwind to caller" catchswitch within a
funclet pad that has an unwind destination other than caller; it is
undefined behavior for such a call or catchswitch to unwind.

Normally when inlining an invoke, calls in the inlined sequence are
rewritten to invokes that unwind to the callsite invoke's unwind
destination, and "unwind to caller" catchswitches in the inlined sequence
are rewritten to unwind to the callsite invoke's unwind destination.
However, if such a call or "unwind to caller" catchswitch is located in a
callee funclet that has another exceptional exit with an unwind
destination within the callee, applying the normal transformation would
give that callee funclet multiple unwind destinations for its exceptional
exits.  There would be no way for EH table generation to determine which
is the "true" exit, and the verifier would reject the function
accordingly.

Add logic to the inliner to detect these cases and leave such calls and
"unwind to caller" catchswitches as calls and "unwind to caller"
catchswitches in the inlined sequence.

This fixes PR26147.


Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: alexcrichton, llvm-commits

Differential Revision: http://reviews.llvm.org/D16319

llvm-svn: 258273
2016-01-20 02:15:15 +00:00
Xinliang David Li 59411db520 [PGO] Add a new interface to be used by Indirect Call Promotion
llvm-svn: 258271
2016-01-20 01:26:34 +00:00
Eduard Burtescu 23c4d83aa3 [NFC] Replace several manual GEP loops with gep_type_iterator.
Reviewers: dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16335

llvm-svn: 258262
2016-01-20 00:26:52 +00:00
Xinliang David Li 440cd7027b Function name change /NFC
llvm-svn: 258260
2016-01-20 00:24:36 +00:00
Matthias Braun d4f6409dff MachineScheduler: Allow independent scheduling of sub register defs
Note that this is disabled by default and still requires a patch to
handleMove() which is not upstreamed yet.

If the TrackLaneMasks policy/strategy is enabled the MachineScheduler
will build a schedule graph where definitions of independent
subregisters are no longer serialised.

Implementation comments:
- Without lane mask tracking a sub register def also counts as a use
  (except for the first one with the read-undef flag set), with lane
  mask tracking enabled this is no longer the case.
- Pressure Diffs where previously maintained per definition of a
  vreg with the help of the SSA information contained in the
  LiveIntervals.  With lanemask tracking enabled we cannot do this
  anymore and instead change the pressure diffs for all uses of the vreg
  as it becomes live/dead.  For this changed style to work correctly we
  ignore uses of instructions that define the same register again: They
  won't affect register pressure.
- With lanemask tracking we remove all read-undef flags from
  sub register defs when building the graph and re-add them later when
  all vreg lanes have become dead.

Differential Revision: http://reviews.llvm.org/D14969

llvm-svn: 258259
2016-01-20 00:23:32 +00:00
Matthias Braun 5d458617aa RegisterPressure: Make liveness tracking subregister aware
Differential Revision: http://reviews.llvm.org/D14968

llvm-svn: 258258
2016-01-20 00:23:26 +00:00
Matthias Braun 3907fded1b LiveInterval: Add utility class to rename independent subregister usage
This renaming is necessary to avoid a subregister aware scheduler
accidentally creating liveness "holes" which are rejected by the
MachineVerifier.

Explanation as found in this patch:

Helper class that can divide MachineOperands of a virtual register into
equivalence classes of connected components.
MachineOperands belong to the same equivalence class when they are part of
the same SubRange segment or adjacent segments (adjacent in control
flow); Different subranges affected by the same MachineOperand belong to
the same equivalence class.

Example:
  vreg0:sub0 = ...
  vreg0:sub1 = ...
  vreg0:sub2 = ...
  ...
  xxx        = op vreg0:sub1
  vreg0:sub1 = ...
  store vreg0:sub0_sub1

The example contains 3 different equivalence classes:
  - One for the (dead) vreg0:sub2 definition
  - One containing the first vreg0:sub1 definition and its use,
    but not the second definition!
  - The remaining class contains all other operands involving vreg0.

We provide a utility function here to rename disjunct classes to different
virtual registers.

Differential Revision: http://reviews.llvm.org/D16126

llvm-svn: 258257
2016-01-20 00:23:21 +00:00
Tom Stellard 2e045bbc5f AMDGPU/SI: Prevent the DAGCombiner from creating setcc with i1 inputs
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15035

llvm-svn: 258256
2016-01-20 00:13:22 +00:00
Sanjoy Das 16901a3e20 [MachineSink] Don't break ImplicitNulls
Summary:
This teaches MachineSink to not sink instructions that might break the
implicit null check optimization that runs later.  This should not
affect frontends that do not use implicit null checks.

Reviewers: aadg, reames, hfinkel, atrick

Subscribers: majnemer, llvm-commits

Differential Revision: http://reviews.llvm.org/D14632

llvm-svn: 258254
2016-01-20 00:06:14 +00:00
Quentin Colombet 4cf56917ea [X86] Do not run shrink-wrapping on function with split-stack attribute or HiPE
calling convention.
The implementation of the related callbacks in the x86 backend for such
functions are not ready to deal with a prologue block that is not the entry
block of the function.

This fixes PR26107, but the longer term solution would be to fix those callbacks.

llvm-svn: 258221
2016-01-19 23:29:03 +00:00
David Majnemer ce10842036 [MC, COFF] Add .reloc support for WinCOFF
This adds rudimentary support for a few relocations that we will use for
the CodeView debug format.

llvm-svn: 258216
2016-01-19 23:05:27 +00:00
Simon Pilgrim 4b919b2ab3 [X86][SSE] Add VZEXT_MOVL target shuffle decoding.
Add support for decoding VZEXT_MOVL target shuffle masks, allowing it to be used as a source in target shuffle combines.

llvm-svn: 258215
2016-01-19 23:04:56 +00:00
Quentin Colombet 2c49e2e664 [MachineFunction] Constify getter. NFC.
llvm-svn: 258207
2016-01-19 22:31:12 +00:00
Simon Pilgrim e74653b67a [X86][SSE] Add INSERTPS target shuffle combines.
As vector shuffles can only reference two inputs many (V)INSERTPS patterns end up being split over two targets shuffles.

This patch adds combines to attempt to combine (V)INSERTPS nodes with input/output nodes that are just zeroing out these additional vector elements.

Differential Revision: http://reviews.llvm.org/D16072

llvm-svn: 258205
2016-01-19 22:24:12 +00:00
Lang Hames 00b7bef269 [Orc] #undef a MACRO after I'm done with it.
Suggested by Philip Reames in review of r257951.

Thanks Philip!

llvm-svn: 258203
2016-01-19 22:20:21 +00:00
Chad Rosier 5c72966ea3 [AArch64] Remove a bunch of useless FIXME comments.
llvm-svn: 258193
2016-01-19 21:47:24 +00:00
Dan Gohman cff798386e [WebAssembly] Remove an unused data member. NFC.
llvm-svn: 258192
2016-01-19 21:31:41 +00:00
Chad Rosier b11c82d3e2 [AArch64] Remove more dead code after r258093.
llvm-svn: 258191
2016-01-19 21:27:05 +00:00
Xinliang David Li 0a83b1b994 Fix a coverage reading bug
function record pointer is not advanced when
duplicate entry is found.

Test case to be added.

llvm-svn: 258188
2016-01-19 21:18:12 +00:00
Lang Hames 2fe7acb773 [Orc] Refactor ObjectLinkingLayer::addObjectSet to defer loading objects until
they're needed.

Prior to this patch objects were loaded (via RuntimeDyld::loadObject) when they
were added to the ObjectLinkingLayer, but were not relocated and finalized until
a symbol address was requested. In the interim, another object could be loaded
and finalized with the same memory manager, causing relocation/finalization of
the first object to fail (as the first finalization call may have marked the
allocated memory for the first object read-only).

By deferring the loadObject call (and subsequent memory allocations) until an
object file is needed we can avoid prematurely finalizing memory.

llvm-svn: 258185
2016-01-19 21:06:38 +00:00
Sanjoy Das 29a4b5dc0d [SCEV] Fix PR26207
In some cases, the max backedge taken count can be more conservative
than the exact backedge taken count (for instance, because
ScalarEvolution::getRange is not control-flow sensitive whereas
computeExitLimitFromICmp can be).  In these cases,
computeExitLimitFromCond (specifically the bit that deals with `and` and
`or` instructions) can create an ExitLimit instance with a
`SCEVCouldNotCompute` max backedge count expression, but a computable
exact backedge count expression.  This violates an implicit SCEV
assumption: a computable exact BE count should imply a computable max BE
count.

This change

 - Makes the above implicit invariant explicit by adding an assert to
   ExitLimit's constructor

 - Changes `computeExitLimitFromCond` to be more robust around
   conservative max backedge counts

llvm-svn: 258184
2016-01-19 20:53:51 +00:00
Sanjoy Das 0ff078736f [SCEV] Use range-for; NFC
llvm-svn: 258183
2016-01-19 20:53:46 +00:00
JF Bastien 17999f20fa WebAssembly: mark known failure caused by r258125
The following test program triggers the assertion:
https://github.com/gcc-mirror/gcc/blob/master/gcc/testsuite/gcc.c-torture/execute/20030916-1.c

llvm-svn: 258182
2016-01-19 20:53:12 +00:00
Kostya Serebryany 311f27c0a8 [libFuzzer] use std::mt19937 for generating random numbers by default. Fix MyStoll to handle negative values. Use std::any_of instead of std::find_if
llvm-svn: 258178
2016-01-19 20:33:57 +00:00
Sanjay Patel d4af297df1 getParent()->getParent() == getModule() ; NFC
llvm-svn: 258176
2016-01-19 19:58:49 +00:00
Sanjay Patel d3112a5bcc function names start with a lowercase letter; NFC
Note: There are no uses of these functions outside of
SimplifyLibCalls, so they could be static functions in
that file.

llvm-svn: 258172
2016-01-19 19:46:10 +00:00
Sanjay Patel b50325e276 fix formatting; NFC
llvm-svn: 258167
2016-01-19 19:17:47 +00:00
Sanjay Patel 4e86036733 don't repeat documentation comments in implementation file; NFC
llvm-svn: 258166
2016-01-19 19:16:10 +00:00
Manuel Jacob 3f49f654a2 Move part of an if condition into an assertion. NFC.
llvm-svn: 258163
2016-01-19 19:04:49 +00:00
Michael Zuckerman 4582bdab12 [AVX512] Adding VPERMT2B and VPERMI2B instruction .
Differential Revision: http://reviews.llvm.org/D16297

llvm-svn: 258161
2016-01-19 18:47:02 +00:00
Philip Reames 1a196f7daf Revert 258157
According the build bots, clang is using the Registry class somewhere as well. Will reapply with appropriate clang changes at a later point.

llvm-svn: 258159
2016-01-19 18:41:10 +00:00
Sanjay Patel d1f4f03f5e [LibCallSimplifier] use instruction-level fast-math-flags to shrink calls
This is a continuation of adding FMF to call instructions:
http://reviews.llvm.org/rL255555

llvm-svn: 258158
2016-01-19 18:38:52 +00:00
Philip Reames 0f6650e8e8 [GC] Registry initialization and linkage interactions
The Registry class constructs a linked list of nodes whose storage is inside static variables and nodes are added via static initializers. The trick is that those static initializers are in both the LLVM code base, and some random plugin that might get loaded in at runtime. The existing code tries to use C++ templates and their ODR rules to get a single definition of the registry for each type, but, experimentally, this doesn't quite work as designed. (Well, the entire structure doesn't. It might not actually be an ODR problem.)

Previously, when I tried moving the GCStrategy class (along with it's registry) from CodeGen to IR, I ran into a problem where asking the GCStrategyRegistry a question would return inconsistent results depending on whether you asked from CodeGen (where the static initializers still were) or Transforms. My best guess is that this is a result of either a) an order of initialization error, or b) we ended up with two copies of the registry being created. I remember at the time having convinced myself it was probably (b), but I don't have any of my notes around from that investigation any more.

See http://reviews.llvm.org/rL226311 for the original patch in question.

This patch tries to remove the possibility of (b) above. (a) was already fixed in change 258109.

Differential Revision: http://reviews.llvm.org/D16170

llvm-svn: 258157
2016-01-19 18:34:27 +00:00
Rong Xu 294572f116 [PGO] Create the profile data variable before the lowering
This patch creates the profile data variable before lowering the profile intrinsics.

Reviewers: davidxl, silvas

Differential Revision: http://reviews.llvm.org/D16015

llvm-svn: 258156
2016-01-19 18:29:54 +00:00
Sanjay Patel 81a63cd11f [LibCallSimplifier] use instruction-level fast-math-flags to transform pow(x, [small integer]) calls
This is a continuation of adding FMF to call instructions:
http://reviews.llvm.org/rL255555

As with D15937, the intent of the patch is to preserve the current behavior of the transform
except that we use the pow call's 'fast' attribute as a trigger rather than a function-level
attribute.

The TODO comment notes a potential follow-on patch that would propagate FMF to the new
instructions.

Differential Revision: http://reviews.llvm.org/D16122

llvm-svn: 258153
2016-01-19 18:15:12 +00:00
Chris Ray b541a3488f NFC Test Commit whitespace change in a comment
Changed whitespace so comments line up.

llvm-svn: 258151
2016-01-19 18:01:20 +00:00
Rafael Espindola a39d305ded Use larger write sizes for MCFillFragment.
This brings the pr26208 testcase down to 3.2 seconds. Not checking it in
since it does create a 4GB .o file.

llvm-svn: 258149
2016-01-19 17:47:48 +00:00
Sanjay Patel 142c49bc42 remove outdated comment; NFC
llvm-svn: 258147
2016-01-19 17:29:22 +00:00
Eduard Burtescu 19eb03106d [opaque pointer types] [NFC] GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType.
Summary:
GEPOperator: provide getResultElementType alongside getSourceElementType.
This is made possible by adding a result element type field to GetElementPtrConstantExpr, which GetElementPtrInst already has.

GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType.

Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16275

llvm-svn: 258145
2016-01-19 17:28:00 +00:00
Michael Zuckerman d9cac592f4 [AVX512] Adding VPERMB instruction
Differential Revision: http://reviews.llvm.org/D16294

llvm-svn: 258144
2016-01-19 17:07:43 +00:00
Dan Gohman b6fd39a3a7 [WebAssembly] Rematerialize constants rather than hold them live in registers.
Teach the register stackifier to rematerialize constants that have multiple
uses instead of leaving them in registers. In the WebAssembly encoding, it's
the same code size to materialize most constants as it is to read a value
from a register.

llvm-svn: 258142
2016-01-19 16:59:23 +00:00
Rafael Espindola 1a7e8b4bc1 Simplify MCFillFragment.
The value size was always 1 or 0, so we don't need to store it.

In a no asserts build this takes the testcase of pr26208 from 11 to 10
seconds.

llvm-svn: 258141
2016-01-19 16:57:08 +00:00
Chad Rosier 401a4ab8d8 Typo.
llvm-svn: 258137
2016-01-19 16:50:45 +00:00
Marina Yatsina d9658d16fd [X86] Add support for "xlat m8"
According to x86 spec "xlat m8" is a legal instruction and it is equivalent to "xlatb".

Differential Revision: http://reviews.llvm.org/D15150

llvm-svn: 258135
2016-01-19 16:35:38 +00:00
Manuel Jacob c784e6acd9 Fix constant folding of constant vector GEPs with undef or null as pointer argument.
Reviewers: eddyb

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16321

llvm-svn: 258134
2016-01-19 16:34:31 +00:00
Marina Yatsina b9f4f62cfe [X86] Adding support for missing variations of X86 string related instructions
The following are legal according to X86 spec:
ins mem, DX
outs DX, mem
lods mem
stos mem
scas mem
cmps mem, mem
movs mem, mem

Differential Revision: http://reviews.llvm.org/D14827

llvm-svn: 258132
2016-01-19 15:37:56 +00:00
Manuel Jacob 6a4761e384 Rename Variable `Ptr` to `PtrTy`. NFC.
llvm-svn: 258130
2016-01-19 15:21:15 +00:00
Rafael Espindola 5568c83a60 Handle 64 bit offsets.
No tests since llvm-mc takes 14 seconds on it. I will try to improve it
and then test.

Part of pr26208.

llvm-svn: 258129
2016-01-19 15:19:08 +00:00
Dan Gohman b13c91f159 [WebAssembly] Disable some WebAssembly-specific optimization passes at -O0.
llvm-svn: 258127
2016-01-19 14:55:02 +00:00
Dan Gohman 3196650bf3 [WebAssembly] Use the templated form of MachineFunction::getSubtarget(). NFC.
llvm-svn: 258126
2016-01-19 14:53:19 +00:00
Dan Gohman 0553299586 [WebAssembly] Re-enable loop idiom recognition for memcpy et al.
llvm-svn: 258125
2016-01-19 14:49:23 +00:00
Asaf Badouh d4a0d9a78c [X86][AVX512]fix dag & add intrinsics for fixupimm
cover all width and types (pd/ps/sd/ss) of fixupimm instruction and inrtinsics

Differential Revision: http://reviews.llvm.org/D16313

llvm-svn: 258124
2016-01-19 14:21:39 +00:00
Philip Reames b336bca07e [GC] Lower vectors-of-pointers directly by default
This commit changes the default on our lowering of vectors-of-pointers from splitting in RS4GC to reporting them in the final stack map.  All of the changes to do so are already in place and tested.  Assuming no problems are unearthed in the next week, we will be deleting the old code entirely next Monday.

llvm-svn: 258111
2016-01-19 04:18:24 +00:00
Philip Reames 3195500297 [GC] Consolidate all built in GCs into a single file [NFC]
Combine a bunch of small files into a single, still rather small, file.  The primary purpose of this is to get all of the static initializers into a single file so as to have a well defined order of initialization.  

llvm-svn: 258109
2016-01-19 03:57:18 +00:00
Kelvin Li 510498c0d3 parseArch() supports more variations of arch names for PowerPC builds
llvm-svn: 258103
2016-01-19 00:04:41 +00:00
Tobias Edler von Koch 3f4f6f3ed6 Add a change accidentally left out from r258100
Also remove an executable bit introduced by r258083.

llvm-svn: 258101
2016-01-18 23:35:24 +00:00
Tobias Edler von Koch 8ecaf69291 [LTO] Restore original linkage of externals prior to splitting
Summary:
This is a companion patch for http://reviews.llvm.org/D16124.

Internalized symbols increase the size of strongly-connected components in
SCC-based module splitting and thus reduce the amount of parallelism. This
patch records the original linkage of non-local symbols prior to
internalization and then restores it just before splitting/CodeGen. This is
also useful for cases where the linker requires symbols to remain external, for
instance, so they can be placed according to linker script rules.

It's currently under its own flag (-restore-globals) but should eventually
share a common flag with D16124.

Reviewers: joker.eph, pcc

Subscribers: slarin, llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D16229

llvm-svn: 258100
2016-01-18 23:24:54 +00:00
Simon Pilgrim c4d519d340 Fixed MSVC warning that not all control paths return a value.
llvm-svn: 258099
2016-01-18 22:54:46 +00:00
Matt Arsenault 33e3ecee0c AMDGPU: Reduce 64-bit SRAs
llvm-svn: 258096
2016-01-18 22:09:04 +00:00
Matt Arsenault 6e3a45193a AMDGPU: Split 64-bit and of constant up
This breaks the tests that were meant for testing
64-bit inline immediates, so move those to shl where
they won't be broken up.

This should be repeated for the other related bit ops.

llvm-svn: 258095
2016-01-18 22:01:13 +00:00
Chad Rosier 234bf6fe5c [AArch64] Remove unused arguments. NFC.
AFAICT, these have been unused since the initial backend import.

llvm-svn: 258093
2016-01-18 21:56:40 +00:00
Matt Arsenault 3cbbc10488 AMDGPU: Generalize shl combine
Reduce 64-bit shl with constant > 32. We already special cased
this for the == 32 case, but this also works for any >= 32 constant.

llvm-svn: 258092
2016-01-18 21:55:14 +00:00
Matt Arsenault 80edab99ff AMDGPU: Reduce 64-bit lshr by constant to 32-bit
64-bit shifts are very slow on some subtargets.

llvm-svn: 258090
2016-01-18 21:43:36 +00:00
Adam Nemet d8968f0945 [LAA] Include function name in debug output
llvm-svn: 258088
2016-01-18 21:16:33 +00:00
Matt Arsenault e83690c1cc AMDGPU: Add subtarget feature for instruction rates
llvm-svn: 258085
2016-01-18 21:13:50 +00:00
Simon Pilgrim 99c6c29c0c Fixed MSVC Win64 warning of implicit conversion of 32-bit shift to 64-bits.
llvm-svn: 258084
2016-01-18 21:11:19 +00:00
Sergei Larin d19d4d30d8 Add to the split module utility an SCC based method which allows not to globalize any local variables.
Summary:
    Currently llvm::SplitModule as the first step globalizes all local objects, which might not be desirable in some scenarios.
    This change adds a new flag to llvm::SplitModule that uses SCC approach to search for a balanced partition without the need to externalize symbols.
    Such partition might not be possible or fully balanced for a given number of partitions, and is a function of the module properties (global/local dependencies within the module).
    
    Joint development Tobias Edler von Koch (tobias@codeaurora.org) and Sergei Larin (slarin@codeaurora.org)
    
    Subscribers: llvm-commits, joker.eph
    
    Differential Revision: http://reviews.llvm.org/D16124

llvm-svn: 258083
2016-01-18 21:07:13 +00:00
Simon Pilgrim 3e5fb61978 [X86][AVX2] Broadcast subvectors
AVX2 can only broadcast from the zero'th element of a vector, but if the broadcastable element is the zero'th element of a 128-bit subvector its advantageous to extract the subvector, broadcast from that and avoid the loading of shuffle mask data that would be needed for VPERMPS/VPERMD. The only exception being when the source type is 4f64 or 4i64 which can directly use the immediate shuffle VPERMPD/VPERMQ directly.

Differential Revision: http://reviews.llvm.org/D16050

llvm-svn: 258081
2016-01-18 20:59:04 +00:00
Krzysztof Parzyszek 7aae9b3782 [Hexagon] Recognize more copy-equivalents in RDF optimizations
llvm-svn: 258076
2016-01-18 20:45:51 +00:00
Krzysztof Parzyszek adc64b7df0 [RDF] Improvements to copy propagation
- Allow any instruction to define equality between registers.
- Keep the DFG updated.

llvm-svn: 258075
2016-01-18 20:43:57 +00:00
Krzysztof Parzyszek e6b0662092 [RDF] Improve compile-time performance of dead code elimination
llvm-svn: 258074
2016-01-18 20:42:47 +00:00
Krzysztof Parzyszek 69e670d5f9 [RDF] Allow unlinking ref nodes from data-flow chains only
llvm-svn: 258073
2016-01-18 20:41:34 +00:00
Craig Topper 5e46adb09a [TableGen] Use FoldingSets instead of DenseMaps to unique UnOpInit, BinOpInit and TernOpInit. This remove the memory needed to store the key for the DenseMap. NFC
llvm-svn: 258071
2016-01-18 20:36:06 +00:00
Craig Topper 7dcb1a5c89 [TableGen] Fix an assert I missed in r258063.
llvm-svn: 258068
2016-01-18 19:59:05 +00:00
Tom Stellard ccdc5391ea TargetLowering: Improve handling of (setcc ([sz]ext x) 0, cc) in SimplifySetCC
Summary:
When SimplifySetCC sees a setcc node that compares the result of a
value extension operation with a constant, it tries to simplify the
setcc node by eliminating the extension and shrinking the constant.

If shrinking the inputs to setcc is deemed not desirable by the target
(e.g. the target does not want a setcc comparing i1 values), then it
is still possible to optimize this sequence in some cases.

This patch adds the following combines to SimplifySetCC when shrinking setcc
inputs is not desirable:

(setcc ([sz]ext (setcc x, y, cc)), 0, setne) -> (setcc (x, y, cc))
(setcc ([sz]ext (setcc x, y, cc)), 0, seteq) -> (setcc (x, Y, !cc))

There are no tests for this yet, but once AMDGPU correctly implements
TargetLowering::isTypeDesirableForOp(), this new combine will be
exercised by the existing CodeGen/AMDGPU/setcc-opt.ll test.

Reviewers: resistor, arsenm

Subscribers: jroelofs, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15034

llvm-svn: 258067
2016-01-18 19:55:21 +00:00
Craig Topper 0e41d0b963 [TableGen] Merge the SuperClass Record and SMRange vector into a single vector. This removes the state needed to manage the extra vector thus reducing the size of the Record class. NFC
llvm-svn: 258065
2016-01-18 19:52:37 +00:00
Craig Topper fbfd578056 [TableGen] Allocate the Init pointer array for BitsInit/ListInit after the BitsInit/ListInit object itself. Saves a bit of memory. NFC
llvm-svn: 258063
2016-01-18 19:52:24 +00:00
Sanjay Patel c2ceb8b2d8 combine clauses with same output ; NFCI
llvm-svn: 258062
2016-01-18 19:17:58 +00:00
Sanjay Patel 7b7eec11c0 use m_OneUse ; NFCI
llvm-svn: 258059
2016-01-18 18:36:38 +00:00
Sanjay Patel 3b8dcc731e fix variable names, typos ; NFC
llvm-svn: 258058
2016-01-18 18:28:09 +00:00
Sanjay Patel d09b44a752 fix typo; NFC
llvm-svn: 258057
2016-01-18 17:50:23 +00:00
Igor Breger 239fda676c AVX512: Masked store intrinsic implementation.
Implemented intrinsic for the follow instructions (store) : VMOVDQU8/16/32/64, VMOVDQA32/64, VMOVAPS/PD, VMOVUPS/PD.

Differential Revision: http://reviews.llvm.org/D16271

llvm-svn: 258047
2016-01-18 13:52:57 +00:00
Elena Demikhovsky 9242ea87d6 Added Cannonlake processor to X86 Target
Differential Revision: http://reviews.llvm.org/D16289

llvm-svn: 258046
2016-01-18 13:00:31 +00:00
Igor Breger dd6522c653 AVX512 : Change v8i1 bitconvert GR8 pattern, remove unnecessary movzbl instruction.
code example , previous implementation.
    movzbl  %dil, %eax
    kmovw  %eax, %k0
  new code
    kmovw  %edi, %k0

Differential Revision: http://reviews.llvm.org/D16287

llvm-svn: 258045
2016-01-18 12:02:45 +00:00
Oliver Stannard 9f68749eba [ARM] Operands for PKHTB alias should be swapped
When the shift immediate is zero, PKHTB is an alias for PKHBT, but the order of
the input operands needs to be swapped.

Differential Revision: http://reviews.llvm.org/D16288

llvm-svn: 258044
2016-01-18 11:56:35 +00:00
Michael Zuckerman 9c47e0681c [AVX512] adding AVXVBMI feature flag
Fixing wrong typo (avx515) → (avx512) 
Review over the shoulder by asaf . 

Differential Revision: http://reviews.llvm.org/D16190

llvm-svn: 258041
2016-01-18 11:12:47 +00:00
Xinliang David Li 42a13308a1 [Coverage] move a local var to be BinaryCoverageReader's member
The symtab is logically referenced beyond the call to the create
method. This changes makes sure its lifetime matches that of
the reader.

llvm-svn: 258036
2016-01-18 06:48:01 +00:00
Junmo Park 3347e7823a Remove extra whitespace. NFC.
llvm-svn: 258035
2016-01-18 06:42:51 +00:00
Eduard Burtescu 6007e0dd02 Revert assert added in rL258028 as the alloca and OtherPtr types may differ in address space.
llvm-svn: 258029
2016-01-18 00:20:34 +00:00
Eduard Burtescu 90c4449128 [opaque pointer types] Alloca: use getAllocatedType() instead of getType()->getPointerElementType().
Reviewers: mjacob

Subscribers: llvm-commits, dblaikie

Differential Revision: http://reviews.llvm.org/D16272

llvm-svn: 258028
2016-01-18 00:10:01 +00:00
Sanjay Patel 6435c6ede0 fix variable names; NFC
llvm-svn: 258027
2016-01-17 23:18:05 +00:00
Sanjay Patel 9613b29927 fix typos; NFC
llvm-svn: 258026
2016-01-17 23:13:48 +00:00
Manuel Jacob 20c6d5bcb8 [opaque pointer types] [breaking-change] [NFC] SimplifyGEPInst: take the source element type of the GEP as an argument.
Patch by Eduard Burtescu.

Reviewers: dblaikie, mjacob

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16281

llvm-svn: 258024
2016-01-17 22:46:43 +00:00
Manuel Jacob 190577ac81 [opaque pointer types] [NFC] CallSite: use getFunctionType() instead of going through PointerType::getElementType.
Patch by Eduard Burtescu.

Reviewers: dblaikie, mjacob

Subscribers: dsanders, llvm-commits, dblaikie

Differential Revision: http://reviews.llvm.org/D16273

llvm-svn: 258023
2016-01-17 22:37:39 +00:00
Manuel Jacob da2c9baa07 [NFC] Remove one dead PointerType::getElementType() call.
Reviewers: dblaikie, mjacob

Subscribers: llvm-commits, dblaikie

Patch by Eduard Burtescu.

Differential Revision: http://reviews.llvm.org/D16274

llvm-svn: 258022
2016-01-17 22:28:28 +00:00
Sanjoy Das de47590589 [IndVars] Fix PR25576
`LCSSASafePhiForRAUW` as computed was incorrect -- in cases like
these (this exact example does not actually trigger the bug):

define i32 @f(i32 %n, i1* %c) {
entry:
  br label %outer.loop

outer.loop:
  br label %inner.loop

inner.loop:
  %iv = phi i32 [ 0, %outer.loop ], [ %iv.inc, %inner.loop ]
  %iv.inc = add nuw nsw i32 %iv, 1
  %tc = udiv i32 %n, 13
  %be.cond = icmp ult i32 %iv, %tc
  br i1 %be.cond, label %inner.loop, label %inner.exit

inner.exit:
  %iv.lcssa = phi i32 [ %iv, %inner.loop ]
  %outer.be.cond = load volatile i1, i1* %c
  br i1 %outer.be.cond, label %outer.loop, label %leave

leave:
  %iv.lcssa.lcssa = phi i32 [ %iv.lcssa, %inner.exit ]
  ret i32 %iv.lcssa.lcssa
}

`LCSSASafePhiForRAUW` is true for `%iv.lcssa` when re-rewriting the exit
value of `%iv` for `%inner.loop` to `%tc` (this can happen due to
`SCEVExpander::findExistingExpansion`), but the RAUW breaks LCSSA.

To fix this, instead of computing `SafePhi` with special logic, decide
the safety of RAUW directly via `replacementPreservesLCSSAForm`.

llvm-svn: 258016
2016-01-17 18:12:52 +00:00
Sanjoy Das 7a8a705c9d [IndVars] Use emplace_back; NFC
llvm-svn: 258015
2016-01-17 18:12:48 +00:00
Michael Zuckerman 97b6a6923e [AVX512] adding AVXVBMI feature flag
The feature flag is for VPERMB,VPERMI2B,VPERMT2B and VPMULTISHIFTQB instructions. 
More about the instruction can be found in:
hattps://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdf

Differential Revision: http://reviews.llvm.org/D16190

llvm-svn: 258012
2016-01-17 13:42:12 +00:00
Artur Pilipenko aba8fdc480 Fix buildbot failure introduced by 258010. Remove local variables became unused.
llvm-svn: 258011
2016-01-17 12:59:40 +00:00
Artur Pilipenko f84dc06e5b Push isDereferenceableAndAlignedPointer down into isSafeToLoadUnconditionally
Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D16226

llvm-svn: 258010
2016-01-17 12:35:29 +00:00
Igor Breger e1f273d900 AVX512: Use MemIntrinsicSDNode to implement load/store intrinsic.
Differential Revision: http://reviews.llvm.org/D16184

llvm-svn: 258009
2016-01-17 12:10:24 +00:00
Michael Zuckerman ac1b238b0a [AVX512] Adding VPERMW/D/Q VPERMPS/D Intrinsics
Differential Revision: http://reviews.llvm.org/D16189

llvm-svn: 258008
2016-01-17 11:33:29 +00:00
Michael Zuckerman ede597c753 [AVX512] Adding VPERMQ VPERMPD Intrinsics
Differential Revision: http://reviews.llvm.org/D16194

llvm-svn: 258006
2016-01-17 08:32:14 +00:00
Simon Pilgrim 20f31fa31a [X86][AVX] Enable extraction of upper 128-bit subvectors for 'half undef' shuffle lowering
Added support for the extraction of the upper 128-bit subvectors for lower/upper half undef shuffles if it would reduce the number of extractions/insertions or avoid loads of AVX2 permps/permd shuffle masks.

Minor follow up to D15477.

llvm-svn: 258000
2016-01-16 22:30:20 +00:00
Manuel Jacob 5f6eaac611 GlobalValue: use getValueType() instead of getType()->getPointerElementType().
Reviewers: mjacob

Subscribers: jholewinski, arsenm, dsanders, dblaikie

Patch by Eduard Burtescu.

Differential Revision: http://reviews.llvm.org/D16260

llvm-svn: 257999
2016-01-16 20:30:46 +00:00
Manman Ren 53a54c41d7 CXX_FAST_TLS calling convention: fix issue on x86-64.
%RBP can't be handled explicitly. We generate the following code:
    pushq %rbp
    movq  %rsp, %rbp
    ...
    movq  %rbx, (%rbp)  ## 8-byte Spill
where %rbp will be overwritten by the spilled value.

The fix is to let PEI handle %RBP.
PR26136

llvm-svn: 257997
2016-01-16 16:39:46 +00:00
Igor Laevsky 28eeb3f66c [BasicAliasAnalysis] Take into account operand bundles in the getModRefInfo function
Differential Revision: http://reviews.llvm.org/D16225

llvm-svn: 257991
2016-01-16 12:15:53 +00:00
Kostya Serebryany 476f0ce31a [libFuzzer] replace vector with a simpler data structure in the Dictionaries to avoid memory allocations on hot path
llvm-svn: 257985
2016-01-16 03:53:32 +00:00
NAKAMURA Takumi 33ff1dda6a [Cygwin] Use -femulated-tls by default since r257718 introduced the new pass.
FIXME: Add more targets to use emutls into clang/test/Driver/emulated-tls.cpp.
FIXME: Add cygwin tests into llvm/test/CodeGen/X86. Working in progress.
llvm-svn: 257984
2016-01-16 03:44:52 +00:00
Kostya Serebryany aca7696f4d [libFuzzer] introduce LLVMFuzzerInitialize
llvm-svn: 257980
2016-01-16 01:23:12 +00:00
Keno Fischer bc0cb11eb2 [DwarfDebug] Don't merge DebugLocEntries if their pieces overlap
Summary:
Later in DWARF emission we check that DebugLocEntries have
non-overlapping pieces, so we should create any such entries
by merging here.

Fixes PR26163.

Reviewers: aprantl
Differential Revision: http://reviews.llvm.org/D16249

llvm-svn: 257979
2016-01-16 01:15:32 +00:00
Keno Fischer f8eb6a1414 [DwarfDebug] Move MergeValues to .cpp, NFC
llvm-svn: 257977
2016-01-16 01:11:33 +00:00
Peter Collingbourne f0f5e87083 Introduce sanstats tool and llvm::CreateSanitizerStatReport function.
This is part of a new statistics gathering feature for the sanitizers.
See clang/docs/SanitizerStats.rst for further info and docs.

Differential Revision: http://reviews.llvm.org/D16174

llvm-svn: 257970
2016-01-16 00:31:11 +00:00
Dan Gohman 7f86ca1803 [WebAssembly] Add some more README.txt entries.
llvm-svn: 257969
2016-01-16 00:20:03 +00:00
Reid Kleckner 9533af4f8a [codeview] Remove custom line info struct in favor of DebugLoc
The only functional change would be that we might emit multiple filename
segments on code like this:

  void f() {
  #include "p1/../t.h"
  #include "p2/../t.h"
  }

I believe these get separate DIFile metadata nodes, but will have the
same canonicalized absolute path. Previously by computing the path up
front and comparing it we would merge the line info segments.

llvm-svn: 257966
2016-01-16 00:09:09 +00:00
Kevin B. Smith c831a08fbf [X86]: Make param names in header and body match for isCalleePop.
Differential Revision: http://reviews.llvm.org/D16246

llvm-svn: 257965
2016-01-16 00:08:36 +00:00
Kostya Serebryany 628bc3ec00 [libFuzzer] move some code from public interface header to a non-public header. NFC
llvm-svn: 257963
2016-01-16 00:04:36 +00:00
Dan Gohman 2f301f3e92 [WebAssembly] Don't create a needless .note.GNU-stack section
WebAssembly's stack will never be executable by default, so it isn't
necessary to declare .note.GNU-stack sections to request a non-executable
stack.

Differential Revision: http://reviews.llvm.org/D15969

llvm-svn: 257962
2016-01-15 23:59:13 +00:00
Artem Belevich 5be0706ebe [NVPTX] Do not emit .hidden or .protected directives as they are not allowed by PTX.
llvm-svn: 257961
2016-01-15 23:57:53 +00:00
Lang Hames c715ebbb99 [Orc] Replace switch cases with a macro.
The cases of this switch are all perfectly regular (except for the first case).
A macro is more readable here.

Thanks to Dave Blaikie for the suggestion. 

llvm-svn: 257951
2016-01-15 23:19:06 +00:00
David Blaikie ab105bbf0c [opaque pointer types] Remove an unnecessary extra explicit value type in Function
Now that this is up in GlobalValue, just use the value there.

llvm-svn: 257949
2016-01-15 23:07:58 +00:00
Matthias Braun feb81bc682 ValueTracking: Put DataLayout reference into the Query structure, NFC.
It looks nicer and improves the compiletime of a typical
clang -O3 -emit-llvm run by ~0.6% for me.

Differential Revision: http://reviews.llvm.org/D16205

llvm-svn: 257944
2016-01-15 22:22:04 +00:00
Dan Gohman 4e9b2a60ab [SelectionDAG] CSE nodes with differing SDNodeFlags
In the optimizer (GVN etc.) when eliminating redundant nodes with different
flags, the flags are ignored for the purposes of testing for congruence, and
then intersected for the purposes of producing a result that supports the union
of all the uses. This commit makes SelectionDAG's CSE do the same thing,
allowing it to CSE nodes in more cases. This fixes PR26063.

Differential Revision: http://reviews.llvm.org/D15957

llvm-svn: 257940
2016-01-15 21:56:40 +00:00
Justin Bogner fd757648a4 PM: Fix an inverted condition in simplifyFunctionCFG
I mentioned the issue here in code review way back in September and
was sure we'd fixed it, but apparently we forgot:

  http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150921/301850.html

In any case, as soon as you try to use this pass in anything but the
most basic pipeline everything falls apart. Fix the condition.

llvm-svn: 257935
2016-01-15 21:21:39 +00:00
Joseph Tremoulet 44b3f961e1 [WinEH] Rename CatchReturnInst::getParentPad, NFC
Summary:
Rename to getCatchSwitchParentPad, to make it more clear which ancestor
the "parent" in question is.  Add a comment pointing out the key feature
that the returned pad indicates which funclet contains the successor
block.

Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16222

llvm-svn: 257933
2016-01-15 21:16:19 +00:00
Manman Ren e5f807f928 CXX_FAST_TLS calling convention: fix issue on ARM.
When we have a single basic block, the explicit copy-back instructions should
be inserted right before the terminator. Before this fix, they were wrongly
placed at the beginning of the basic block.

PR26136

llvm-svn: 257930
2016-01-15 20:24:11 +00:00
Manman Ren 4632e8e625 CXX_FAST_TLS calling convention: fix issue on AArch64.
When we have a single basic block, the explicit copy-back instructions should
be inserted right before the terminator. Before this fix, they were wrongly
placed at the beginning of the basic block.

I will commit fixes to other platforms as well.

PR26136

llvm-svn: 257929
2016-01-15 20:13:28 +00:00
Manman Ren 4fe01bd8f9 CXX_FAST_TLS calling convention: fix issue on X86-64.
When we have a single basic block, the explicit copy-back instructions should
be inserted right before the terminator. Before this fix, they were wrongly
placed at the beginning of the basic block.

I will commit fixes to other platforms as well.

PR26136

llvm-svn: 257925
2016-01-15 19:35:42 +00:00
Xinliang David Li 285d7bd4e7 Fix -Wmismatched-tags warning/error
llvm-svn: 257924
2016-01-15 19:22:41 +00:00
Kyle Butt 132bf36161 Codegen: [PPC] Silence false-positive initialization warning. NFC
Some compilers don't do exhaustive switch checking. For those compilers,
add an initialization to prevent un-initialized variable warnings from
firing. For compilers with exhaustive switch checking, we still get a
guarantee that the switch is exhaustive, and hence the initializations
are redundant, and a non-functional change.

llvm-svn: 257923
2016-01-15 19:20:06 +00:00
Xinliang David Li b606638526 [PGO] Commonize (more) index profile file and buffer writer.
The file and buffer writer code are mostly shared except for the
stream back-patching. This is because raw_string_ostream does not
support seek like interface. The result is that the data patching
code needs to be pushed to the caller which is not quite readable 
(passing around offset, value etc). This also makes future enhancement
(which needs more patching) more difficult (and can make impl messy).

In this patch, two types of streams needed by the writer are now
unified with same set of interfaces under ProfOStream class. The patch
method is added so that common implementation becomes cleaner. It
also enables future enhancement. Should be NFC.

llvm-svn: 257921
2016-01-15 19:01:04 +00:00
Rafael Espindola 257a35368f Bring back "Assert that we have all use/users in the getters."
This reverts commit r257751, bringing back r256105.

The problem the assert found was fixed in r257915.

Original commit message:

Assert that we have all use/users in the getters.

An error that is pretty easy to make is to use the lazy bitcode reader
and then do something like

if (V.use_empty())

The problem is that uses in unmaterialized functions are not accounted
for.

This patch adds asserts that all uses are known.

llvm-svn: 257920
2016-01-15 19:00:20 +00:00
Reid Kleckner d4a0d18899 Revert "[ARM] Add ARMv8-M security extension instructions to ARMv8-M Baseline/Mainline"
This reverts commit r257883.

Somehow this didn't make it into r257916.

llvm-svn: 257919
2016-01-15 18:55:12 +00:00
Matthew Simpson 57fe1b10db Reapply r257800 with fix
The fix uniques the bundle of getelementptr indices we are about to vectorize
since it's possible for the same index to be used by multiple instructions.
The original commit message is below.

[SLP] Vectorize the index computations of getelementptr instructions.

This patch seeds the SLP vectorizer with getelementptr indices. The primary
motivation in doing so is to vectorize gather-like idioms beginning with
consecutive loads (e.g., g[a[0] - b[0]] + g[a[1] - b[1]] + ...). While these
cases could be vectorized with a top-down phase, seeding the existing bottom-up
phase with the index computations avoids the complexity, compile-time, and
phase ordering issues associated with a full top-down pass. Only bundles of
single-index getelementptrs with non-constant differences are considered for
vectorization.

llvm-svn: 257918
2016-01-15 18:51:51 +00:00
Reid Kleckner 47f2452da8 # This is a combination of 2 commits.
# The first commit's message is:

Revert "[ARM] Add DSP build attribute and extension targeting"

This reverts commit b11cc50c0b4a7c8cdb628abc50b7dc226ff583dc.

# This is the 2nd commit message:

Revert "[ARM] Add new system registers to ARMv8-M Baseline/Mainline"

This reverts commit 837d08454e3e5beb8581951ac26b22fa07df3cd5.

llvm-svn: 257916
2016-01-15 18:31:29 +00:00
Rafael Espindola 79db917139 Don't try to check all uses if lazy loading.
This means that LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN will not be set
in a few cases.

This should have no impact in ld64 since it doesn't use lazy loading
when merging modules and that is when it checks
LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN.

llvm-svn: 257915
2016-01-15 18:23:46 +00:00
James Y Knight ac03dca412 Stop increasing alignment of externally-visible globals on ELF
platforms.

With ELF, the alignment of a global variable in a shared library will
get copied into an executables linked against it, if the executable even
accesss the variable. So, it's not possible to implicitly increase
alignment based on access patterns, or you'll break existing binaries.

This happened to affect libc++'s std::cout symbol, for example. See
thread: http://thread.gmane.org/gmane.comp.compilers.clang.devel/45311

(This is a re-commit of r257719, without the bug reported in
PR26144. I've tweaked the code to not assert-fail in
enforceKnownAlignment when computeKnownBits doesn't recurse far enough
to find the underlying Alloca/GlobalObject value.)

Differential Revision: http://reviews.llvm.org/D16145

llvm-svn: 257902
2016-01-15 16:33:06 +00:00
Silviu Baranga f29dfd36bb Re-commit r257064, after it was reverted in r257340.
This contains a fix for the issue that caused the revert:
we no longer assume that we can insert instructions after the
instruction that produces the base pointer. We previously
assumed that this would be ok, because the instruction produces
a value and therefore is not a terminator. This is false for invoke
instructions. We will now insert these new instruction directly
at the location of the users.

Original commit message:

[InstCombine] Look through PHIs, GEPs, IntToPtrs and PtrToInts to expose more constants when comparing GEPs

Summary:
When comparing two GEP instructions which have the same base pointer
and one of them has a constant index, it is possible to only compare
indices, transforming it to a compare with a constant. This removes
one use for the GEP instruction with the constant index, can reduce
register pressure and can sometimes lead to removing the comparisson
entirely.

InstCombine was already doing this when comparing two GEPs if the base
pointers were the same. However, in the case where we have complex
pointer arithmetic (GEPs applied to GEPs, PHIs of GEPs, conversions to
or from integers, etc) the value of the original base pointer will be
hidden to the optimizer and this transformation will be disabled.

This change detects when the two sides of the comparison can be
expressed as GEPs with the same base pointer, even if they don't
appear as such in the IR. The transformation will convert all the
pointer arithmetic to arithmetic done on indices and all the relevant
uses of GEPs to GEPs with a common base pointer. The GEP comparison
will be converted to a comparison done on indices.

Reviewers: majnemer, jmolloy

Subscribers: hfinkel, jevinskie, jmolloy, aadg, llvm-commits

Differential Revision: http://reviews.llvm.org/D15146

llvm-svn: 257897
2016-01-15 15:52:05 +00:00
Artur Pilipenko 6dd6969cee Change isSafeToLoadUnconditionally arguments order. Separated from http://reviews.llvm.org/D10920.
llvm-svn: 257894
2016-01-15 15:27:46 +00:00
Krzysztof Parzyszek 2a3b2f9841 [Hexagon] Generate CONST64 when optimizing for size in copy-to-combine
llvm-svn: 257891
2016-01-15 14:08:31 +00:00
Krzysztof Parzyszek 9b7320e621 [Hexagon] Handle DBG_VALUE instructions in copy-to-combine
llvm-svn: 257890
2016-01-15 13:55:57 +00:00
Matthew Simpson 9258e013a2 Revert "[SLP] Vectorize the index computations of getelementptr instructions."
This reverts commit r257800.

llvm-svn: 257888
2016-01-15 13:10:46 +00:00
James Molloy 3ef84c4cbb [CodeGenPrepare] Try and appease sanitizers
dupRetToEnableTailCallOpts(BB) can invalidate BB. It must run *after* we iterate across BB!

llvm-svn: 257886
2016-01-15 10:36:01 +00:00
Bradley Smith 48b93e1f21 [ARM] Add DSP build attribute and extension targeting
llvm-svn: 257885
2016-01-15 10:28:25 +00:00
Bradley Smith 42f6e90a43 [ARM] Add new system registers to ARMv8-M Baseline/Mainline
llvm-svn: 257884
2016-01-15 10:28:03 +00:00
Bradley Smith 618712df04 [ARM] Add ARMv8-M security extension instructions to ARMv8-M Baseline/Mainline
llvm-svn: 257883
2016-01-15 10:27:14 +00:00
Bradley Smith 433c22e35c [ARM] Add ARMv8-A semaphore/atomic instructions to ARMv8-M Baseline/Mainline
llvm-svn: 257882
2016-01-15 10:26:51 +00:00
Bradley Smith a1189106d5 [ARM] Add B.W and CBZ instructions to ARMv8-M Baseline
llvm-svn: 257881
2016-01-15 10:26:17 +00:00
Bradley Smith 519563e371 [ARM] Add SDIV/UDIV instructions to ARMv8-M Baseline
llvm-svn: 257880
2016-01-15 10:25:35 +00:00
Bradley Smith d9a99ce53d [ARM] Add MOVW/MOVT instructions to ARMv8-M Baseline/Mainline
llvm-svn: 257879
2016-01-15 10:25:14 +00:00
Bradley Smith e26f799422 [ARM] Add ARMv8-M Baseline/Mainline LLVM targeting
llvm-svn: 257878
2016-01-15 10:24:39 +00:00
Bradley Smith 4c21cba72b [ARM] Split out ARMv8-A semaphores and atomics and ARMv7 clrex as separate features
llvm-svn: 257877
2016-01-15 10:23:46 +00:00
James Molloy f01488e2bc [InstCombine] Rewrite bswap/bitreverse handling completely.
There are several requirements that ended up with this design;
  1. Matching bitreversals is too heavyweight for InstCombine and doesn't really need to be done so early.
  2. Bitreversals and byteswaps are very related in their matching logic.
  3. We want to implement support for matching more advanced bswap/bitreverse patterns like partial bswaps/bitreverses.
  4. Bswaps are best matched early in InstCombine.

The result of these is that a new utility function is created in Transforms/Utils/Local.h that can be configured to search for bswaps, bitreverses or both. InstCombine uses it to find only bswaps, CGP uses it to find only bitreversals.

We can then extend the matching logic in one place only.

llvm-svn: 257875
2016-01-15 09:20:19 +00:00
Jonas Paulsson 5b29e096ac [SystemZ] Fix bad instruction name
SLGBR -> SLBGR

Reviewed by Ulrich Weigand

llvm-svn: 257874
2016-01-15 07:12:09 +00:00
Kostya Serebryany ae5b9567bc [libFuzzer] do mutations based on memcmp/strcmp interceptors under a separate flag (-use_memcmp, default=1)
llvm-svn: 257873
2016-01-15 06:24:05 +00:00
Pete Cooper 835594e627 Delete MCRelocationInfo::createExprForRelocation.
This method has no callers.

Also remove X86ELFRelocationInfo.cpp and X86MachORelocationInfo.cpp
which only existed to provide an implementation of that method.

Ok'd by Rafael and Jim.

llvm-svn: 257859
2016-01-15 02:24:12 +00:00
Keno Fischer 253a7bd4da Once again revert debug info verifier changes
Yet another wave of buildbot failures (though fewer this time).
I'm only reverting the Verifier changes, as the test cases
will be fine without them as well, and touching them as often
just introduces unnecessary churn.

llvm-svn: 257855
2016-01-15 02:12:38 +00:00
Keno Fischer 81e2e9ef86 Reapply r257105 "[Verifier] Check that debug values have proper size"
I originally reapplied this in 257550, but had to revert again due to bot
breakage. The only change in this version is to allow either the TypeSize
or the TypeAllocSize of the variable to be the one represented in debug info
(hopefully in the future we can figure out how to encode the difference).
Additionally, several bot failures following r257550, were due to
optimizer bugs now fixed in r257787 and r257795.

r257550 commit message was:

```
The follow extra changes were made to test cases:

Manually making the variable be the actual type instead of a pointer
to avoid pointer-size differences in generic code:

    LLVM :: DebugInfo/Generic/2010-03-24-MemberFn.ll
    LLVM :: DebugInfo/Generic/2010-04-06-NestedFnDbgInfo.ll
    LLVM :: DebugInfo/Generic/2010-05-03-DisableFramePtr.ll
    LLVM :: DebugInfo/Generic/varargs.ll

Delete sizing information from debug info for the same reason
(but the presence of the pointer was important to the test case):

    LLVM :: DebugInfo/Generic/restrict.ll
    LLVM :: DebugInfo/Generic/tu-composite.ll
    LLVM :: Linker/type-unique-type-array-a.ll
    LLVM :: Linker/type-unique-simple2.ll

Fixing an incorrect DW_OP_deref

    LLVM :: DebugInfo/Generic/2010-05-03-OriginDIE.ll

Fixing a missing DW_OP_deref

    LLVM :: DebugInfo/Generic/incorrect-variable-debugloc.ll

Additionally, clang should no longer complain during bootstrap should no
longer happen after r257534.

The original commit message was:
``
Summary:
Teach the Verifier to make sure that the storage size given to llvm.dbg.declare
or the value size given to llvm.dbg.value agree with what is declared in
DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA).
Additionally this catches a number of common mistakes, such as passing a
pointer when a value was intended or vice versa.

One complication comes from stack coloring which modifies the original IR when
it merges allocas in order to make sure that if AA falls back to the IR it gets
the correct result. However, given this new invariant, indiscriminately
replacing one alloca by a different (differently sized one) is no longer valid.
Fix this by just undefing out any use of the alloca in a dbg.declare in this
case.

Additionally, I had to fix a number of test cases. Of particular note:
- I regenerated dbg-changes-codegen-branch-folding.ll from the given source as
  it was affected by the bug fixed in r256077
- two-cus-from-same-file.ll was changed to avoid having a variable-typed debug
  variable as that would depend on the target, even though this test is
  supposed to be generic
- I had to manually declared size/align for reference type. See also the
  discussion for D14275/r253186.
- fpstack-debuginstr-kill.ll required changing `double` to `long double`
- most others were just a question of adding OP_deref
``

```

llvm-svn: 257850
2016-01-15 00:46:17 +00:00
Amaury Sechet 74f4ce6193 LLVMRunStaticConstructors can be called before object is finalized, #24028
Summary: Since you cannot call finalizeObject manually through the C-API and other functions from the C-API automatically call it, LLVMRunStaticConstructors should also call it or otherwise you cannot call it without first calling a workaround function (or call any other function from the C-API which implicitly finalizes the object).

Reviewers: dnovillo, spatel, bkramer, deadalnix, joker.eph, echristo, lhames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16188

llvm-svn: 257849
2016-01-15 00:23:34 +00:00
Kostya Serebryany 4282d30516 [libFuzzer] use custom stol; also introduce __libfuzzer_is_present so that users can check for its presence.
llvm-svn: 257848
2016-01-15 00:17:37 +00:00
Sanjay Patel 960e5349af rangify; NFCI
llvm-svn: 257845
2016-01-15 00:08:10 +00:00
Weiming Zhao 038393bba0 Fix AArch64ConditionOptimizer
Summary:
This pass may modify the Cmp operands. However, the flag reg may be used by both the branch and CSEL.
Modifying CMP will have side effect on CSEL.

Reviewers: t.p.northover

Subscribers: llvm-commits, aemerson, rengolin

Differential Revision: http://reviews.llvm.org/D16147

llvm-svn: 257844
2016-01-15 00:06:58 +00:00
Sanjay Patel 784b5e3ff0 remove duplicate documentation comments (already in the header file) ; NFC
llvm-svn: 257835
2016-01-14 23:23:04 +00:00
Easwaran Raman f4bb2f0dc3 Refactor threshold computation for inline cost analysis
Differential Revision: http://reviews.llvm.org/D15401

llvm-svn: 257832
2016-01-14 23:16:29 +00:00
Keno Fischer f6d17b953c [Verifier] Check parentage of GVs in dbg metadata
Summary:
Before this the Verifier didn't complain if the GlobalVariable
referenced from a DIGlobalVariable was not in fact in the correct
module (it would crash while writing bitcode though). Fix this by
always checking parantage of GlobalValues while walking constant
expressions and changing the DIGlobalVariable visitor to also
visit the constant it contains.

Reviewers: rafael
Differential Revision: http://reviews.llvm.org/D16059

llvm-svn: 257825
2016-01-14 22:42:02 +00:00
Keno Fischer 60f82a269f [Verifier] Verify that a GlobalValue is only used in this Module
Summary:
We already have the inverse verification that we only use globals
that are defined in this module. This essentially catches the
same mistake, but when verifying the module that contains the
definition.

Reviewers: rafael
Differential Revision: http://reviews.llvm.org/D15272

llvm-svn: 257823
2016-01-14 22:20:56 +00:00
Xinliang David Li 565b301380 [PGO] Move profile summary interface/impl into InstrProf.[*] /NFC
llvm-svn: 257819
2016-01-14 22:10:49 +00:00
Lang Hames 52c4724165 [Orc] Add support for EH-frame registration to the Orc Remote Target utility
classes.

OrcRemoteTargetClient::RCMemoryManager will now register EH frames with the
server automatically. This allows remote-execution of code that uses exceptions.

llvm-svn: 257816
2016-01-14 22:02:03 +00:00
Krzysztof Parzyszek 0d11212f00 [Hexagon] Use S2_lsr_i_r instead of S2_extractu to obtain upper halfword
llvm-svn: 257815
2016-01-14 21:59:22 +00:00