Commit Graph

36091 Commits

Author SHA1 Message Date
Zachary Turner 0a43efea95 Resubmit "Refactor raw pdb dumper into library"
This fixes a number of endianness issues as well as an ODR
violation that hopefully causes everything to be happy.

llvm-svn: 267431
2016-04-25 17:38:08 +00:00
Chad Rosier e2cbd13e56 [ValueTracking] Improve isImpliedCondition when the dominating cond is false.
llvm-svn: 267430
2016-04-25 17:23:36 +00:00
Adrian Prantl 2c0b0ab624 dsymutil: Only warn about clang module DWO id mismatches in verbose mode.
Until PR27449 (https://llvm.org/bugs/show_bug.cgi?id=27449) is fixed in
clang this warning is pointless, since ASTFileSignatures will change
randomly when a module is rebuilt.

rdar://problem/25610919

llvm-svn: 267427
2016-04-25 17:04:32 +00:00
Sanjay Patel 43c7af6889 add tests for potential CGP transform (PR27344)
llvm-svn: 267426
2016-04-25 16:56:52 +00:00
Marcin Koscielnicki 1c1af6ef77 [PR27390] [CodeGen] Reject indexed loads in CombinerDAG.
visitAND, when folding and (load) forgets to check which output of
an indexed load is involved, happily folding the updated address
output on the following testcase:

target datalayout = "e-m:e-i64:64-n32:64"
target triple = "powerpc64le-unknown-linux-gnu"

%typ = type { i32, i32 }

define signext i32 @_Z8access_pP1Tc(%typ* %p, i8 zeroext %type) {
  %b = getelementptr inbounds %typ, %typ* %p, i64 0, i32 1
  %1 = load i32, i32* %b, align 4
  %2 = ptrtoint i32* %b to i64
  %3 = and i64 %2, -35184372088833
  %4 = inttoptr i64 %3 to i32*
  %_msld = load i32, i32* %4, align 4
  %zzz = add i32 %1,  %_msld
  ret i32 %zzz
}

Fix this by checking ResNo.

I've found a few more places that currently neglect to check for
indexed load, and tightened them up as well, but I don't have test
cases for them.  In fact, they might not be triggerable at all,
at least with current targets.  Still, better safe than sorry.

Differential Revision: http://reviews.llvm.org/D19202

llvm-svn: 267420
2016-04-25 15:43:44 +00:00
Hrvoje Varga c2dd5d223a [mips][microMIPS] Revert commit r267137
Commit r267137 was the reason for failing tests in LLVM test suite.

llvm-svn: 267419
2016-04-25 15:40:08 +00:00
Zlatko Buljan b43d4bcbd5 [mips][microMIPS] Revert commit r266977
Commit r266977 was reason for failing LLVM test suite with error message: fatal error: error in backend: Cannot select: t17: i32 = rotr t2, t11 ...

llvm-svn: 267418
2016-04-25 15:34:57 +00:00
Sanjay Patel 0fc4137065 [x86] auto-generate checks for cmov tests
llvm-svn: 267417
2016-04-25 15:26:57 +00:00
David Majnemer dd21523653 [WinEH] Update SplitAnalysis::computeLastSplitPoint to cope with multiple EH successors
We didn't have logic to correctly handle CFGs where there was more than
one EH-pad successor (these are novel with WinEH).
There were situations where a register was live in one exceptional
successor but not another but the code as written would only consider
the first exceptional successor it found.

This resulted in split points which were insufficiently early if an
invoke was present.

This fixes PR27501.

N.B.  This removes getLandingPadSuccessor.

llvm-svn: 267412
2016-04-25 14:31:32 +00:00
Silviu Baranga 82d04260b7 [ARM] Add support for the X asm constraint
Summary:
This patch adds support for the X asm constraint.

To do this, we lower the constraint to either a "w" or "r" constraint
depending on the operand type (both constraints are supported on ARM).

Fixes PR26493

Reviewers: t.p.northover, echristo, rengolin

Subscribers: joker.eph, jgreenhalgh, aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D19061

llvm-svn: 267411
2016-04-25 14:29:18 +00:00
Artem Tamazov d6468666b5 [AMDGPU][llvm-mc] s_getreg/setreg* - Add hwreg(...) syntax.
Added hwreg(reg[,offset,width]) syntax.
Default offset = 0, default width = 32.
Possibility to specify 16-bit immediate kept.
Added out-of-range checks.
Disassembling is always to hwreg(...) format.
Tests updated/added.

Differential Revision: http://reviews.llvm.org/D19329

llvm-svn: 267410
2016-04-25 14:13:51 +00:00
Krzysztof Parzyszek e6ee481bdf [Hexagon] Correctly set "Flags" in ELF header
llvm-svn: 267397
2016-04-25 12:49:47 +00:00
James Molloy eb040cc55f [GlobalOpt] Allow constant globals to be SRA'd
The current logic assumes that any constant global will never be SRA'd. I presume this is because normally constant globals can be pushed into their uses and deleted. However, that sometimes can't happen (which is where you really want SRA, so the elements that can be eliminated, are!).

There seems to be no reason why we can't SRA constants too, so let's do it.

llvm-svn: 267393
2016-04-25 10:48:29 +00:00
Igor Kudrin ed99a96f06 [Coverage] Restore the correct count value after processing a nested region in case of combined regions.
If several regions cover the same area of code, we have to restore
the combined value for that area when return from a nested region.

This patch achieves that by combining regions before calling buildSegments.

Differential Revision: http://reviews.llvm.org/D18610

llvm-svn: 267390
2016-04-25 09:43:37 +00:00
Silviu Baranga 795c629ec9 [SCEV] Improve the run-time checking of the NoWrap predicate
Summary:
This implements a new method of run-time checking the NoWrap
SCEV predicates, which should be easier to optimize and nicer
for targets that don't correctly handle multiplication/addition
of large integer types (like i128).

If the AddRec is {a,+,b} and the backedge taken count is c,
the idea is to check that |b| * c doesn't have unsigned overflow,
and depending on the sign of b, that:

   a + |b| * c >= a (b >= 0) or
   a - |b| * c <= a (b <= 0)

where the comparisons above are signed or unsigned, depending on
the flag that we're checking.

The advantage of doing this is that we avoid extending to a larger
type and we avoid the multiplication of large types (multiplying
i128 can be expensive).

Reviewers: sanjoy

Subscribers: llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D19266

llvm-svn: 267389
2016-04-25 09:27:16 +00:00
Marcin Koscielnicki a44d44cb2e [PowerPC] [PR27387] Disallow r0 for ADD8TLS.
ADD8TLS, a variant of add instruction used for initial-exec TLS,
currently accepts r0 as a source register.  While add itself supports
r0 just fine, linker can relax it to a local-exec sequence, converting
it to addi - which doesn't support r0.

Differential Revision: http://reviews.llvm.org/D19193

llvm-svn: 267388
2016-04-25 09:24:34 +00:00
Michael Zuckerman 1bd66dd1c2 Fixing wrong mask size error. From __mmask8 to __mmask16.
Was reviewed over the shoulder by AsafBadouh.
Connected to review http://reviews.llvm.org/D19195.

llvm-svn: 267379
2016-04-25 05:27:51 +00:00
Craig Topper 61a14911b2 [X86] Add a complete set of tests for all operand sizes of cttz/ctlz with and without zero undef being lowered to bsf/bsr.
llvm-svn: 267373
2016-04-25 01:01:15 +00:00
Adrian Prantl 93035c8f47 Verifier: Verify that each inlinable callsite of a debug-info-bearing function
in a debug-info-bearing function has a debug location attached to it. Failure to
do so causes an "!dbg attachment points at wrong subprogram for function"
assertion failure when the inliner sets up inline scope info.

rdar://problem/25878916

This reaplies r267320 without changes after fixing an issue in the OpenMP IR
generator in clang.

llvm-svn: 267370
2016-04-24 22:23:13 +00:00
Rafael Espindola 468d327b34 Also check the IR.
llvm-svn: 267367
2016-04-24 21:42:56 +00:00
Rafael Espindola 2adb06a820 Add a test for how we handle protected visibility.
llvm-svn: 267366
2016-04-24 21:30:18 +00:00
Simon Pilgrim 646c2a5569 [X86][AVX] Added PR24935 test case
llvm-svn: 267362
2016-04-24 20:30:48 +00:00
Saleem Abdulrasool 9611518646 ARM: fix __chkstk Frame Setup on WoA
This corrects the MI annotations for the stack adjustment following the __chkstk
invocation.  We were marking the original SP usage as a Def rather than Kill.
The (new) assigned value is the definition, the original reference is killed.

Adjust the ISelLowering to mark Kills and FrameSetup as well.

This partially resolves PR27480.

llvm-svn: 267361
2016-04-24 20:12:48 +00:00
Simon Pilgrim 4b5462f119 [InstCombine][SSE] Reduce DIVSS/DIVSD to FDIV if only first element is required
As discussed on D19318, if we only demand the first element of a DIVSS/DIVSD intrinsic, then reduce to a FDIV call. This matches the existing FADD/FSUB/FMUL patterns.

llvm-svn: 267359
2016-04-24 18:35:59 +00:00
Simon Pilgrim 83020942d3 [InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 2 of 2)
Split from D17490. This patch improves support for determining the demanded vector elements through SSE scalar intrinsics:

1 - demanded vector element support for unary and some extra binary scalar intrinsics (RCP/RSQRT/SQRT/FRCZ and ADD/CMP/DIV/ROUND).

2 - addss/addsd get simplified to a fadd call if we aren't interested in the pass through elements

3 - if we don't need the lowest element of a scalar operation then just use the first argument (the pass through elements) directly

We can add support for propagating demanded elements through any equivalent packed SSE intrinsics in a future patch (these wouldn't use the pass through patterns).

Differential Revision: http://reviews.llvm.org/D19318

llvm-svn: 267357
2016-04-24 18:23:14 +00:00
Simon Pilgrim 424da1637a [InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 1 of 2)
This patch improves support for determining the demanded vector elements through SSE scalar intrinsics:

1 - recognise that we only need the lowest element of the second input for binary scalar operations (and all the elements of the first input)

2 - recognise that the roundss/roundsd intrinsics use the lowest element of the second input and the remaining elements from the first input

Differential Revision: http://reviews.llvm.org/D17490

llvm-svn: 267356
2016-04-24 18:12:42 +00:00
Simon Pilgrim 2d0104cc47 [X86][SSE] Added SSSE3/AVX/AVX2 BITREVERSE tests
Codegen is pretty bad at the moment but could use PSHUFB quite efficiently 

llvm-svn: 267347
2016-04-24 15:45:06 +00:00
Simon Pilgrim f379a6c684 [X86][XOP] Fixed VPPERM permute op decoding (PR27472).
Fixed issue with VPPERM target shuffle mask decoding that was incorrectly masking off the 3-bit permute op with a 2-bit mask.

llvm-svn: 267346
2016-04-24 15:05:04 +00:00
Simon Pilgrim 9f5697ef68 [X86][SSE] Improved support for decoding target shuffle masks through bitcasts
Reused the ability to split constants of a type wider than the shuffle mask to work with masks generated from scalar constants transfered to xmm.

This fixes an issue preventing PSHUFB target shuffle masks decoding rematerialized scalar constants and also exposes the XOP VPPERM bug described in PR27472.

llvm-svn: 267343
2016-04-24 14:53:54 +00:00
Marcin Koscielnicki aef3b5b5e2 [SystemZ] [SSP] Add support for LOAD_STACK_GUARD.
This fixes PR22248 on s390x.  The previous attempt at this was D19101,
which was before LOAD_STACK_GUARD existed.  Compared to the previous
version, this always emits a rather ugly block of 4 instructions, involving
a thread pointer load that can't be shared with other potential users.
However, this is necessary for SSP - spilling the guard value (or thread
pointer used to load it) is counter to the goal, since it could be
overwritten along with the frame it protects.

Differential Revision: http://reviews.llvm.org/D19363

llvm-svn: 267340
2016-04-24 13:57:49 +00:00
Simon Pilgrim 7eedee938f [X86][SSE] Demonstrate issue with decoding shuffle masks that have been lowered as rematerialized constants on scalar unit
Found whilst investigating PR27472

llvm-svn: 267339
2016-04-24 13:45:30 +00:00
NAKAMURA Takumi d53ab701bf llvm/test/tools/gold/X86/thinlto.ll: Possible fix corresponding to r267318.
llvm-svn: 267334
2016-04-24 08:02:00 +00:00
Duncan P. N. Exon Smith 5892c6b94b BitcodeReader: Fix some holes in upgrade from r267296
Add tests for some missing cases to bitcode upgrade in r267296.

  - DICompositeType with an 'elements:' field, which will cause it to be
    involved in a cycle after the upgrade.

  - A DIDerivedType that references a class in 'extraData:'.

I updated test/Bitcode/dityperefs-3.8.ll with the missing cases and
regenerated test/Bitcode/dityperefs-3.8.ll.bc.

llvm-svn: 267332
2016-04-24 06:52:01 +00:00
Mehdi Amini ca2c54e04e Add "hasSection" flag in the Summary
Reviewers: tejohnson

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19405

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267329
2016-04-24 05:31:43 +00:00
Gerolf Hoflehner 01b3a6184a [MachineCombiner] Support for floating-point FMA on ARM64 (re-commit r267098)
The original patch caused crashes because it could derefence a null pointer
for SelectionDAGTargetInfo for targets that do not define it.

Evaluates fmul+fadd -> fmadd combines and similar code sequences in the
machine combiner. It adds support for float and double similar to the existing
integer implementation. The key features are:

- DAGCombiner checks whether it should combine greedily or let the machine
combiner do the evaluation. This is only supported on ARM64.
- It gives preference to throughput over latency: the heuristic used is
to combine always in loops. The targets decides whether the machine
combiner should optimize for throughput or latency.
- Supports for fmadd, f(n)msub, fmla, fmls patterns
- On by default at O3 ffast-math

llvm-svn: 267328
2016-04-24 05:14:01 +00:00
Adrian Prantl da99dced5c Revert "Verifier: Verify that each inlinable callsite of a debug-info-bearing function"
This reverts commit r267320 while investigating an OpenMP buildbot failure.

llvm-svn: 267322
2016-04-24 03:47:37 +00:00
Adrian Prantl 971481db98 Verifier: Verify that each inlinable callsite of a debug-info-bearing function
in a debug-info-bearing function has a debug location attached to it. Failure to
do so causes an "!dbg attachment points at wrong subprogram for function"
assertion failure when the inliner sets up inline scope info.

rdar://problem/25878916

llvm-svn: 267320
2016-04-24 03:23:02 +00:00
Mehdi Amini c3ed48c1bd Reorganize GlobalValueSummary with a "Flags" bitfield.
Right now it only contains the LinkageType, but will be extended
with "hasSection", "isOptSize", "hasInlineAssembly", etc.

Differential Revision: http://reviews.llvm.org/D19404

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267319
2016-04-24 03:18:18 +00:00
Mehdi Amini 8fe6936e18 Add a version field in the bitcode for the summary
Differential Revision: http://reviews.llvm.org/D19456

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267318
2016-04-24 03:18:11 +00:00
Mehdi Amini 059464fe36 Add an internalization step to the ThinLTOCodeGenerator
Keeping as much as possible internal/private is
known to help the optimizer. Let's try to benefit from
this in ThinLTO.
Note: this is early work, but is enough to build clang (and
all the LLVM tools). I still need to write some lit-tests...

Differential Revision: http://reviews.llvm.org/D19103

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267317
2016-04-24 03:18:01 +00:00
Craig Topper 601b6c69bc [X86] Fix patterns that turn cmove/cmovne+ctlz/cttz into lzcnt/tzcnt instructions. Only one of the conditions should be valid for each pattern, not both. Update tests accordingly.
llvm-svn: 267311
2016-04-24 02:01:22 +00:00
Davide Italiano cefdf238e9 [RuntimeDyldELF] Handle GOTPCRELX/REX_GOTPCRELX.
llvm-svn: 267309
2016-04-24 01:36:37 +00:00
Davide Italiano 327b88c36d [MC/ELF] Make the relaxation test more interesting.
Add a case where we can't relax.

llvm-svn: 267308
2016-04-24 01:08:35 +00:00
Davide Italiano f59b0da654 [MC/ELF] Implement support for GOTPCRELX/REX_GOTPCRELX.
The option to control the emission of the new relocations
is -relax-relocations (blatantly copied from GNU as).
It can't be enabled by default because it breaks relatively
recent versions of ld.bfd/ld.gold (late 2015).

llvm-svn: 267307
2016-04-24 01:03:57 +00:00
Mehdi Amini 02fa65209c Relax test using CHECK-DAG instead of CHECK-NEXT
It seems we still have some ordering issue in the combined index
emission, but I can't figure out why right now.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267306
2016-04-24 00:25:15 +00:00
Mehdi Amini 363d83f808 Fix test stability (was sensitive to the path)
This is a fixup for r267304.
The test was sensitive to the path in a subtle way:
the index in memory is sorted by GUID, which are hashes
that include the source filename for local globals.
Teresa recently added a directive at the IR level, so
we can specify it here to make the test independent of
the path.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267305
2016-04-24 00:03:57 +00:00
Mehdi Amini ae64eafd31 Store and emit original name in combined index
Summary:
As discussed in D18298, some local globals can't
be renamed/promoted (because they have a section, or because
they are referenced from inline assembly).
To be able to detect naming collision, we need to keep around
the "GUID" using their original name without taking the linkage
into account.

Reviewers: tejohnson

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D19454

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267304
2016-04-23 23:38:17 +00:00
Mehdi Amini cb87494f4c Always traverse GlobalVariable initializer when computing the export list
Summary:
We are always importing the initializer for a GlobalVariable.
So if a GlobalVariable is in the export-list, we pull in any
refs as well.

Reviewers: tejohnson

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19102

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267303
2016-04-23 23:29:24 +00:00
Duncan P. N. Exon Smith a59d3e5af8 DebugInfo: Remove MDString-based type references
Eliminate DITypeIdentifierMap and make DITypeRef a thin wrapper around
DIType*.  It is no longer legal to refer to a DICompositeType by its
'identifier:', and DIBuilder no longer retains all types with an
'identifier:' automatically.

Aside from the bitcode upgrade, this is mainly removing logic to resolve
an MDString-based reference to an actualy DIType.  The commits leading
up to this have made the implicit type map in DICompileUnit's
'retainedTypes:' field superfluous.

This does not remove DITypeRef, DIScopeRef, DINodeRef, and
DITypeRefArray, or stop using them in DI-related metadata.  Although as
of this commit they aren't serving a useful purpose, there are patchces
under review to reuse them for CodeView support.

The tests in LLVM were updated with deref-typerefs.sh, which is attached
to the thread "[RFC] Lazy-loading of debug info metadata":

  http://lists.llvm.org/pipermail/llvm-dev/2016-April/098318.html

llvm-svn: 267296
2016-04-23 21:08:00 +00:00
Renato Golin 179d1f5dad Revert "[AArch64] Fix optimizeCondBranch logic."
This reverts commit r267206, as it broke self-hosting on AArch64.

llvm-svn: 267294
2016-04-23 19:30:52 +00:00
Simon Pilgrim 9448b112de [X86][XOP] Added VPPERM -> BLEND-WITH-ZERO Test
Currently failing due to poor blend matching, found whilst investigating PR27472

llvm-svn: 267282
2016-04-23 11:14:18 +00:00
Benjamin Kramer 45b336646e Use %T instead of cd'ing to Output directly.
%T expands to Output if not configured differently.

llvm-svn: 267281
2016-04-23 11:01:36 +00:00
Craig Topper 7e5fad66f3 [CodeGen] When promoting CTTZ operations to larger type, don't insert a select to detect if the input is zero to return the original size instead of the extended size. Instead just set the first bit in the zero extended part.
llvm-svn: 267280
2016-04-23 05:20:47 +00:00
Teresa Johnson 723cd05f83 [gold] Gate value name discarding under save-temps
Summary:
This removes a couple of flags added to control this behavior, and
simply keeps all value names when save-temps is specified.

Reviewers: rafael

Subscribers: llvm-commits, pcc, davide

Differential Revision: http://reviews.llvm.org/D19384

llvm-svn: 267279
2016-04-23 05:15:59 +00:00
Duncan P. N. Exon Smith 30805b2417 BitcodeWriter: Emit uniqued subgraphs after all distinct nodes
Since forward references for uniqued node operands are expensive (and
those for distinct node operands are cheap due to
DistinctMDOperandPlaceholder), minimize forward references in uniqued
node operands.

Moreover, guarantee that when a cycle is broken by a distinct node, none
of the uniqued nodes have any forward references.  In
ValueEnumerator::EnumerateMetadata, enumerate uniqued node subgraphs
first, delaying distinct nodes until all uniqued nodes have been
handled.  This guarantees that uniqued nodes only have forward
references when there is a uniquing cycle (since r267276 changed
ValueEnumerator::organizeMetadata to partition distinct nodes in front
of uniqued nodes as a post-pass).

Note that a single uniqued subgraph can hit multiple distinct nodes at
its leaves.  Ideally these would themselves be emitted in post-order,
but this commit doesn't attempt that; I think it requires an extra pass
through the edges, which I'm not convinced is worth it (since
DistinctMDOperandPlaceholder makes forward references quite cheap
between distinct nodes).

I've added two testcases:

  - test/Bitcode/mdnodes-distinct-in-post-order.ll is just like
    test/Bitcode/mdnodes-in-post-order.ll, except with distinct nodes
    instead of uniqued ones.  This confirms that, in the absence of
    uniqued nodes, distinct nodes are still emitted in post-order.

  - test/Bitcode/mdnodes-distinct-nodes-break-cycles.ll is the minimal
    example where a naive post-order traversal would cause one uniqued
    node to forward-reference another.  IOW, it's the motivating test.

llvm-svn: 267278
2016-04-23 04:59:22 +00:00
Duncan P. N. Exon Smith 1483fff271 BitcodeWriter: Emit distinct nodes before uniqued nodes
When an operand of a distinct node hasn't been read yet, the reader can
use a DistinctMDOperandPlaceholder.  This is much cheaper than forward
referencing from a uniqued node.  Change
ValueEnumerator::organizeMetadata to partition distinct nodes and
uniqued nodes to reduce the overhead of cycles broken by distinct nodes.

Mehdi measured this for me; this removes most of the RAUW from the
importing step of -flto=thin, even after a WIP patch that removes
string-based DITypeRefs (introducing many more cycles to the metadata
graph).

llvm-svn: 267276
2016-04-23 04:42:39 +00:00
Tim Northover 09ca33ebc4 llvm-objdump: deal with invalid ARM encodings slightly better.
Before we printed a warning to stderr and left the actual output stream in a
mess. This tries to print a .long or .short representation of what we saw (as
if there was a data-in-code directive).

This isn't guaranteed to restore synchronization in Thumb-mode (if the invalid
instruction was supposed to be 32-bits, we may be off-by-16 for the rest of the
function). But there's no certain way to deal with that, and it's invalid code
anyway (if the data really wasn't an instruction, the user can add proper
.data_in_code directives if they care)

llvm-svn: 267250
2016-04-22 23:23:31 +00:00
Tim Northover 9e8eb418e5 MachO: remove weird ARM/Thumb interface from MachOObjectFile
Only one consumer (llvm-objdump) actually cared about the fact that there were
two triples. Others were actively working around the fact that the Triple
returned by getArch might have been invalid. As for llvm-objdump, it needs to
be acutely aware of both Triples anyway, so being generic in the exposed API is
no benefit.

Also rename the version of getArch returning a Triple. Users were having to
pass an unwanted nullptr to disambiguate the two, which was nasty.

The only functional change here is that armv7m and armv7em object files no
longer crash llvm-objdump.

llvm-svn: 267249
2016-04-22 23:21:13 +00:00
Matt Arsenault 7e8de01f84 AMDGPU: sext_inreg (srl x, K), vt -> bfe x, K, vt.Size
llvm-svn: 267244
2016-04-22 22:59:16 +00:00
NAKAMURA Takumi 4aec5fda93 Fix llvm/test/CodeGen/ARM/Windows/dbzchk.ll not to check mixed output, take #2.
llvm-svn: 267242
2016-04-22 22:51:48 +00:00
David Blaikie e438cff475 llvm-symbolizer: Avoid infinite recursion walking dwos where the dwo contains a dwo_name attribute
The dwo_name was added to dwo files to improve diagnostics in dwp, but
it confuses tools that attempt to load any dwo named by a dwo_name, even
ones inside dwos. Avoid this by keeping track of whether a unit is
already a dwo unit, and if so, not loading further dwos.

llvm-svn: 267241
2016-04-22 22:50:56 +00:00
Matt Arsenault efa3fe14d1 AMDGPU: Re-visit nodes in performAndCombine
This fixes test regressions when i64 loads/stores are made promote.

llvm-svn: 267240
2016-04-22 22:48:38 +00:00
Nico Weber 0aa9845d15 Revert r267210, it makes clang assert (PR27490).
llvm-svn: 267232
2016-04-22 22:08:42 +00:00
Andrew Kaylor aa641a5171 Re-commit optimization bisect support (r267022) without new pass manager support.
The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling).

Differential Revision: http://reviews.llvm.org/D19172

llvm-svn: 267231
2016-04-22 22:06:11 +00:00
Sriraman Tallam 3cb773431d Differential Revision: http://reviews.llvm.org/D19040
llvm-svn: 267229
2016-04-22 21:41:58 +00:00
David Blaikie 9a4f3cb275 llvm-symbolizer: prefer .dwo contents over fission-gmlt-like-data when .dwo file is present
Rather than relying on the gmlt-like data emitted into the .o/executable
which only contains the simple name of any inlined functions, use the
.dwo file if present.

Test symbolication with/without a .dwo, and the old test that was
testing behavior when no gmlt-like data was present. (I haven't included
a test of non-gmlt-like data + no .dwo (that would be akin to
symbolication with no debug info) but we could add one for completeness)

The test was simplified a bit to be a little clearer (unoptimized, force
inline, using a function call as the inlined entity) and regenerated
with ToT clang. For the no-gmlt-like-data case, I modified Clang back to
its old behavior temporarily & the .dwo file is identical so it is
shared between the two executables.

llvm-svn: 267227
2016-04-22 21:32:59 +00:00
Peter Collingbourne 7dd8dbf486 Introduce llvm.load.relative intrinsic.
This intrinsic takes two arguments, ``%ptr`` and ``%offset``. It loads
a 32-bit value from the address ``%ptr + %offset``, adds ``%ptr`` to that
value and returns it. The constant folder specifically recognizes the form of
this intrinsic and the constant initializers it may load from; if a loaded
constant initializer is known to have the form ``i32 trunc(x - %ptr)``,
the intrinsic call is folded to ``x``.

LLVM provides that the calculation of such a constant initializer will
not overflow at link time under the medium code model if ``x`` is an
``unnamed_addr`` function. However, it does not provide this guarantee for
a constant initializer folded into a function body. This intrinsic can be
used to avoid the possibility of overflows when loading from such a constant.

Differential Revision: http://reviews.llvm.org/D18367

llvm-svn: 267223
2016-04-22 21:18:02 +00:00
Matt Arsenault 3b748d76f6 DAGCombiner: Relax alignment restriction when changing store type
If the target allows the alignment, this should be OK.

llvm-svn: 267217
2016-04-22 21:01:41 +00:00
Philip Reames 5f0e36947b [unordered] sink unordered stores at end of blocks
The existing code turned out to be completely correct when auditted.  Thus, only minor code changes and adding a couple of tests.

llvm-svn: 267215
2016-04-22 20:53:32 +00:00
Sanjoy Das f97229d6ba Fold compares for distinct allocations
Summary:
We can fold compares to false when two distinct allocations within a
function are compared for equality.

Patch by Anna Thomas!

Reviewers: majnemer, reames, sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19390

llvm-svn: 267214
2016-04-22 20:52:25 +00:00
Peter Collingbourne 265ebd7d70 CodeGen: Use PLT relocations for relative references to unnamed_addr functions.
The relative vtable ABI (PR26723) needs PLT relocations to refer to virtual
functions defined in other DSOs. The unnamed_addr attribute means that the
function's address is not significant, so we're allowed to substitute it
with the address of a PLT entry.

Also includes a bonus feature: addends for COFF image-relative references.

Differential Revision: http://reviews.llvm.org/D17938

llvm-svn: 267211
2016-04-22 20:40:10 +00:00
Philip Reames eedef73b63 [unordered] Extend load/store type canonicalization to handle unordered operations
Extend the type canonicalization logic to work for unordered atomic loads and stores.  Note that while this change itself is fairly simple and low risk, there's a reasonable chance this will expose problems in the backends by suddenly generating IR they wouldn't have seen before.  Anything of this nature will be an existing bug in the backend (you could write an atomic float load), but this will definitely change the frequency with which such cases are encountered.  If you see problems, feel free to revert this change, but please make sure you collect a test case.  

llvm-svn: 267210
2016-04-22 20:33:48 +00:00
Matt Arsenault 629d12de70 DAGCombiner: Relax alignment restriction when changing load type
If the target allows the alignment, this should still be OK.

llvm-svn: 267209
2016-04-22 20:21:36 +00:00
Quentin Colombet 10768ab09e [AArch64] Fix optimizeCondBranch logic.
The opcode for the optimized branch does not depend on the size
of the activate bits in the AND masks, but the AND opcode itself.
Indeed, we need to use a X or W variant based on the AND variant
not based on whether the mask fits into the related variant.
Otherwise, we may end up using the W variant of the optimized branch
for 64-bit register inputs!

This fixes the last make check verifier issues for AArch64: PR27479.

llvm-svn: 267206
2016-04-22 20:09:58 +00:00
Justin Bogner b93949089e PM: Port SinkingPass to the new pass manager
llvm-svn: 267199
2016-04-22 19:54:10 +00:00
Jun Bum Lim d29a24e4fd [DeadStoreElimination] Shorten beginning of memset overwritten by later stores
Summary: This change will shorten memset if the beginning of memset is overwritten by later stores.

Reviewers: hfinkel, eeckstein, dberlin, mcrosier

Subscribers: mgrang, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18906

llvm-svn: 267197
2016-04-22 19:51:29 +00:00
Justin Bogner 395c2127ed PM: Port DCE to the new pass manager
Also add a very basic test, since apparently there aren't any tests
for DCE whatsoever to add the new pass version to.

llvm-svn: 267196
2016-04-22 19:40:41 +00:00
Matthias Braun 6493bc2b97 MachineScheduler: Limit the size of the ready list.
Avoid quadratic complexity in unusually large basic blocks by limiting
the size of the ready lists.

Differential Revision: http://reviews.llvm.org/D19349

llvm-svn: 267189
2016-04-22 19:09:17 +00:00
Quentin Colombet 658d9dbe56 [AArch64] When creating MRS instruction, make sure the destination register is
declared as a definition.

This fixes the machine verifier error for CodeGen/AArch64/nzcv-save.ll.

llvm-svn: 267185
2016-04-22 18:46:17 +00:00
Adam Nemet 54053a518e [LoopVersioningLICM] Add test coverage for llvm.loop.licm_versioning.disable
In the next change, I am generalizing the function
findStringMetadataForLoop and I want to make sure I don't break this.
Looks like there was no coverage for this so far.

llvm-svn: 267182
2016-04-22 18:34:50 +00:00
Quentin Colombet 9598f10104 [AArch64][AdvSIMDScalar] Update the kill flags correctly.
We used to simply set the kill flags to true when transforming a scalar
instruction to a vector one.
SrcScalar1 = copy SrcVector1
... = opScalar SrcScalar1
=>
SrcScalar1 = copy SrcVector1
... = opVector SrcVector1<kill>

This is obviously wrong. The proper update consists in:
1. Propagate the kill status from the copy to the new opVector
2. Reset the kill status on the copy, since the live-range of
   SrcVector1 got extended.

This fixes some of the machine verifier errors for AArch64 with make check.

llvm-svn: 267180
2016-04-22 18:09:14 +00:00
Saleem Abdulrasool 8237008897 test: split test into two runs
Rather than checking both stdout and stderr simultaneously, split it into two
tests.  This apparently breaks on Windows where MSVCRT does not buffer output
correctly.  NFC.

Thanks to chapuni for bringing the issue to my attention!

llvm-svn: 267179
2016-04-22 18:06:51 +00:00
Chad Rosier 1a60159064 [SimplifyCFG] Add final missing implications to isImpliedTrueByMatchingCmp.
Summary: eq imply [u|s]ge and [u|s]le are true.

Remove redundant logic by implementing isImpliedFalseByMatchingCmp(Pred1, Pred2)
as isImpliedTrueByMatchingCmp(Pred1, getInversePredicate(Pred2)).

llvm-svn: 267177
2016-04-22 17:57:34 +00:00
Sanjoy Das a6155b659a Have isKnownNotFullPoison be smarter around control flow
Summary:
(... while still not using a PostDomTree)

The way we use isKnownNotFullPoison from SCEV today, the new CFG walking
logic will not trigger for any realistic cases -- it will kick in only
for situations where we could have merged the contiguous basic blocks
anyway[0], since the poison generating instruction dominates all of its
non-PHI uses (which are the only uses we consider right now).

However, having this change in place will allow a later bugfix to break
fewer llvm-lit tests.

[0]: i.e. cases where block A branches to block B and B is A's only
successor and A is B's only predecessor.

Reviewers: broune, bjarke.roune

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19212

llvm-svn: 267175
2016-04-22 17:41:06 +00:00
Krzysztof Parzyszek 8c6fb415fd [Hexagon] Properly close live range in HexagonBlockRanges ---add testcase
llvm-svn: 267174
2016-04-22 17:30:13 +00:00
Chad Rosier 3456cb5672 [SimplifyCFG] Add missing implications to isImpliedTrueByMatchingCmp.
Summary: [u|s]gt and [u|s]lt imply [u|s]ge and [u|s]le are true, respectively.
I've simplified the existing tests and added additional tests to cover the new
cases mentioned above.  I've also added tests for all the cases where the
first compare doesn't imply anything about the second compare.

llvm-svn: 267171
2016-04-22 17:14:12 +00:00
Chad Rosier 1960d13e29 [SimplifyCFG] Simplify code review by temporarily removing this test file.
A followup commit will replace these tests with simplified and more inclusive
tests.  The diff is unreadable if this were to be done in a single commit.

llvm-svn: 267170
2016-04-22 17:14:08 +00:00
Konstantin Zhuravlyov a40d8358e7 [AMDGPU] Insert nop pass: take care of outstanding feedback
- Switch few loops to range-based for loops
- Fix nop insertion at the end of BB
- Fix formatting
- Check for endpgm

Differential Revision: http://reviews.llvm.org/D19380

llvm-svn: 267167
2016-04-22 17:04:51 +00:00
Zoran Jovanovic f6344ff295 [mips][microMIPS] Revert commit r266861.
Commit r266861 was the reason for failing tests in LLVM test suite.

llvm-svn: 267166
2016-04-22 16:53:15 +00:00
Krzysztof Parzyszek 9062b75a93 [Hexagon] Teach mux expansion how to deal with undef predicates
llvm-svn: 267165
2016-04-22 16:47:01 +00:00
Krzysztof Parzyszek e2c6405708 [Hexagon] Add definitions for trap/pause instructions
Also add tests for other instructions from HexagonSystemInst.td.

llvm-svn: 267162
2016-04-22 16:25:00 +00:00
David Majnemer bfd695d591 [EarlyCSE] Don't add the overflow flags to the hash
We take the intersection of overflow flags while CSE'ing.
This permits us to consider two instructions with different overflow
behavior to be replaceable.

llvm-svn: 267153
2016-04-22 14:12:50 +00:00
Nirav Dave 9a878c4930 Emit code16 in assembly in 16-bit mode
Summary:
When generating assembly using -m16 we must explicitly mark it as
16-bit. Emit .code16 at beginning of file. Fixes wrong results when
using -fno-integrated-as.

Reviewers: dwmw2

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19392

llvm-svn: 267152
2016-04-22 13:36:11 +00:00
Simon Dardis 5676d06aef [mips] Fix select patterns for MIPS64
When targetting MIPS64R6 some of the patterns for select were guarded by a
broken predicate. The predicate was supposed to test if a constant value
could fit in a 16 bit zero-extended field. Instead the value was tested to
fit in a 16 bit sign-extended field. For negative constants of native word
width this resulted in wrong code generation.

Reviewers: vkalintiris, dsanders

Differential Review: http://reviews.llvm.org/D19378

llvm-svn: 267151
2016-04-22 13:19:22 +00:00
Daniel Sanders d41718e8af Revert r267049, r26706[16789], r267071 - Refactor raw pdb dumper into library
r267049 broke multiple buildbots (e.g. clang-cmake-mips, and clang-x86_64-linux-selfhost-modules) which the follow-ups have not yet resolved and this is preventing subsequent committers from being notified about additional failures on the affected buildbots.

llvm-svn: 267148
2016-04-22 12:04:42 +00:00
Nikolay Haustov b1bfd5039e AMDGPU/SI: Add test missed in rL266865
llvm-svn: 267144
2016-04-22 11:39:43 +00:00
Silviu Baranga e985c76b90 [InstCombine] Preserve fast math flags when combining PHIs
Summary:
When optimizing PHIs which have inputs floating point binary
operators, we preserve all IR flags except the fast math
flags.

This change removes the logic which tracked some of the IR flags
(no wrap, exact) and replaces it by doing an and on the IR flags of
all inputs to the PHI - which will also handle the fast math
flags.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19370

llvm-svn: 267139
2016-04-22 11:21:36 +00:00
Hrvoje Varga 5560998250 [mips][microMIPS] Implement SLT, SLTI, SLTIU, SLTU microMIPS32r6 instructions
Differential Revision: http://reviews.llvm.org/D19354

llvm-svn: 267137
2016-04-22 11:18:40 +00:00
Zoran Jovanovic 8e366822c2 [mips][microMIPS] Add R_MICROMIPS_PC18_S3 relocation
Differential Revision: http://reviews.llvm.org/D15026

llvm-svn: 267130
2016-04-22 10:15:12 +00:00
Daniel Sanders 591c379563 Revert r267098 - [MachineCombiner] Support for floating-point FMA on ARM64
It introduced buildbot failures on clang-cmake-mips, clang-ppc64le-linux, among others.

llvm-svn: 267127
2016-04-22 09:37:26 +00:00
Ashutosh Nema 468558a061 [X86]: Changing cost for “TRUNCATE v16i32 to v16i8” in SSE4.1 mode.
Summary:
rL256194 transforms truncations between vectors of integers into PACKUS/PACKSS
operations during DAG combine. This generates better code for truncate, so cost
of truncate needs to be changed but looks like it got changed only in SSE2 table
Whereas this change is also applicable for SSE4.1, so the cost of truncate needs
to be changed for that as well. Cost of “TRUNCATE v16i32 to v16i8” & “TRUNCATE 
v16i16 to v16i8” should be same in SSE4.1 & SSE2 table. Removing their cost from
SSE4.1, so it will fall back to SSE2.

Reviewers: Simon Pilgrim
llvm-svn: 267123
2016-04-22 08:34:05 +00:00
Vedant Kumar 6013f45f92 Revert "Initial implementation of optimization bisect support."
This reverts commit r267022, due to an ASan failure:

  http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549

llvm-svn: 267115
2016-04-22 06:51:37 +00:00
Zlatko Buljan ae720dbbb6 [mips][microMIPS] Implement DVP, EVP and JALRC.HB instructions
Differential Revision: http://reviews.llvm.org/D18687

llvm-svn: 267114
2016-04-22 06:44:34 +00:00
David Majnemer d0ce8f1485 [GVN] Respect fast-math-flags on fcmps
We assumed that flags were only present on binary operators.  This is
not true, they may also be present on calls and fcmps.

llvm-svn: 267113
2016-04-22 06:37:51 +00:00
David Majnemer 9554c1339c [EarlyCSE] Take the intersection of flags on instructions
EarlyCSE had inconsistent behavior with regards to flag'd instructions:
- In some cases, it would pessimize if the available instruction had
  different flags by not performing CSE.
- In other cases, it would miscompile if it replaced an instruction
  which had no flags with an instruction which has flags.

Fix this by being more consistent with our flag handling by utilizing
andIRFlags.

llvm-svn: 267111
2016-04-22 06:37:45 +00:00
Nicolai Haehnle b0c9748709 AMDGPU/SI: add llvm.amdgcn.ps.live intrinsic
Summary:
This intrinsic returns true if the current thread belongs to a live pixel
and false if it belongs to a pixel that we are executing only for derivative
computation. It will be used by Mesa to implement gl_HelperInvocation.

Note that for pixels that are killed during the shader, this implementation
also returns true, but it doesn't matter because those pixels are always
disabled in the EXEC mask.

This unearthed a corner case in the instruction verifier, which complained
about a v_cndmask 0, 1, exec, exec<imp-use> instruction. That's stupid but
correct code, so make the verifier accept it as such.

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19191

llvm-svn: 267102
2016-04-22 04:04:08 +00:00
Craig Topper 59479e7208 [AVX512] Teach lowering to use vplzcntd/q to implement 128/256-bit CTTZ_ZERO_UNDEF even without VLX support. We can just extend to 512-bits and extract like we do for CTLZ.
llvm-svn: 267100
2016-04-22 03:22:38 +00:00
Gerolf Hoflehner b32f11fc62 [MachineCombiner] Support for floating-point FMA on ARM64
Evaluates fmul+fadd -> fmadd combines and similar code sequences in the
machine combiner. It adds support for float and double similar to the existing
integer implementation. The key features are:

- DAGCombiner checks whether it should combine greedily or let the machine
combiner do the evaluation. This is only supported on ARM64.
- It gives preference to throughput over latency: the heuristic used is
to combine always in loops. The targets decides whether the machine
combiner should optimize for throughput or latency.
- Supports for fmadd, f(n)msub, fmla, fmls patterns
- On by default at O3 ffast-math

llvm-svn: 267098
2016-04-22 02:15:19 +00:00
Nico Weber f3fc748308 Try to fix UNRESOLVED: LLVM :: CodeGen/AArch64/arm64-regress-opt-cmp.s on bots.
This test used to write a .s file until r266971 fixed that.  But on most bots,
the .s file still exists.  Add an rm statement to clean up the bots.  In a few
days, this statement can go away again.

llvm-svn: 267095
2016-04-22 01:08:56 +00:00
Saleem Abdulrasool 12b87facf4 ARM: fix test for Windows division
This was meant to be part of SVN r267080.  cbz cannot use a high register, which
would be silently truncated.  This has now been fixed.

llvm-svn: 267092
2016-04-22 01:03:38 +00:00
Dan Gohman 04e7fb778d [WebAssembly] Limit alignment hints to natural alignment.
This follows the current binary format rules.

llvm-svn: 267082
2016-04-21 23:59:48 +00:00
Saleem Abdulrasool a028853540 ARM: restrict register class for WIN__DBZCHK
WIN__DBZCHK will insert a CBZ instruction into the stream.  This instruction
reserves 3 bits for the condition register (rn).  As such, we must ensure that
we restrict the register to a low register.  Use the tGPR class instead of GPR
to ensure that this is properly constrained.  In debug builds, we would attempt
to use lr as a condition register which would silently get truncated with no
hint that the register selection was incorrect.

llvm-svn: 267080
2016-04-21 23:53:19 +00:00
Mike Aizatsky c89755e4cb [sancov] using normalized filenames for blacklist checks.
Differential Revision: http://reviews.llvm.org/D19395

llvm-svn: 267078
2016-04-21 23:38:45 +00:00
Tim Northover c52c74efdf MachO: enable .data_region directives everywhere
We'd disabled them on x86 because back in the early days some host tools
couldn't handle the new load commands. This no longer holds: anyone capable of
deploying Clang should be able to deploy its copies of ar/ranlib/etc.

rdar://25254790

llvm-svn: 267075
2016-04-21 23:00:17 +00:00
Zachary Turner ad817e8266 Fix pdbdump-headers.test after guid format change.
llvm-svn: 267067
2016-04-21 22:13:25 +00:00
Derek Bruening d862c178b0 [esan] EfficiencySanitizer instrumentation pass
Summary:
Adds an instrumentation pass for the new EfficiencySanitizer ("esan")
performance tuning family of tools.  Multiple tools will be supported
within the same framework.  Preliminary support for a cache fragmentation
tool is included here.

The shared instrumentation includes:
+ Turn mem{set,cpy,move} instrinsics into library calls.
+ Slowpath instrumentation of loads and stores via callouts to
  the runtime library.
+ Fastpath instrumentation will be per-tool.
+ Which memory accesses to ignore will be per-tool.

Reviewers: eugenis, vitalybuka, aizatsky, filcab

Subscribers: filcab, vkalintiris, pcc, silvas, llvm-commits, zhaoqin, kcc

Differential Revision: http://reviews.llvm.org/D19167

llvm-svn: 267058
2016-04-21 21:30:22 +00:00
Kevin Enderby 6e295f2304 Fix a typo in an error message. Caught by Sean Silva!
llvm-svn: 267056
2016-04-21 21:20:40 +00:00
Sanjay Patel 1725bde4cc add tests for disguised fabs/fneg
llvm-svn: 267053
2016-04-21 21:02:25 +00:00
Sanjay Patel 95590f4b7b use FileCheck; add test for disguised fabs
llvm-svn: 267051
2016-04-21 20:58:58 +00:00
Krzysztof Parzyszek adf02ae540 [Hexagon] Properly recognize register alt names
llvm-svn: 267038
2016-04-21 19:49:53 +00:00
Sanjoy Das a085cfc150 Folding compares with unescaped allocations
Summary:
If we know that the pointer allocated within a function does not escape,
we can fold away comparisons that are done with global pointers

Patch by Anna Thomas!

Reviewers: reames, majnemer, sanjoy

Subscribers: mgrang, mcrosier, majnemer, llvm-commits

Differential Revision: http://reviews.llvm.org/D19276

llvm-svn: 267035
2016-04-21 19:26:45 +00:00
Krzysztof Parzyszek 5de5910d7d [Hexagon] Expand handling of the small-data/bss section
llvm-svn: 267034
2016-04-21 18:56:45 +00:00
Matt Arsenault 8d1052f55c DAGCombiner: Reduce 64-bit BFE pattern to pattern on 32-bit component
If the extracted bits are restricted to the upper half or lower half,
this can be truncated.

llvm-svn: 267024
2016-04-21 18:03:06 +00:00
Philip Reames a98c7ead30 [instcombine][unordered] Extend load(select) transform to handle unordered loads
llvm-svn: 267023
2016-04-21 17:59:40 +00:00
Andrew Kaylor f0f279291c Initial implementation of optimization bisect support.
This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations.

The bisection is enabled using a new command line option (-opt-bisect-limit).  Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit.  A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used.

The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check.  Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute.  A new function call has been added for module and SCC passes that behaves in a similar way.

Differential Revision: http://reviews.llvm.org/D19172

llvm-svn: 267022
2016-04-21 17:58:54 +00:00
Philip Reames 3ac0718423 [unordered] unordered loads from null are still unreachable
llvm-svn: 267019
2016-04-21 17:45:05 +00:00
Marcin Koscielnicki 48d72342ff [PowerPC] [SSP] Fix stack guard load for 32-bit.
r266809 incorrectly used LD to load the stack guard, it should be LWZ.

Differential Revision: http://reviews.llvm.org/D19358

llvm-svn: 267017
2016-04-21 17:36:05 +00:00
Philip Reames ac55090e96 [instcombine][unordered] Implement *-load forwarding for unordered atomics
This builds on 266999 which made FindAvailableValue do the right thing.  Tests included show the newly enabled transforms and those which disabled either due to conservatism or correctness requirements.

llvm-svn: 267006
2016-04-21 17:03:33 +00:00
Amjad Aboud a5ba99140c Fixed Dwarf debug info emission to skip DILexicalBlockFile entries.
Before this fix, DILexicalBlockFile entries were skipped only in some cases and were not in other cases.

Differential Revision: http://reviews.llvm.org/D18724

llvm-svn: 267004
2016-04-21 16:58:49 +00:00
Philip Reames 92c43699bc [unordered] Add tests and conservative handling in support of future changes [NFCI]
This change adds a couple of test cases to make sure FindAvailableLoadedValue does the right thing.  At the moment, the code added is dead, but separating it makes follow on changes far more obvious.

llvm-svn: 266999
2016-04-21 16:51:08 +00:00
Rafael Espindola 15ca14c0b9 Fix recursive -only-needed.
We were assuming that only linkonce_odr GVs were lazy linked.

llvm-svn: 266995
2016-04-21 14:56:33 +00:00
Zoran Jovanovic 9360c10a88 [mips][microMIPS] Implement ldpc instruction
Differential Revision: http://reviews.llvm.org/D15009

llvm-svn: 266990
2016-04-21 14:32:12 +00:00
Zoran Jovanovic 6764fa7840 [mips][microMIPS] Add R_MICROMIPS_PC19_S2 relocation
Differential Revision: http://reviews.llvm.org/D14915

llvm-svn: 266988
2016-04-21 14:09:35 +00:00
Zoran Jovanovic 02b7003068 [mips][microMIPS] Add R_MICROMIPS_PC26_S1 relocation
Differential Revision: http://reviews.llvm.org/D14822

llvm-svn: 266985
2016-04-21 13:43:26 +00:00
Zlatko Buljan dd4151504a [mips][microMIPS] Implement TLBP, TLBR, TLBWI and TLBWR instructions
Differential Revision: http://reviews.llvm.org/D18855

llvm-svn: 266980
2016-04-21 11:32:40 +00:00
Zlatko Buljan d370f440e2 [mips][microMIPS] Implement LL, SC, MOVEP, ROTR, ROTRV and SYSCALL instructions and add tests for LWM32 and SWM32
Differential Revision: http://reviews.llvm.org/D19150

llvm-svn: 266977
2016-04-21 11:01:51 +00:00
Evgeny Astigeevich fc972f1451 Updated a test not to produce an empty s-file.
llvm-svn: 266971
2016-04-21 09:36:49 +00:00
Evgeny Astigeevich fd89fe0dd3 [AArch64][CodeGen] Fix of PR27158: incorrect peephole optimization in AArch64InstrInfo::optimizeCompareInstr
AArch64InstrInfo::optimizeCompareInstr has bug PR27158 which causes generation of incorrect code.
A compare instruction is substituted with another instruction which does not
produce the same flags as the original compare instruction.
This patch contains:
1. Fix of the bug.
2. A regression test in MIR.
3. A new test to check that SUBS is replaced by SUB.

Differential Revision: http://reviews.llvm.org/D18838

llvm-svn: 266969
2016-04-21 08:54:08 +00:00
Craig Topper 21690db05a [AVX512] Add CTTZ support for v8i64 and v16i32 vectors.
llvm-svn: 266968
2016-04-21 07:30:06 +00:00
Craig Topper 89d7a76d88 [X86] Fix vector-tzcnt-512 test to disable CDI while enabling BWI for one of the runs. Update check patterns accordingly.
llvm-svn: 266967
2016-04-21 07:30:03 +00:00
Craig Topper 4c07b0f896 Fix test command line to explicitly disable CDI instructions for one test.
llvm-svn: 266966
2016-04-21 07:29:59 +00:00
Craig Topper 340ad0a0c9 [AVX512] Add support for lowering CTTZ v64i8 and v32i16 with BWI instructions.
llvm-svn: 266963
2016-04-21 06:39:34 +00:00
Mehdi Amini a71a5a6289 ThinLTO: Resolve linkonce_odr aliases just like functions
This help to streamline the process of handling importing since
we don't need to special case alias everywhere: just like
linkonce_odr function, make sure at least one alias is emitted
by turning it weak.

Differential Revision: http://reviews.llvm.org/D19308

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266958
2016-04-21 05:47:17 +00:00
Sanjoy Das 54a3a006ca [SimplifyCFG] Fold `llvm.guard(false)` to unreachable
Summary:
`llvm.guard(false)` always bails out of the current compilation unit, so
we can prune any control flow following it.

Reviewers: hfinkel, pcc, reames

Subscribers: majnemer, reames, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19245

llvm-svn: 266955
2016-04-21 05:09:12 +00:00
Craig Topper 3dd625ce79 [AVX512] Add support for popcount of v8i64 and v16i32 with and without BWI instructions.
Without BWI we have to split the vectors into 256-bit vectors so we can use AVX2 pshufb and then concatenate the results.

llvm-svn: 266950
2016-04-21 03:57:24 +00:00
Mehdi Amini bda3c97c16 ThinLTO/ModuleLinker: add a flag to not always pull-in linkonce when performing importing
Summary:
The function importer already decided what symbols need to be pulled
in. Also these magically added ones will not be in the export list
for the source module, which can confuse the internalizer for
instance.

Reviewers: tejohnson, rafael

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D19096

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266948
2016-04-21 01:59:39 +00:00
Duncan P. N. Exon Smith c196531ef3 BitcodeWriter: Emit metadata in post-order (again)
Emit metadata nodes in post-order.  The iterative algorithm from r266709
failed to maintain this property.  After understanding my mistake, it
wasn't too hard to write a test with llvm-bcanalyzer (and I've actually
made this change once before: see r220340).

This also reverts the "noisy" testcase change from r266709.  That should
have been more of a red flag :/.

Note: The same bug crept into the ValueMapper in r265456.  I'm still
working on the fix.

llvm-svn: 266947
2016-04-21 01:55:12 +00:00
Nick Lewycky 762f8a8549 Add optimization for 'icmp slt (or A, B), A' and some related idioms based on knowledge of the sign bit for A and B.
No matter what value you OR in to A, the result of (or A, B) is going to be UGE A. When A and B are positive, it's SGE too. If A is negative, OR'ing a value into it can't make it positive, but can increase its value closer to -1, therefore (or A, B) is SGE A. Working through all possible combinations produces this truth table:

```
A is
+, -, +/-
F  F   F   +    B is
T  F   ?   -
?  F   ?   +/-
```

The related optimizations are flipping the 'slt' for 'sge' which always NOTs the result (if the result is known), and swapping the LHS and RHS while swapping the comparison predicate.

There are more idioms left to implement (aren't there always!) but I've stopped here because any more would risk becoming unreasonable for reviewers.

llvm-svn: 266939
2016-04-21 00:53:14 +00:00
Vedant Kumar 932866bfe7 [test/PGOProfile] Make tests independent of the raw profile version (NFC)
Differential Revision: http://reviews.llvm.org/D19290

llvm-svn: 266928
2016-04-20 22:24:01 +00:00
Kevin Enderby 81e8b7d949 Thread Expected<...> up from libObject’s getName() for symbols to allow llvm-objdump to produce a good error message.
Produce another specific error message for a malformed Mach-O file when a symbol’s
string index is past the end of the string table.  The existing test case in test/Object/macho-invalid.test
for macho-invalid-symbol-name-past-eof now reports the error with the message indicating
that a symbol at a specific index has a bad sting index and that bad string index value.
 
Again converting interfaces to Expected<> from ErrorOr<> does involve
touching a number of places. Where the existing code reported the error with a
string message or an error code it was converted to do the same.  There is some
code for this that could be factored into a routine but I would like to leave that for
the code owners post-commit to do as they want for handling an llvm::Error.  An
example of how this could be done is shown in the diff in
lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h which had a Check() routine
already for std::error_code so I added one like it for llvm::Error .

Also there some were bugs in the existing code that did not deal with the
old ErrorOr<> return values.  So now with Expected<> since they must be
checked and the error handled, I added a TODO and a comment:
“// TODO: Actually report errors helpfully” and a call something like
consumeError(NameOrErr.takeError()) so the buggy code will not crash
since needed to deal with the Error.

Note there fixes needed to lld that goes along with this that I will commit right after this.
So expect lld not to built after this commit and before the next one.

llvm-svn: 266919
2016-04-20 21:24:34 +00:00
Kostya Serebryany a83bfeac9d Rename asan-check-lifetime into asan-stack-use-after-scope
Summary:
This is done for consistency with asan-use-after-return.
I see no other users than tests.

Reviewers: aizatsky, kcc

Differential Revision: http://reviews.llvm.org/D19306

llvm-svn: 266906
2016-04-20 20:02:58 +00:00
Jacques Pienaar d96f8a3e82 [lanai] Add subword scheduling itineraries.
Differentiate between word and subword memory operations as they take different
amount of cycles to complete. This just adds a basic model of the subword
latency to the scheduler.

llvm-svn: 266898
2016-04-20 18:28:55 +00:00
Krzysztof Parzyszek 5626703837 [Hexagon] Fix handling of lcomm directive
Patch by Colin LeMahieu.

llvm-svn: 266882
2016-04-20 15:54:13 +00:00
Teresa Johnson 143d15bc29 Re-enable "[gold-plugin] Disable name for values other than GlobalValue"
This restores r266871 with a fix for gold tests relying on the value
names, when using a release compiler, by adding a way to disable the
default discarding. Update affected tests to use the new mechanism so
that value names are preserved as expected, regardless of how the
compiler was built.

llvm-svn: 266881
2016-04-20 15:16:57 +00:00
Teresa Johnson b35cc691ea [ThinLTO] Prevent importing of "llvm.used" values
Summary:
This patch prevents importing from (and therefore exporting from) any
module with a "llvm.used" local value. Local values need to be promoted
and renamed when importing, and their presense on the llvm.used variable
indicates that there are opaque uses that won't see the rename. One such
example is a use in inline assembly.

See also the discussion at:
http://lists.llvm.org/pipermail/llvm-dev/2016-April/098047.html

As part of this, move collectUsedGlobalVariables out of Transforms/Utils
and into IR/Module so that it can be used more widely. There are several
other places in LLVM that used copies of this code that can be cleaned
up as a follow on NFC patch.

Reviewers: joker.eph

Subscribers: pcc, llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D18986

llvm-svn: 266877
2016-04-20 14:39:45 +00:00
Krzysztof Parzyszek 16331f0aa0 [RDF] Consider register as live if any alias is live
This only affects the recomputation of kill flags.

llvm-svn: 266875
2016-04-20 14:33:23 +00:00
Zoran Jovanovic fdbd0a37c1 [mips][microMIPS] Implement BGEC, BGEUC, BLTC, BLTUC, BEQC and BNEC instructions
Differential Revision: http://reviews.llvm.org/D14206

llvm-svn: 266873
2016-04-20 14:07:46 +00:00
Teresa Johnson f0bedf5343 Revert "[gold-plugin] Disable name for values other than GlobalValue"
This reverts commit r266871. Setting the default based on the NDEBUG
flag is causing test failures. Need to figure out whether to change this
approach or update tests.

llvm-svn: 266872
2016-04-20 13:18:47 +00:00
Teresa Johnson d3ded4a441 [gold-plugin] Disable name for values other than GlobalValue
Summary:
Applies Mehdi's optimization (r263086) to disable value names other than
for GlobalValues to LTO/ThinLTO performed via the gold-plugin, in the
same manner as it is applied in libLTO.

Reviewers: rafael, joker-eph

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19269

llvm-svn: 266871
2016-04-20 13:01:37 +00:00
Nikolay Haustov fb5c307ccd AMDGPU/SI: Assembler: improvements to support trap handlers.
Add ParseAMDGPURegister which can be invoked recursively for parsing lists.
Rename getRegForName to getSpecialRegForName.
Support legacy SP3 register list syntax: [s2,s3,s4,s5] or [flat_scratch_lo,flat_scratch_hi].
Add 64-bit registers TBA, TMA where missing.
Add some tests.

Differential Revision: http://reviews.llvm.org/D19163

llvm-svn: 266865
2016-04-20 09:34:48 +00:00
Asaf Badouh 89406d1815 [X86] enable PIE for functions
Call locally defined function directly for PIE/fPIE

Differential Revision: http://reviews.llvm.org/D19226

llvm-svn: 266863
2016-04-20 08:32:57 +00:00
Hrvoje Varga 117625aaf3 [mips][microMIPS]Implement CFC*, CTC* and LDC* instructions
Differential Revision: http://reviews.llvm.org/D18640

llvm-svn: 266861
2016-04-20 06:34:48 +00:00
Craig Topper ddf022a337 [AVX512] Add avx512cd+vl runs to vector-tzcnt-128/256 tests to show using the vplzcntd/q instructions.
llvm-svn: 266860
2016-04-20 05:19:01 +00:00
Craig Topper 7e37746011 [AVX512] Update vector-tzcnt-512 test to show how bad v32i16 and v64i8 is with avx512bw enabled.
llvm-svn: 266859
2016-04-20 05:18:58 +00:00
Craig Topper 99e60e9f1f [AVX512] Add popcount support for v32i16 and v64i8.
llvm-svn: 266858
2016-04-20 05:18:55 +00:00
Mehdi Amini bb3a1d92f3 ThinLTO: never promote as external weak
This linkage is *not* intended to express that a declaration refers
to a weak symbol, but that the symbol might not be present at link
time. I don't believe it was the intent.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266856
2016-04-20 04:18:11 +00:00
Mehdi Amini 2c719cc117 FunctionImport: make sure we always select the right callee in presence of alias
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266854
2016-04-20 04:17:36 +00:00
Marcin Koscielnicki f12609c9ed [SystemZ] Add support for llvm.thread.pointer intrinsic.
Differential Revision: http://reviews.llvm.org/D19054

llvm-svn: 266844
2016-04-20 01:03:48 +00:00
Mandeep Singh Grang 029a0567fa [LLVM] Remove unwanted --check-prefix=CHECK from unit tests. NFC.
Summary: Removed unwanted --check-prefix=CHECK from numerous unit tests.

Reviewers: t.p.northover, dblaikie, uweigand, MatzeB, tstellarAMD, mcrosier

Subscribers: mcrosier, dsanders

Differential Revision: http://reviews.llvm.org/D19279

llvm-svn: 266834
2016-04-19 23:51:52 +00:00
Tim Northover 1ee27c74cb ARM: fix assertion failure on -O0 cmpxchg.
Because lowering of CMP_SWAP_64 occurs during type legalization, there can be
i64 types produced by more than just a BUILD_PAIR or similar. My initial tests
used just incoming function args.

llvm-svn: 266828
2016-04-19 22:25:02 +00:00
Nicolai Haehnle b48275f134 Add IntrWrite[Arg]Mem intrinsic property
Summary:
This property is used to mark an intrinsic that only writes to memory, but
neither reads from memory nor has other side effects.

An example where this is useful is the llvm.amdgcn.buffer.store.format.*
intrinsic, which corresponds to a store instruction that goes through a special
buffer descriptor rather than through a plain pointer.

With this property, the intrinsic should still be handled as having side
effects at the LLVM IR level, but machine scheduling can make smarter
decisions.

Reviewers: tstellarAMD, arsenm, joker.eph, reames

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18291

llvm-svn: 266826
2016-04-19 21:58:33 +00:00
Nicolai Haehnle e2dda4f750 AMDGPU: Guard VOPC instructions against incorrect commute
Summary:
The added testcase, which triggered this, was derived from a shader-db case
via bugpoint. A separate question is why scalar branching wasn't used.

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19208

llvm-svn: 266825
2016-04-19 21:58:22 +00:00
Krzysztof Parzyszek 3af70c126d [Hexagon] Fix operand swapping in HexagonPeephole
Also, disable zero- and size-extend optimizations for now.

llvm-svn: 266821
2016-04-19 21:36:24 +00:00
Marcin Koscielnicki 3fdc257d6a [AArch64] [ARM] Make a target-independent llvm.thread.pointer intrinsic.
Both AArch64 and ARM support llvm.<arch>.thread.pointer intrinsics that
just return the thread pointer.  I have a pending patch that does the same
for SystemZ (D19054), and there are many more targets that could benefit
from one.

This patch merges the ARM and AArch64 intrinsics into a single target
independent one that will also be used by subsequent targets.

Differential Revision: http://reviews.llvm.org/D19098

llvm-svn: 266818
2016-04-19 20:51:05 +00:00
Krzysztof Parzyszek 5ffee8d829 [Hexagon] Fix printing the address operand of S2_storerinewabs
llvm-svn: 266811
2016-04-19 20:20:33 +00:00
Tim Shen a1d8bc5597 [PPC, SSP] Support PowerPC Linux stack protection.
llvm-svn: 266809
2016-04-19 20:14:52 +00:00
Tim Shen e885d5e4d3 [SSP, 2/2] Create llvm.stackguard() intrinsic and lower it to LOAD_STACK_GUARD
With this change, ideally IR pass can always generate llvm.stackguard
call to get the stack guard; but for now there are still IR form stack
guard customizations around (see getIRStackGuard()). Future SSP
customization should go through LOAD_STACK_GUARD.

There is a behavior change: stack guard values are not CSEed anymore,
since we should never reuse the value in case that it has been spilled (and
corrupted). See ssp-guard-spill.ll. This also cause the change of stack
size and codegen in X86 and AArch64 test cases.

Ideally we'd like to know if the guard created in llvm.stackprotector() gets
spilled or not. If the value is spilled, discard the value and reload
stack guard; otherwise reuse the value. This can be done by teaching
register allocator to know how to rematerialize LOAD_STACK_GUARD and
force a rematerialization (which seems hard), or check for spilling in
expandPostRAPseudo. It only makes sense when the stack guard is a global
variable, which requires more instructions to load. Anyway, this seems to go out
of the scope of the current patch.

llvm-svn: 266806
2016-04-19 19:40:37 +00:00
Jacques Pienaar 50d4e98905 [lanai] Add lowering for SETCCE i32.
* Add lowering for SETCCE i32.
* Add test to check lowering of i64 compares uses SETCCE expansion (outside of EQ and NE).
* Fix select.ll test and immediate form selection for RI operations.

llvm-svn: 266802
2016-04-19 19:15:25 +00:00
Duncan P. N. Exon Smith 9738602869 IR: Enable debug info type ODR uniquing for forward decls
Add a new method, DICompositeType::buildODRType, that will create or
mutate the DICompositeType for a given ODR identifier, and use it in
LLParser and BitcodeReader instead of DICompositeType::getODRType.

The logic is as follows:

  - If there's no node, create one with the given arguments.
  - Else, if the current node is a forward declaration and the new
    arguments would create a definition, mutate the node to match the
    new arguments.
  - Else, return the old node.

This adds a missing feature supported by the current DITypeIdentifierMap
(which I'm slowly making redudant).  The only remaining difference is
that the DITypeIdentifierMap has a "the-last-one-wins" rule, whereas
DICompositeType::buildODRType has a "the-first-one-wins" rule.

For now I'm leaving behind DICompositeType::getODRType since it has
obvious, low-level semantics that are convenient for unit testing.

llvm-svn: 266786
2016-04-19 18:00:19 +00:00
Duncan P. N. Exon Smith b84b4164ec Linker: Simplify test/Linker/dicompositetype-unique.ll, NFC
Simplify the test logic a little, sharing logic between the two linking
directions by specifying -check-prefix multiple times.  Now it's more
obvious what's hte same and different between the two directions, and
there is less CHECK duplication.  This is a prep for expanding the test.

llvm-svn: 266773
2016-04-19 17:43:43 +00:00
Chad Rosier b7dfbb40a3 [ValueTracking] Improve isImpliedCondition for conditions with matching operands.
This patch improves SimplifyCFG to catch cases like:

  if (a < b) {
    if (a > b) <- known to be false
      unreachable;
  }

Phabricator Revision: http://reviews.llvm.org/D18905

llvm-svn: 266767
2016-04-19 17:19:14 +00:00
Mehdi Amini f12cbe7dd6 Fix Gold test after r266750 (ModuleLinker: Do not import linkonce/weak as "external_weak")
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266752
2016-04-19 16:21:37 +00:00
Mehdi Amini 113adde594 ModuleLinker: Do not import linkonce/weak as "external_weak"
Summary:
There is no reason to have a weak reference because the external
definition will be weak.

Reviewers: rafael

Subscribers: llvm-commits, tejohnson

Differential Revision: http://reviews.llvm.org/D19267

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266750
2016-04-19 16:11:05 +00:00
Duncan P. N. Exon Smith 0b0271ef97 IR: getOrInsertODRUniquedType => DICompositeType::getODRType, NFC
Lift the API for debug info ODR type uniquing up a layer.  Instead of
clients managing the map directly on the LLVMContext, add a static
method to DICompositeType called getODRType and handle the map in the
background.  Also adds DICompositeType::getODRTypeIfExists, so far just
for convenience in the unit tests.

This simplifies the logic in LLParser and BitcodeReader.  Because of
argument spam there are actually a few more lines of code now; I'll see
if I come up with a reasonable way to clean that up.

llvm-svn: 266742
2016-04-19 14:55:09 +00:00
Simon Pilgrim 998cffa6b9 [InstCombine][X86] Added extra tests introduced for D17490
llvm-svn: 266732
2016-04-19 12:59:52 +00:00
Simon Pilgrim 74b3bfdf71 [InstCombine][X86] Regenerate SSE combine tests as part of setup for D17490
Regenerated with utils/update_test_checks.py 

llvm-svn: 266731
2016-04-19 12:56:46 +00:00
Simon Pilgrim 32b1c9fe7f [X86][AVX2] Prefer VPERMQ/VPERMPD over VINSERTI128/VINSERTF128 for unary shuffles
Using VPERMQ/VPERMPD allows memory folding of the (repeated) input where VINSERTI128/VINSERTF128 can not.

Differential Revision: http://reviews.llvm.org/D19228

llvm-svn: 266728
2016-04-19 12:26:40 +00:00
Sanjoy Das c0441c29df Introduce a "patchable-function" function attribute
Summary:
The `"patchable-function"` attribute can be used by an LLVM client to
influence LLVM's code generation in ways that makes the generated code
easily patchable at runtime (for instance, to redirect control).
Right now only one patchability scheme is supported,
`"prologue-short-redirect"`, but this can be expanded in the future.

Reviewers: joker.eph, rnk, echristo, dberris

Subscribers: joker.eph, echristo, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19046

llvm-svn: 266715
2016-04-19 05:24:47 +00:00
Duncan P. N. Exon Smith 9695eb3239 BitcodeWriter: Break recursion when enumerating Metadata, almost NFC
Use a worklist instead of recursing through MDNode operands in
ValueEnumerator.  The actual record output order has changed slightly,
but otherwise there's no functionality change.

I had to update test/Bitcode/metadata-function-blocks.ll.  I renumbered
nodes so they continue to match the implicit record ids.

llvm-svn: 266709
2016-04-19 03:46:51 +00:00
Michael Kuperstein de16b44f74 Port DemandedBits to the new pass manager.
Differential Revision: http://reviews.llvm.org/D18679

llvm-svn: 266699
2016-04-18 23:55:01 +00:00
Paul Robinson 43d1e45347 [DWARF] Force a linkage_name on an inlined subprogram's abstract origin.
When we suppress linkage names, for a non-inlined subprogram the name
can still be found in the object-file symbol table, because we have
the code address of the subprogram.  This is not necessarily the case
for an inlined subprogram, so we still want to emit the linkage name
in the DWARF.  Put this on the abstract-origin DIE because it's common
to all inlined instances.

Differential Revision: http://reviews.llvm.org/D18706

llvm-svn: 266692
2016-04-18 22:41:41 +00:00
Tim Northover b629c77692 ARM: use a pseudo-instruction for cmpxchg at -O0.
The fast register-allocator cannot cope with inter-block dependencies without
spilling. This is fine for ldrex/strex loops coming from atomicrmw instructions
where any value produced within a block is dead by the end, but not for
cmpxchg. So we lower a cmpxchg at -O0 via a pseudo-inst that gets expanded
after regalloc.

Fortunately this is at -O0 so we don't have to care about performance. This
simplifies the various axes of expansion considerably: we assume a strong
seq_cst operation and ensure ordering via the always-present DMB instructions
rather than v8 acquire/release instructions.

Should fix the 32-bit part of PR25526.

llvm-svn: 266679
2016-04-18 21:48:55 +00:00
Simon Pilgrim 9c7ea2dfaa [X86][SSE] Test case for PR2585
llvm-svn: 266669
2016-04-18 21:07:49 +00:00
Simon Pilgrim 647e9f80af [X86][AVX] Added extra memory folding tests for D19228
llvm-svn: 266662
2016-04-18 19:48:16 +00:00
Chad Rosier e30fed70e6 [ValueTracking] Correct lit test comments. NFC.
llvm-svn: 266657
2016-04-18 19:11:45 +00:00
Sanjoy Das 432c1c3fb3 [BPI] Consider deoptimize calls as "unreachable"
Summary:
Calls to @llvm.experimental.deoptimize are expected to "never execute",
so optimize them as such.

Reviewers: chandlerc

Subscribers: junbuml, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19095

llvm-svn: 266654
2016-04-18 19:01:28 +00:00
Xinliang David Li e6b892940f Port InstrProfiling pass to the new pass manager
Differential Revision: http://reviews.llvm.org/D18126

llvm-svn: 266637
2016-04-18 17:47:38 +00:00
Simon Pilgrim 6802410f42 [X86][AVX] Added zero+blend vs vperm2f128 optsize tests cases (PR22984)
We should be trying to use vperm2f128 instead of zero+blend (if we're only the user of zero?) when optsize is enabled.

llvm-svn: 266632
2016-04-18 17:14:04 +00:00
Konstantin Zhuravlyov 8c273ad719 [AMDGPU] Add insert nops pass based on subtarget features instead of cl::opt
Also,
- Skip pass if machine module does not have debug info
- Minor comment changes
- Added test

Differential Revision: http://reviews.llvm.org/D19079

llvm-svn: 266626
2016-04-18 16:28:23 +00:00
Simon Pilgrim ec4f40b6ee [X86][AVX] Renamed vperm2f128 test to make it quicker to review
missed one the first time round...

llvm-svn: 266623
2016-04-18 16:08:19 +00:00
Simon Pilgrim 1a87a93dff [X86][AVX] Renamed vperm2f128 tests to make it quicker to review
llvm-svn: 266621
2016-04-18 15:37:45 +00:00
Igor Kudrin 1c14dc4c5a Reapply "[Coverage] Prevent detection of false instantiations in case of macro expansion."
The root of the problem was that findMainViewFileID(File, Function)
could return some ID for any given file, even though that file
was not the main file for that function.

This patch ensures that the result of this function is conformed
with the result of findMainViewFileID(Function).

This commit reapplies r266436, which was reverted by r266458,
with the .covmapping file serialized in v1 format.

Differential Revision: http://reviews.llvm.org/D18787

llvm-svn: 266620
2016-04-18 15:36:30 +00:00
Eric Liu d09f15ea6f Revert "Replace the use of MaxFunctionCount module flag"
This reverts commit r266477.

This commit introduces cyclic dependency. This commit has "Analysis" depend on "ProfileData",
while "ProfileData" depends on "Object", which depends on "BitCode", which
depends on "Analysis".

llvm-svn: 266619
2016-04-18 15:31:11 +00:00
Artem Tamazov e2762423c2 [AMDGPU][llvm-mc] s_setreg* - Fix order of operands
Order should match the sp3 syntax, where destination (simm16 denoting the hwreg) is coming first.

Differential Revision: http://reviews.llvm.org/D19161

llvm-svn: 266617
2016-04-18 14:54:26 +00:00
Daniel Sanders d8c07766f3 [mips][ias] Prevent double-filling of delay slots by generating '.set noreorder' regions.
Summary:
When clang is given -save-temps or -via-file-asm, any inline assembly in
the source is parsed twice. Once by the compiler, and again by the
assembler. We must take care to ensure that this doesn't lead to
double-filling delay slots.

Reviewers: sdardis, vkalintiris

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D19166

llvm-svn: 266608
2016-04-18 12:35:36 +00:00
Renato Golin 4b18a510a2 [ARM] AArch32 v8 NEON is still not IEEE-754 compliant
llvm-svn: 266603
2016-04-18 12:06:47 +00:00
Strahinja Petrovic c91b45a31a [PowerPC] add comment to test
Added comment in test for soft-float operations on ppc architecture.
Test commit.

llvm-svn: 266600
2016-04-18 11:52:14 +00:00
Craig Topper 6ff46266d1 Declare MVT::SimpleValueType as an int8_t sized enum. This removes 400 bytes from TargetLoweringBase and probably other places.
This required changing several places to print VT enums as strings instead of raw ints since the proper method to use to print became ambiguous. This is probably an improvement anyway.

This also appears to save ~8K from an x86 self host build of llc.

llvm-svn: 266562
2016-04-17 17:37:33 +00:00
Simon Pilgrim 6ccbd2bdd2 [X86][SSE] Added 16i8 -> 8i64 sext test
Shows poor codegen for AVX2

llvm-svn: 266560
2016-04-17 15:10:42 +00:00
Craig Topper 75869d5701 [AVX512] ISD::MUL v2i64/v4i64 should only be legal if DQI and VLX features are enabled.
llvm-svn: 266554
2016-04-17 07:25:39 +00:00
Duncan P. N. Exon Smith 2c7cd4afab IR: Fix type-refs in testcase from r266548
There's a hole in the verifier right now: if a module has no compile
units, it never checks that all the string-based DITypeRefs get
resolved.  As a result, this testcase didn't fail the verifier, even
there were references to `!"has-uuid"` instead of `!"uuid"` (the former
was a composite type's 'name:' field, the latter its 'identifier:'
field).

I'm currently working on removing string-based type refs entirely, and
this testcase started failing (because the upgrade script can't resolve
the type refs).  Rather than fixing the (about-to-be-removed) hole in
the verifier, I'm just going to fix the test so that my upgrade script
handles it.

llvm-svn: 266553
2016-04-17 06:42:30 +00:00
Sanjoy Das 99042473d0 Fix a typo in rL265762
I accidentally replaced `mayBeOverridden` with `!isInterposable`.
Remove the negation and add a test case that would've caught this.

Many thanks to Håkan Hjort for spotting this!

llvm-svn: 266551
2016-04-17 04:30:43 +00:00
Duncan P. N. Exon Smith 5ab2be094e IR: Use an explicit map for debug info type uniquing
Rather than relying on the structural equivalence of DICompositeType to
merge type definitions, use an explicit map on the LLVMContext that
LLParser and BitcodeReader consult when constructing new nodes.
Each non-forward-declaration DICompositeType with a non-empty
'identifier:' field is stored/loaded from the type map, and the first
definiton will "win".

This map is opt-in: clients that expect ODR types from different modules
to be merged must call LLVMContext::ensureDITypeMap.

  - Clients that just happen to load more than one Module in the same
    LLVMContext won't magically merge types.

  - Clients (like LTO) that want to continue to merge types based on ODR
    identifiers should opt-in immediately.

I have updated LTOCodeGenerator.cpp, the two "linking" spots in
gold-plugin.cpp, and llvm-link (unless -disable-debug-info-type-map) to
set this.

With this in place, it will be straightforward to remove the DITypeRef
concept (i.e., referencing types by their 'identifier:' string rather
than pointing at them directly).

llvm-svn: 266549
2016-04-17 03:58:21 +00:00
Duncan P. N. Exon Smith 05ebfd0938 IR: Use ODR to unique DICompositeType members
Merge members that are describing the same member of the same ODR type,
even if other bits differ.  If the file or line differ, we don't care;
if anything else differs, it's an ODR violation (and we still don't
really care).

For DISubprogram declarations, this looks at the LinkageName and Scope.
For DW_TAG_member instances of DIDerivedType, this looks at the Name and
Scope.  In both cases, we know that the Scope follows ODR rules if it
has a non-empty identifier.

llvm-svn: 266548
2016-04-17 02:30:20 +00:00
Duncan P. N. Exon Smith 4f6f15d2df Linker: Clarify test/Linker/type-unique-odr-a.ll, NFC
Split up the long RUN and clarify the CHECK lines:

  - Explicitly confirm there are no other subprograms inside of "A".

  - Remove checks for "bar" and "baz", which were just implicitly
    checking that there were no other subprograms inside of "A".

This prepares for adding a RUN line which links the two files in the
opposite direction.

llvm-svn: 266543
2016-04-17 00:26:17 +00:00
Simon Pilgrim b75744ceae [X86][AVX] Add shuffle combine tests for MOVDDUP/MOVSHDUP/MOVSLDUP
128, 256 and 512 bit implementations (some not yet supported by combineX86ShuffleChain)

llvm-svn: 266535
2016-04-16 20:30:59 +00:00
Simon Pilgrim fd4b9b02a3 [X86][XOP] Added VPPERM constant mask decoding and target shuffle combining support
Added additional test that peeks through bitcast to v16i8 mask

llvm-svn: 266533
2016-04-16 17:52:07 +00:00
Simon Pilgrim 4eb94a93a0 [X86][XOP] More VPPERM shuffle mask decode tests
As requested by D18441

llvm-svn: 266531
2016-04-16 16:37:21 +00:00
Mehdi Amini 2d28f7aa07 ThinLTO: Make aliases explicit in the summary
To be able to work accurately on the reference graph when taking
decision about internalizing, promoting, renaming, etc. We need
to have the alias information explicit.

Differential Revision: http://reviews.llvm.org/D18836

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266517
2016-04-16 06:56:44 +00:00
Alex Denisov d5cee4aada Replace hardcoded comment at 'lit.site.cfg.in'
At the moment almost every lit.site.cfg.in contains two lines comment:

## Autogenerated by LLVM/Clang configuration.
# Do not edit!

The patch adds variable LIT_SITE_CFG_IN_HEADER, that is replaced from 
configure_lit_site_cfg with the note and some useful information.

llvm-svn: 266515
2016-04-16 06:47:41 +00:00
Matt Arsenault c10783c42d AMDGPU: Enable LocalStackSlotAllocation pass
This resolves more frame indexes early and folds
the immediate offsets into the scratch mubuf instructions.

This cleans up a lot of the mess that's currently emitted,
such as emitting add 0s and repeatedly initializing the same
register to 0 when spilling.

llvm-svn: 266508
2016-04-16 02:13:37 +00:00
Matt Arsenault b6be202779 AMDGPU: Use s_addk_i32 / s_mulk_i32
llvm-svn: 266506
2016-04-16 01:46:49 +00:00
Wei Mi 963f2df4d2 Don't skip splitSeparateComponents in eliminateDeadDefs for HoistSpillHelper::hoistAllSpills.
Because HoistSpillHelper::hoistAllSpills is called in postOptimization, before the
patch we didn't want LiveRangeEdit::eliminateDeadDefs to call splitSeparateComponents
and generate unassigned new vregs. However, skipping splitSeparateComponents will make
verify-machineinstrs unhappy, so I remove the early return, and use
HoistSpillHelper::LRE_DidCloneVirtReg to assign physreg/stackslot for those new vregs.

In addition, some code reorganization to make class HoistSpillHelper privately inheriting
from LiveRangeEdit::Delegate possible. This is to be consistent with class RAGreedy and
class RegisterCoalescer.

Differential Revision: http://reviews.llvm.org/D19142

llvm-svn: 266489
2016-04-15 23:16:44 +00:00
Evgeniy Stepanov 40cd1514cf [cfi] Support explicit sections for functions in cfi-icall.
Allow explicit section for indirectly called functions in cfi-icall.
Jumptables for functions in the same type class must be contiguous, so they
always go to the default text section.

Fixes PR25079.

llvm-svn: 266486
2016-04-15 22:55:38 +00:00
Adrian Prantl dc75a6b517 Convert this sample-based-profiling testcase to use a NoDebug CU.
llvm-svn: 266481
2016-04-15 22:05:38 +00:00
Hans Wennborg 07f6d3a893 Switch lowering: don't add incoming PHI values from skipped bit test MBB's (PR27135)
After r245976, LLVM will skip the last bit test case if knows it will always be
true. However, we would still erroneously update PHI nodes with incoming values
from the MBB that would perform the final bit test, causing -verify-machineinstrs
to fail.

llvm-svn: 266479
2016-04-15 21:45:30 +00:00
Easwaran Raman f53baca686 Replace the use of MaxFunctionCount module flag
Adds an interface to get ProfileSummary for a module and makes InlineCost use ProfileSummary to get max function count.

Differential Revision: http://reviews.llvm.org/D18622

llvm-svn: 266477
2016-04-15 21:39:58 +00:00
Adrian Prantl 7a717c4ee3 Let the DISubprogram in this test point to the right compile unit.
llvm-svn: 266468
2016-04-15 19:38:14 +00:00
Adrian Prantl ab2398935f Update testcase to new debug metadata format.
llvm-svn: 266467
2016-04-15 19:32:22 +00:00
Tim Northover 903f81ba18 ARM: don't try to hoist constant RHS out of a division.
Divisions by a constant can be converted into multiplies which are usually
cheaper, but this isn't possible if the constant gets separated (particularly
in loops). Fix this by telling ConstantHoisting that the immediate in a DIV is
cheap.

I considered making the check generic, but neither AArch64 (strangely) nor x86
showed any benefit on the tests I had.

llvm-svn: 266464
2016-04-15 18:17:18 +00:00
Chad Rosier 1fbe9bcab4 [AArch64] Add load/store pair instructions to getMemOpBaseRegImmOfsWidth().
This improves AA in the MI schduler when reason about paired instructions.

Phabricator Revision: http://reviews.llvm.org/D17098
PR26358

llvm-svn: 266462
2016-04-15 18:09:10 +00:00
Igor Kudrin e880a06559 Revert "[Coverage] Prevent detection of false instantiations in case of macro expansion."
This reverts commit r266436 as it broke buildbot.

llvm-svn: 266458
2016-04-15 17:53:48 +00:00
Marcin Koscielnicki a344ae8f78 [SystemZ] Fix large tests broken by conditional returns.
These were broken by D17339.

Differential Revision: http://reviews.llvm.org/D19158

llvm-svn: 266454
2016-04-15 17:24:40 +00:00
David Majnemer 2e02ba78d5 [InstCombine] Don't transform compares of calls to functions named fabs{f,l,}
InstCombine wants to optimize compares of calls to fabs with zero.
However, we didn't have the necessary legality checking to verify that
the function call had the same behavior as fabs.

llvm-svn: 266452
2016-04-15 17:21:03 +00:00
Geoff Berry a6a4ab3b69 Fix test to require Asserts since it uses debug output.
llvm-svn: 266448
2016-04-15 16:09:00 +00:00
Adrian Prantl 75819aedf6 [PR27284] Reverse the ownership between DICompileUnit and DISubprogram.
Currently each Function points to a DISubprogram and DISubprogram has a
scope field. For member functions the scope is a DICompositeType. DIScopes
point to the DICompileUnit to facilitate type uniquing.

Distinct DISubprograms (with isDefinition: true) are not part of the type
hierarchy and cannot be uniqued. This change removes the subprograms
list from DICompileUnit and instead adds a pointer to the owning compile
unit to distinct DISubprograms. This would make it easy for ThinLTO to
strip unneeded DISubprograms and their transitively referenced debug info.

Motivation
----------

Materializing DISubprograms is currently the most expensive operation when
doing a ThinLTO build of clang.

We want the DISubprogram to be stored in a separate Bitcode block (or the
same block as the function body) so we can avoid having to expensively
deserialize all DISubprograms together with the global metadata. If a
function has been inlined into another subprogram we need to store a
reference the block containing the inlined subprogram.

Attached to https://llvm.org/bugs/show_bug.cgi?id=27284 is a python script
that updates LLVM IR testcases to the new format.

http://reviews.llvm.org/D19034
<rdar://problem/25256815>

llvm-svn: 266446
2016-04-15 15:57:41 +00:00
NAKAMURA Takumi 391dae265f llvm/test/CodeGen/AArch64/arm64-csldst-mmo.ll requires +Asserts.
llvm-svn: 266443
2016-04-15 15:37:27 +00:00
Sanjay Patel f11ab05bdb [SimplifyCFG] propagate branch metadata when creating select (PR27344)
This is almost identical to:
http://reviews.llvm.org/rL264527

This doesn't solve PR27344; it just allows the profile weights to survive. 
To solve the bug, we need to use the profile weights in the backend.

llvm-svn: 266442
2016-04-15 15:32:12 +00:00
Geoff Berry c376406669 [AArch64] Add MMOs to callee-save load/store instructions.
Summary:
Without MMOs, the callee-save load/store instructions were treated as
volatile by the MI post-RA scheduler and AArch64LoadStoreOptimizer.

Reviewers: t.p.northover, mcrosier

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D17661

llvm-svn: 266439
2016-04-15 15:16:19 +00:00
Nirav Dave 1f51c334ca Fix typing on generated LXV2DX/STXV2DX instructions
[PPC] Previously when casting generic loads to LXV2DX/ST instructions we
would leave the original load return type in place allowing for an
assertion failure when we merge two equivalent LXV2DX nodes with
different types.

This fixes PR27350.

Reviewers: nemanjai

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19133

llvm-svn: 266438
2016-04-15 15:01:38 +00:00
Jun Bum Lim 4c5bd58ebe [MachineScheduler]Add support for store clustering
Perform store clustering just like load clustering. This change add
StoreClusterMutation in machine-scheduler. To control StoreClusterMutation,
added enableClusterStores() in TargetInstrInfo.h. This is enabled only on
AArch64 for now.

This change also add support for unscaled stores which were not handled in
getMemOpBaseRegImmOfs().

llvm-svn: 266437
2016-04-15 14:58:38 +00:00
Igor Kudrin 061d496c51 [Coverage] Prevent detection of false instantiations in case of macro expansion.
The root of the problem was that findMainViewFileID(File, Function)
could return some ID for any given file, even though that file
was not the main file for that function.

This patch ensures that the result of this function is conformed
with the result of findMainViewFileID(Function).

Differential Revision: http://reviews.llvm.org/D18787

llvm-svn: 266436
2016-04-15 14:56:50 +00:00
Sanjay Patel 81433e99b9 [SimplifyCFG] add metadata to show failure to propagate (PR27344)
llvm-svn: 266435
2016-04-15 14:53:35 +00:00
Nicolai Haehnle 750082d1fe AMDGPU/SI: Fix regression with no-return atomics
Summary:
In the added test-case, the atomic instruction feeds into a non-machine
CopyToReg node which hasn't been selected yet, so guard against
non-machine opcodes here.

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19043

llvm-svn: 266433
2016-04-15 14:42:36 +00:00
Justin Lebar f04e678e36 Move divergent-target test into CodeGen/NVPTX because it requires an NVPTX target.
llvm-svn: 266403
2016-04-15 01:20:52 +00:00
Justin Lebar cad81cf6b3 [Speculation] Add a SpeculativeExecution mode where the pass does nothing unless TTI::hasBranchDivergence() is true.
Summary:
This lets us add this pass to the IR pass manager unconditionally; it
will simply not do anything on targets without branch divergence.

Reviewers: tra

Subscribers: llvm-commits, jingyue, rnk, chandlerc

Differential Revision: http://reviews.llvm.org/D18625

llvm-svn: 266398
2016-04-15 00:32:09 +00:00
Vedant Kumar 4960fbf391 [test] Require 'asserts' for a test which uses -debug-only
Without this line, bots which run check-all on Release compilers will
break.

llvm-svn: 266386
2016-04-14 23:32:40 +00:00
Matt Arsenault fd8ab09c0e AMDGPU: Include LDS size in printed comment
llvm-svn: 266382
2016-04-14 22:11:51 +00:00
Michael Kuperstein 16f13e252b [AliasSetTracker] Correctly handle changing the size of an entry
If the size of an AST entry changes, we also need to make sure we perform
necessary alias set merges, as the new size may overlap pointers in other sets.
We happen to run into this with memset, because memset allows an entry for a
i8* pointer to have a decidedly non-i8 size.

This fixes PR27262.

Differential Revision: http://reviews.llvm.org/D18939

llvm-svn: 266381
2016-04-14 22:00:11 +00:00
Matt Arsenault 3d1c1deb04 AMDGPU: Run SIFoldOperands after PeepholeOptimizer
PeepholeOptimizer cleans up redundant copies, which makes
the operand folding more effective.

shader-db stats:

Totals:
SGPRS: 34200 -> 34336 (0.40 %)
VGPRS: 22118 -> 21655 (-2.09 %)
Code Size: 632144 -> 633460 (0.21 %) bytes
LDS: 11 -> 11 (0.00 %) blocks
Scratch: 10240 -> 11264 (10.00 %) bytes per wave
Max Waves: 8822 -> 8918 (1.09 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 7704 -> 7840 (1.77 %)
VGPRS: 5169 -> 4706 (-8.96 %)
Code Size: 234444 -> 235760 (0.56 %) bytes
LDS: 2 -> 2 (0.00 %) blocks
Scratch: 0 -> 1024 (0.00 %) bytes per wave
Max Waves: 1188 -> 1284 (8.08 %)
Wait states: 0 -> 0 (0.00 %)

Increases:
SGPRS: 35 (0.01 %)
VGPRS: 1 (0.00 %)
Code Size: 59 (0.02 %)
LDS: 0 (0.00 %)
Scratch: 1 (0.00 %)
Max Waves: 48 (0.02 %)
Wait states: 0 (0.00 %)

Decreases:
SGPRS: 26 (0.01 %)
VGPRS: 54 (0.02 %)
Code Size: 68 (0.03 %)
LDS: 0 (0.00 %)
Scratch: 0 (0.00 %)
Max Waves: 4 (0.00 %)
Wait states: 0 (0.00 %)

llvm-svn: 266378
2016-04-14 21:58:24 +00:00