Commit Graph

96102 Commits

Author SHA1 Message Date
Quentin Colombet f760799c40 [AArch64][RegisterBankInfo] Bump the cost of vector loads.
This does not change anything yet, because we do not offer any
alternative mapping.

llvm-svn: 284088
2016-10-13 00:11:59 +00:00
Quentin Colombet f35a8c5bdc [AArch64][RegisterBankInfo] Use a proper cost for cross regbank G_BITCASTs.
This does not change anything yet, because we do not offer any
alternative mapping.

llvm-svn: 284087
2016-10-13 00:11:57 +00:00
Quentin Colombet 27b40356f7 [AArch64][RegisterBankInfo] Provide more realistic copy costs.
llvm-svn: 284086
2016-10-13 00:11:55 +00:00
Krzysztof Parzyszek abc0662f04 Handle lane masks in LivePhysRegs when adding live-ins
Differential Revision: https://reviews.llvm.org/D25533

llvm-svn: 284076
2016-10-12 22:53:41 +00:00
Tim Northover fb8d989818 GlobalISel: support G_TRUNC selection on AArch64.
Ahmed's patch again.

llvm-svn: 284075
2016-10-12 22:49:15 +00:00
Tim Northover 69271c64d5 GlobalISel: support int <-> float conversions on AArch64.
More of Ahmed's work.

llvm-svn: 284074
2016-10-12 22:49:11 +00:00
Tim Northover 7dd378dd08 GlobalISel: select G_FCMP instructions on AArch64.
Another of Ahmed's patches.

llvm-svn: 284073
2016-10-12 22:49:07 +00:00
Tim Northover 6c02ad5e4f GlobalISel: support selection of G_ICMP on AArch64.
Patch from Ahmed Bougaca again.

llvm-svn: 284072
2016-10-12 22:49:04 +00:00
Tim Northover 5e3dbf326c GlobalISel: select G_BRCOND instructions on AArch64.
llvm-svn: 284071
2016-10-12 22:49:01 +00:00
Tim Northover 6aacd27cd7 GlobalISel: mark G_BRCOND on s1 as legal.
It's going to be a TBNZ (at -O0) anyway, so the high bits don't matter.

llvm-svn: 284070
2016-10-12 22:48:36 +00:00
Vedant Kumar 68216d7bd3 [Coverage] Factor out logic to create FunctionRecords (NFC)
llvm-svn: 284063
2016-10-12 22:27:45 +00:00
Albert Gutowski 795d7d6381 Create llvm.addressofreturnaddress intrinsic
Summary: We need a new LLVM intrinsic to implement MS _AddressOfReturnAddress builtin on 64-bit Windows.

Reviewers: majnemer, rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25293

llvm-svn: 284061
2016-10-12 22:13:19 +00:00
Reid Kleckner fb58be862c Update _MSC_VER equality checks for msdiaNNN.dll
Use inequality instead of equality to defend against minor version
increases in _MSC_VER. An _MSC_VER value of 1901 should still use
msdia140.dll, as described in this blog post:
https://blogs.msdn.microsoft.com/vcblog/2016/10/05/visual-c-compiler-version/

llvm-svn: 284058
2016-10-12 21:51:14 +00:00
Haicheng Wu 1ef17e90b2 Reapply "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop"
Reappy r284044 after revert in r284051. Krzysztof fixed the error in r284049.

The original summary:

This patch tries to fully unroll loops having break statement like this

for (int i = 0; i < 8; i++) {
    if (a[i] == value) {
        found = true;
        break;
    }
}

GCC can fully unroll such loops, but currently LLVM cannot because LLVM only
supports loops having exact constant trip counts.

The upper bound of the trip count can be obtained from calling
ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the
refactoring work in SCEV to prevent duplicating code.

The feature of using the upper bound is enabled under the same circumstance
when runtime unrolling is enabled since both are used to unroll loops without
knowing the exact constant trip count.

llvm-svn: 284053
2016-10-12 21:29:38 +00:00
Krzysztof Parzyszek d62669d791 [MIRParser] Parse lane masks for register live-ins
Differential Revision: https://reviews.llvm.org/D25530

llvm-svn: 284052
2016-10-12 21:06:45 +00:00
Haicheng Wu 45e4ef737d Revert "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop"
This reverts commit r284044.

llvm-svn: 284051
2016-10-12 21:02:22 +00:00
Haicheng Wu 6cac34fd41 [LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop
This patch tries to fully unroll loops having break statement like this

for (int i = 0; i < 8; i++) {
    if (a[i] == value) {
        found = true;
        break;
    }
}

GCC can fully unroll such loops, but currently LLVM cannot because LLVM only
supports loops having exact constant trip counts.

The upper bound of the trip count can be obtained from calling
ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the
refactoring work in SCEV to prevent duplicating code.

The feature of using the upper bound is enabled under the same circumstance
when runtime unrolling is enabled since both are used to unroll loops without
knowing the exact constant trip count.

Differential Revision: https://reviews.llvm.org/D24790

llvm-svn: 284044
2016-10-12 20:24:32 +00:00
Peter Collingbourne 46aafc16e8 LTO: Use the correct mangler function in LTOCodeGenerator::applyScopeRestrictions().
We need to use the overload of Mangler::getNameWithPrefix that takes a
GlobalValue in order to mangle in the stdcall stack byte count for Windows
targets.

Differential Revision: https://reviews.llvm.org/D25529

llvm-svn: 284040
2016-10-12 20:12:19 +00:00
Krzysztof Parzyszek 8271be9a1d Do not remove implicit defs in BranchFolder
Branch folder removes implicit defs if they are the only non-branching
instructions in a block, and the branches do not use the defined registers.
The problem is that in some cases these implicit defs are required for
the liveness information to be correct.

Differential Revision: https://reviews.llvm.org/D25478

llvm-svn: 284036
2016-10-12 19:50:57 +00:00
Matt Arsenault d486d3f8d1 AMDGPU: Initial implementation of VGPR indexing mode
This is the most basic handling of the indirect access
pseudos using GPR indexing mode. This currently only enables
the mode for a single v_mov_b32 and then disables it.
This is much more complicated to use than the movrel instructions,
so a new optimization pass is probably needed to fold the access
into the uses and keep the mode enabled for them.

llvm-svn: 284031
2016-10-12 18:49:05 +00:00
Teresa Johnson 4b9b379172 [ThinLTO] Don't link module level assembly when importing
Module inline asm was always being linked/concatenated
when running the IRLinker. This is correct for full LTO but not when
we are importing for ThinLTO, as it can result in multiply defined
symbols when the module asm defines a global symbol.

In order to test with llvm-lto2, I had to work around PR30396,
where a symbol that is defined in module assembly but defined in the
LLVM IR appears twice. Added workaround to llvm-lto2 with a FIXME.

Fixes PR30610.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25359

llvm-svn: 284030
2016-10-12 18:39:29 +00:00
Sanjoy Das bc357e8fa3 [SimplifyCFG] Don't create PHI nodes for constant bundle operands
Summary:
Constant bundle operands may need to retain their constant-ness for
correctness.  I'll admit that this is slightly odd, but it looks like
SimplifyCFG already does this for things like @llvm.frameaddress and
@llvm.stackmap, so I suppose adding one more case is not a big deal.

It is possible to add a mechanism to denote bundle operands that need to
remain constants, but that's probably too complicated for the time
being.

Reviewers: jmolloy

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D25502

llvm-svn: 284028
2016-10-12 18:15:33 +00:00
Matt Arsenault cc88ce36ed AMDGPU: Add instruction definitions for VGPR indexing
VI added a second method of indexing into VGPRs
besides using v_movrel*

llvm-svn: 284027
2016-10-12 18:00:51 +00:00
Tom Stellard fac248cb5f AMDGPU/SI: Change mimg intrinsic signatures
This makes more fields overridable and removes redundant bits.

Patch by: Changpeng Fang

llvm-svn: 284024
2016-10-12 16:35:29 +00:00
Artur Pilipenko c6eb6bd9cb [ValueTracking] An improvement to IR ValueTracking on Non-negative Integers
Since this change is known to cause performance degradations in some cases it's commited under a temporary flag which is turned off by default.

Patch by Li Huang

Differential Revision: https://reviews.llvm.org/D18777

llvm-svn: 284022
2016-10-12 16:18:43 +00:00
Matt Arsenault 691efe03bd BranchRelaxation: Unique live ins when creating block
llvm-svn: 284018
2016-10-12 15:32:04 +00:00
Nirav Dave d046332679 [MC] Fix Error Location for ParseIdentifier
Prevent partial parsing of '$' or '@' of invalid identifiers and fixup
workaround points. NFC Intended.

llvm-svn: 284017
2016-10-12 13:58:07 +00:00
Simon Pilgrim 08190943cb [DAGCombiner] Update most ADD combines to support general vector combines
Add a number of helper functions to match scalar or vector equivalent constant/splat values to allow most of the combine patterns to be used by vectors.

Differential Revision: https://reviews.llvm.org/D25374

llvm-svn: 284015
2016-10-12 13:48:10 +00:00
Konstantin Zhuravlyov 081385a74e [DAGCombiner] Do not remove the load of stored values when optimizations are disabled
This combiner breaks debug experience and should not be run when optimizations are disabled.

For example:
  int main() {
    int j = 0;
    j += 2;
    if (j == 2)
      return 0;
    return 5;
  }
When debugging this code compiled in /O0, it should be valid to break at line "j+=2;" and edit the value of j. It should change the return value of the function.

Differential Revision: https://reviews.llvm.org/D19268

llvm-svn: 284014
2016-10-12 13:44:24 +00:00
Chad Rosier c215c3fd14 [CVP] Convert an AShr to a LShr if 1st operand is known to be nonnegative.
An arithmetic shift can be safely changed to a logical shift if the first
operand is known positive. This allows ComputeKnownBits (and similar analysis)
to determine the sign bit of the shifted value in some cases. In turn, this
allows InstCombine to canonicalize a signed comparison (a > 0) into an equality
check (a != 0).

PR30577

Differential Revision: https://reviews.llvm.org/D25119

llvm-svn: 284013
2016-10-12 13:41:38 +00:00
Alexey Bataev b271a58e37 NFC: The Cost Model specialization, by Andrey Tischenko
The current Cost Model implementation is very inaccurate and has to be
updated, improved, re-implemented to be able to take into account the
concrete CPU models and the concrete targets where this Cost Model is
being used. For example, the Latency Cost Model should be differ from
Code Size Cost Model, etc.
This patch is the first step to launch the developing and implementation
of a new Cost Model generation.

Differential Revision: https://reviews.llvm.org/D25186

llvm-svn: 284012
2016-10-12 13:24:13 +00:00
Simon Pilgrim fd0d7b21e0 [InstCombine] Fix constexpr issue in select combining
As discussed by Andrea on PR30486, we have an unsafe cast to an Instruction type in the select combine which doesn't take into account that it could be a ConstantExpr instead.

Differential Revision: https://reviews.llvm.org/D25466

llvm-svn: 284000
2016-10-12 10:20:15 +00:00
Alex Lorenz d680287898 [Support][CommandLine] Display subcommands in help when there are less than 3
subcommands

This commit fixes a bug where the help output doesn't display subcommands when
a tool has less than 3 subcommands.

This change doesn't include a corresponding unittest as there is no viable way
to provide a unittest for it.

Differential Revision: https://reviews.llvm.org/D25463

llvm-svn: 283998
2016-10-12 10:04:35 +00:00
Chandler Carruth 5dbc164d15 [LCG] Add the necessary functionality to the LazyCallGraph to support inlining.
The basic inlining operation makes the following changes to the call graph:
1) Add edges that were previously transitive edges. This is always trivial and
   this patch gives the LCG helper methods to make this more convenient.
2) Remove the inlined edge. We had existing support for this, but it contained
   bugs that needed to be fixed. Testing in the same pattern as the inliner
   exposes these bugs very nicely.
3) Delete a function when it becomes dead because it is internal and all calls
   have been inlined. The LCG had no support at all for this operation, so this
   adds that support.

Two unittests have been added that exercise this specific mutation pattern to
the call graph. They were extremely effective in uncovering bugs. Sadly,
a large fraction of the code here is just to implement those unit tests, but
I think they're paying for themselves. =]

This was split out of a patch that actually uses the routines to
implement inlining in the new pass manager in order to isolate (with
unit tests) the logic that was entirely within the LCG.

Many thanks for the careful review from folks! There will be a few minor
follow-up patches based on the comments in the review as well.

Differential Revision: https://reviews.llvm.org/D24225

llvm-svn: 283982
2016-10-12 07:59:56 +00:00
Daniel Jasper 90d990e034 Revert "[libFuzzer] refactoring to speed things up, NFC"
This reverts commit r283946.

This breaks when build with GCC:
lib/Fuzzer/FuzzerTracePC.cpp:169:6: error: always_inline function might not be inlinable [-Werror=attributes]
lib/Fuzzer/FuzzerTracePC.cpp:169:6: error: inlining failed in call to always_inline 'void fuzzer::TracePC::HandleCmp(void*, T, T) [with T = long unsigned int]': target specific option mismatch
lib/Fuzzer/FuzzerTracePC.cpp:198:65: error: called from here

llvm-svn: 283979
2016-10-12 07:26:46 +00:00
Quentin Colombet 9de30faeac [AArch64][InstrustionSelector] Teach the selector about G_BITCAST.
llvm-svn: 283973
2016-10-12 03:57:52 +00:00
Quentin Colombet cb629a897c [AArch64][InstructionSelector] Refactor the handling of copies.
Although Copies are not specific to preISel, we still have to assign them
a proper register class. However, given they are not constrained to
anything we do not have to handle the source register at the copy. It
will be properly mapped when reaching the related definition.

In the process, the handlong of G_ANYEXT is slightly modified as those
end up being selected as copy. The difference is that when register size
do not match on both sides, we need to insert SUBREG_TO_REG operation,
otherwise the post RA copy expansion will not be happy!

llvm-svn: 283972
2016-10-12 03:57:49 +00:00
Quentin Colombet 404e4350dc [AArch64][MachineLegalizer] Mark more bitcasts as legal.
Those are copies, we do not have to do any legalization action for them.

llvm-svn: 283970
2016-10-12 03:57:43 +00:00
Sebastian Pop d57d93c9de Memory-SSA cleanup of clobbers interface, NFC
This implements the cleanup that Danny asked to commit separately from the
previous fix to GVN-hoist in https://reviews.llvm.org/D25476#inline-219818

Tested with ninja check on x86_64-linux.

llvm-svn: 283967
2016-10-12 03:08:40 +00:00
Sebastian Pop ab12fb62ee GVN-hoist: fix store past load dependence analysis (PR30216, PR30499)
This is a refreshed version of a patch that was reverted: it fixes
the problems reported in both PR30216 and PR30499, and
contains all the test-cases from both bugs.

To hoist stores past loads, we used to search for potential
conflicting loads on the hoisting path by following a MemorySSA
def-def link from the store to be hoisted to the previous
defining memory access, and from there we followed the def-use
chains to all the uses that occur on the hoisting path. The
problem is that the def-def link may point to a store that does
not alias with the store to be hoisted, and so the loads that are
walked may not alias with the store to be hoisted, and even as in
the testcase of PR30216, the loads that may alias with the store
to be hoisted are not visited.

The current patch visits all loads on the path from the store to
be hoisted to the hoisting position and uses the alias analysis
to ask whether the store may alias the load. I was not able to
use the MemorySSA functionality to ask for whether load and
store are clobbered: I'm not sure which function to call, so I
used a call to AA->isNoAlias().

Store past store is still working as before using a MemorySSA
query: I added an extra test to pr30216.ll to make sure store
past store does not regress.

Tested on x86_64-linux with check and a test-suite run.

Differential Revision: https://reviews.llvm.org/D25476

llvm-svn: 283965
2016-10-12 02:23:39 +00:00
Tim Shen 4ff62b187e [PPCMIPeephole] Fix splat elimination
Summary:
In PPCMIPeephole, when we see two splat instructions, we can't simply do the following transformation:
  B = Splat A
  C = Splat B
=>
  C = Splat A
because B may still be used between these two instructions. Instead, we should make the second Splat a PPC::COPY and let later passes decide whether to remove it or not:
  B = Splat A
  C = Splat B
=>
  B = Splat A
  C = COPY B

Fixes PR30663.

Reviewers: echristo, iteratee, kbarton, nemanjai

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D25493

llvm-svn: 283961
2016-10-12 00:48:25 +00:00
Michael Kuperstein 7adbf6b042 [DAG] Fix crash in build_vector -> vector_shuffle combine
Fixes a crash in the build_vector -> vector_shuffle combine
when the first vector input is twice as wide as the output,
and the second input vector is even wider.

llvm-svn: 283953
2016-10-11 22:44:31 +00:00
Tim Northover c1d8c2bf8c GlobalISel: support same-size casts on AArch64.
Mostly Ahmed's work again, I'm just sprucing things up slightly before
committing.

llvm-svn: 283952
2016-10-11 22:29:23 +00:00
Kostya Serebryany a09d11e108 [libFuzzer] refactoring to speed things up, NFC
llvm-svn: 283946
2016-10-11 21:27:37 +00:00
Reid Kleckner bdfc05ff93 Re-land "[Thumb] Save/restore high registers in Thumb1 pro/epilogues"
Reverts r283938 to reinstate r283867 with a fix.

The original change had an ArrayRef referring to a destroyed temporary
initializer list. Use plain C arrays instead.

llvm-svn: 283942
2016-10-11 21:14:03 +00:00
Kevin Enderby 68fffa8a62 Next set of additional error checks for invalid Mach-O files for the
load commands that uses the MachO::linker_option_command
type but not used in llvm libObject code but used in llvm tool code.

This includes just LC_LINKER_OPTION load command.

llvm-svn: 283939
2016-10-11 21:04:39 +00:00
Reid Kleckner f4876beb2b Revert "[Thumb] Save/restore high registers in Thumb1 pro/epilogues"
This reverts r283867.

This appears to be an infinite loop:

    while (HiRegToSave != AllHighRegs.end() && CopyReg != AllCopyRegs.end()) {
      if (HiRegsToSave.count(*HiRegToSave)) {
        ...

        CopyReg = findNextOrderedReg(++CopyReg, CopyRegs, AllCopyRegs.end());
        HiRegToSave =
            findNextOrderedReg(++HiRegToSave, HiRegsToSave, AllHighRegs.end());
      }
    }

llvm-svn: 283938
2016-10-11 20:54:41 +00:00
Tim Northover 3d38b3a4d1 GlobalISel: support selection of extend operations.
Patch mostly by Ahmed Bougaca.

llvm-svn: 283937
2016-10-11 20:50:21 +00:00
Tim Northover 60cf6fc58f MIRParser: allow types on registers with a RegBank.
This fixes some GlobalISel regression tests.

llvm-svn: 283936
2016-10-11 20:50:04 +00:00
Kyle Butt 0846e56e63 Codegen: Tail-duplicate during placement.
The tail duplication pass uses an assumed layout when making duplication
decisions. This is fine, but passes up duplication opportunities that
may arise when blocks are outlined. Because we want the updated CFG to
affect subsequent placement decisions, this change must occur during
placement.

In order to achieve this goal, TailDuplicationPass is split into a
utility class, TailDuplicator, and the pass itself. The pass delegates
nearly everything to the TailDuplicator object, except for looping over
the blocks in a function. This allows the same code to be used for tail
duplication in both places.

This change, in concert with outlining optional branches, allows
triangle shaped code to perform much better, esepecially when the
taken/untaken branches are correlated, as it creates a second spine when
the tests are small enough.

Issue from previous rollback fixed, and a new test was added for that
case as well. Issue was worklist/scheduling/taildup issue in layout.

Issue from 2nd rollback fixed, with 2 additional tests. Issue was
tail merging/loop info/tail-duplication causing issue with loops that share
a header block.

Issue with early tail-duplication of blocks that branch to a fallthrough
predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll

Differential revision: https://reviews.llvm.org/D18226

llvm-svn: 283934
2016-10-11 20:36:43 +00:00
Reid Kleckner 5d0bc63d91 Avoid braced initialization for default member initializers for MSVC 2013
llvm-svn: 283928
2016-10-11 20:02:57 +00:00
Arnold Schwaighofer 9103e268cf Silence -Wunused-but-set-variable warning
llvm-svn: 283927
2016-10-11 19:49:29 +00:00
Rui Ueyama f9904043ca Re-submit r283823: Define DbiStreamBuilder::addDbgStream to add stream.
The previous commit was failing because we filled empty slots of
the debug stream index with kInvalidStreamIndex. It should've been 0.

llvm-svn: 283925
2016-10-11 19:43:12 +00:00
Kostya Serebryany 4d25ad93f3 [sanitizer-coverage] use private linkage for coverage guards, delete old commented-out code.
llvm-svn: 283924
2016-10-11 19:36:50 +00:00
Rui Ueyama 888de9b031 Fix build error on LP64 platforms.
llvm-svn: 283922
2016-10-11 19:28:56 +00:00
Zachary Turner 733be51dcd [raw_ostream] Raise some helper functions out of raw_ostream.
Low level functionality to format numbers were embedded in the
implementation of raw_ostream.  I have need to use these through
an interface other than the overloaded stream operators, so they
need to be raised to a level that they can be used from either
raw_ostream operators or other code.

llvm-svn: 283921
2016-10-11 19:24:45 +00:00
Konstantin Zhuravlyov cdd4547607 [AMDGPU] Refactor waitcnt encoding
- Refactor bit packing/unpacking
- Calculate bit mask given bit shift and bit width
- Introduce function for decoding bits of waitcnt
- Introduce function for encoding bits of waitcnt
- Introduce function for getting waitcnt mask (instead of using bare numbers)
- Introduce function fot getting max waitcnt(s) (instead of using bare numbers)

Differential Revision: https://reviews.llvm.org/D25298

llvm-svn: 283919
2016-10-11 18:58:22 +00:00
Dehao Chen e9d075233a Allow Switch instruction to have extractProfTotalWeight called as it can terminate a basic block. (NFC)
llvm-svn: 283918
2016-10-11 18:53:00 +00:00
Mehdi Amini caec5559cb Fix "static initialization order fiasco" for the XCore Target.
I fixed all the other Targets in r283702, and interestingly the
sanitizers are only now "sometimes" catching this bug on the only
one I missed.

llvm-svn: 283914
2016-10-11 18:22:41 +00:00
Zachary Turner 76e007e7e3 [Support] Fix undefined behavior in RandomNumberGenerator.
This has existed pretty much forever AFAICT, but the code was
never being exercised because nobody was using the class.  A
user of this class surfaced, and now we're breaking with UB.
The code was obviously wrong, so it's fixed here.

llvm-svn: 283912
2016-10-11 18:17:26 +00:00
NAKAMURA Takumi e9587bd771 ARMMachineFunctionInfo.cpp: Add an initializer of ARMFunctionInfo::ReturnRegsCount in the explicit ctor.
It caused crash since r283867.

llvm-svn: 283909
2016-10-11 17:38:30 +00:00
NAKAMURA Takumi e61de02016 Reformat.
llvm-svn: 283908
2016-10-11 17:38:25 +00:00
Sanjay Patel 8253e15ef3 [DAG] add fold for masked negated sign-extended bool
This enhances the fold added with:
https://reviews.llvm.org/rL283900

llvm-svn: 283905
2016-10-11 17:05:52 +00:00
Sanjay Patel 8384703d9b [DAG] add fold for masked negated extended bool
The non-obvious motivation for adding this fold (which already happens in InstCombine)
is that we want to canonicalize IR towards select instructions and canonicalize DAG 
nodes towards boolean math. So we need to recreate some folds in the DAG to handle that
change in direction. 

An interesting implementation difference for cases like this is that InstCombine
generally works top-down while the DAG goes bottom-up. That means we need to detect 
different patterns. In this case, the SimplifyDemandedBits fold prevents us from 
performing a zext to sext fold that would then be recognized as a negation of a sext. 

llvm-svn: 283900
2016-10-11 16:26:36 +00:00
Daniel Jasper b617286089 Silence unused warning in non-assert builds.
llvm-svn: 283899
2016-10-11 16:22:36 +00:00
Changpeng Fang 98317d20f4 AMDGPU/SI: Update ISA version numbers for Tonga and Polaris10/11.
Differential Revision:
  http://reviews.llvm.org/D25454

Reviewers:
  tstellarAMD

llvm-svn: 283893
2016-10-11 16:00:47 +00:00
Zachary Turner 79f3333d3f [cl] Don't print subcommand help when no subcommands present.
Previously we would print

  USAGE: <exe> [subcommand] [options]

Even if no subcommands were present.  This changes the output
format to only print "[subcommand]" if there is at least one
subcommand.

Fixes llvm.org/pr30598

Patch by Serge Guelton

llvm-svn: 283892
2016-10-11 15:58:48 +00:00
Sanjay Patel 38a42e4bfa [DAG] simplify logic; NFC
llvm-svn: 283885
2016-10-11 14:14:30 +00:00
Sanjay Patel 907ae69125 [DAG] hoist DL(N) and fix formatting; NFC
llvm-svn: 283884
2016-10-11 14:04:24 +00:00
Sanjay Patel 9609f3d6c7 [DAG] fix formatting; NFC
llvm-svn: 283878
2016-10-11 13:47:43 +00:00
Igor Laevsky 04423cf785 [LCSSA] Implement linear algorithm for the isRecursivelyLCSSAForm
For each block check that it doesn't have any uses outside of it's innermost loop.

Differential Revision: https://reviews.llvm.org/D25364

llvm-svn: 283877
2016-10-11 13:37:22 +00:00
Oliver Stannard d2083fb356 [Thumb] Save/restore high registers in Thumb1 pro/epilogues
The high registers are not allocatable in Thumb1 functions, but they
could still be used by inline assembly, so we need to save and restore
the callee-saved high registers (r8-r11) in the prologue and epilogue.

This is complicated by the fact that the Thumb1 push and pop
instructions cannot access these registers. Therefore, we have to move
them down into low registers before pushing, and move them back after
popping into low registers.

In most functions, we will have low registers that are also being
pushed/popped, which we can use as the temporary registers for
saving/restoring the high registers. However, this is not guaranteed, so
we may need to push some extra low registers to ensure that the high
registers can be saved/restored. For correctness, it would be sufficient
to use just one low register, but if we have enough low registers
available then we only need one push/pop instruction, rather than one
per high register.

We can also use the argument/return registers when they are not live,
and the link register when saving (but not restoring), reducing the
number of extra registers we need to push.

There are still a few extreme edge cases where we need two push/pop
instructions, because not enough low registers can be made live in the
prologue or epilogue.

In addition to the regression tests included here, I've also tested this
using a script to generate functions which clobber different
combinations of registers, have different numbers of argument and return
registers (including variadic arguments), allocate different fixed sized
objects on the stack, and do or don't use variable sized allocas and the
__builtin_return_address intrinsic (all of which affect the available
registers in the prologue and epilogue). I ran these functions in a test
harness which verifies that all of the callee-saved registers are
correctly preserved.

Differential Revision: https://reviews.llvm.org/D24228

llvm-svn: 283867
2016-10-11 10:12:25 +00:00
Oliver Stannard 50a74393c2 [ARM] Fix registers clobbered by SjLj EH on soft-float targets
Currently, the Int_eh_sjlj_dispatchsetup intrinsic is marked as
clobbering all registers, including floating-point registers that may
not be present on the target. This is technically true, as we could get
linked against code that does use the FP registers, but that will not
actually work, as the soft-float code cannot save and restore the FP
registers. SjLj exception handling can only work correctly if either all
or none of the code is built for a target with FP registers. Therefore,
we can assume that, when Int_eh_sjlj_dispatchsetup is compiled for a
soft-float target, it is only going to be linked against other
soft-float code, and so only clobbers the general-purpose registers.
This allows us to check that no non-savable registers are clobbered when
generating the prologue/epilogue.

Differential Revision: https://reviews.llvm.org/D25180

llvm-svn: 283866
2016-10-11 10:06:59 +00:00
Diana Picus c93518db8c [AArch64] Allow label arithmetic with add/sub/cmp
Allow instructions such as 'cmp w0, #(end - start)' by folding the
expression into a constant. For ELF, we fold only if the symbols are in
the same section. For MachO, we fold if the expression contains only
symbols that are not linker visible.

Fixes https://llvm.org/bugs/show_bug.cgi?id=18920

Differential Revision: https://reviews.llvm.org/D23834

llvm-svn: 283862
2016-10-11 09:17:47 +00:00
Fraser Cormack 48d9fdc1e1 Fix formatting in findRegisterUseOperandIdx. NFC.
llvm-svn: 283860
2016-10-11 09:09:21 +00:00
Daniel Jasper 0c42dc4784 Revert "Codegen: Tail-duplicate during placement."
This reverts commit r283842.

test/CodeGen/X86/tail-dup-repeat.ll causes and llc crash with our
internal testing. I'll share a link with you.

llvm-svn: 283857
2016-10-11 07:36:11 +00:00
Mehdi Amini ea8e9795a5 Make RandomNumberGenerator compatible with <random>
LLVM's RandomNumberGenerator wasn't compatible with
the random distribution from <random>.

Fixes PR25105

Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu>

Differential Revision: https://reviews.llvm.org/D25443

llvm-svn: 283854
2016-10-11 07:13:01 +00:00
Dehao Chen c87bc79d90 Tune isHotFunction/isColdFunction
Summary: This patch sets function as hot if function's entry count is hot/cold.

Reviewers: eraman, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25048

llvm-svn: 283852
2016-10-11 05:19:00 +00:00
Matthias Braun e67fc4a328 Fix warning; NFC
llvm-svn: 283851
2016-10-11 04:32:03 +00:00
Matthias Braun 3d85ebe5b1 MIRParser: generic register operands with types
This should fix the fallout of r283848.

llvm-svn: 283850
2016-10-11 04:22:29 +00:00
Matthias Braun 74ad41c7cd MIRParser: Rewrite register info initialization; mostly NFC
This changes MachineRegisterInfo to be initializes after parsing all
instructions. This is in preparation for upcoming commits that allow the
register class specification on the operand or deduce them from the
MCInstrDesc.

This commit removes the unused feature of having nonsequential register
numbers. This was confusing anyway as the vreg numbers would be
different after parsing when you had "holes" in your numbering.

This patch also introduces the concept of an incomplete virtual
register. An incomplete virtual register may be used during .mir parsing
to construct MachineOperands without knowing the exact register class
(or register bank) yet.

NFC except for some error messages.

Differential Revision: https://reviews.llvm.org/D22397

llvm-svn: 283848
2016-10-11 03:13:01 +00:00
Kyle Butt ae068a320c Codegen: Tail-duplicate during placement.
The tail duplication pass uses an assumed layout when making duplication
decisions. This is fine, but passes up duplication opportunities that
may arise when blocks are outlined. Because we want the updated CFG to
affect subsequent placement decisions, this change must occur during
placement.

In order to achieve this goal, TailDuplicationPass is split into a
utility class, TailDuplicator, and the pass itself. The pass delegates
nearly everything to the TailDuplicator object, except for looping over
the blocks in a function. This allows the same code to be used for tail
duplication in both places.

This change, in concert with outlining optional branches, allows
triangle shaped code to perform much better, esepecially when the
taken/untaken branches are correlated, as it creates a second spine when
the tests are small enough.

Issue from previous rollback fixed, and a new test was added for that
case as well. Issue was worklist/scheduling/taildup issue in layout.

Issue from 2nd rollback fixed, with 2 additional tests. Issue was
tail merging/loop info/tail-duplication causing issue with loops that share
a header block.

Issue with early tail-duplication of blocks that branch to a fallthrough
predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll

Differential revision: https://reviews.llvm.org/D18226

llvm-svn: 283842
2016-10-11 01:20:33 +00:00
Kostya Serebryany d19919a80e [libFuzzer] implement value profile for switch, increase the size of the PCs array, make sure we don't overflow it
llvm-svn: 283841
2016-10-11 01:14:41 +00:00
Kostya Serebryany 3e0e901a18 [libFuzzer] add switch tests
llvm-svn: 283840
2016-10-11 01:13:32 +00:00
Dylan McKay c328fe5af4 [RegAllocGreedy] Attempt to split unspillable live intervals
Summary:
Previously, when allocating unspillable live ranges, we would never
attempt to split. We would always bail out and try last ditch graph
recoloring.

This patch changes this by attempting to split all live intervals before
performing recoloring.

This fixes LLVM bug PR14879.

I can't add test cases for any backends other than AVR because none of
them have small enough register classes to trigger the bug.

Reviewers: qcolombet

Subscribers: MatzeB

Differential Revision: https://reviews.llvm.org/D25070

llvm-svn: 283838
2016-10-11 01:04:36 +00:00
David Majnemer 80dca0c78f [InstCombine] Transform !range metadata to !nonnull when combining loads
When combining an integer load with !range metadata that does not include 0 to a pointer load, make sure emit !nonnull metadata on the newly-created pointer load. This prevents the !nonnull metadata from being dropped during a ptrtoint/inttoptr pair.

This fixes PR30597.

Patch by Ariel Ben-Yehuda!

Differential Revision: https://reviews.llvm.org/D25215

llvm-svn: 283836
2016-10-11 01:00:45 +00:00
Quentin Colombet d2623f8e38 [AArch64][InstructionSelector] Teach how to select FP load/store.
This patch allows to select 32 and 64-bit FP load and store.

llvm-svn: 283832
2016-10-11 00:21:14 +00:00
Quentin Colombet 0e5312787e [AArch64][InstructionSelector] Teach the selector how to handle vector OR.
This only adds the support for 64-bit vector OR. Adding more sizes is
not difficult, but it requires a bigger refactoring because ORs work on
any size, not necessarly the ones that match the width of the register
width. Right now, this is not expressed in the legalization, so don't
bother pushing the refactoring yet.

llvm-svn: 283831
2016-10-11 00:21:11 +00:00
Quentin Colombet d3126d5fb4 [AArch64][MachineLegalizer] Mark v2s32 G_LOAD as legal.
Actually every 64-bit loads are legal, but right now the API does not
offer a simple way to express that.

llvm-svn: 283829
2016-10-11 00:21:08 +00:00
Rui Ueyama 8af4988f35 Revert r283824 and r283823: Define DbiStreamBuilder::addDbgStream to add stream.
This reverts commit r283824 and r283823 to fix buildbots.

llvm-svn: 283828
2016-10-11 00:15:50 +00:00
Rui Ueyama 914eef6a64 Fix a bug in DbiStreamBuilder::addDbgStream.
This feature will be tested in LLD unit tests.

llvm-svn: 283824
2016-10-10 23:44:04 +00:00
Rui Ueyama 70edd9e41d Define DbiStreamBuilder::addDbgStream to add stream.
Previously, there is no way to create a stream other than pre-defined
special stream such as DBI or IPI. This patch adds a new method,
addDbgStream, to add a debug stream to a PDB file.

Differential Revision: https://reviews.llvm.org/D25356

llvm-svn: 283823
2016-10-10 23:35:36 +00:00
Peter Collingbourne 0da86301ad Revert r283690, "MC: Remove unused entities."
llvm-svn: 283814
2016-10-10 22:49:37 +00:00
Tim Northover bdf1624367 GlobalISel: select G_GLOBAL_VALUE uses on AArch64.
llvm-svn: 283809
2016-10-10 21:50:00 +00:00
Tim Northover ad0acca544 GlobalISel: allow G_GLOBAL_VALUEs in AArch64 legalization.
llvm-svn: 283808
2016-10-10 21:49:53 +00:00
Tim Northover 2fda4b08ae GlobalISel: support selecting G_GEP instructions.
They're basically just an alias for G_ADD on AArch64.

llvm-svn: 283807
2016-10-10 21:49:49 +00:00
Tim Northover 4edc60d785 GlobalISel: support selecting constants on AArch64.
llvm-svn: 283806
2016-10-10 21:49:42 +00:00
Dehao Chen 84287abf43 Rename isHotFunction/isColdFunction to isFunctionEntryHot/isFunctionEntryCold. (NFC)
This is in preparation for https://reviews.llvm.org/D25048

llvm-svn: 283805
2016-10-10 21:47:28 +00:00
Hal Finkel fcd2421667 [SelectionDAGBuilder] Support llvm.flt.rounds on targets where i32 is not legal
Add integer expansion for FLT_ROUNDS_ for targets where i32 is not a legal
type.

Patch by Edward Jones, thanks!

Differential Revision: https://reviews.llvm.org/D24459

llvm-svn: 283797
2016-10-10 20:45:15 +00:00
Adrian Prantl 3bfe1093df Teach llvm::StripDebugInfo() about global variable !dbg attachments.
This is a regression introduced by the global variable ownership
reversal performed in r281284.

rdar://problem/28448075

llvm-svn: 283784
2016-10-10 17:53:33 +00:00
Justin Lebar 611c5c225a Use unique_ptr in LLVMContextImpl's constant maps.
Reviewers: timshen

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D25419

llvm-svn: 283767
2016-10-10 16:26:13 +00:00
Alexandros Lamprineas 20e9ddba73 [ARM] Fix invalid VLDM/VSTM access when targeting Big Endian with NEON
The instructions VLDM/VSTM can only access word-aligned memory
locations and produce alignment fault if the condition is not met.

The compiler currently generates VLDM/VSTM for v2f64 load/store
regardless the alignment of the memory access. Instead, if a v2f64
load/store is not word-aligned, the compiler should generate
VLD1/VST1. For each non double-word-aligned VLD1/VST1, a VREV
instruction should be generated when targeting Big Endian.

Differential Revision: https://reviews.llvm.org/D25281

llvm-svn: 283763
2016-10-10 16:01:54 +00:00
Nirav Dave f43cc9f8b5 Add return type for checkForValidSection parsing function. NFC Intended.
llvm-svn: 283761
2016-10-10 15:24:54 +00:00
Zvi Rackover 2a21f125bd [X86] Prefer rotate by 1 over rotate by imm
Summary:
Rotate by 1 is translated to 1 micro-op, while rotate with imm8 is translated to 2 micro-ops.

Fixes pr30644.

Reviewers: delena, igorb, craig.topper, spatel, RKSimon

Differential Revision: https://reviews.llvm.org/D25399

llvm-svn: 283758
2016-10-10 14:43:55 +00:00
Chris Dewhurst 850131213f This pass, fixing an erratum in some LEON 2 processors ensures that the SDIV instruction is not issued, but replaced by SDIVcc instead, which does not exhibit the error. Unit test included.
Differential Review: https://reviews.llvm.org/D24660

llvm-svn: 283727
2016-10-10 08:53:06 +00:00
Daniel Jasper 0dea246b4f Fix WebAssembly build after r283702.
llvm-svn: 283723
2016-10-10 06:49:55 +00:00
Craig Topper 9ece2f7529 [AVX-512] Add missing pattern sext or zext from bytes to quad words with a 128-bit load as input.
llvm-svn: 283720
2016-10-10 06:25:48 +00:00
Michael Zuckerman 3eeac2d56b [x86][inline-asm][llvm] accept 'v' constraint
Commit in the name of:Coby Tayree
1.'v' constraint for (x86) non-avx arch imitates the already implemented 'x' constraint, i.e. allows XMM{0-15} & YMM{0-15} depending on the apparent arch & mode (32/64).
2.for the avx512 arch it allows [X,Y,Z]MM{0-31} (mode dependent)

This patch applies the needed changes to clang
 clang patch: https://reviews.llvm.org/D25004

Differential Revision: D25005
 

llvm-svn: 283717
2016-10-10 05:48:56 +00:00
Dylan McKay 1a523767dc [AVR] Enable generation of the TableGen assembly writer tables
This also changes the order of the statements in CMakeLists.txt to be
alphabetical.

llvm-svn: 283711
2016-10-10 01:28:45 +00:00
Craig Topper 64378f4378 [AVX-512] Port 128 and 256-bit memory->register sign/zero extend patterns from SSE file. Also add a minimal set for 512-bit.
llvm-svn: 283704
2016-10-09 23:08:39 +00:00
Craig Topper 29558b8284 [X86] Remove redundant patterns. The same pattern appears a few lines up.
llvm-svn: 283703
2016-10-09 23:08:33 +00:00
Mehdi Amini f42454b94b Move the global variables representing each Target behind accessor function
This avoids "static initialization order fiasco"

Differential Revision: https://reviews.llvm.org/D25412

llvm-svn: 283702
2016-10-09 23:00:34 +00:00
Elena Demikhovsky 5b10aa1f1e DAG: Setting Masked-Expand-Load as a variant of Masked-Load node
Masked-expand-load node represents load operation that loads a variable amount of elements from memory according to amount of "true" bits in the mask and expands the loaded elements according to their position in the mask vector.
Right now, the node is used in intrinsics for VEXPAND* instructions. 
The work is done towards implementation of masked.expandload and masked.compressstore intrinsics.

Differential Revision: https://reviews.llvm.org/D25322

llvm-svn: 283694
2016-10-09 10:48:52 +00:00
Craig Topper 43973154dd [AVX-512] Fix execution domain for EVEX encoded VINSERTPS.
llvm-svn: 283692
2016-10-09 06:41:47 +00:00
Peter Collingbourne cc723cccab MC: Remove unused entities.
llvm-svn: 283691
2016-10-09 04:39:13 +00:00
Peter Collingbourne 5c924d7117 Target: Remove unused entities.
llvm-svn: 283690
2016-10-09 04:38:57 +00:00
Craig Topper e30cb00dc0 [AVX-512] Add subvector insert and extract to load/store folding tables.
llvm-svn: 283689
2016-10-09 03:54:13 +00:00
Craig Topper 4262d53024 [AVX-512] Add the vector down convert instructions to the store folding tables.
llvm-svn: 283687
2016-10-09 03:54:05 +00:00
Kostya Serebryany 7abb95d3b3 [libFuzzer] make a test less flaky
llvm-svn: 283686
2016-10-09 03:45:38 +00:00
Kostya Serebryany c5325ed29d [libFuzzer] when shrinking the corpus, delete evicted files previously created by the current process
llvm-svn: 283682
2016-10-08 23:24:45 +00:00
Kostya Serebryany 9adc7c8b4a [libFuzzer] control the reload interval by a flag, make it 10 seconds by default
llvm-svn: 283676
2016-10-08 22:12:14 +00:00
Kostya Serebryany cd04ec25dd [libFuzzer] fix use-after-free in libFuzzer found by ... fuzzing.
llvm-svn: 283675
2016-10-08 21:57:48 +00:00
Mehdi Amini 732afdd09a Turn cl::values() (for enum) from a vararg function to using C++ variadic template
The core of the change is supposed to be NFC, however it also fixes
what I believe was an undefined behavior when calling:

 va_start(ValueArgs, Desc);

with Desc being a StringRef.

Differential Revision: https://reviews.llvm.org/D25342

llvm-svn: 283671
2016-10-08 19:41:06 +00:00
Craig Topper 086f0c1401 [AVX-512] Fix a bug in getLargestLegalSuperClass where we inflated to VR128X/VR256X even when VLX isn't supported.
This seems to have been responsible for the XMM16-31 spills observed in PR29112. With this fixed the test case has been modified to no longer have a spill of XMM16.

llvm-svn: 283668
2016-10-08 18:49:57 +00:00
Colin LeMahieu c69f7ff6c0 [Hexagon] Adding change of flow max 1 (cofMax1) TS flag for marking this restriction rather than implying it from TypeJR.
llvm-svn: 283665
2016-10-08 17:18:51 +00:00
Teresa Johnson 897bab9b35 [ThinLTO] Record calls to aliases
Summary:
When there is a call to an alias in the same module, we were not
adding a call edge. So we could incorrectly think that the alias
was dead if it was inlined in that function, despite having a
reference imported elsewhere. This resulted in unsats at link time.

Add a call edge when the call is to an alias.

Reviewers: davide, mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25384

llvm-svn: 283664
2016-10-08 16:11:42 +00:00
Sebastian Pop eb65d72d9c [AArch64] Avoid generating indexed vector instructions for Exynos
Avoid generating indexed vector instructions for Exynos. This is needed for
fmla/fmls/fmul/fmulx. For example, the instruction

  fmla v0.4s, v1.4s, v2.s[1]

is less efficient than the instructions

  dup v2.4s, v2.s[1]
  fmla v0.4s, v1.4s, v2.4s

Patch written by Abderrazek Zaafrani.

Differential Revision: https://reviews.llvm.org/D21571

llvm-svn: 283663
2016-10-08 12:30:07 +00:00
Adam Nemet ee5cf031ce [OptRemarks] Remove non-printable chars from function name
Value names may be prefixed with a binary '1' to indicate that the
backend should not modify the symbols due to any platform naming
convention.

This should not show up in the YAML opt record file because it breaks
the YAML parser.

llvm-svn: 283656
2016-10-08 04:47:20 +00:00
Mehdi Amini f82bda0a7a ThinLTO: don't perform incremental LTO on module without a hash
Clang always emit a hash for ThinLTO, but as other frontend are
starting to use ThinLTO, this could be a serious bug.

Differential Revision: https://reviews.llvm.org/D25379

llvm-svn: 283655
2016-10-08 04:44:23 +00:00
Mehdi Amini 00fa1409ec ThinLTO: handles modules with empty summaries
We need to add an entry in the combined-index for modules that have
a hash but otherwise empty summary, this is needed so that we can
get the hash for the module.

Also, if no entry is present in the combined index for a module, we
need to skip it when trying to compute a cache entry.

Differential Revision: https://reviews.llvm.org/D25300

llvm-svn: 283654
2016-10-08 04:44:18 +00:00
Kyle Butt 2facd194a2 Revert "Codegen: Tail-duplicate during placement."
This reverts commit 71c312652c10f1855b28d06697c08d47e7a243e4.

llvm-svn: 283647
2016-10-08 01:47:05 +00:00
Dylan McKay f96ffe1ebf [AVR] Add backend dependencies to MCTargetDesc/LLVMBuild.txt
llvm-svn: 283642
2016-10-08 01:14:23 +00:00
Zachary Turner 3b14764ce5 [pdb] Dump Module Symbols to Yaml.
This is the first step towards round-tripping symbol information,
and thusly being able to write symbol information to a PDB.

This patch writes the symbol information for each compiland to
the Yaml when running in pdb2yaml mode.  There's still some loose
ends, such as what to do about relocations (necessary in order to
print linkage names), how to print enums with friendly names, and
how to give the dumper access to the StringTable, but this is a
good first start.

llvm-svn: 283641
2016-10-08 01:12:01 +00:00
Dylan McKay 552b7856d3 Fix incorrect assertion in AVRFrameLowering.cpp
This wasn't looking at the right instruction, and would always fail.

llvm-svn: 283640
2016-10-08 01:10:36 +00:00
Dylan McKay b16b6d5739 [AVR] Don't worry about call frame size when initializing frame pointer
We previously only used the frame pointer if the frame pointer was too
big. This was to work around a bug (described in this old commit)

https://sourceforge.net/p/avr-llvm/code/204/tree//llvm/trunk/AVR/AVRFrameLowering.cpp?diff=50d64d912718465cb887d17a:203

I mistakenly invered the condition assuming it was a typo. I am now
removing it because it doesn't seem to be a problem anymore (plus it's a
dirty hack).

llvm-svn: 283639
2016-10-08 01:10:31 +00:00
Dylan McKay 7c2d41aa9f [AVR] Don't shadow container while iterating in range-based loop
This works on clang, but fails on GCC 4.6

llvm-svn: 283638
2016-10-08 01:09:06 +00:00
Dylan McKay a1a944e3cb [AVR] Use references rather than pointers in AVRISelLowering
llvm-svn: 283636
2016-10-08 01:06:21 +00:00
Dylan McKay 12109e7314 Allow a maximum of 64 bits to be returned in registers
The rest spills to the stack

Authored by Jake Goulding

llvm-svn: 283635
2016-10-08 01:05:09 +00:00
Dylan McKay c1ff65cf62 [AVR] Expand MULHS for all types
Once MULHS was expanded, this exposed an issue where the condition
register was thought to be 16-bit. This caused an attempt to copy a
16-bit register to an 8-bit register.

Authored by Jake Goulding

llvm-svn: 283634
2016-10-08 01:01:49 +00:00
Dylan McKay ddb7a59fe9 [AVR] Add the 'SoftFail' field to all instruction formats
This will be used in the future for disassembly.

llvm-svn: 283630
2016-10-08 00:55:46 +00:00
Dylan McKay 24d02ee141 [AVR] Set up the instruction printer and the assembly backend
llvm-svn: 283629
2016-10-08 00:50:11 +00:00
Dylan McKay 2b0936d41d [AVR] Add dependencies to AVR libraries in AVRCodeGen
llvm-svn: 283628
2016-10-08 00:45:24 +00:00
Dylan McKay 07897f5492 [AVR] Add missing subdirectories to LLVMBuild
llvm-svn: 283627
2016-10-08 00:42:58 +00:00
Gor Nishanov 1b6aec8e25 [coroutines] Store an address of destroy OR cleanup part in the coroutine frame.
Summary:
If heap allocation of a coroutine is elided, we need to make sure that we will update an address stored in the coroutine frame from f.destroy to f.cleanup.
Before this change, CoroSplit synthesized these stores after coro.begin:

```
    store void (%f.Frame*)* @f.resume, void (%f.Frame*)** %resume.addr
    store void (%f.Frame*)* @f.destroy, void (%f.Frame*)** %destroy.addr

```

In those cases where we did heap elision, but were not able to devirtualize all indirect calls, destroy call will attempt to "free" the coroutine frame stored on the stack. Oops.

Now we use select to put an appropriate coroutine subfunction in the destroy slot. As bellow:

```
    store void (%f.Frame*)* @f.resume, void (%f.Frame*)** %resume.addr
    %0 = select i1 %need.alloc, void (%f.Frame*)* @f.destroy, void (%f.Frame*)* @f.cleanup
    store void (%f.Frame*)* %0, void (%f.Frame*)** %destroy.addr
```

Reviewers: majnemer

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D25377

llvm-svn: 283625
2016-10-08 00:22:50 +00:00
Dylan McKay 4d82df32b9 [AVR] Add the assembly printer
Summary: This adds the AVRAsmPrinter class.

Reviewers: arsenm, kparzysz

Subscribers: llvm-commits, wdng, beanz, japaric, mgorny

Differential Revision: https://reviews.llvm.org/D25271

llvm-svn: 283623
2016-10-08 00:02:36 +00:00
Tom Stellard 5ab6154dc3 AMDGPU/SI: Handle div_fmas hazard in GCNHazardRecognizer
Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D25250

llvm-svn: 283622
2016-10-07 23:42:48 +00:00
Kyle Butt 37e676d857 Codegen: Tail-duplicate during placement.
The tail duplication pass uses an assumed layout when making duplication
decisions. This is fine, but passes up duplication opportunities that
may arise when blocks are outlined. Because we want the updated CFG to
affect subsequent placement decisions, this change must occur during
placement.

In order to achieve this goal, TailDuplicationPass is split into a
utility class, TailDuplicator, and the pass itself. The pass delegates
nearly everything to the TailDuplicator object, except for looping over
the blocks in a function. This allows the same code to be used for tail
duplication in both places.

This change, in concert with outlining optional branches, allows
triangle shaped code to perform much better, esepecially when the
taken/untaken branches are correlated, as it creates a second spine when
the tests are small enough.

Issue from previous rollback fixed, and a new test was added for that
case as well. Issue was worklist/scheduling/taildup issue in layout.

Issue from 2nd rollback fixed, with 2 additional tests. Issue was
tail merging/loop info/tail-duplication causing issue with loops that share
a header block.

Differential revision: https://reviews.llvm.org/D18226

llvm-svn: 283619
2016-10-07 22:33:20 +00:00
Arnold Schwaighofer 3f25658143 swifterror: Don't compute swifterror vregs during instruction selection
The code used llvm basic block predecessors to decided where to insert phi
nodes. Instruction selection can and will liberally insert new machine basic
block predecessors. There is not a guaranteed one-to-one mapping from pred.
llvm basic blocks and machine basic blocks.

Therefore the current approach does not work as it assumes we can mark
predecessor machine basic block as needing a copy, and needs to know the set of
all predecessor machine basic blocks to decide when to insert phis.

Instead of computing the swifterror vregs as we select instructions, propagate
them at the end of instruction selection when the MBB CFG is complete.

When an instruction needs a swifterror vreg and we don't know the value yet,
generate a new vreg and remember this "upward exposed" use, and reconcile this
at the end of instruction selection.

This will only happen if the target supports promoting swifterror parameters to
registers and the swifterror attribute is used.

rdar://28300923

llvm-svn: 283617
2016-10-07 22:06:55 +00:00
Sanjay Patel 14c02052d6 [DAG] clean up foldSelectOfConstants(); NFCI
Rename variables, simplify logic. 
Not clear yet why we don't handle a target with ZeroOrNegativeOneBooleanContent too.

llvm-svn: 283613
2016-10-07 21:55:42 +00:00
Davide Italiano f6988d2980 [InstCombine] Don't unpack arrays that are too large (part 2).
This is similar to r283599, but for store instructions.
Thanks to David for pointing out!

llvm-svn: 283612
2016-10-07 21:53:09 +00:00
Zachary Turner 0d8407447d Refactor Symbol visitor code.
Type visitor code had already been refactored previously to
decouple the visitor and the visitor callback interface.  This
was necessary for having the flexibility to visit in different
ways (for example, dumping to yaml, reading from yaml, dumping
to ScopedPrinter, etc).

This patch merely implements the same visitation pattern for
symbol records that has already been implemented for type records.

llvm-svn: 283609
2016-10-07 21:34:46 +00:00
Davide Italiano da11412243 [InstCombine] Don't unpack arrays that are too large
Differential Revision:  https://reviews.llvm.org/D25376

llvm-svn: 283599
2016-10-07 20:57:42 +00:00
Sanjay Patel ecaf343fe7 [DAG] move fold (select C, 0, 1 -> xor C, 1) to a helper function; NFC
We're missing at least 3 other similar folds based on what we have in InstCombine. 

llvm-svn: 283596
2016-10-07 20:47:51 +00:00
Tom Stellard 6982bb8f25 AMDGPU/SI: Add support for 8-byte relocations
Reviewers: arsenm, kzhuravl

Subscribers: wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D25375

llvm-svn: 283593
2016-10-07 20:36:58 +00:00
Colin LeMahieu 9694825d32 [Hexagon][NFC] Using documented instruction type name V4LDST instead of MEMOP.
llvm-svn: 283582
2016-10-07 19:11:28 +00:00
Mehdi Amini dc5a507c92 Recommit "Use StringRef in LTOModule implementation (NFC)""
This reverts commit r283456 and reapply r282997, with explicitly
zeroing the struct member to workaround a bug in MSVC2013 with
zero-initialization: https://connect.microsoft.com/VisualStudio/feedback/details/802160

llvm-svn: 283581
2016-10-07 19:05:14 +00:00
Davide Italiano c0169fa94f [LoopIdiomRecognize] Merge two if conditions into one. NFCI.
llvm-svn: 283579
2016-10-07 18:39:43 +00:00
Sanjay Patel 4326c4ac8f [InstCombine] fold select X, (ext X), C
If we're going to canonicalize IR towards select of constants, try harder to create those.
Also, don't lose the metadata.

This is actually 4 related transforms in one patch:
      // select X, (sext X), C --> select X, -1, C
      // select X, (zext X), C --> select X,  1, C
      // select X, C, (sext X) --> select X, C, 0
      // select X, C, (zext X) --> select X, C, 0

Differential Revision: https://reviews.llvm.org/D25126

llvm-svn: 283575
2016-10-07 17:53:07 +00:00
Tom Stellard ef33c4b3f2 AMDGPU/SI: Emit fixups for long branches
Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D25366

llvm-svn: 283570
2016-10-07 16:01:18 +00:00
Artem Tamazov 73f1ab28cd [AMDGPU][mc] Add support for buffer_load_dwordx3, buffer_store_dwordx3.
Partially fixes Bug 28232.
Lit tests added.

Differential Revision: https://reviews.llvm.org/D25367

llvm-svn: 283567
2016-10-07 15:53:16 +00:00
Dehao Chen 6e0c8446db Invoke add-discriminator at -g0 -fsample-profile
Summary: -fsample-profile needs discriminator, which will not be added if built with -g0. This patch makes sure the discriminator is added for sample-profile at -g0. A followup patch will be send out to update clang tests.

Reviewers: davidxl, dblaikie, echristo, dnovillo

Subscribers: mehdi_amini, probinson, llvm-commits

Differential Revision: https://reviews.llvm.org/D25132

llvm-svn: 283565
2016-10-07 15:21:31 +00:00
Matthew Simpson a371c14ffe [LV] Don't mark multi-use branch conditions uniform
Previously, we marked the branch conditions of latch blocks uniform after
vectorization if they were instructions contained in the loop. However, if a
condition instruction has users other than the branch, it may not remain
uniform. This patch ensures the conditions we mark uniform are only used by the
branch. This should fix PR30627.

Reference: https://llvm.org/bugs/show_bug.cgi?id=30627
llvm-svn: 283563
2016-10-07 15:20:13 +00:00
Krzysztof Parzyszek e513e17b23 Only track physical registers in LivePhysRegs
llvm-svn: 283561
2016-10-07 14:50:49 +00:00
Sam Kolton a3ec5c10e2 [AMDGPU] Assembler: support v_mac_f32 DPP and SDWA. Move getNamedOperandIdx to AMDGPUBaseInfo.h
Reviewers: artem.tamazov, tstellarAMD

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D25084

llvm-svn: 283560
2016-10-07 14:46:06 +00:00
Konstantin Zhuravlyov c09e2d7e46 [AMDGPU] AMDGPUCodeGenPrepare: remove extra ';'
llvm-svn: 283558
2016-10-07 14:39:53 +00:00
Tom Stellard 17eb3413cd [ValueTracking] Fix crash in GetPointerBaseWithConstantOffset()
Summary:
While walking defs of pointer operands we were assuming that the pointer
size would remain constant.  This is not true, because addresspacecast
instructions may cast the pointer to an address space with a different
pointer width.

This partial reverts r282612, which was a more conservative solution
to this problem.

Reviewers: reames, sanjoy, apilipenko

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D24772

llvm-svn: 283557
2016-10-07 14:23:29 +00:00
Konstantin Zhuravlyov f74fc60a7d [AMDGPU] Promote uniform (i1, i16] operations to i32
Differential Revision: https://reviews.llvm.org/D25302

llvm-svn: 283555
2016-10-07 14:22:58 +00:00
Javed Absar 9797989ca7 [ARM]: add missing switch case for cortex-r52
Adds a missing switch case for handling cortex-r52
in init-subtarget-features.

llvm-svn: 283551
2016-10-07 13:41:55 +00:00
Martin Storsjo 04864f45b2 [ARM] Reapply: Use __rt_div functions for divrem on Windows
Reapplying r283383 after revert in r283442. The additional fix
is a getting rid of a stray space in a function name, in the
refactoring part of the commit.

This avoids falling back to calling out to the GCC rem functions
(__moddi3, __umoddi3) when targeting Windows.

The __rt_div functions have flipped the two arguments compared
to the __aeabi_divmod functions. To match MSVC, we emit a
check for division by zero before actually calling the library
function (even if the library function itself also might do
the same check).

Not all calls to __rt_div functions for division are currently
merged with calls to the same function with the same parameters
for the remainder. This is more wasteful than a div + mls as before,
but avoids calls to __moddi3.

Differential Revision: https://reviews.llvm.org/D25332

llvm-svn: 283550
2016-10-07 13:28:53 +00:00
Javed Absar fb4b6e8db9 [ARM]: Add Cortex-R52 target to LLVM
This patch adds Cortex-R52, the new ARM real-time processor, to LLVM. 
Cortex-R52 implements the ARMv8-R architecture.

llvm-svn: 283542
2016-10-07 12:06:40 +00:00
Simon Pilgrim a5d019ee95 [X86][SSE] Update register class during MOVSD/MOVSS - BLENDPD/BLENDPS commutation
MOVSD/MOVSS take a 128-bit register and a FR32/FR64 register input, the commutation code wasn't taking this into account leading to verification errors.

This patch inserts a vreg copy mi to ensure that the registers are correct.

Fix for PR30607

Differential Revision: https://reviews.llvm.org/D25280

llvm-svn: 283539
2016-10-07 11:18:38 +00:00
Alexey Bataev 6ad5da7c81 [SLPVectorizer] Fix for PR25748: reduction vectorization after loop
unrolling.

The next code is not vectorized by the SLPVectorizer:
```
 int test(unsigned int *p) {
  int sum = 0;
  for (int i = 0; i < 8; i++)
    sum += p[i];
  return sum;
 }
```
During optimization this loop is fully unrolled and SLPVectorizer is
unable to vectorize it. Patch tries to fix this problem.

Differential Revision: https://reviews.llvm.org/D24796

llvm-svn: 283535
2016-10-07 09:39:22 +00:00
Oliver Stannard 4df1cc0b00 [ARM] Don't convert switches to lookup tables of pointers with ROPI/RWPI
With the ROPI and RWPI relocation models we can't always have pointers
to global data or functions in constant data, so don't try to convert switches
into lookup tables if any value in the lookup table would require a relocation.
We can still safely emit lookup tables of other values, such as simple
constants.

Differential Revision: https://reviews.llvm.org/D24462

llvm-svn: 283530
2016-10-07 08:48:24 +00:00
Mehdi Amini 68c6c8cd78 Use StringRef in ARMELFStreamer (NFC)
llvm-svn: 283529
2016-10-07 08:48:07 +00:00
Nicolai Haehnle 87bc4c218b AMDGPU: Fix use-after-free in SIOptimizeExecMasking
Summary:
There was a bug with sequences like

   s_mov_b64 s[0:1], exec
   s_and_b64 s[2:3]<def>, s[0:1], s[2:3]<kill>
   ...
   s_mov_b64_term exec, s[2:3]

because s[2:3] was defined and used in the same instruction, ending up with
SaveExecInst inside OtherUseInsts.

Note that the test case also exposes an unrelated bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98028

Reviewers: tstellarAMD, arsenm

Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D25306

llvm-svn: 283528
2016-10-07 08:40:14 +00:00
Mehdi Amini a0016ec95f Use StringReg in TargetParser APIs (NFC)
llvm-svn: 283527
2016-10-07 08:37:29 +00:00
Mehdi Amini 9ff8e87ca4 Revert "Revert "Add a static_assert to enforce that parameters to llvm::format() are not totally unsafe""
This reverts commit r283510 and reapply r283509, with updates to
clang-tools-extra as well.

llvm-svn: 283525
2016-10-07 08:25:42 +00:00
Craig Topper 948625633f [X86] Fix patterns for VPMULLD and VPCMPEQQ to not require aligned loads.
llvm-svn: 283524
2016-10-07 06:54:43 +00:00
Craig Topper 871da8ebea [X86] Remove unused PatFrags. NFC
llvm-svn: 283523
2016-10-07 06:54:39 +00:00
Dylan McKay e5d89e8001 [AVR] Add the AVRMCInstLower class
Summary:
This class deals with the lowering of CodeGen `MachineInstr` objects to
MC `MCInst` objects.

Reviewers: kparzysz, arsenm

Subscribers: wdng, beanz, japaric, mgorny

Differential Revision: https://reviews.llvm.org/D25269

llvm-svn: 283522
2016-10-07 06:13:09 +00:00
David Majnemer 8c03c1bade [SimplifyCFG] Correctly test for unconditional branches in GetCaseResults
GetCaseResults assumed that a terminator with one successor was an
unconditional branch.  This is not necessarily the case, it could be a
cleanupret.

Strengthen the check by querying whether or not the terminator is
exceptional.

llvm-svn: 283517
2016-10-07 01:38:35 +00:00
Peter Collingbourne 2261d78cd2 Target: Remove unused patterns and transforms. NFC.
llvm-svn: 283515
2016-10-07 00:30:49 +00:00
Colin LeMahieu 8ed1aee9dd [Hexagon] NFC Removing 'V4_' prefix from duplex instruction names.
llvm-svn: 283514
2016-10-07 00:15:07 +00:00
Mehdi Amini 292f376934 Revert "Add a static_assert to enforce that parameters to llvm::format() are not totally unsafe"
This reverts commit r283509, clang is hitting the assert.

llvm-svn: 283510
2016-10-06 23:41:49 +00:00
Mehdi Amini a7e893f638 Add a static_assert to enforce that parameters to llvm::format() are not totally unsafe
Summary:
I had for the second time today a bug where llvm::format("%s", Str)
was called with Str being a StringRef. The Linux and MacOS bots were
fine, but windows having different calling convention, it printed
garbage.

Instead we can catch this at compile-time: it is never expected to
call a C vararg printf-like function with non scalar type I believe.

Reviewers: bogner, Bigcheese, dexonsmith

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25266

llvm-svn: 283509
2016-10-06 23:26:29 +00:00
Colin LeMahieu 9675de5ba8 [Hexagon] NFC. Canonicalizing absolute address instruction names.
llvm-svn: 283507
2016-10-06 23:02:11 +00:00
Vedant Kumar 7beb423765 Delete some dead code in SelectionDAG (NFC)
Differential Revision: https://reviews.llvm.org/D24435

llvm-svn: 283505
2016-10-06 22:53:43 +00:00
Dan Gohman 2726b88c03 [WebAssemby] Implement block signatures.
Per spec changes, this implements block signatures, and adds just enough
logic to produce correct block signatures at the ends of functions.

Differential Revision: https://reviews.llvm.org/D25144

llvm-svn: 283503
2016-10-06 22:29:32 +00:00
Dan Gohman 3a643e8d46 [WebAssembly] Remove loop's bottom label.
Per spec changes, loop constructs no longer have a bottom label.

https://reviews.llvm.org/D25118

llvm-svn: 283502
2016-10-06 22:10:23 +00:00
Dan Gohman 7f1bdb2e02 [WebAssembly] Remove the output operand from stores.
Per spec changes, store instructions in WebAssembly no longer have a return
value. Update the instruction descriptions.

Differential Revision: https://reviews.llvm.org/D25122

llvm-svn: 283501
2016-10-06 22:08:28 +00:00
Wolfgang Pieb e51bede1d8 Preserve the debug location when CodeGenPrepare sinks a compare instruction into the
basic block of a user.

Patch by Andrea DiBiagio.

Differential Revision: https://reviews.llvm.org/D24632

llvm-svn: 283500
2016-10-06 21:43:45 +00:00
Pirama Arumuga Nainar cc152ac794 Handle *_EXTEND_VECTOR_INREG during Integer Legalization
Summary:
These nodes need legalization for 3-element vectors.  This commit
handles the legalization and adds tests for zext and sext.

This fixes PR30614.

Reviewers: RKSimon, srhines

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25268

llvm-svn: 283496
2016-10-06 21:27:05 +00:00
Rong Xu 0e79f7d11d [PGO] Create weak alias for the renamed Comdat function
Add a weak alias to the renamed Comdat function in IR level instrumentation,
using it's original name. This ensures the same behavior w/ and w/o IR
instrumentation, even for non standard conforming code.

Differential Revision: http://reviews.llvm.org/D25339

llvm-svn: 283490
2016-10-06 20:38:13 +00:00
Michael Kuperstein e524e22846 [X86] Preserve BasePtr for LEA64_32r
When replacing FrameIndex with BasePtr, we must preserve BasePtr for
LEA64_32r since BasePtr is used later for stack adjustment if it is
the same as StackPtr.

Patch by H.J Lu <hjl.tools@gmail.com>

Differential Revision: https://reviews.llvm.org/D23575

llvm-svn: 283486
2016-10-06 19:31:27 +00:00
Michael Kuperstein 7cc2123847 [DAG] Generalize build_vector -> vector_shuffle combine for more than 2 inputs
This generalizes the build_vector -> vector_shuffle combine to support any
number of inputs. The idea is to create a binary tree of shuffles, where
the first layer performs pairwise shuffles of the input vectors placing each
input element into the correct lane, and the rest of the tree blends these
shuffles together.

This doesn't try to be smart and create any sort of "optimal" shuffles.
The assumption is that even a "poor" shuffle sequence is better than extracting
and inserting the elements one by one.

Differential Revision: https://reviews.llvm.org/D24683

llvm-svn: 283480
2016-10-06 18:58:24 +00:00
Michael Ilseman 6d6b4d87a3 Revert "Add -strip-nonlinetable-debuginfo capability"
This reverts commit r283473.

Reverted until review is completed.

llvm-svn: 283478
2016-10-06 18:30:26 +00:00
Matt Arsenault 5e63a04e46 AMDGPU: Don't fold undef uses or copies with implicit uses
llvm-svn: 283476
2016-10-06 18:12:13 +00:00
Matt Arsenault c59a92387e AMDGPU: Remove scheduling info from si_mask_branch
llvm-svn: 283475
2016-10-06 18:12:07 +00:00
Michael Ilseman d0a4db7632 Add -strip-nonlinetable-debuginfo capability
This adds a new function to DebugInfo.cpp that takes an llvm::Module
as input and removes all debug info metadata that is not directly
needed for line tables, thus effectively stripping all type and
variable information from the module.

The primary motivation for this feature was the bitcode work flow
(cf. http://lists.llvm.org/pipermail/llvm-dev/2016-June/100643.html
for more background). This is not wired up yet, but will be in
subsequent patches.  For testing, the new functionality is exposed to
opt with a -strip-nonlinetable-debuginfo option.

The secondary use-case (and one that works right now!) is as a
reduction pass in bugpoint. I added two new bugpoint options
(-disable-strip-debuginfo and -disable-strip-debug-types) to control
the new features. By default it will first attempt to remove all debug
information, then only the type info, and then proceed to hack at any
remaining MDNodes.

llvm-svn: 283473
2016-10-06 17:58:38 +00:00
Matt Arsenault c2ee42cd16 AMDGPU: Remove leftover implicit operands when folding immediates
When constant folding an operation to a copy or an immediate
mov, the implicit uses/defs of the old instruction were left behind,
e.g. replacing v_or_b32 left the implicit exec use on the new copy.

llvm-svn: 283471
2016-10-06 17:54:30 +00:00
Matt Arsenault 11f7402075 Reapply "AMDGPU: Support using tablegened MC pseudo expansions"
Fix bad merge

llvm-svn: 283470
2016-10-06 17:19:11 +00:00
Matt Arsenault cbc879ee2f Revert "AMDGPU: Support using tablegened MC pseudo expansions"
llvm-svn: 283469
2016-10-06 17:08:01 +00:00
Matt Arsenault d20a2dd7ac AMDGPU: Support using tablegened MC pseudo expansions
Make the necessary refactorings to make use of PseudoInstExpansion

llvm-svn: 283467
2016-10-06 16:56:41 +00:00
Matt Arsenault 6bc43d8627 BranchRelaxation: Support expanding unconditional branches
AMDGPU needs to expand unconditional branches in a new
block with an indirect branch.

llvm-svn: 283464
2016-10-06 16:20:41 +00:00
Krzysztof Parzyszek d391d6f1c3 [Hexagon] Avoid replacing full regs with subregisters in tied operands
Doing so will result in the two-address pass generating incorrect code.

llvm-svn: 283463
2016-10-06 16:18:04 +00:00
Matt Arsenault ef5bba0136 BranchRelaxation: Account for function alignment
llvm-svn: 283462
2016-10-06 16:00:58 +00:00
Matt Arsenault 36919a4f7c Move AArch64BranchRelaxation to generic code
llvm-svn: 283459
2016-10-06 15:38:53 +00:00
Matt Arsenault 0a3ea89e85 AArch64: Move remaining target specific BranchRelaxation bits to TII
llvm-svn: 283458
2016-10-06 15:38:09 +00:00
Nirav Dave ee554e6155 [X86] Fix intel syntax push parsing bug
Change erroneous parsing of push immediate instructions in intel syntax
to default to pointer size by rewriting into the ATT style for matching.

This fixes PR22028.

Reviewers: majnemer, rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25288

llvm-svn: 283457
2016-10-06 15:28:08 +00:00
Mehdi Amini a5ee89863c Revert "Use StringRef in LTOModule implementation (NFC)"
This reverts commit r282997, a windows bot is asserting in
one test apparently.

llvm-svn: 283456
2016-10-06 15:12:22 +00:00
Sam Kolton 3381d7a216 [AMDGPU] Disassembler: print label names in branch instructions
Summary: Add AMDGPUSymbolizer for finding names for labels from ELF symbol table.
Initialize MCObjectFileInfo with some default values.

Reviewers: vpykhtin, artem.tamazov, tstellarAMD

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D24802

llvm-svn: 283450
2016-10-06 13:46:08 +00:00
Anna Thomas 488c05763c [RS4GC] Fix comment to show TODO. NFC
llvm-svn: 283449
2016-10-06 13:24:20 +00:00
Krzysztof Parzyszek 459a1c9f2b [RDF] Replace some expensive copies with references in range-based loops
llvm-svn: 283446
2016-10-06 13:05:46 +00:00
Krzysztof Parzyszek 61d9032bf3 [RDF] Replace potentially unclear autos with real types
llvm-svn: 283445
2016-10-06 13:05:13 +00:00
Diana Picus 6341e46cd1 Revert "[ARM] Use __rt_div functions for divrem on Windows"
This reverts commit r283383 because it broke some of the bots:
undefined reference to ` __aeabi_uldivmod'

It affected (at least) clang-cmake-armv7-a15-selfhost,
clang-cmake-armv7-a15-selfhost and clang-native-arm-lnt.

llvm-svn: 283442
2016-10-06 11:24:29 +00:00
Henric Karlsson 54a53bd303 Test commit access (NFC)
llvm-svn: 283439
2016-10-06 10:58:41 +00:00
Matt Arsenault 10c17ca6c6 AMDGPU: Partially fix reported code size for some instructions
These ones need to have the size on the pseudo instruction set for
getInstSizeInBytes to work correctly. These also have a statically
known size.

llvm-svn: 283437
2016-10-06 10:13:23 +00:00
Bjorn Pettersson 3961603921 [ValueTracking] Teach computeKnownBits and ComputeNumSignBits to look through ExtractElement.
Summary:
The computeKnownBits and ComputeNumSignBits functions in ValueTracking can now do a simple look-through of ExtractElement.

Reviewers: majnemer, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24955

llvm-svn: 283434
2016-10-06 09:56:21 +00:00
Sagar Thakur f9292220dc [EfficiencySanitizer] Adds shadow memory parameters for 40-bit virtual memory address.
Adding 40-bit shadow memory parameters because MIPS64 uses 40-bit virtual memory addresses.

Reviewed by rengolin.
Differential: https://reviews.llvm.org/D23801

llvm-svn: 283433
2016-10-06 09:52:06 +00:00
Nuno Lopes d3f5af0fe4 fix build on cygwin
Cygwin has dlfcn.h, but no Dl_info

llvm-svn: 283427
2016-10-06 09:32:16 +00:00
James Molloy 6215fad0e9 [ARM] Constant pool promotion - fix alignment calculation
Global variables are GlobalValues, so they have explicit alignment. Querying
DataLayout for the alignment was incorrect.

Testcase added.

llvm-svn: 283423
2016-10-06 07:56:00 +00:00
Petr Hosek e023d62e76 [Triple] Add triple for Fuchsia
Fuchsia is a new operating system.

Differential Revision: https://reviews.llvm.org/D25116

llvm-svn: 283419
2016-10-06 05:17:26 +00:00
Kostya Serebryany 936b1e774f [libFuzzer] be more careful with memory usage, print peak rss in status lines
llvm-svn: 283418
2016-10-06 05:14:00 +00:00
Konstantin Zhuravlyov b4eb5d5049 [AMDGPU] Promote uniform i16 bitreverse intrinsic to i32
Differential Revision: https://reviews.llvm.org/D25121

llvm-svn: 283415
2016-10-06 02:20:46 +00:00
Kostya Serebryany 3b564e9765 [libFuzzer] when re-running for lsan, don't look at the coverage
llvm-svn: 283411
2016-10-05 23:31:01 +00:00
Kostya Serebryany 1c73f1bf27 [libFuzzer] refactoring to make -shrink=1 work for value profile, added a test.
llvm-svn: 283409
2016-10-05 22:56:21 +00:00
Reid Kleckner bb96df602e [codeview] Truncate records to maximum record size near 64KB
If we don't truncate, LLVM asserts when the label difference doesn't fit
in a 16 bit field. This patch truncates two kinds of data: trailing null
terminated names in symbol records, and inline line tables. The inline
line table test that I have is too large (many MB), so I'm not checking
it in.

Hopefully fixes PR28264.

llvm-svn: 283403
2016-10-05 22:36:07 +00:00
Adrian Prantl b3510afcd1 Verifier: Reject any unknown named MD nodes in the llvm.dbg namespace.
This came out of a discussion in https://reviews.llvm.org/D25285.

There used to be various other llvm.dbg.* nodes, but we don't support
upgrading them and we want to reserve the namespace for future uses.

This also removes an entirely obsolete and bitrotted testcase for PR7662.

Reapplies 283390 with a forgotten testcase.

llvm-svn: 283400
2016-10-05 22:15:37 +00:00
Adrian Prantl 497f085475 Revert "Verifier: Reject any unknown named MD nodes in the llvm.dbg namespace."
Forgot to add a testcase in r283390.

llvm-svn: 283399
2016-10-05 22:15:34 +00:00
David Callahan c1051ab26e Modify df_iterator to support post-order actions
Summary: This makes a change to the state used to maintain visited information for depth first iterator. We know assume a method "completed(...)" which is called after all children of a node have been visited. In all existing cases, this method does nothing so this patch has no functional changes.  It will however allow a client to distinguish back from cross edges in a DFS tree.

Reviewers: nadav, mehdi_amini, dberlin

Subscribers: MatzeB, mzolotukhin, twoh, freik, llvm-commits

Differential Revision: https://reviews.llvm.org/D25191

llvm-svn: 283391
2016-10-05 21:36:16 +00:00
Adrian Prantl 71bba7253e Verifier: Reject any unknown named MD nodes in the llvm.dbg namespace.
This came out of a discussion in https://reviews.llvm.org/D25285.

There used to be various other llvm.dbg.* nodes, but we don't support
upgrading them and we want to reserve the namespace for future uses.

This also removes an entirely obsolete and bitrotted testcase for PR7662.

llvm-svn: 283390
2016-10-05 21:31:19 +00:00
Dan Gohman 5a68ec7f09 [WebAssembly] Add binary-encoding opcode values to instruction descriptions.
llvm-svn: 283389
2016-10-05 21:24:08 +00:00
Reid Kleckner 2b3e6428e5 [codeview] Translate bitpiece metadata to DEFRANGE_SUBFIELD* records
This allows LLVM to describe locations of aggregate variables that have
been split by SROA.

Fixes PR29141

Reviewers: amccarth, majnemer

Differential Revision: https://reviews.llvm.org/D25253

llvm-svn: 283388
2016-10-05 21:21:33 +00:00
Lang Hames a5e873e2a1 [Object] Fix a crash in Archive::child_iterator's default constructor.
To be default constructible, Archive::child_iterator needs to be able to
construct an Archive::Child with a null parent, however Archive::Child's
constructor always dereferenced its Parent argument to compute the remaining
archive size. This commit fixes Archive::Child's constructor to only do the
size calculation when the parent is non-null.

llvm-svn: 283387
2016-10-05 21:20:00 +00:00
Martin Storsjo f997759aef [ARM] Use __rt_div functions for divrem on Windows
This avoids falling back to calling out to the GCC rem functions
(__moddi3, __umoddi3) when targeting Windows.

The __rt_div functions have flipped the two arguments compared
to the __aeabi_divmod functions. To match MSVC, we emit a
check for division by zero before actually calling the library
function (even if the library function itself also might do
the same check).

Not all calls to __rt_div functions for division are currently
merged with calls to the same function with the same parameters
for the remainder. This is more wasteful than a div + mls as before,
but avoids calls to __moddi3.

Differential Revision: https://reviews.llvm.org/D24076

llvm-svn: 283383
2016-10-05 21:08:02 +00:00
James Y Knight b0a473aaf8 [Sparc] Implement UMUL_LOHI and SMUL_LOHI instead of MULHS/MULHU/MUL.
This is what the instruction-set actually provides, and the default
expansions of the others into the lohi opcodes are good.

llvm-svn: 283381
2016-10-05 20:54:17 +00:00
Anna Zaks 9a6a6eff0e [asan] Reapply: Switch to using dynamic shadow offset on iOS
The VM layout is not stable between iOS version releases, so switch to dynamic shadow offset.

This is the LLVM counterpart of https://reviews.llvm.org/D25218

Differential Revision: https://reviews.llvm.org/D25219

llvm-svn: 283376
2016-10-05 20:34:13 +00:00
Matthew Simpson a58c50dff0 [LV] Pass profitability analysis in vectorizer constructor (NFC)
The vectorizer already holds a pointer to one cost model artifact in a member
variable (i.e., MinBWs). As we add more, it will be easier to communicate these
artifacts to the vectorizer if we simply pass a pointer to the cost model
instead.

llvm-svn: 283373
2016-10-05 20:23:46 +00:00
Krzysztof Parzyszek 3b6cbd55f7 [RDF] Fix live def propagation through basic block
llvm-svn: 283371
2016-10-05 20:08:09 +00:00
Matthias Braun 0a6916f303 AMDGPU: Do not re-use tmpreg in spill/restore lowering
The register scavenging code does not support multiple definitions of
the same vreg.

Differential Revision: https://reviews.llvm.org/D25220

llvm-svn: 283369
2016-10-05 20:02:51 +00:00
Matthew Simpson 386546124f [LV] Pass legality analysis in vectorizer constructor (NFC)
The vectorizer already holds a pointer to the legality analysis in a member
variable, so it makes sense that we would pass it in the constructor.

llvm-svn: 283368
2016-10-05 19:53:20 +00:00
Peter Collingbourne d799d28540 FastISel: Remove unused/un-overridden entry points. NFCI.
llvm-svn: 283366
2016-10-05 19:25:20 +00:00
Matthew Simpson 6a8e0bcf3d [LV] Remove obsolete comment (NFC)
llvm-svn: 283365
2016-10-05 19:19:49 +00:00
Matthew Simpson ee3fdc7e26 [LV] Use getScalarizationOverhead in memory instruction costs (NFC)
This patch refactors the cost estimation of scalarized loads and stores to
reuse getScalarizationOverhead for the cost of the extractelement and
insertelement instructions we might create. The existing code accounted for
this cost, but it was functionally equivalent to the helper function.

llvm-svn: 283364
2016-10-05 19:11:54 +00:00
Sanjay Patel a40c479fe9 fix documentation comments; NFC
llvm-svn: 283361
2016-10-05 18:51:12 +00:00
Rafael Espindola 37fc0183d7 Allow the caller to pass in the hash.
If the caller already has the hash we don't have to compute it. This
will be used in lld.

llvm-svn: 283359
2016-10-05 18:46:21 +00:00
Reid Kleckner f9dddec21c Improve DEBUG_VALUE assembly comments for spilled bitpieces
Previously we would give up when we saw the bitpiece DWARF expression
and print "[complex expression]" when actually we handled bitpiece
expressions outside the loop.

llvm-svn: 283355
2016-10-05 18:36:02 +00:00
Matthew Simpson 1755d81b29 [LV] Add helper function for predicated block probability (NFC)
The cost model has to estimate the probability of executing predicated blocks.
However, we currently always assume predicated blocks have a 50% chance of
executing (this value is hardcoded in several places throughout the code).
Since we always use the same value, this patch adds a helper function for
getting this uniform probability. The function simplifies some comments and
makes our assumptions more clear. In the future, we may want to extend this
with actual block probability information if it's available.

llvm-svn: 283354
2016-10-05 18:30:36 +00:00
Simon Dardis 299dbd6cd1 [mips][ias] fix li macro when values are negated with ~
The integrated assembler evaluates the expressions such as ~0x80000000 to
0xffffffff7fffffff early in the parsing process. This patch adds compatibility
with gas so that li loads the expected value (0x7fffffff) in those cases. This
only occurs iff all the upper 32bits are set and maintains existing checks by
not truncating the result down to 32 bits if any of the the upper bits are not
set.

Reviewers: dsanders, zoran.jovanovic

Differential Review: https://reviews.llvm.org/D23399

llvm-svn: 283353
2016-10-05 18:26:19 +00:00
Matthew Simpson c631167609 [LV] Add isScalarWithPredication helper function (NFC)
This patch adds a single helper function for checking if an instruction will be
scalarized with predication. Such instructions include conditional stores and
instructions that may divide by zero. Existing checks have been updated to use
the new function.

llvm-svn: 283350
2016-10-05 17:52:34 +00:00
Anna Zaks e732ce4dff Revert "[asan] LLVM: Switch to using dynamic shadow offset on iOS"
This reverts commit abe77a118615cd90b0d7f127e4797096afa2b394.

Revert as these changes broke a Chromium buildbot.

llvm-svn: 283348
2016-10-05 17:42:02 +00:00
Bjorn Pettersson 12559441bd [DAG] Teach computeKnownBits and ComputeNumSignBits in SelectionDAG to look through EXTRACT_VECTOR_ELT.
Summary: Both computeKnownBits and ComputeNumSignBits can now do a simple
look-through of EXTRACT_VECTOR_ELT. It will compute the result based
on the known bits (or known sign bits) for the vector that the element
is extracted from.

Reviewers: bogner, tstellarAMD, mkuper

Subscribers: wdng, RKSimon, jyknight, llvm-commits, nhaehnle

Differential Revision: https://reviews.llvm.org/D25007

llvm-svn: 283347
2016-10-05 17:40:27 +00:00
Rafael Espindola 24db10d8e1 Don't pass null to memcpy. Should fix the asan bots.
llvm-svn: 283336
2016-10-05 16:33:03 +00:00
Simon Dardis f45a59f80b Recommit: "[mips] Add rsqrt, recip for MIPS"
Add rsqrt.[ds], recip.[ds] for MIPS. Correct the microMIPS definitions for
architecture support and register usage.

Reviewers: vkalintiris, zoran.jovanoic

Differential Review: https://reviews.llvm.org/D24499

llvm-svn: 283334
2016-10-05 16:11:01 +00:00
Hans Wennborg c26c03d911 Revert r282920 "X86: Allow conditional tail calls in Win64 "leaf" functions (PR26302)"
This is suspected to cause a miscompile in Chromium. Reverting while
investigating.

llvm-svn: 283329
2016-10-05 15:39:27 +00:00
Simon Dardis bbfd528748 Revert "[mips] Add rsqrt, recip for MIPS"
This reverts commit r282485 which contain two patches instead of
one.

llvm-svn: 283327
2016-10-05 15:28:33 +00:00
Douglas Katzman 0411e8669b [X86] Don't randomly encode %rip where illegal
Differential Revision: https://reviews.llvm.org/D25112

llvm-svn: 283326
2016-10-05 15:23:35 +00:00
James Molloy b7de497cb9 [Thumb] Don't try and emit LDRH/LDRB from the constant pool
This is not a valid encoding - these instructions cannot do PC-relative addressing.

The underlying problem here is of whitelist in ARMISelDAGToDAG that unwraps ARMISD::Wrappers during addressing-mode selection. This didn't realise TargetConstantPool was actually possible, so didn't handle it.

llvm-svn: 283323
2016-10-05 14:52:13 +00:00
Oren Ben Simhon a2010755fa Test commit permission
llvm-svn: 283318
2016-10-05 13:48:33 +00:00
Dylan McKay afff169f17 [AVR] Don't select 'MOVW' instructions when they are not supported
We have a subtarget feature which we were ignoring, which was causing us
to generate unsupported instructions for some older chips.

llvm-svn: 283317
2016-10-05 13:38:29 +00:00
Dylan McKay 82ef77091c [AVR] Add AVRRegisterInfo::splitReg function
No tests are included just yet - this is used from the pseudo
instruction expander pass, which hasn't been pulled in-tree yet.

llvm-svn: 283316
2016-10-05 13:27:30 +00:00
Krzysztof Parzyszek e7c72cdbb0 Fix machine operand traversal in ScheduleDAGInstrs::fixupKills
llvm-svn: 283315
2016-10-05 13:15:06 +00:00
Dylan McKay ea55554803 [AVR] Update return type of dynamic alloca pass
It was recently changed from 'const char*' to StringRef

llvm-svn: 283312
2016-10-05 12:32:24 +00:00
Dylan McKay 192405a31a [AVR] Add the AVR frame lowering code
Summary: This allows AVR to lower frames into assembly code.

Reviewers: arsenm, kparzysz

Subscribers: japaric, wdng, beanz, mgorny

Differential Revision: https://reviews.llvm.org/D25032

llvm-svn: 283311
2016-10-05 11:48:56 +00:00
Dylan McKay c1760424de [AVR] Split all of the AVR device definitions into a separate file
We have ~500 lines of subtarget feature definitions, they don't belong
in our main TableGen file.

llvm-svn: 283310
2016-10-05 10:28:45 +00:00
Dylan McKay 5af1248230 [AVR] Enable the instruction printer in the target definition
llvm-svn: 283309
2016-10-05 10:23:38 +00:00
Dylan McKay f66e120b3b [AVR] Add definitions for the ATTiny102 and ATtiny104 chips
llvm-svn: 283308
2016-10-05 10:20:33 +00:00
Mehdi Amini 149f6eaed9 Re-commit "Use StringRef in Support/Darf APIs (NFC)"
This reverts commit r283285 and re-commit r283275 with
a fix for format("%s", Str); where Str is a StringRef.

llvm-svn: 283298
2016-10-05 05:59:29 +00:00
Dylan McKay efe40389c0 [AVR] Add the machine code backend
Summary:
This adds the AVR machine code backend (`AVRAsmBackend.cpp`). This will
allow us to generate machine code from assembled AVR instructions.

Reviewers: arsenm, kparzysz

Subscribers: modocache, japaric, wdng, beanz, mgorny

Differential Revision: https://reviews.llvm.org/D25029

llvm-svn: 283297
2016-10-05 05:30:19 +00:00
Dean Michael Berris 27358cff88 [Support][CommandLine] Add cl::getRegisteredSubcommands()
This should allow users of the library to get a range to iterate through
all the subcommands that are registered to the global parser. This
allows users to define subcommands in libraries that self-register to
have dispatch done at a different stage (like main). It allows for
writing code like the following:

    for (auto *S : cl::getRegisteredSubcommands()) {
      if (*S) {
	// Dispatch on S->getName().
      }
    }

This change also contains tests that show this usage pattern.

Reviewers: zturner, dblaikie, echristo

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D24489

llvm-svn: 283296
2016-10-05 05:20:08 +00:00
Mehdi Amini e4f0b75e3d Blind attempt to fix windows build after r283290 - Use StringRef in StringSaver API (NFC)
llvm-svn: 283294
2016-10-05 01:41:11 +00:00
Mehdi Amini 5b00770c35 Use StringRef in ARMConstantPool APIs (NFC)
llvm-svn: 283293
2016-10-05 01:41:06 +00:00
Kyle Butt 25ac35d822 Revert "Codegen: Tail-duplicate during placement."
This reverts commit 062ace9764953e9769142c1099281a345f9b6bdc.

Issue with loop info and block removal revealed by polly.
I have a fix for this issue already in another patch, I'll re-roll this
together with that fix, and a test case.

llvm-svn: 283292
2016-10-05 01:39:29 +00:00
Mehdi Amini 3e021be3b6 Use StringRef in FastISel API (NFC)
llvm-svn: 283291
2016-10-05 01:37:29 +00:00
Mehdi Amini ec4fb5ba97 Use StringRef in StringSaver API (NFC)
llvm-svn: 283290
2016-10-05 01:32:41 +00:00
Mehdi Amini a6f81ca8ea Use StringRef in ARCRuntimeEntryPoints APIs (NFC)
llvm-svn: 283288
2016-10-05 01:15:04 +00:00
Kostya Serebryany 379359c53a [libFuzzer] add ShrinkValueProfileTest, move code around, NFC
llvm-svn: 283286
2016-10-05 01:09:40 +00:00
Mehdi Amini 2bcac0fac4 Revert "Re-commit "Use StringRef in Support/Darf APIs (NFC)""
One test seems randomly broken: DebugInfo/X86/gnu-public-names.ll

llvm-svn: 283285
2016-10-05 01:04:02 +00:00
Mehdi Amini a28bb09f28 Use StringRef in MCSectionMachO (NFC)
llvm-svn: 283284
2016-10-05 01:02:34 +00:00
Mehdi Amini 215ff8df74 Use StringRef in DarwinAsmParser (NFC)
llvm-svn: 283283
2016-10-05 01:02:22 +00:00
Michael Zolotukhin 5cda89ad36 [LoopDistribute] Fix a typo in the pass name.
llvm-svn: 283282
2016-10-05 00:44:52 +00:00
Mehdi Amini 32b297a42f Re-commit "Use StringRef in Support/Darf APIs (NFC)"
This reverts commit r283278 and re-commit r283275 with
the update to fix the build on the LLDB side.

llvm-svn: 283281
2016-10-05 00:37:18 +00:00
Kostya Serebryany 2455f0d013 [libFuzzer] clear the corpus elements if they are evicted (i.e. smaller elements with proper coverage are found). Make sure we never try to mutate empty element. Print the corpus size in bytes in the status lines
llvm-svn: 283279
2016-10-05 00:25:17 +00:00
Mehdi Amini 78b04ae7ac Revert "Use StringRef in Support/Darf APIs (NFC)"
This reverts commit r283275, it broke LLDB Android debug server.

llvm-svn: 283278
2016-10-05 00:21:14 +00:00
Mehdi Amini c6caed8fa1 Use StringRef instead of raw pointers in ARMBuildAttrs (NFC)
llvm-svn: 283277
2016-10-05 00:15:18 +00:00
Mehdi Amini e0327be584 Use StringRef in Support/Darf APIs (NFC)
llvm-svn: 283275
2016-10-04 23:55:40 +00:00
Kyle Butt adabac2d57 Codegen: Tail-duplicate during placement.
The tail duplication pass uses an assumed layout when making duplication
decisions. This is fine, but passes up duplication opportunities that
may arise when blocks are outlined. Because we want the updated CFG to
affect subsequent placement decisions, this change must occur during
placement.

In order to achieve this goal, TailDuplicationPass is split into a
utility class, TailDuplicator, and the pass itself. The pass delegates
nearly everything to the TailDuplicator object, except for looping over
the blocks in a function. This allows the same code to be used for tail
duplication in both places.

This change, in concert with outlining optional branches, allows
triangle shaped code to perform much better, esepecially when the
taken/untaken branches are correlated, as it creates a second spine when
the tests are small enough.

Issue from previous rollback fixed, and a new test was added for that
case as well.

Differential revision: https://reviews.llvm.org/D18226

llvm-svn: 283274
2016-10-04 23:54:18 +00:00
Manuel Jacob 49fafb1109 [C API] Add LLVMConstExactUDiv and LLVMBuildExactUDiv functions.
Summary:
These are analog to the existing LLVMConstExactSDiv and LLVMBuildExactSDiv
functions.

Reviewers: deadalnix, majnemer

Subscribers: majnemer, llvm-commits

Differential Revision: https://reviews.llvm.org/D25259

llvm-svn: 283269
2016-10-04 23:32:42 +00:00
Rafael Espindola 39751afc4e Misc improvements to StringTableBuilder.
This patch adds write methods to StringTableBuilder so that it is
easier to change the underlying implementation.

Using the write methods, avoid creating a temporary buffer when using
mmaped output.

It also uses a more compact key in the DenseMap. Overall this produces
a slightly faster lld:

firefox
  master 6.853419709
  patch  6.841968912 1.00167361138x faster
chromium
  master 4.297280174
  patch  4.298712163 1.00033323147x slower
chromium fast
  master 1.802335952
  patch  1.806872459 1.00251701521x slower
the gold plugin
  master 0.3247149
  patch  0.321971644 1.00852017888x faster
clang
  master 0.551279945
  patch  0.543733194 1.01387951128x faster
llvm-as
  master 0.032743458
  patch  0.032143478 1.01866568391x faster
the gold plugin fsds
  master 0.350814247
  patch  0.348571741 1.00643341309x faster
clang fsds
  master 0.6281672
  patch  0.621130222 1.01132931187x faster
llvm-as fsds
  master 0.030168899
  patch  0.029797155 1.01247582194x faster
scylla
  master 3.104222518
  patch  3.059590248 1.01458766252x faster

llvm-svn: 283266
2016-10-04 22:43:25 +00:00
Alina Sbirlea 9a78ebd6d8 [cpu-detection] Copy simplified version of get_cpuid_max to remove dependency to clang's implementation
Summary:
Attempting to fix PR30384.
Take the same approach as in compiler_rt and add a simplified version of __get_cpuid_max.
Including cpuid.h is no longer needed.

Reviewers: echristo, joerg

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D24597

llvm-svn: 283265
2016-10-04 22:39:53 +00:00
David L Kreitzer 7c7ee89b01 Revert r283248. It caused failures in the hexagon buildbots.
llvm-svn: 283254
2016-10-04 20:57:19 +00:00
Sanjay Patel bfdbea6481 [Target] move reciprocal estimate settings from TargetOptions to TargetLowering
The motivation for the change is that we can't have pseudo-global settings for
codegen living in TargetOptions because that doesn't work with LTO.

Ideally, these reciprocal attributes will be moved to the instruction-level via
FMF, metadata, or something else. But making them function attributes is at least
an improvement over the current state.

The ingredients of this patch are:

    Remove the reciprocal estimate command-line debug option.
    Add TargetRecip to TargetLowering.
    Remove TargetRecip from TargetOptions.
    Clean up the TargetRecip implementation to work with this new scheme.
    Set the default reciprocal settings in TargetLoweringBase (everything is off).
    Update the PowerPC defaults, users, and tests.
    Update the x86 defaults, users, and tests.

Note that if this patch needs to be reverted, the related clang patch checked in
at r283251 should be reverted too.

Differential Revision: https://reviews.llvm.org/D24816

llvm-svn: 283252
2016-10-04 20:46:43 +00:00
Kevin Enderby f993d6e72c Next set of additional error checks for invalid Mach-O files for the
load commands that uses the MachO::encryption_info_command and
MachO::encryption_info_command types but not used in llvm libObject
code but used in llvm tool code.

This includes just LC_ENCRYPTION_INFO and
LC_ENCRYPTION_INFO_64 load commands.

llvm-svn: 283250
2016-10-04 20:37:43 +00:00
David L Kreitzer fedb9b67ca [safestack] Requires a valid TargetMachine to be passed to the SafeStack pass.
Patch by Michael LeMay

Differential revision: http://reviews.llvm.org/D24896

llvm-svn: 283248
2016-10-04 20:31:32 +00:00
Matthias Braun 46a5238682 AArch64: Macrofusion: Split features, add missing combinations.
AArch64InstrInfo::shouldScheduleAdjacent() determines whether two
instruction can benefit from macroop fusion on apple CPUs. The list
turned out to be incomplete:
- the "rr" variants of the instructions were missing
- even the "rs" variants can have shift value == 0 and behave like the
  "rr" variants

This also splits the MacropFusion target feature into
ArithmeticBccFusion and ArithmeticCbzFusion.

Differential Revision: https://reviews.llvm.org/D25142

llvm-svn: 283243
2016-10-04 19:28:21 +00:00
Anna Zaks ef97d2c589 [asan] LLVM: Switch to using dynamic shadow offset on iOS
The VM layout is not stable between iOS version releases, so switch to dynamic shadow offset.

This is the LLVM counterpart of https://reviews.llvm.org/D25218

Differential Revision: https://reviews.llvm.org/D25219

llvm-svn: 283239
2016-10-04 19:02:29 +00:00
Hal Finkel bdd6735a9e Don't filter diagnostics written as YAML to the output file
The purpose of the YAML diagnostic output file is to collect information on
optimizations performed, or not performed, for later processing by tools that
help users (and compiler developers) understand how code was optimized. As
such, the diagnostics that appear in the file should not be coupled to what a
user might want to see summarized for them as the compiler runs, and in fact,
because the user likely does not know what optimization diagnostics their tools
might want to use, the user cannot provide a useful filter regardless. As such,
we shouldn't filter the diagnostics going to the output file.

Differential Revision: https://reviews.llvm.org/D25224

llvm-svn: 283236
2016-10-04 18:13:45 +00:00
Adam Nemet 0428e93217 Serialize remark argument as a mapping to get proper quotation for the value.
llvm-svn: 283231
2016-10-04 17:05:04 +00:00
Adam Nemet 2780ee0dc1 Allow derived classes of OptimizationRemarkAnalysis in YAML
llvm-svn: 283230
2016-10-04 17:05:01 +00:00
Anna Thomas 479cbb9405 [RS4GC] Handle ShuffleVector instruction in findBasePointer
Summary:
This patch modifies the findBasePointer to handle the shufflevector instruction.

Tests run: RS4GC tests, local downstream tests.

Reviewers: reames, sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25197

llvm-svn: 283219
2016-10-04 13:48:37 +00:00
Rafael Espindola fda3dc9266 Remove duplicated typedef. NFC.
llvm-svn: 283216
2016-10-04 13:09:59 +00:00
Nemanja Ivanovic 6354d23555 [Power9] Exploit D-Form VSX Scalar memory ops that target full VSX register set
This patch corresponds to review:

The newly added VSX D-Form (register + offset) memory ops target the upper half
of the VSX register set. The existing ones target the lower half. In order to
unify these and have the ability to target all the VSX registers using D-Form
operations, this patch defines Pseudo-ops for the loads/stores which are
expanded post-RA. The expansion then choses the correct opcode based on the
register that was allocated for the operation.

llvm-svn: 283212
2016-10-04 11:25:52 +00:00
Simon Dardis 86b3a1e79b [mips][fastisel] Consider soft-float an unsupported floating point mode
Treat soft-float as unsupported for fast-isel. Additionally, ensure we check
that lowering f32 arguments also considers the case of soft-float mode.

Reviewers: ehostunreach, vkalintiris, zoran.jovanovic

Differential Review: https://reviews.llvm.org/D24505

llvm-svn: 283209
2016-10-04 10:35:07 +00:00
whitequark 7c4fe0e9a3 [SelectionDAG] Fix calling convention in expansion of ?MULO.
The SMULO/UMULO DAG nodes, when not directly supported by the target,
expand to a multiplication twice as wide. In case that the resulting
type is not legal, an __mul?i3 intrinsic is used. Since the type is
not legal, the legalizer cannot directly call the intrinsic with
the wide arguments; instead, it "pre-lowers" them by splitting them
in halves.

The "pre-lowering" code in essence made assumptions about
the calling convention, specifically that i(N*2) values will be
split into two iN values and passed in consecutive registers in
little-endian order. This, naturally, breaks on a big-endian system,
such as our OR1K out-of-tree backend.

Thanks to James Miller <james@aatch.net> for help in debugging.

Differential Revision: https://reviews.llvm.org/D25223

llvm-svn: 283203
2016-10-04 09:07:49 +00:00
Sjoerd Meijer 535529b41c Consistent fp denormal mode names. NFC.
This fixes the inconsistency of the fp denormal option names: in LLVM this was
DenormalType, but in Clang this is DenormalMode which seems better.

Differential Revision: https://reviews.llvm.org/D24906

llvm-svn: 283192
2016-10-04 08:03:36 +00:00
Nemanja Ivanovic 11049f8f07 [Power9] Part-word VSX integer scalar loads/stores and sign extend instructions
This patch corresponds to review:
https://reviews.llvm.org/D23155

This patch removes the VSHRC register class (based on D20310) and adds
exploitation of the Power9 sub-word integer loads into VSX registers as well
as vector sign extensions.
The new instructions are useful for a few purposes:

    Int to Fp conversions of 1 or 2-byte values loaded from memory
    Building vectors of 1 or 2-byte integers with values loaded from memory
    Storing individual 1 or 2-byte elements from integer vectors

This patch implements all of those uses.

llvm-svn: 283190
2016-10-04 06:59:23 +00:00
Kostya Serebryany 4820cc988f [libFuzzer] remove dfsan support and some related stale code. This is not being used and as is is pretty weak anyway
llvm-svn: 283187
2016-10-04 06:08:46 +00:00
Craig Topper ee2d995661 [X86] Add MOV8rm_NOREX to switch in isReallyTriviallyReMaterializable to match MOV8rm.
llvm-svn: 283184
2016-10-04 03:11:44 +00:00
Kostya Serebryany 5a52a11ce4 [libFuzzer] change the probabilities so that we choose only the inputs that are known to be minimal inputs for at least one coverage feature (works only with -shrink=1 for now)
llvm-svn: 283178
2016-10-04 01:51:44 +00:00
Matt Arsenault dcf0cfca4c AMDGPU: Refactor indirect vector lowering
Allow inserting multiple instructions in the
expanded loop.

llvm-svn: 283177
2016-10-04 01:41:05 +00:00
Matt Arsenault 283fbc24f6 AMDGPU: Factor SGPR spilling into separate functions
llvm-svn: 283175
2016-10-04 01:14:56 +00:00
Kyle Butt 3ffb8529bc Revert "Codegen: Tail-duplicate during placement."
This reverts commit ff234efbe23528e4f4c80c78057b920a51f434b2.

Causing crashes on aarch64 build.

llvm-svn: 283172
2016-10-04 00:38:23 +00:00
Eli Friedman 74bed9d757 Make GlobalsAA ignore dead constant expressions.
Slightly improves the precision of GlobalsAA in certain situations, and
makes the behavior of optimization passes more predictable.

Differential Revision: https://reviews.llvm.org/D24104

llvm-svn: 283165
2016-10-04 00:03:55 +00:00
Kyle Butt 396bfdd707 Codegen: Tail-duplicate during placement.
The tail duplication pass uses an assumed layout when making duplication
decisions. This is fine, but passes up duplication opportunities that
may arise when blocks are outlined. Because we want the updated CFG to
affect subsequent placement decisions, this change must occur during
placement.

In order to achieve this goal, TailDuplicationPass is split into a
utility class, TailDuplicator, and the pass itself. The pass delegates
nearly everything to the TailDuplicator object, except for looping over
the blocks in a function. This allows the same code to be used for tail
duplication in both places.

This change, in concert with outlining optional branches, allows
triangle shaped code to perform much better, esepecially when the
taken/untaken branches are correlated, as it creates a second spine when
the tests are small enough.

llvm-svn: 283164
2016-10-04 00:00:09 +00:00
Dan Gohman e040533ece [WebAssembly] Update to more stack-machine-oriented terminology.
WebAssembly has officially switched from being an AST to being a stack
machine. Update various bits of terminology and README.md entries
accordingly.

llvm-svn: 283154
2016-10-03 22:43:53 +00:00
Dan Gohman ffc184bb1d [WebAssemby] Clean up an obsolete comment.
The comment is present inside the body of GetVRegDef.

llvm-svn: 283153
2016-10-03 22:32:21 +00:00
Matthias Braun 9baa3e80a9 TargetMachine: Make the win32-macho workaround more specific.
This is to avoid problems with win32 + ELF which surprisingly happens a
lot in practice: If a user just specifies -march on the commandline the
object format changes along with the architecture to ELF in many
instances while the OS stays with the default/host OS.

llvm-svn: 283151
2016-10-03 22:12:37 +00:00
Dan Gohman 16fa0d8159 [WebAssembly] Delete an unused function. NFC.
llvm-svn: 283150
2016-10-03 22:06:28 +00:00
Dan Gohman 9850e87784 [WebAssembly] Fix indentation. NFC.
llvm-svn: 283147
2016-10-03 21:33:09 +00:00
Dan Gohman 4b8e8becf6 [WebAssembly] Rename OPERAND_FP32IMM to OPERAND_F32IMM.
WebAssembly documentation consistently says "f32" rather than "fp32" to
describe 32-bit floating-point.

llvm-svn: 283146
2016-10-03 21:31:31 +00:00
Quentin Colombet 3a06701913 [AArch64][RegisterBankInfo] Add getSameKindofOperandsMapping.
Refactor the code so that the same function can be used for all
instructions with all the same operands for up to 3 operands.

This is going to be useful for cast instructions.
NFC.

llvm-svn: 283144
2016-10-03 20:20:13 +00:00
Krzysztof Parzyszek c8b6ecabd8 [RDF] Fix liveness propagation through shadows
Each shadow only represents data flow that is restricted to its reaching
def. Propagating more than that could lead to spurious register liveness,
resulting in extra (incorrectly) block live-ins.

llvm-svn: 283143
2016-10-03 20:17:20 +00:00
Matthias Braun a827ed8891 AArch64Subtarget: Remove unused CPUString field
llvm-svn: 283142
2016-10-03 20:17:02 +00:00
Matthias Braun eccdee9196 X86: Do not produce GOT relocations on windows
Windows has no GOT relocations the way elf/darwin has. Some people use
x86_64-pc-win32-macho to build EFI firmware; Do not produce GOT
relocations for this target.

Differential Revision: https://reviews.llvm.org/D24627

llvm-svn: 283140
2016-10-03 20:11:24 +00:00
Sanjoy Das 0359a193a7 [PruneEH] Be correct in the face IPO
This fixes one spot I had missed in r265762.  Credit goes to Philip
Reames for spotting this one!

llvm-svn: 283137
2016-10-03 19:35:30 +00:00
Dehao Chen 92abc7e9f2 Refactor LICM pass in preparation for LoopSink pass.
Summary: LoopSink pass uses some common function in LICM. This patch refactor the LICM code to make it usable by LoopSink pass (https://reviews.llvm.org/D22778).

Reviewers: davidxl, danielcdh, hfinkel, chandlerc

Subscribers: hfinkel, llvm-commits

Differential Revision: https://reviews.llvm.org/D24168

llvm-svn: 283134
2016-10-03 18:52:08 +00:00
Konstantin Zhuravlyov 60a8373778 [AMDGPU] Pass optimization level to SelectionDAGISel
llvm-svn: 283133
2016-10-03 18:47:26 +00:00
Konstantin Zhuravlyov 691e2e020b [AMDGPU] Sign extend AShr when promoting (instead of zero extending)
llvm-svn: 283130
2016-10-03 18:29:01 +00:00
Hans Wennborg b4d2678c6f Jump threading: avoid trying to split edge into landingpad block (PR27840)
Splitting the edge is nontrivial because of the landing pad, and we would
currently assert trying to do it.

Differential Revision: https://reviews.llvm.org/D24680

llvm-svn: 283129
2016-10-03 18:18:04 +00:00
Rafael Espindola 9ddfb04d03 Revert "Use getSize instead of data().size(). NFC."
This reverts commit r283125.

lld needs to be updated.

llvm-svn: 283127
2016-10-03 18:01:10 +00:00
Krzysztof Parzyszek ab26e2dd3b [RDF] Further improve readability of the graph
Print target basic block for a branch.

llvm-svn: 283126
2016-10-03 17:54:33 +00:00
Rafael Espindola 4bb425e848 Use getSize instead of data().size(). NFC.
Also assert isFinalized in getSize(). This just reduces the noise from
another patch.

llvm-svn: 283125
2016-10-03 17:49:19 +00:00
Krzysztof Parzyszek a77fe4eef3 [RDF] Replace RegisterAliasInfo with target-independent code using lane masks
llvm-svn: 283122
2016-10-03 17:14:48 +00:00
Sanjay Patel d27a21874b [x86, SSE/AVX] allow 128/256-bit lowering for copysign vector intrinsics (PR30433)
This should fix:
https://llvm.org/bugs/show_bug.cgi?id=30433

There are a couple of open questions about the codegen:
1. Should we let scalar ops be scalars and avoid vector constant loads/splats?
2. Should we have a pass to combine constants such as the inverted pair that we have here?

Differential Revision: https://reviews.llvm.org/D25165
 

llvm-svn: 283119
2016-10-03 16:38:27 +00:00
Rafael Espindola d7325ee702 Don't drop the llvm. prefix when renaming.
If the llvm. prefix is dropped other parts of llvm don't see this as
an intrinsic.  This means that the number of regular symbols depends
on the context the module is loaded into, which causes LTO to abort.

Fixes PR30509.

llvm-svn: 283117
2016-10-03 15:51:42 +00:00
Sanjay Patel f7df85af87 fix formatting; NFC
llvm-svn: 283115
2016-10-03 15:18:36 +00:00
Nirav Dave 157891c57f Prevent out of order HashDirective lexing in AsmLexer.
Retrying after buildbot reset.

To lex hash directives we peek ahead to find component tokens, create a
unified token, and unlex the peeked tokens so the parser does not need
to parse the tokens then. Make sure we do not to lex another hash
directive during peek operation.

This fixes PR28921.

Reviewers: rnk, loladiro

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24839

llvm-svn: 283111
2016-10-03 13:48:27 +00:00
Matt Arsenault 4a048a44e5 AMDGPU: Fix typo
llvm-svn: 283108
2016-10-03 13:06:58 +00:00
Volkan Keles 1c38681ae6 Add new target hooks for LoadStoreVectorizer
Summary: Added 6 new target hooks for the vectorizer in order to filter types, handle size constraints and decide how to split chains.

Reviewers: tstellarAMD, arsenm

Subscribers: arsenm, mzolotukhin, wdng, llvm-commits, nhaehnle

Differential Revision: https://reviews.llvm.org/D24727

llvm-svn: 283099
2016-10-03 10:31:34 +00:00
Sjoerd Meijer 4dbe73c1ed [ARM] Code size optimisation to lower udiv+urem to udiv+mls instead of a
library call to __aeabi_uidivmod. This is an improved implementation of
r280808, see also D24133, that got reverted because isel was stuck in a loop.
That was caused by the optimisation incorrectly triggering on i64 ints, which
shouldn't happen because there is no 64bit hwdiv support; that put isel's type
legalization and this optimisation in a loop. A native ARM compiler and testing
now shows that this is fixed.

Patch mostly by Pablo Barrio.

Differential Revision: https://reviews.llvm.org/D25077

llvm-svn: 283098
2016-10-03 10:12:32 +00:00
Konstantin Zhuravlyov c9786f0d8a [AMDGPU] Remove unused variables from SIOptimizeExecMasking
Differential Revision: https://reviews.llvm.org/D25110

llvm-svn: 283087
2016-10-03 04:43:22 +00:00
Hal Finkel 530fa5fcc9 [PowerPC] Account for the ELFv2 function prologue during branch selection
The PPC branch-selection pass, which performs branch relaxation, needs to
account for the padding that might be introduced to satisfy block alignment
requirements. We were assuming that the first block was at offset zero (i.e.
had the alignment of the function itself), but under the ELFv2 ABI, a global
entry function prologue is added to the first block, and it is a
two-instruction sequence (i.e. eight-bytes long). If the function has 16-byte
alignment, the fact that the first block is eight bytes offset from the start
of the function is relevant to calculating where padding will be added in
between later blocks.

Unfortunately, I don't have a small test case.

llvm-svn: 283086
2016-10-03 04:06:44 +00:00
Craig Topper eab23d3bc4 [AVX-512] Remove isCheapAsAMove flag from VMOVAPSZ128rm_NOVLX and friends.
This was accidentally copy and pasted from other Pseudos in the file.

llvm-svn: 283084
2016-10-03 02:22:33 +00:00
Craig Topper 4e7b888ea4 [X86] Mark all sizes of (V)MOVUPD as trivially rematerializable.
I don't know for sure that we truly needs this, but its the only vector load that isn't rematerializable. Making it consistent allows it to not be a special case in the td files.

llvm-svn: 283083
2016-10-03 02:00:29 +00:00
Simon Pilgrim a8d2168cb0 [X86][AVX2] Add support for combining target shuffles to VPERMD/VPERMPS
llvm-svn: 283080
2016-10-02 21:07:58 +00:00
Sanjoy Das 4aeb0f2c7f [SCEV] Rely on ConstantRange instead of custom logic; NFCI
This was first landed in rL283058 and subsequenlty reverted since a
change this depends on (rL283057) was buggy and had to be reverted.

llvm-svn: 283079
2016-10-02 20:59:10 +00:00
Sanjoy Das c7d3291b68 [ConstantRange] Make getEquivalentICmp smarter
This change teaches getEquivalentICmp to be smarter about generating
ICMP_NE and ICMP_EQ predicates.

An earlier version of this change was landed as rL283057 which had a
use-after-free bug.  This new version has a fix for that bug, and a (C++
unittests/) test case that would have triggered it rL283057.

llvm-svn: 283078
2016-10-02 20:59:05 +00:00
Yaron Keren bd8731946c Rangify for loops.
llvm-svn: 283074
2016-10-02 19:21:41 +00:00
Simon Pilgrim 03afbe783d [X86][AVX] Ensure broadcast loads respect dependencies
To allow broadcast loads of a non-zero'th vector element, lowerVectorShuffleAsBroadcast can replace a load with a new load with an adjusted address, but unfortunately we weren't ensuring that the new load respected the same dependencies.

This patch adds a TokenFactor and updates all dependencies of the old load to reference the new load instead.

Bug found during internal testing.

Differential Revision: https://reviews.llvm.org/D25039

llvm-svn: 283070
2016-10-02 15:59:15 +00:00
Craig Topper 46413af7f7 [X86] Don't set i64 ADDC/ADDE/SUBC/SUBE as Custom if the target isn't 64-bit. This way we don't have to catch them and do nothing with them in ReplaceNodeResults.
llvm-svn: 283066
2016-10-02 06:13:43 +00:00