Commit Graph

43925 Commits

Author SHA1 Message Date
Sanne Wouda d4658ee634 [AArch64] [Assembler] option to disable negative immediate conversions
Summary:
Similar to the ARM target in https://reviews.llvm.org/rL298380, this
patch adds identical infrastructure for disabling negative immediate
conversions, and converts the existing aliases to the new infrastucture.

Reviewers: rengolin, javed.absar, olista01, SjoerdMeijer, samparker

Reviewed By: samparker

Subscribers: samparker, aemerson, llvm-commits

Differential Revision: https://reviews.llvm.org/D31243

llvm-svn: 298908
2017-03-28 10:02:56 +00:00
Igor Breger f580fce2c3 [GlobalISel][X86] support G_FRAME_INDEX instruction selection.
Summary:
    G_LOAD/G_STORE, add alternative RegisterBank mapping.
    For G_LOAD, Fast and Greedy mode choose the same RegisterBank mapping (GprRegBank ) for the G_GLOAD + G_FADD , can't get rid of cross register bank copy GprRegBank->VecRegBank.

    Reviewers: zvi, rovka, qcolombet, ab

    Reviewed By: zvi

    Subscribers: llvm-commits, dberris, kristof.beyls, eladcohen, guyblank

    Differential Revision: https://reviews.llvm.org/D30979

llvm-svn: 298907
2017-03-28 09:35:06 +00:00
Anna Thomas ba04f4e925 rename instcombine test file. NFC
llvm-svn: 298904
2017-03-28 08:34:07 +00:00
Weiming Zhao 320848458b Dont emit Mapping symbols for sections that contain only data.
Summary:
Dont emit mapping symbols for sections that contain only data.

Patched by Shankar Easwaran <shankare@codeaurora.org>

Reviewers: rengolin, peter.smith, weimingz, kparzysz, t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, llvm-commits

Differential Revision: https://reviews.llvm.org/D30724

llvm-svn: 298901
2017-03-28 05:40:36 +00:00
Alex Shlyapnikov bbd5cc63d7 Revert "[asan] Delay creation of asan ctor."
Speculative revert. Some libfuzzer tests are affected.

This reverts commit r298731.

llvm-svn: 298890
2017-03-27 23:11:50 +00:00
Alex Shlyapnikov 09171aa31f Revert "[asan] Put ctor/dtor in comdat."
Speculative revert, some libfuzzer tests are affected.

This reverts commit r298756.

llvm-svn: 298889
2017-03-27 23:11:47 +00:00
Renato Golin be2f7d9d61 [ARM] Mark falky test unsupported until we find the cause
llvm-svn: 298887
2017-03-27 22:38:43 +00:00
Javed Absar 3d59437093 Improve machine schedulers for in-order processors
This patch enables schedulers to specify instructions that 
cannot be issued with any other instructions.
It also fixes BeginGroup/EndGroup.

Reviewed by: Andrew Trick
Differential Revision: https://reviews.llvm.org/D30744

llvm-svn: 298885
2017-03-27 20:46:37 +00:00
Kevin Enderby 6c1d2b4cb2 Add the error handling for Mach-O dyld compact lazy bind, weak bind and
rebase entry errors and test cases for each of the error checks.

Also verified with Nick Kledzik that a BIND_OPCODE_SET_ADDEND_SLEB
opcode is legal in a lazy bind table, so code that had that as an error
check was removed.

With MachORebaseEntry and MachOBindEntry classes now returning
an llvm::Error in all cases for malformed input the variables Malformed
and logic to set use them is no longer needed and has been removed
from those classes.

Also in a few places, removed the redundant Done assignment to true
when also calling moveToEnd() as it does that assignment.

This only leaves the dyld compact export entries left to have
error handling yet to be added for the dyld compact info.

llvm-svn: 298883
2017-03-27 20:09:23 +00:00
Matthew Simpson b8ff4a4a70 [LV] Transform truncations of non-primary induction variables
The vectorizer tries to replace truncations of induction variables with new
induction variables having the smaller type. After r295063, this optimization
was applied to all integer induction variables, including non-primary ones.
When optimizing the truncation of a non-primary induction variable, we still
need to transform the new induction so that it has the correct start value.
This should fix PR32419.

Reference: https://bugs.llvm.org/show_bug.cgi?id=32419
llvm-svn: 298882
2017-03-27 20:07:38 +00:00
Ahmed Bougacha f75782f9dc [GlobalISel][AArch64] Fold FI into LDR/STR ui addressing mode.
A majority of loads and stores at O0 access an alloca.

It's trivial to fold the G_FRAME_INDEX into the instruction; do it.

llvm-svn: 298864
2017-03-27 17:31:56 +00:00
Ahmed Bougacha 8a654085d0 [GlobalISel][AArch64] Fold G_GEP into LDR/STR ui addressing mode.
We're not to the point of supporting the load/store patterns yet
(because they extensively use PatFrags).

But in the meantime, we can implement some of the simplest addressing
modes.

llvm-svn: 298863
2017-03-27 17:31:52 +00:00
Ahmed Bougacha 85a66a6d9f [GlobalISel][AArch64] Select store of zero to WZR/XZR.
These occur very frequently, and are quite trivial to catch.

llvm-svn: 298862
2017-03-27 17:31:48 +00:00
Ahmed Bougacha 641cb203b6 [GlobalISel][AArch64] Select CBZ.
CBZ/CBNZ represent a substantial portion of all conditional branches.
Look through G_ICMP to select them.

We can't use tablegen yet because the existing patterns match an
AArch64ISD node.

llvm-svn: 298856
2017-03-27 16:35:31 +00:00
Ahmed Bougacha c1cbcee170 [GlobalISel][AArch64] Use proper constant types in test. NFC.
llvm-svn: 298854
2017-03-27 16:35:23 +00:00
Dmitry Preobrazhensky c512d44845 [AMDGPU][MC] Fix for Bug 28207 + LIT tests
Enabled clamp and omod for v_cvt_* opcodes which have src0 of an integer type

Reviewers: vpykhtin, arsenm

Differential Revision: https://reviews.llvm.org/D31327

llvm-svn: 298852
2017-03-27 15:57:17 +00:00
Chad Rosier 862a41270f [AArch64] Mark mrs of TPIDR_EL0 (thread pointer) as not having side effects.
Among other things, this allows Machine LICM to hoist a costly 'mrs'
instruction from within a loop.

Differential Revision: http://reviews.llvm.org/D31151

llvm-svn: 298851
2017-03-27 15:52:38 +00:00
Anna Thomas f57ae33381 [InstCombine] Avoid incorrect folding of select into phi nodes when incoming element is a vector type
Summary:
We are incorrectly folding selects into phi nodes when the incoming value of a phi
node is a constant vector. This optimization is done in `FoldOpIntoPhi` when the
select condition is a phi node with constant incoming values.
Without the fix, we are miscompiling (i.e. incorrectly folding the
select into the phi node) when the vector contains non-zero
elements.
This patch fixes the miscompile and we will correctly fold based on the
select vector operand (see added test cases).

Reviewers: majnemer, sanjoy, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31189

llvm-svn: 298845
2017-03-27 13:52:51 +00:00
Gadi Haber 89d5f9391a [X86][AVX2] bugzilla bug 21281 Performance regression in vector interleave in AVX2
This is a patch for an on-going bugzilla bug 21281 on the generated X86 code for a matrix transpose8x8 subroutine which requires vector interleaving. The generated code in AVX2 is currently non-optimal and requires 60 instructions as opposed to only 40 instructions generated for AVX1.
 The patch includes a fix for the AVX2 case where vector unpack instructions use less operations than the vector blend operations available in AVX2.
 In this case using vector unpack instructions is more efficient.

Reviewers:
zvi  
delena  
igorb  
craig.topper  
guyblank  
eladcohen  
m_zuckerman  
aymanmus  
RKSimon 

llvm-svn: 298840
2017-03-27 12:13:37 +00:00
Serge Pavlov b71bb80c2d [LoopUnroll] Remap references in peeled iteration
References in cloned blocks must be remapped prior to dominator
calculation.

Differential Revision: https://reviews.llvm.org/D31281

llvm-svn: 298811
2017-03-26 16:46:53 +00:00
Simon Pilgrim 92925ea701 [X86][SSE] Add computeKnownBitsForTargetNode support for (V)PSLL/(V)PSRL instructions
llvm-svn: 298806
2017-03-26 13:17:55 +00:00
Simon Pilgrim 049d9c921f [X86][AVX512F] Fix reg class for VMOVSSZrr/VMOVSSZrrk and VMOVSDZrr/VMOVSDZrrk
Fixed -verify-machineinstrs errors in fast-isel-select-sse.ll (one of many in PR27481)

The VMOVSSZrr/VMOVSSZrrk and VMOVSDZrr/VMOVSDZrrk instructions were assuming both source registers were V128X when the second is actually supposed to be FR32X/FR64X

Differential Revision: https://reviews.llvm.org/D31200

llvm-svn: 298805
2017-03-26 12:52:28 +00:00
Simon Pilgrim 544f750de6 Regenerate test
llvm-svn: 298803
2017-03-26 10:33:03 +00:00
Simon Pilgrim a2b81dc411 Regenerate test
The CHECK-DAG aren't necessary and get in the way of automated checks

llvm-svn: 298802
2017-03-26 10:31:37 +00:00
Simon Pilgrim 1d8235a022 Regenerate tests to remove duplicated checks
llvm-svn: 298801
2017-03-26 10:28:39 +00:00
Igor Breger 531a203a06 [GlobalISel][X86] support G_FRAME_INDEX instruction selection.
Summary:
    Support G_FRAME_INDEX instruction selection.

    Reviewers: zvi, rovka, ab, qcolombet

    Reviewed By: ab

    Subscribers: llvm-commits, dberris, kristof.beyls, eladcohen, guyblank

    Differential Revision: https://reviews.llvm.org/D30980

llvm-svn: 298800
2017-03-26 08:11:12 +00:00
Joerg Sonnenberger fa7367428a Split the SimplifyCFG pass into two variants.
The first variant contains all current transformations except
transforming switches into lookup tables. The second variant
contains all current transformations.

The switch-to-lookup-table conversion results in code that is more
difficult to analyze and optimize by other passes. Most importantly,
it can inhibit Dead Code Elimination. As such it is often beneficial to
only apply this transformation very late. A common example is inlining,
which can often result in range restrictions for the switch expression.

Changes in execution time according to LNT:
SingleSource/Benchmarks/Misc/fp-convert +3.03%
MultiSource/Benchmarks/ASC_Sequoia/CrystalMk/CrystalMk -11.20%
MultiSource/Benchmarks/Olden/perimeter/perimeter -10.43%
and a couple of smaller changes. For perimeter it also results 2.6%
a smaller binary.

Differential Revision: https://reviews.llvm.org/D30333

llvm-svn: 298799
2017-03-26 06:44:08 +00:00
Chandler Carruth 0d256c0f5d [IR] Make SwitchInst::CaseIt almost a normal iterator.
This moves it to the iterator facade utilities giving it full random
access semantics, etc. It can also now be used with standard algorithms
like std::all_of and std::any_of and range adaptors like llvm::reverse.

Also make the semantics of iterating match what every other iterator
uses and forbid decrementing past the begin iterator. This was used as
a hacky way to work around iterator invalidation. However, every
instance trying to do this failed to actually avoid touching invalid
iterators despite the clear documentation that the removed and all
subsequent iterators become invalid including the end iterator. So I've
added a return of the next iterator to removeCase and rewritten the
loops that were doing this to correctly follow the iterator pattern of
either incremneting or removing and assigning fresh values to the
iterator and the end.

In one case we were trying to go backwards to make this cleaner but it
doesn't actually work. I've made that code match the code we use
everywhere else to remove cases as we iterate. This changes the order of
cases in one test output and I moved that test to CHECK-DAG so it
wouldn't care -- the order isn't semantically meaningful anyways.

llvm-svn: 298791
2017-03-26 02:49:23 +00:00
Simon Pilgrim c0720a4052 [X86][SSE] Combine (VSRLI (VSRAI X, Y), (NumSignBits-1)) -> (VSRLI X, (NumSignBits-1))
Part 3 of 3.

Differential Revision: https://reviews.llvm.org/D31347

llvm-svn: 298782
2017-03-25 20:43:01 +00:00
Eric Christopher 0935875c40 Change the default attributes for llvm.prefetch to inaccessiblemem_or_argmemonly
so that we can perform some optimizations across it.

Fixes PR32365

llvm-svn: 298781
2017-03-25 20:20:23 +00:00
Simon Pilgrim 6397963c81 [X86][SSE] Added ComputeNumSignBitsForTargetNode support for (V)PSRAI
Part 2 of 3.

Differential Revision: https://reviews.llvm.org/D31347

llvm-svn: 298780
2017-03-25 19:58:36 +00:00
Sanjay Patel 9ebb68843e [x86] use PMOVMSK to replace memcmp libcalls for 16-byte equality
This is the payoff for D31156 - if a target has efficient comparison instructions for vector-sized equality, 
we can replace memcmp calls with inline code that is both smaller and faster.

Differential Revision: https://reviews.llvm.org/D31290

llvm-svn: 298775
2017-03-25 16:05:33 +00:00
Simon Pilgrim c3e5c3c5bc [X86][SSE] Add extra computeNumSignBits test case for D31311.
llvm-svn: 298774
2017-03-25 15:43:36 +00:00
Yaxun Liu 14834c3e3d [AMDGPU] Switch data layout by triple environment amdgiz
Switch data layout by target triple environment amdgiz and amdgizcl indicating using of an address space mapping in which generic address space is 0.

amdgiz is for non-OpenCL environment where generic address space is 0.

amdgizcl is for OpenCL environment where generic address space is 0.

Differential Revision: https://reviews.llvm.org/D31211

llvm-svn: 298758
2017-03-25 02:05:44 +00:00
Evgeniy Stepanov 71bb8f1ad0 [asan] Put ctor/dtor in comdat.
When possible, put ASan ctor/dtor in comdat.

The only reason not to is global registration, which can be
TU-specific. This is not the case when there are no instrumented
globals. This is also limited to ELF targets, because MachO does
not have comdat, and COFF linkers may GC comdat constructors.

The benefit of this is a lot less __asan_init() calls: one per DSO
instead of one per TU. It's also necessary for the upcoming
gc-sections-for-globals change on Linux, where multiple references to
section start symbols trigger quadratic behaviour in gold linker.

llvm-svn: 298756
2017-03-25 01:01:11 +00:00
Eli Friedman 95ddd18703 [ARM] Fix mixup between Lo and Hi in SMLALBB formation.
llvm-svn: 298752
2017-03-25 00:13:24 +00:00
Reid Kleckner 6b78e16368 [codeview] Don't assert when the user violates the ODR
If we have an array of a user-defined aggregates for which there was an
ODR violation, then the array size will not necessarily match the number
of elements times the size of the element.

Fixes PR32383

llvm-svn: 298750
2017-03-24 23:28:42 +00:00
Sanjay Patel 650599d84d [x86] add 32-bit RUN for better memcmp coverage; NFC
llvm-svn: 298744
2017-03-24 22:09:48 +00:00
Matt Arsenault 0607a4427b AMDGPU: Fix annotating loops with nested loop conditions
If the branch condition for a loop was a phi which itself
was fed from a phi from a loop, it isn't safe to try
to delete the phi until after the loop is handled.

llvm-svn: 298737
2017-03-24 20:57:10 +00:00
Ivan Krasin c2124e185c Revert r298620: [LV] Vectorize GEPs
Reason: breaks linking Chromium with LLD + ThinLTO (a pass crashes)
LLVM bug: https://bugs.llvm.org//show_bug.cgi?id=32413

Original change description:

[LV] Vectorize GEPs

This patch adds support for vectorizing GEPs. Previously, we only generated
vector GEPs on-demand when creating gather or scatter operations. All GEPs from
the original loop were scalarized by default, and if a pointer was to be stored
to memory, we would have to build up the pointer vector with insertelement
instructions.

With this patch, we will vectorize all GEPs that haven't already been marked
for scalarization.

The patch refines collectLoopScalars to more exactly identify the scalar GEPs.
The function now more closely resembles collectLoopUniforms. And the patch
moves vector GEP creation out of vectorizeMemoryInstruction and into the main
vectorization loop. The vector GEPs needed for gather and scatter operations
will have already been generated before vectoring the memory accesses.

Original Differential Revision: https://reviews.llvm.org/D30710

llvm-svn: 298735
2017-03-24 20:49:43 +00:00
Evgeniy Stepanov 64e872a91f [asan] Delay creation of asan ctor.
Create the constructor in the module pass.
This in needed for the GC-friendly globals change, where the constructor can be
put in a comdat  in some cases, but we don't know about that in the function
pass.

llvm-svn: 298731
2017-03-24 20:42:15 +00:00
Matt Arsenault b5d23271e2 AMDGPU: Implement f16 fround
llvm-svn: 298730
2017-03-24 20:04:18 +00:00
Matt Arsenault b8f8dbc227 AMDGPU: Unify divergent function exits.
StructurizeCFG can't handle cases with multiple
returns creating regions with multiple exits.
Create a copy of UnifyFunctionExitNodes that only
unifies exit nodes that skips exit nodes
with uniform branch sources.

llvm-svn: 298729
2017-03-24 19:52:05 +00:00
Adrian Prantl 4994bc0474 Make testcase less nonsensical while still exercising the same code paths.
llvm-svn: 298726
2017-03-24 19:11:31 +00:00
Matt Arsenault 4c7795dd31 AMDGPU: Fold rcp/rsq of undef to undef
llvm-svn: 298725
2017-03-24 19:04:57 +00:00
Stanislav Mekhanoshin 70603dcef2 [AMDGPU] Fold V_CNDMASK with identical source operands
Such instructions sometimes appear after lowering and folding.

Differential Revision: https://reviews.llvm.org/D31318

llvm-svn: 298723
2017-03-24 18:55:20 +00:00
Konstantin Zhuravlyov 4986d9fb45 [AMDGPU] Rename Kind to ValueKind in metadata to be consistent
llvm-svn: 298722
2017-03-24 18:43:15 +00:00
Stanislav Mekhanoshin a27b2cac03 [AMDGPU] Add AMDGPUAliasAnalysis to opt pipeline
Previously it was added only to the BE.

Differential Revision: https://reviews.llvm.org/D31323

llvm-svn: 298721
2017-03-24 18:01:14 +00:00
Teresa Johnson 428b9e0627 [ThinLTO] Correct counting of functions in inliner stats
Summary: Declarations need to be filtered out when counting functions.

Reviewers: eraman

Subscribers: Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D31336

llvm-svn: 298720
2017-03-24 17:59:06 +00:00
Reid Kleckner 5d57752c81 [PDB] Split item and type records when merging type streams
Summary: MSVC does this when producing a PDB.

Reviewers: ruiu

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31316

llvm-svn: 298717
2017-03-24 17:26:38 +00:00
Simon Pilgrim 31c04590e6 [X86][SSE] Add ashr + mask test cases.
Test cases showing cases where we're missing an opportunity to lshr a value with an extended sign to avoid loading a mask

llvm-svn: 298716
2017-03-24 17:25:47 +00:00
Max Kazantsev 7696a7edf9 Revert "[ScalarEvolution] Re-enable Predicate implication from operations"
This reverts commit rL298690

Causes failures on clang.

llvm-svn: 298693
2017-03-24 07:04:31 +00:00
Max Kazantsev 89554446e7 [ScalarEvolution] Re-enable Predicate implication from operations
The patch rL298481 was reverted due to crash on clang-with-lto-ubuntu build.
The reason of the crash was type mismatch between either a or b and RHS in the following situation:

  LHS = sext(a +nsw b) > RHS.

This is quite rare, but still possible situation. Normally we need to cast all {a, b, RHS} to their widest type.
But we try to avoid creation of new SCEV that are not constants to avoid initiating recursive analysis that
can take a lot of time and/or cache a bad value for iterations number. To deal with this, in this patch we
reject this case and will not try to analyze it if the type of sum doesn't match with the type of RHS. In this
situation we don't need to create any non-constant SCEVs.

This patch also adds an assertion to the method IsProvedViaContext so that we could fail on it and not
go further into range analysis etc (because in some situations these analyzes succeed even when the passed
arguments have wrong types, what should not normally happen).

The patch also contains a fix for a problem with too narrow scope of the analysis caused by wrong
usage of predicates in recursive invocations.

The regression test on the said failure: test/Analysis/ScalarEvolution/implied-via-addition.ll

llvm-svn: 298690
2017-03-24 06:19:00 +00:00
Daniel Berlin 9d0796e5d0 NewGVN: Fix PR32403 - Handling of undef in phis was not quite correct
due to LLVM's view of phi nodes.  It would cause NewGVN not to fixpoint
in some interesting edge cases.

llvm-svn: 298687
2017-03-24 05:30:34 +00:00
Adrian Prantl f0ffc5233d Fix a bug when emitting debug info for partially constant global variables.
While fixing a malformed testcase, I discovered that the code
exercised by it was wrong, too.

llvm-svn: 298664
2017-03-23 23:35:00 +00:00
Reid Kleckner 392f062675 [sancov] Don't instrument blocks with no insertion point
This prevents crashes when attempting to instrument functions containing
C++ try.

Sanitizer coverage will still fail at runtime when an exception is
thrown through a sancov instrumented function, but that seems marginally
better than what we have now. The full solution is to color the blocks
in LLVM IR and only instrument blocks that have an unambiguous color,
using the appropriate token.

llvm-svn: 298662
2017-03-23 23:30:41 +00:00
Dehao Chen b197d5b0a0 Fix trellis layout to avoid mis-identify triangle.
Summary:
For the following CFG:

A->B
B->C
A->C

If there is another edge B->D, then ABC should not be considered as triangle.

Reviewers: davidxl, iteratee

Reviewed By: iteratee

Subscribers: nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D31310

llvm-svn: 298661
2017-03-23 23:28:09 +00:00
Dehao Chen 722e94061b Set the prof weight correctly for call instructions in DeadArgumentElimination.
Summary: In DeadArgumentElimination, the call instructions will be replaced. We also need to set the prof weights so that function inlining can find the correct profile.

Reviewers: eraman

Reviewed By: eraman

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31143

llvm-svn: 298660
2017-03-23 23:26:00 +00:00
Bryant Wong def79b21e4 [MetaRenamer] Don't rename library functions.
Library functions can have specific semantics that affect the behavior of
certain passes. DSE, for instance, gives special treatment to malloc-ed pointers
but not to pointers returned from an equivalently typed (but differently named)
function.

MetaRenamer ought not to alter program semantics, so library functions must
remain untouched.

Reviewers: mehdi_amini, majnemer, chandlerc, davide

Reviewed By: davide

Subscribers: davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D31304

llvm-svn: 298659
2017-03-23 23:21:07 +00:00
Dehao Chen 775341a14c Use isFunctionHotInCallGraph to set the function section prefix.
Summary: The current prefix based function layout algorithm only looks at function's entry count, which is not sufficient. A function should be grouped together if its entry count or any call edge count is hot.

Reviewers: davidxl, eraman

Reviewed By: eraman

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31225

llvm-svn: 298656
2017-03-23 23:14:11 +00:00
Krzysztof Parzyszek 10fbac009d [Hexagon] Avoid infinite loops in HexagonLoopIdiomRecognition
- Avoid explosive growth of the simplification queue by not queuing
  expressions that are alredy in it.
- Add an iteration counter and abort after a sufficiently large number
  of iterations (assuming that it's a symptom of an infinite loop).

llvm-svn: 298655
2017-03-23 23:01:22 +00:00
Reid Kleckner a5d187b0ff [PDB] Use two DBs when dumping the IPI stream
Summary:
When dumping these records from an object file section, we should use
only one type database. However, when dumping from a PDB, we should use
two: one for the type stream and one for the IPI stream.

Certain type records that normally live in the .debug$T object file
section get moved over to the IPI stream of the PDB file and they get
new indices.

So far, I've noticed that the MSVC linker always moves these records
into IPI:
- LF_FUNC_ID
- LF_MFUNC_ID
- LF_STRING_ID
- LF_SUBSTR_LIST
- LF_BUILDINFO
- LF_UDT_MOD_SRC_LINE

These records have index fields that can point into TPI or IPI. In
particular, LF_SUBSTR_LIST and LF_BUILDINFO point to LF_STRING_ID
records to describe compilation command lines.

I've modified the dumper to have an optional pointer to the item DB, and
to do type name lookup of these fields in that DB. See printItemIndex.
The result is that our pdbdump-headers.test is more faithful to the PDB
contents and the output is less confusing.

Reviewers: ruiu

Subscribers: amccarth, zturner, llvm-commits

Differential Revision: https://reviews.llvm.org/D31309

llvm-svn: 298649
2017-03-23 21:36:25 +00:00
Gil Rapaport 638d4538cd [LV] Add regression test for r297610
The new test asserts that scalarized memory operations get memcheck metadata
added even if the loop is only unrolled.

Differential Revision: https://reviews.llvm.org/D30972

llvm-svn: 298641
2017-03-23 20:02:23 +00:00
Teresa Johnson 0c6a4ff8dc [ThinLTO] Add support for emitting minimized bitcode for thin link
Summary:
The cumulative size of the bitcode files for a very large application
can be huge, particularly with -g. In a distributed build environment,
all of these files must be sent to the remote build node that performs
the thin link step, and this can exceed size limits.

The thin link actually only needs the summary along with a bitcode
symbol table. Until we have a proper bitcode symbol table, simply
stripping the debug metadata results in significant size reduction.

Add support for an option to additionally emit minimized bitcode
modules, just for use in the thin link step, which for now just strips
all debug metadata. I plan to add a cc1 option so this can be invoked
easily during the compile step.

However, care must be taken to ensure that these minimized thin link
bitcode files produce the same index as with the original bitcode files,
as these original bitcode files will be used in the backends.

Specifically:
1) The module hash used for caching is typically produced by hashing the
written bitcode, and we want to include the hash that would correspond
to the original bitcode file. This is because we want to ensure that
changes in the stripped portions affect caching. Added plumbing to emit
the same module hash in the minimized thin link bitcode file.
2) The module paths in the index are constructed from the module ID of
each thin linked bitcode, and typically is automatically generated from
the input file path. This is the path used for finding the modules to
import from, and obviously we need this to point to the original bitcode
files. Added gold-plugin support to take a suffix replacement during the
thin link that is used to override the identifier on the MemoryBufferRef
constructed from the loaded thin link bitcode file. The assumption is
that the build system can specify that the minimized bitcode file has a
name that is similar but uses a different suffix (e.g. out.thinlink.bc
instead of out.o).

Added various tests to ensure that we get identical index files out of
the thin link step.

Reviewers: mehdi_amini, pcc

Subscribers: Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D31027

llvm-svn: 298638
2017-03-23 19:47:39 +00:00
Nirav Dave 9ebefeb9b1 [X86] Fix Stale SDNode use in X86ISelDAGtoDAG
Summary: Fixes pr32329.

Reviewers: spatel, craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31286

llvm-svn: 298633
2017-03-23 18:25:17 +00:00
Zhaoshi Zheng e3c9070f06 Model ashr(shl(x, n), m) as mul(x, 2^(n-m)) when n > m
Given below case:

  %y = shl %x, n
  %z = ashr %y, m

when n = m, SCEV models it as sext(trunc(x)). This patch tries to handle
the case where n > m by using sext(mul(trunc(x), 2^(n-m)))) as the SCEV
expression.

llvm-svn: 298631
2017-03-23 18:06:09 +00:00
Adrian McCarthy 3c0328e011 Somehow this still breaks because of ANSI color codes in test output on Linux.
Reverting until I can figure out the root cause.

Revert "Re-land:  Make NativeExeSymbol a concrete subclass of NativeRawSymbol [PDB]"

This reverts commit f461a70cc376f0f91c8b4917be79479cc86330a5.

llvm-svn: 298626
2017-03-23 17:18:50 +00:00
Adrian McCarthy e4e5a2b13e Fix build break after r298623
Use the -color-output option explicitly to eliminate the ANSI color codes in
pdb-native-summary.test.  (The default should have done this.)

llvm-svn: 298625
2017-03-23 16:57:53 +00:00
Pirama Arumuga Nainar bc26482717 [ARM] Fix computeKnownBits for ARMISD::CMOV
Summary:
The true and false operands for the CMOV are operands 0 and 1.
ARMISelLowering.cpp::computeKnownBits was looking at operands 1 and 2
instead.  This can cause CMOV instructions to be incorrectly folded into
BFI if value set by the CMOV is another CMOV, whose known bits are
computed incorrectly.

This patch fixes the issue and adds a test case.

Reviewers: kristof.beyls, jmolloy

Subscribers: llvm-commits, aemerson, srhines, rengolin

Differential Revision: https://reviews.llvm.org/D31265

llvm-svn: 298624
2017-03-23 16:47:47 +00:00
Adrian McCarthy 997a15c3c3 Re-land: Make NativeExeSymbol a concrete subclass of NativeRawSymbol [PDB]
The new test should pass on all platforms now that llvm-pdbdump has the
`-color-output` option.

This moves exe symbol-specific method implementations out of NativeRawSymbol
into a concrete subclass. Also adds implementations for hasCTypes and
hasPrivateSymbols and a simple test to ensure the native reader can access
the summary information for the executable from the PDB.

Original Differential Revision: https://reviews.llvm.org/D31059

llvm-svn: 298623
2017-03-23 16:45:20 +00:00
Matthew Simpson 4e7b71bc86 [LV] Vectorize GEPs
This patch adds support for vectorizing GEPs. Previously, we only generated
vector GEPs on-demand when creating gather or scatter operations. All GEPs from
the original loop were scalarized by default, and if a pointer was to be stored
to memory, we would have to build up the pointer vector with insertelement
instructions.

With this patch, we will vectorize all GEPs that haven't already been marked
for scalarization.

The patch refines collectLoopScalars to more exactly identify the scalar GEPs.
The function now more closely resembles collectLoopUniforms. And the patch
moves vector GEP creation out of vectorizeMemoryInstruction and into the main
vectorization loop. The vector GEPs needed for gather and scatter operations
will have already been generated before vectoring the memory accesses.

Differential Revision: https://reviews.llvm.org/D30710

llvm-svn: 298620
2017-03-23 16:29:58 +00:00
Simon Pilgrim 1c048ab6ba [X86][SSE] Extract elements from narrower shuffle masks.
Add support for widening narrow shuffle masks so we can directly extract from the relevant input vector of the shuffle.

llvm-svn: 298616
2017-03-23 16:09:34 +00:00
Matthew Simpson 1fb4064531 [LV] Delete unneeded scalar GEP creation code
The code for generating scalar base pointers in vectorizeMemoryInstruction is
not needed. We currently scalarize all GEPs and maintain the scalarized values
in VectorLoopValueMap. The GEP cloning in this unneeded code is the same as
that in scalarizeInstruction. The test cases that changed as a result of this
patch changed because we were able to reuse the scalarized GEP that we
previously generated instead of cloning a new one.

Differential Revision: https://reviews.llvm.org/D30587

llvm-svn: 298615
2017-03-23 16:07:21 +00:00
Tim Shen ce26a45f7c [PPC] Add generated tests for all atomic operations
Summary: Add tests for all atomic operations for powerpc64le, so that all changes can be easily examined.

Reviewers: kbarton, hfinkel, echristo

Subscribers: mehdi_amini, nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D31285

llvm-svn: 298614
2017-03-23 16:02:47 +00:00
Sanjay Patel 8db22f9032 [x86] add memcmp tests, remove run
Add tests for vector lengths that could be handled without a libcall.

There's an explicit test for 'nobuiltin', so there's not much value
in a separate run that checks that same behavior over and over again.

llvm-svn: 298611
2017-03-23 15:38:22 +00:00
Igor Breger a8ba572dcf [GlobalISel][X86] Support G_STORE/G_LOAD operation
Summary:
1. Support pointer type as function argumnet and return value
2. G_STORE/G_LOAD - set legal action for i8/i16/i32/i64/f32/f64/vec128
3. RegisterBank - support typeless operations like G_STORE/G_LOAD, for scalar use GPR bank.
4. Support instruction selection for G_LOAD/G_STORE

Reviewers: zvi, rovka, ab, qcolombet

Reviewed By: rovka

Subscribers: llvm-commits, dberris, kristof.beyls, eladcohen, guyblank

Differential Revision: https://reviews.llvm.org/D30973

llvm-svn: 298609
2017-03-23 15:25:57 +00:00
Nirav Dave e9ca32ae52 [SDAG] Fix zeroExtend assertion error
Move CombineTo preventing deleted node from being returned in
visitZERO_EXTEND.

Fixes PR32284.

Reviewers: RKSimon, bogner

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31254

llvm-svn: 298604
2017-03-23 15:01:50 +00:00
Dehao Chen 53a0c082d2 Do not set branch weight if the branch weight annotation is present.
Summary: ThinLTO will annotate the CFG twice. If the branch weight is set by the first annotation, we should not set the branch weight again in the second annotation because the first annotation is more accurate as there is less optimization that could affect debug info accuracy.

Reviewers: tejohnson, davidxl

Reviewed By: tejohnson

Subscribers: mehdi_amini, aprantl, llvm-commits

Differential Revision: https://reviews.llvm.org/D31228

llvm-svn: 298602
2017-03-23 14:43:10 +00:00
Strahinja Petrovic cac14b5334 [Mips] Fix for decoding DINS instruction - disassembler
This patch fixes decoding of size and position for DINSM
and DINSU instructions.

Differential Revision: https://reviews.llvm.org/D31072

llvm-svn: 298593
2017-03-23 13:19:04 +00:00
Simon Pilgrim d341290b5c [X86][SSE] Add computeNumSignBits test for sitofp of (extended) i64 extracted element
llvm-svn: 298592
2017-03-23 13:18:09 +00:00
Michael Zuckerman 85436ece89 [X86][TD][vpmovm2 ] New TD pattern for the vpmovm2 instruction
Up until now, vpmovm2 instruction described its destination operand size
by the source operand size. This patch adds new pattern for the vpmovm2
instruction. The node describes new expansion of the destination (from
{128|256} to 512).

Differential Revision: https://reviews.llvm.org/D30654

llvm-svn: 298586
2017-03-23 09:57:01 +00:00
Artyom Skrobov 92c0653095 Reapply r298417 "[ARM] Recommit the glueless lowering of addc/adde in Thumb1"
The UB in t2_so_imm_neg conversion has been addressed under D31242 / r298512

This reverts commit r298482.

llvm-svn: 298562
2017-03-22 23:35:51 +00:00
Konstantin Zhuravlyov 4cbb68959b [AMDGPU] Do not emit isa info as code object metadata
- It was decided to expose this information through other means (rocr)

Differential Revision: https://reviews.llvm.org/D30970

llvm-svn: 298560
2017-03-22 23:27:09 +00:00
Konstantin Zhuravlyov a780ffaac2 [AMDGPU] Emit kernel debug properties as code object metadata
Differential Revision: https://reviews.llvm.org/D30969

llvm-svn: 298558
2017-03-22 23:10:46 +00:00
Konstantin Zhuravlyov ca0e7f6472 [AMDGPU] Emit kernel code properties as code object metadata
- These are not required for low level runtime

Differential Revision: https://reviews.llvm.org/D29949

llvm-svn: 298556
2017-03-22 22:54:39 +00:00
Sanjay Patel 51e077924d [x86] improve tests, add tests, auto-generate checks; NFC
llvm-svn: 298553
2017-03-22 22:39:17 +00:00
Konstantin Zhuravlyov 7498cd61fb [AMDGPU] Restructure code object metadata creation
- Rename runtime metadata -> code object metadata
  - Make metadata not flow
  - Switch enums to use ScalarEnumerationTraits
  - Cleanup and move AMDGPUCodeObjectMetadata.h to AMDGPU/MCTargetDesc
  - Introduce in-memory representation for attributes
  - Code object metadata streamer
  - Create metadata for isa and printf during EmitStartOfAsmFile
  - Create metadata for kernel during EmitFunctionBodyStart
  - Finalize and emit metadata to .note during EmitEndOfAsmFile
  - Other minor improvements/bug fixes

Differential Revision: https://reviews.llvm.org/D29948

llvm-svn: 298552
2017-03-22 22:32:22 +00:00
Saleem Abdulrasool 1723b8bbbc c++filt: support COFF import thunks
The synthetic thunk for the import is prefixed with __imp_.  Attempt to
undecorate the names when they begin with the __imp_ prefix.

llvm-svn: 298550
2017-03-22 21:15:19 +00:00
Anna Thomas e27b39a976 [LVI] Add an LVI printer pass to capture test LVI cache after transformations
Summary:
Adding a printer pass for printing the LVI cache values after transformations
that use LVI.
This will help us in identifying cases where LVI
invariants are violated, or transforms that leave LVI in an incorrect state.
Right now, I have added two test cases to show that the printer pass is working.
I will be adding more test cases in a later change, once this change is
checked in upstream.

Reviewers: reames, dberlin, sanjoy, apilipenko

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30790

llvm-svn: 298542
2017-03-22 19:27:12 +00:00
Luqman Aden 3f807c91dc Preserve nonnull metadata on Loads through SROA & mem2reg.
Summary:
https://llvm.org/bugs/show_bug.cgi?id=31142 :

SROA was dropping the nonnull metadata on loads from allocas that got optimized out. This patch simply preserves nonnull metadata on loads through SROA and mem2reg.

Reviewers: chandlerc, efriedma

Reviewed By: efriedma

Subscribers: hfinkel, spatel, efriedma, arielb1, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D27114

llvm-svn: 298540
2017-03-22 19:16:39 +00:00
Adrian Prantl 3eb1ec9767 Fix testcase on windows.
llvm-svn: 298521
2017-03-22 17:15:03 +00:00
Sanjay Patel 2f602cea41 [InstCombine] canonicalize insertelement of scalar constant ahead of insertelement of variable
insertelement (insertelement X, Y, IdxC1), ScalarC, IdxC2 -->
insertelement (insertelement X, ScalarC, IdxC2), Y, IdxC1

As noted in the code comment and seen in the test changes, the motivation is that by pulling
constant insertion up, we may be able to constant fold some insertelement instructions.

Differential Revision: https://reviews.llvm.org/D31196

llvm-svn: 298520
2017-03-22 17:10:44 +00:00
Adrian Prantl 5503077cde Fix PR32298 by adding an early exit to getFrameIndexExprs().
Also add an assertion for the case that there are multiple FI
expressions with a DW_OP_LLVM_fragment; which should violate internal
constraints in DbgVariable.

llvm-svn: 298518
2017-03-22 16:50:16 +00:00
Artyom Skrobov 50a066b313 [ARM] t2_so_imm_neg had a subtle bug in the conversion, and could trigger UB by negating (int)-2147483648. By pure luck, none of the pre-existing tests triggered this; so I'm adding one.
Summary: Thanks to Vitaly Buka for helping catch this.

Reviewers: rengolin, jmolloy, efriedma, vitalybuka

Subscribers: llvm-commits, aemerson

Differential Revision: https://reviews.llvm.org/D31242

llvm-svn: 298512
2017-03-22 15:09:30 +00:00
Rafael Espindola 72dc254532 Add default typo to .tbss.*
This matches gas behavior and is part of pr31888.

llvm-svn: 298508
2017-03-22 14:04:19 +00:00
Rafael Espindola f4b9da6286 Set the default type for .bss.foo.
This matches gas and is part of pr31888.

llvm-svn: 298506
2017-03-22 13:57:16 +00:00
Rafael Espindola ccd9f4f4c7 Produce INIT_ARRAY for sections named .init_array.*
These sections are merged together by the linker, so they should have
the same time.

llvm-svn: 298505
2017-03-22 13:35:41 +00:00
Dmitry Preobrazhensky 895d377dc7 [AMDGPU][MC] Fix for Bug 28204 + LIT tests
Fixed v_mad_i64_i32/u64_u32 encoding

Reviewers: artem.tamazov

Differential Revision: https://reviews.llvm.org/D30828

llvm-svn: 298502
2017-03-22 13:31:01 +00:00
Simon Pilgrim 345481599f [X86] Add multiply by constant tests (PR28513)
As discussed on PR28513, add tests for constant multiplication by constants between 1 to 32

llvm-svn: 298497
2017-03-22 12:03:56 +00:00
Max Kazantsev c6effaa495 Revert "[ScalarEvolution] Predicate implication from operations"
This reverts commit rL298481

Fails clang-with-lto-ubuntu build.

llvm-svn: 298489
2017-03-22 07:50:33 +00:00
Craig Topper ad5c2d04f7 [ValueTracking] Make sure we keep range metadata information when calculating known bits for calls to bitreverse intrinsic.
llvm-svn: 298488
2017-03-22 07:22:49 +00:00
Jonas Paulsson 808c89f467 [SystemZ] Don't drop any operands in expandZExtPseudo()
Make sure that any operands, e.g. of an implicit def of a super reg is
transferred to the new instruction.

Review: Ulrich Weigand
llvm-svn: 298484
2017-03-22 06:03:32 +00:00
Vitaly Buka e69c137f90 Revert "[ARM] Recommit the glueless lowering of addc/adde in Thumb1, including the amended (no UB anymore) fix for adding/subtracting -2147483648."
Fails check-llvm with ubsan

This reverts commit r298417.

llvm-svn: 298482
2017-03-22 05:07:44 +00:00
Max Kazantsev 15e76aa0f8 [ScalarEvolution] Predicate implication from operations
This patch allows SCEV predicate analysis to prove implication of some expression predicates
from context predicates related to arguments of those expressions.
It introduces three new rules:

For addition:
  (A >X && B >= 0) || (B >= 0 && A > X) ===> (A + B) > X.

For division:
  (A > X) && (0 < B <= X + 1) ===> (A / B > 0).
  (A > X) && (-B <= X < 0) ===> (A / B >= 0).

Using these rules, SCEV is able to prove facts like "if X > 1 then X / 2 > 0".
They can also be combined with the same context, to prove more complex expressions like
"if X > 1 then X/2 + 1 > 1".

Diffirential Revision: https://reviews.llvm.org/D30887

Reviewed by: sanjoy

llvm-svn: 298481
2017-03-22 04:48:46 +00:00
Craig Topper 07f2915ad8 [InstCombine] Teach SimplifyDemandedUseBits to shrink Constants on the left side of subtracts
Summary: Subtracts can have constants on the left side, but we don't shrink them based on demanded bits. This patch fixes that to match the right hand side.

Reviewers: davide, majnemer, spatel, sanjoy, hfinkel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31119

llvm-svn: 298478
2017-03-22 04:03:53 +00:00
Reid Kleckner 45928018c5 [codeview] Use separate records for LF_SUBSTR_LIST and LF_ARGLIST
They are structurally the same, but now we need to distinguish them
because one record lives in the IPI stream and the other lives in TPI.

llvm-svn: 298474
2017-03-22 01:37:38 +00:00
Aditya Nandakumar bc389badbc [GlobalISel]: Create VREGs for ConstantInt args
This patch changes the behavior of IRTranslating intrinsics where we
now create VREG + G_CONSTANT for ConstantInt values. We already do this
for FloatingPoint values. This makes it easier for the backends to
select code and it won't have to de-duplicate creation+selection of
constants.

Reviewed by: ab

llvm-svn: 298473
2017-03-22 01:16:39 +00:00
Adrian Prantl 0498baa4be Don't compose DWARF expressions with multiple subregisters.
If a register location can only be described by a complex expression
(i.e., multiple subregisters) it doesn't safely compose with another
complex expression. For example, it is not possible to apply a
DW_OP_deref operation to multiple DW_OP_pieces.

llvm-svn: 298472
2017-03-22 01:16:01 +00:00
Ahmed Bougacha 15b3e8a93a [GlobalISel] Update DBG_VALUEs referencing DCE'd instructions.
Quentin points out that r298358 would cause us to emit different code
with debug info.  That's a big no-no; also erase the instructions that
only live thanks to DBG_VALUE users.

Adrian explained how this is an existing problem and an OK thing to do:
clang has allocas for all variables so shouldn't be affected at -O0, but
swift uses a bit of inlineasm to explicitly keep values live for the
purpose of debug info quality.  I'm not sure there is a better scheme.

llvm-svn: 298460
2017-03-21 23:42:54 +00:00
Ahmed Bougacha e8e1fa3a7c [GlobalISel] Don't translate br to layout successor.
MI can represent fallthrough to layout successor blocks, and our
post-isel representation uses that extensively.

We might as well use it too, to avoid translating and carrying along
unnecessary branches.

llvm-svn: 298459
2017-03-21 23:42:50 +00:00
Matt Arsenault 5b20fbb748 AMDGPU: Rename SI_RETURN
This is used for a specific type of return to a shader part's
epilog code. Rename to try avoiding confusion from a true
call's return.

llvm-svn: 298452
2017-03-21 22:18:10 +00:00
Matthias Braun 8445cbd1ca SplitKit: Fix subreg copy related problems
Fix two problems related to r298025:
- SplitKit would create duplicate VNIs in some cases leading to crashs
  when hoisting copies.
- VirtRegMap could fail expanding copies at the beginning of a basic
  block.

This fixes http://llvm.org/PR32353

llvm-svn: 298448
2017-03-21 21:58:08 +00:00
Matt Arsenault 3dbeefa978 AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel
Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
calling convention can be changed to a non-kernel.

Converted with perl -pi -e 's/define void/define amdgpu_kernel void/'
on the relevant test directories (and undoing in one place that actually
wanted a non-kernel).

llvm-svn: 298444
2017-03-21 21:39:51 +00:00
Tim Northover dd4b9d6d7b GlobalISel: widen booleans by zero-extending to a byte.
A bool is represented by a single byte, which the ARM ABI requires to be either
0 or 1. So we cannot use G_ANYEXT when legalizing the type.

llvm-svn: 298439
2017-03-21 21:12:04 +00:00
Sanjay Patel 68c6beabe9 [InstCombine] regenerate checks; NFC
llvm-svn: 298432
2017-03-21 20:14:38 +00:00
George Burgess IV 56c7e88c2c Let llvm.objectsize be conservative with null pointers
This adds a parameter to @llvm.objectsize that makes it return
conservative values if it's given null.

This fixes PR23277.

Differential Revision: https://reviews.llvm.org/D28494

llvm-svn: 298430
2017-03-21 20:08:59 +00:00
Artyom Skrobov 40a4f40679 [ARM] Recommit the glueless lowering of addc/adde in Thumb1,
including the amended (no UB anymore) fix for adding/subtracting -2147483648.

This reverts r298328 "[ARM] Revert r297443 and r297820."
and partially reverts r297842 "Revert "[Thumb1] Fix the bug when adding/subtracting -2147483648""

llvm-svn: 298417
2017-03-21 18:39:41 +00:00
Dehao Chen 190f17cae7 Use ProfileSummary:getProfileCount to get ScaledCount for ModuleSummary
Summary: ModuleSummary should use the standard interface of ProfileSummary::getProfileCount.

Reviewers: eraman, tejohnson

Reviewed By: tejohnson

Subscribers: tejohnson, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D31154

llvm-svn: 298404
2017-03-21 17:22:35 +00:00
Adrian Prantl 4dc0324ff8 Revert 298388 and 298389 because they broke some AMDGPU tests.
llvm-svn: 298401
2017-03-21 17:14:30 +00:00
Krzysztof Parzyszek d033d1fd82 Recommit r298282 with fixes for memory allocation/deallocation
[Hexagon] Recognize polynomial-modulo loop idiom again

Regain the ability to recognize loops calculating polynomial modulo
operation. This ability has been lost due to some changes in the
preceding optimizations. Add code to preprocess the IR to a form
that the pattern matching code can recognize.

llvm-svn: 298400
2017-03-21 17:09:27 +00:00
Marek Olsak 5c7a61d221 AMDGPU: Buffer descriptor changes for GFX9
Reviewers: arsenm

Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, dstuttard, tpr

Differential Revision: https://reviews.llvm.org/D31158

llvm-svn: 298397
2017-03-21 17:00:39 +00:00
Marek Olsak e22fdb9cac AMDGPU: Always use VGPR indexing on GFX9
Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, dstuttard, tpr

Differential Revision: https://reviews.llvm.org/D31157

llvm-svn: 298396
2017-03-21 17:00:32 +00:00
Krzysztof Parzyszek 5e7f06f354 [Hexagon] Add -march=hexagon to a testcase
llvm-svn: 298395
2017-03-21 16:59:40 +00:00
Adrian Prantl 1db6486fb1 Don't compose DWARF expressions with multiple subregisters.
If a register location can only be described by a complex expression
(i.e., multiple subregisters) it doesn't safely compose with another
complex expression. For example, it is not possible to apply a
DW_OP_deref operation to multiple DW_OP_pieces.

llvm-svn: 298389
2017-03-21 16:37:39 +00:00
Matt Arsenault f8fb605a68 AMDGPU: Fix asserting on 0 dmask for image intrinsics
Fold these to undef during lowering so users get eliminated.

llvm-svn: 298387
2017-03-21 16:32:17 +00:00
Matt Arsenault 964a848514 AMDGPU: Convert image intrinsic uses in tests
llvm-svn: 298386
2017-03-21 16:24:12 +00:00
Matt Arsenault dce313c3cf DAG: Fold bitcast/extract_vector_elt of undef to undef
Fixes not eliminating store when intrinsic is lowered to undef.

llvm-svn: 298385
2017-03-21 16:20:16 +00:00
Simon Pilgrim 5e39cbaee5 Fix shufpd test name.
llvm-svn: 298381
2017-03-21 15:12:53 +00:00
Sanne Wouda 2409c6403d [ARM] [Assembler] Support negative immediates for A32, T32 and T16
Summary:
To support negative immediates for certain arithmetic instructions, the
instruction is converted to the inverse instruction with a negated (or inverted)
immediate. For example, "ADD r0, r1, #FFFFFFFF" cannot be encoded as an ADD
instruction.  However, "SUB r0, r1, #1" is equivalent.

These conversions are different from instruction aliases.  An alias maps
several assembler instructions onto one encoding.  A conversion, however, maps
an *invalid* instruction--e.g. with an immediate that cannot be represented in
the encoding--to a different (but equivalent) instruction.

Several instructions with negative immediates were being converted already, but
this was not systematically tested, nor did it cover all instructions.

This patch implements all possible substitutions for ARM, Thumb1 and
Thumb2 assembler and adds tests.  It also adds a feature flag
(-mattr=+no-neg-immediates) to turn these substitutions off.  This is
helpful for users who want their code to assemble to exactly what they
wrote.

Reviewers: t.p.northover, rovka, samparker, javed.absar, peter.smith, rengolin

Reviewed By: javed.absar

Subscribers: aadg, aemerson, llvm-commits

Differential Revision: https://reviews.llvm.org/D30571

llvm-svn: 298380
2017-03-21 14:59:17 +00:00
Sanjay Patel 00ece756c3 [InstCombine] auto-generate better checks; NFC
llvm-svn: 298377
2017-03-21 14:04:44 +00:00
Sanjay Patel 79379cae15 [x86] use PMOVMSK for vector-sized equality comparisons
We could do better by splitting any oversized type into whatever vector size the target supports, 
but I left that for future work if it ever comes up. The motivating case is memcmp() calls on 16-byte
structs, so I think we can wire that up with a TLI hook that feeds into this.

Differential Revision: https://reviews.llvm.org/D31156

llvm-svn: 298376
2017-03-21 13:50:33 +00:00
Simon Pilgrim 8bda035121 [X86][AVX] Tests showing missing SHUFPD + ZERO lowering
This lowers to SHUFPD if the input is zeroinitializer but not with a demanded elts optimized build vector.

llvm-svn: 298370
2017-03-21 13:30:40 +00:00
Valery Pykhtin fd4c410f4d [AMDGPU] Iterative scheduling infrastructure + minimal registry scheduler
Differential revision: https://reviews.llvm.org/D31046

llvm-svn: 298368
2017-03-21 13:15:46 +00:00
Volkan Keles 044e003203 [GlobalISel] Fix shufflevector tests
clang-lld-x86_64-2stage fails because of the order
of the instructions. `CHECK-DAG` directives should
fix the problem.

llvm-svn: 298367
2017-03-21 13:12:59 +00:00
Sam Kolton f60ad58dad [ADMGPU] SDWA peephole optimization pass.
Summary:
First iteration of SDWA peephole.

This pass tries to combine several instruction into one SDWA instruction. E.g. it converts:
'''
    V_LSHRREV_B32_e32 %vreg0, 16, %vreg1
    V_ADD_I32_e32 %vreg2, %vreg0, %vreg3
    V_LSHLREV_B32_e32 %vreg4, 16, %vreg2
'''
Into:
'''
   V_ADD_I32_sdwa %vreg4, %vreg1, %vreg3 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:DWORD
'''

Pass structure:
    1. Iterate over machine instruction in basic block and try to apply "SDWA patterns" to each of them. SDWA patterns match machine instruction into either source or destination SDWA operand. E.g. ''' V_LSHRREV_B32_e32 %vreg0, 16, %vreg1''' is matched to source SDWA operand '''%vreg1 src_sel:WORD_1'''.
    2. Iterate over found SDWA operands and find instruction that could be potentially coverted into SDWA. E.g. for source SDWA operand potential instruction are all instruction in this basic block that uses '''%vreg0'''
    3. Iterate over all potential instructions and check if they can be converted into SDWA.
    4. Convert instructions to SDWA.

This review contains basic implementation of SDWA peephole pass. This pass requires additional testing fot both correctness and performance (no performance testing done).
There are several ways this pass can be improved:
    1. Make this pass work on whole function not only basic block. As I can see this can be done right now without changes to pass.
    2. Introduce more SDWA patterns
    3. Introduce mnemonics to limit when SDWA patterns should apply

Reviewers: vpykhtin, alex-t, arsenm, rampitec

Subscribers: wdng, nhaehnle, mgorny

Differential Revision: https://reviews.llvm.org/D30038

llvm-svn: 298365
2017-03-21 12:51:34 +00:00
Andrea Di Biagio 7937be7dd3 [DebugInfo][X86] Teach Optimize LEAs pass to handle debug values
This patch fixes an issue in the Optimize LEAs pass where redundant LEAs were
not removed because they were being used by debug values. The debug values are
now ignored when determining whether LEAs are redundant.

For now the debug values for the redundant LEAs are marked as undefined,
effectively lost. The intention is for a follow up patch which will attempt to
preserve the debug values where possible.

Patch by Andrew Ng.

Differential Revision: https://reviews.llvm.org/D30835

llvm-svn: 298360
2017-03-21 11:36:21 +00:00
Jonas Paulsson 54c7680e1f [DAGTypeLegalizer] Handle widening truncate to vector of i1.
Previously, PromoteIntRes_TRUNCATE() did not handle the case where
the operand needs widening, which resulted in llvm_unreachable().

This patch adds the needed handling, along with a test case.

Review: Eli Friedman, Simon Pilgrim.
https://reviews.llvm.org/D31077

llvm-svn: 298357
2017-03-21 10:24:14 +00:00
David Green da21170c49 [ConstantFolding] Fix to prevent constant folding having to repeatedly scan operands. NFCI
After the loop unroll threshold was increased in r295538, very
large constant expressions can be created. This prevents them
from having to be recursively scanned, leading to a compile
time blow-up.

Differential Revision: https://reviews.llvm.org/D30689

llvm-svn: 298356
2017-03-21 10:17:39 +00:00
Volkan Keles 75bdc7690e [GlobalISel] Translate shufflevector
Reviewers: qcolombet, aditya_nandakumar, t.p.northover, javed.absar, ab, dsanders

Reviewed By: javed.absar

Subscribers: dberris, rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D30962

llvm-svn: 298347
2017-03-21 08:44:13 +00:00
Jonas Paulsson bd65421f08 [SystemZ] Don't drop MO flags in foldMemoryOperandImpl()
The def operand of the new LG/LD should have the old def operands
flags and subreg index.

New test: test/CodeGen/SystemZ/fold-memory-op-impl.ll

Review: Ulrich Weigand
llvm-svn: 298341
2017-03-21 05:49:40 +00:00
Vitaly Buka c12716e742 Revert "[Hexagon] Recognize polynomial-modulo loop idiom again"
Fix memory leaks on check-llvm tests detected by Asan.

This reverts commit r298282.

llvm-svn: 298329
2017-03-21 00:59:51 +00:00
Eli Friedman 76732acc23 [ARM] Revert r297443 and r297820.
The glueless lowering of addc/adde in Thumb1 has known serious
miscompiles (see https://reviews.llvm.org/D31081), and r297820
causes an infinite loop for certain constructs.  It's not
clear when they will be fixed, so let's just take them out
of the tree for now.

(I resolved a small conflict with r297453.)

llvm-svn: 298328
2017-03-21 00:26:39 +00:00
Vadzim Dambrouski ba789cbd3d [ARM] Fix PR32130: Handle promotion of zero sized constants.
The special case of zero sized values was previously not handled correctly.
This patch handles this by not promoting if the size is zero.

Patch by Tim Neumann.

Differential Revision: https://reviews.llvm.org/D31116

llvm-svn: 298320
2017-03-20 22:59:57 +00:00
Sanjay Patel f238902f52 [x86] add tests for setcc of i128/i256; NFC
llvm-svn: 298317
2017-03-20 22:15:40 +00:00
Matt Arsenault 6b00d40900 InstCombine: Check source value precision when reducing cast intrinsic
Missed this check when porting from the libcall version.

llvm-svn: 298312
2017-03-20 21:59:24 +00:00
Tim Northover 4340d64f91 GlobalISel: add implicit defs & uses when mutating an instruction.
Otherwise a scheduler might do bad things to the code we produce.

llvm-svn: 298311
2017-03-20 21:58:23 +00:00
Eli Friedman b1578d3612 [SCEV] Fix trip multiple calculation
If loop bound containing calculations like min(a,b), the Scalar
Evolution API getSmallConstantTripMultiple returns 4294967295 "-1"
as the trip multiple. The problem is that, SCEV use -1 * umax to
represent umin. The multiple constant -1 was returned, and the logic
of guarding against huge trip counts was skipped. Because -1 has 32
active bits.

The fix attempt to factor more general cases. First try to get the
greatest power of two divisor of trip count expression. In case
overflow happens, the trip count expression is still divisible by the
greatest power of two divisor returned. Returns 1 if not divisible by 2.

Patch by Huihui Zhang <huihuiz@codeaurora.org>

Differential Revision: https://reviews.llvm.org/D30840

llvm-svn: 298301
2017-03-20 20:25:46 +00:00
David L. Jones d61548471c [X86] Clean up test/CodeGen/X86/2006-03-01-InstrSchedBug.ll
Summary:
- Migrated from grep to FileCheck.
- Re-indented, removed boilerplate comments.
- Added 'entry' label at beginning of basic block.

Patch by Jorge Gorbe!

Reviewed By: RKSimon

Subscribers: RKSimon, jgorbe, llvm-commits

Differential Revision: https://reviews.llvm.org/D30317

llvm-svn: 298298
2017-03-20 20:10:30 +00:00
Nirav Dave f5f0864ac2 Add test case for merging of chained stores of mismatched type.
llvm-svn: 298293
2017-03-20 19:48:22 +00:00
Kevin Enderby a8d256cb36 Add the rest of the error checking for Mach-O dyld compact bind entry errors
and test cases for each of the error checks.

To do this more plumbing was needed so that the segment indexes and
segment offsets can be checked.  Basically what was done was the SegInfo
from llvm-objdump’s MachODump.cpp was moved into libObject for Mach-O
objects as BindRebaseSegInfo and it is only created when an iterator for
bind or rebase entries are created.

This commit really only adds the error checking and test cases for the
bind table entires and the checking for the lazy bind and weak bind entries
are still to be fully done as well as the rebase entires.  Though some of
the plumbing for those are added with this commit.  Those other error
checks and test cases will be added in follow on commits.

Note, the two llvm_unreachable() calls should now actually be unreachable
with the error checks in place and would take a logic bug in the error
checking code to be reached if the segment indexes and segment
offsets are used from a checked bind entry.  Comments have been added
to the methods that require the arguments to have been checked
prior to calling.

llvm-svn: 298292
2017-03-20 19:46:55 +00:00