Commit Graph

145226 Commits

Author SHA1 Message Date
Craig Topper bb10c0f1ec [X86] FileCheckize one of the rotate tests.
llvm-svn: 295676
2017-02-20 19:44:10 +00:00
Steven Wu abfea28867 Fix use-after-free found by ASAN
DenseMap::lookup returns copy of the value in the map. Returning the
address of the temporary return value will cause use-after-free.

llvm-svn: 295675
2017-02-20 18:33:40 +00:00
Craig Topper 2012dda9a0 [AVX-512] Add a few more patterns for selecting masked vpternlog with broadcast loads where the passthru operand is not operand 0.
llvm-svn: 295673
2017-02-20 17:44:09 +00:00
Simon Pilgrim 2967ed1c7e [X86] Tidyup combineExtractVectorElt. NFCI.
Pull out repeated code for extraction index operand and source vector value type.

Use isNullConstant helper to check for zero extraction index.

llvm-svn: 295670
2017-02-20 16:09:45 +00:00
Simon Pilgrim e9a8145adb [X86][SSE] Regenerate extracted bitcasted constant tests and add 32-bit test target
llvm-svn: 295669
2017-02-20 15:57:14 +00:00
Daniel Sanders e604ef5f55 [globalisel] OperandPredicateMatcher's shouldn't need to generate the MachineOperand expr. NFC
Summary:
Each OperandPredicateMatcher shouldn't need to know how to generate the expression
to reference a MachineOperand. The OperandMatcher should provide it.

In addition to separating responsibilities, this also lays some groundwork for
decoupling source patterns from destination patterns to allow invented operands
or operands provided by GlobalISel's equivalent to the ComplexPattern<> class.

Depends on D29709

Reviewers: t.p.northover, ab, rovka, qcolombet, aditya_nandakumar

Reviewed By: ab

Subscribers: dberris, kristof.beyls, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D29710

llvm-svn: 295668
2017-02-20 15:30:43 +00:00
Simon Pilgrim 72d666e443 [X86][SSE] Regenerate re-materialized store tests and add 64-bit test target
llvm-svn: 295666
2017-02-20 15:20:37 +00:00
Simon Pilgrim 5a33d1c266 [X86][SSE] Regenerate vselect widening tests and add 32-bit test target
llvm-svn: 295665
2017-02-20 15:16:43 +00:00
Diana Picus 1c33c9f0b0 [ARM] GlobalISel: Don't select atomic loads
There used to be a check in the IRTranslator that prevented us from having to
deal with atomic loads/stores. That check has been removed in r294993 and the
AArch64 backend was updated accordingly. This commit does the same thing for the
ARM backend.

In general, in the ARM backend we introduce fences during the atomic expand
pass, so we don't have to worry about atomics, *except* for the 32-bit ARMv8
target, which handles atomics more like AArch64. Since we don't want to worry
about that yet, just bail out of instruction selection if we find any atomic
loads.

llvm-svn: 295662
2017-02-20 14:45:58 +00:00
Daniel Sanders b41ce2b392 [globalisel] Separate the SelectionDAG importer from the emitter. NFC
Summary:
In the near future the rules will be sorted between these two steps to
ensure that more important rules are not prevented by less important ones.

Reviewers: t.p.northover, ab, rovka, qcolombet, aditya_nandakumar

Reviewed By: ab

Subscribers: dberris, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D29709

llvm-svn: 295661
2017-02-20 14:31:27 +00:00
Igor Breger fda32d266a [X86] Fix EXTRACT_VECTOR_ELT with variable index from v32i16 and v64i8 vector.
Its more profitable to go through memory (1 cycles throughput)
than using VMOVD + VPERMV/PSHUFB sequence ( 2/3 cycles throughput) to implement EXTRACT_VECTOR_ELT with variable index.
IACA tool was used to get performace estimation (https://software.intel.com/en-us/articles/intel-architecture-code-analyzer)
For example for var_shuffle_v16i8_v16i8_xxxxxxxxxxxxxxxx_i8 test from vector-shuffle-variable-128.ll I get 26 cycles vs 79 cycles. 
Removing the VINSERT node, we don't need it any more.

Differential Revision: https://reviews.llvm.org/D29690

llvm-svn: 295660
2017-02-20 14:16:29 +00:00
Alexey Bataev 19b35bf7f4 [SLP] Additional test for vectorization of cal/invoke args vectorization
llvm-svn: 295657
2017-02-20 12:41:16 +00:00
Simon Pilgrim 5910ebe720 [X86][AVX512] Add support for ASHR v2i64/v4i64 support without VLX
Use v8i64 ASHR instructions if we don't have VLX.

Differential Revision: https://reviews.llvm.org/D28537

llvm-svn: 295656
2017-02-20 12:16:38 +00:00
Sanne Wouda 47eb9723de [ARM] Add a div regression test for Cortex-M23
Summary:
This file was missed in the commit for Cortex-M23 and Cortex-M33
support.  See https://reviews.llvm.org/D29073?id=85814 .

Reviewers: rengolin, javed.absar, samparker

Reviewed By: samparker

Subscribers: llvm-commits, aemerson

Differential Revision: https://reviews.llvm.org/D30162

llvm-svn: 295655
2017-02-20 12:05:07 +00:00
Simon Pilgrim c0dc9a4913 Strip trailing whitespace.
llvm-svn: 295653
2017-02-20 11:56:43 +00:00
Simon Pilgrim 50b958c07a [SelectionDAG] Add scalarization support for ISD::*_EXTEND_VECTOR_INREG opcodes.
Thanks to Mikael Holmén for the initial test case

llvm-svn: 295652
2017-02-20 11:55:58 +00:00
Sjoerd Meijer e22a79e898 AArch64AsmParser: tablegen the isBranchTarget helper functions
Use tablegen to autogenerate isBranchtarget helper functions. This is a cleanup
that removes almost identical functions that differ only in a few constants.

Differential Revision: https://reviews.llvm.org/D30160

llvm-svn: 295649
2017-02-20 10:57:54 +00:00
Simon Dardis df943b02a9 [mips] Add test for mul macro variants
llvm-svn: 295648
2017-02-20 10:53:03 +00:00
NAKAMURA Takumi 5539d8d1c9 llvm/examples/Kaleidoscope/BuildingAJIT: More fixup corresponding to r295636.
I missed updating them since I just ran check-llvm (with examples) in r295645.

llvm-svn: 295646
2017-02-20 10:07:41 +00:00
NAKAMURA Takumi d4c7a12177 llvm/examples/Kaleidoscope/include/KaleidoscopeJIT.h: Fixup corresponding to r295636.
llvm-svn: 295645
2017-02-20 09:56:24 +00:00
Ayman Musa 51ffeab8c8 [X86][AVX] Extend hasVEX_WPrefix bit to accept WIG value (W Ignore) + update all AVX instructions with the new value.
Add WIG value to all of AVX instructions which ignore the W-bit in their encoding, instead of giving them the default value of 0.
This patch is needed for a follow up work on EVEX2VEX pass (replacing EVEX encoded instructions with their corresponding VEX version when possible).

Differential Revision: https://reviews.llvm.org/D29876

llvm-svn: 295643
2017-02-20 08:27:54 +00:00
Alexey Bataev f96465b9b8 [SLP] nullptr'ize initial value in `findBuildAggregate()`, NFC.
Initial value of V is sett nullptr, as it is not used.

llvm-svn: 295642
2017-02-20 08:04:11 +00:00
Alexey Bataev 2f6b124e01 [SLP] Rework `findBuildAggregate()` from ercursive form to iterative, NFC.
Reviewers: mkuper

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D30103

llvm-svn: 295641
2017-02-20 07:49:39 +00:00
Craig Topper c6c68f5958 [AVX-512] Add more patterns to fold masked VPTERNLOG with load when the passthru isn't operand 0.
llvm-svn: 295640
2017-02-20 07:00:40 +00:00
Craig Topper 5aef828ba7 [AVX-512] Add tests for missed opportunities to fold masked VPTERNLOG with load when the passthru op isn't operand 0.
llvm-svn: 295639
2017-02-20 07:00:37 +00:00
Craig Topper a5fa2e40f9 [AVX-512] Fix mistake in the immediate swizzle for some of the VPTERNLOG patterns.
llvm-svn: 295638
2017-02-20 07:00:34 +00:00
Craig Topper cb5b45cc36 [AVX-512] Use a better immediate in the VPTERNLOG commuting tests so its easier to spot bad swizzling.
llvm-svn: 295637
2017-02-20 07:00:31 +00:00
Lang Hames 67de5d24a9 [Orc] Rename ObjectLinkingLayer -> RTDyldObjectLinkingLayer.
The current ObjectLinkingLayer (now RTDyldObjectLinkingLayer) links objects
in-process using MCJIT's RuntimeDyld class. In the near future I hope to add new
object linking layers (e.g. a remote linking layer that links objects in the JIT
target process, rather than the client), so I'm renaming this class to be more
descriptive.

llvm-svn: 295636
2017-02-20 05:45:14 +00:00
Craig Topper 5b4e36aafa [AVX-512] Add more VPTERNLOG patterns to enable folding of broadcast loads that aren't in operand 2.
llvm-svn: 295634
2017-02-20 02:47:42 +00:00
Craig Topper c184b671d9 [X86] Use memory form of shift right by 1 when the rotl immediate is one less than the operation size.
An earlier commit already did this for the register form.

llvm-svn: 295626
2017-02-20 00:37:23 +00:00
Craig Topper 0f14411b57 [X86] Add test cases showing missed opportunities to use rotate right by 1 instructions when operation reads/writes memory.
llvm-svn: 295625
2017-02-20 00:37:20 +00:00
Daniel Jasper 5a51f8cae4 s/REQUIRES: Asserts/REQUIRES: asserts/
Other than this, we consistently use lower case.

llvm-svn: 295623
2017-02-19 23:26:00 +00:00
Craig Topper 63801df251 [AVX-512] Remove AddedComplexity from masked operations. The size of the patterns already increases their priority.
llvm-svn: 295619
2017-02-19 21:44:35 +00:00
Simon Pilgrim 14a7eee0b4 [X86] Use peekThroughOneUseBitcasts helper. NFCI.
llvm-svn: 295618
2017-02-19 21:40:51 +00:00
Davide Italiano 16b476ffcc [X86] Prefer static_cast<> to C-style cast. NFCI.
llvm-svn: 295617
2017-02-19 21:35:41 +00:00
Craig Topper 489057715e [AVX-512] Disable peephole optimizations on the VPTERNLOG commute test. Add new patterns to enable isel to fold the loads on it own.
llvm-svn: 295616
2017-02-19 21:32:15 +00:00
Davide Italiano 1aef59eb44 [AArch64] Prefer static_cast<> to C-style cast. NFCI.
llvm-svn: 295615
2017-02-19 21:31:14 +00:00
Simon Pilgrim d590de2998 [X86][SSE] Use getTargetConstantBitsFromNode to find zeroable shuffle elements.
Replaces existing approach that could only search BUILD_VECTOR nodes.

Requires getTargetConstantBitsFromNode to discriminate cases with all/partial UNDEF bits in each element - this should also be useful when we get around to supporting getTargetShuffleMaskIndices with UNDEF elements. 

llvm-svn: 295613
2017-02-19 19:40:31 +00:00
Craig Topper 4e794c71a6 [AVX-512] Add patterns to recognize masked vpternlog when the passthrough operand is not operand 0.
This uses a SDNodeXForm to swizzle the appropriate immediate bits to allow this to be matched.

llvm-svn: 295612
2017-02-19 19:36:58 +00:00
Craig Topper ab1afa85ba [AVX-512] Add test cases that show failure to select masked VPTERNLOG when a select is used to force the passthru operand to be not operand 0.
llvm-svn: 295611
2017-02-19 19:36:54 +00:00
Simon Pilgrim 4271186f9c [X86][SSE] Enable initial support for domain crossing at high shuffle combine depths.
As discussed on D27692, this permits another domain to be used to combine a shuffle at high depths.

We currently set the required depth at 4 or more combined shuffles, this is probably too high for most targets but is a good starting point and already helps avoid a number of costly variable shuffles.

llvm-svn: 295608
2017-02-19 17:19:38 +00:00
Artyom Skrobov be31754094 Remove redundant call to GluedNodes.back() [NFC]
llvm-svn: 295607
2017-02-19 16:56:18 +00:00
Simon Pilgrim 6d07d514de [X86][SSE] Generalize INSERTPS/SHUFPS/SHUFPD combines across domains.
Relax the INSERTPS/SHUFPS/SHUFPD combines to support integer inputs if permitted.

llvm-svn: 295606
2017-02-19 15:15:40 +00:00
Igor Kudrin 9e015dafbf [llvm-cov] Respect Windows line endings when parsing demangled symbols.
Differential Revision: https://reviews.llvm.org/D30096

llvm-svn: 295605
2017-02-19 14:26:52 +00:00
Simon Pilgrim b4460cf5a9 [X86][SSE] Add domain crossing support for target shuffle combines.
Add the infrastructure to flag whether float and/or int domains are permitable.

A future patch will enable domain crossing based off shuffle depth and the value types of the source vectors.

llvm-svn: 295604
2017-02-19 14:12:25 +00:00
Simon Pilgrim 5798180947 Removed extra ';'
llvm-svn: 295603
2017-02-19 12:32:44 +00:00
Craig Topper 218d1a020e [AVX-512] Add broadcast VPTERNLOG instructions to special case commuting switch.
The instructions are marked commutable, but without special handling we don't get the immediate correct.

While here also remove the masked memory forms that aren't commutable.

llvm-svn: 295602
2017-02-19 08:03:26 +00:00
Craig Topper 94de4b9330 [AVX-512] Add patterns to show missed opportunities for folding vpternlog with broadcast loads. Also demonstrates a bug in the commuting of broadcast vpternlog instructions when we are able to select them.
llvm-svn: 295601
2017-02-19 08:03:23 +00:00
NAKAMURA Takumi 7802984126 Untabify.
llvm-svn: 295599
2017-02-19 06:51:46 +00:00
Daniel Berlin 17b1375299 Re-add debugcounter.ll with Requires: Asserts so that it only triggers when asserts are on
llvm-svn: 295598
2017-02-19 06:45:02 +00:00
Daniel Berlin d46dfb3d0e Which, in turn, causes build bots to fail that have it unexpectedly passing. So remove debugcounter.ll for now
llvm-svn: 295597
2017-02-19 04:56:07 +00:00
Daniel Berlin 1ad7f5cab0 XFAIL this test until we figure out what to do here, since it will fail if NDEBUG defined
llvm-svn: 295596
2017-02-19 04:55:02 +00:00
Daniel Berlin ce3d348a5c Add two files lost in rebase, causing build break
llvm-svn: 295595
2017-02-19 04:29:50 +00:00
Daniel Berlin a4b5c01dd2 Add a DebugCounter for PredicateInfo renaming, and an associated test
llvm-svn: 295594
2017-02-19 04:29:01 +00:00
Daniel Berlin 25f1db1111 Add initial support for debug counting
Summary:

We have support for bisection, and bugpoint can reduce testcases
often to a single pass. But that doesn't help reduce it to a single
transform by a single pass.  Which debug counting lets us do.

Debug counting lets you instrument a pass so that it only executes a
certain thing (rwhatever you want) after skipping it a certain time of
times, and then only does a certain number of executions before saying
"skip" again.

To make it concrete, for predicateinfo, if i instrument use renaming,
i can make it so it skips renaming the first N uses, renames the next
N, and then skips the rest.

This lets you narrow down a miscompilation to, often, a single
transformation, and then also debug it (by using the same command line
parameters).

Reviewers: chandlerc, davide, mehdi_amini

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D29998

llvm-svn: 295593
2017-02-19 04:28:56 +00:00
NAKAMURA Takumi 486dfe11af llvm/test/CodeGen/AMDGPU/r600.alu-limits.ll should require +Asserts. This would run into infinite loop anyways with -Asserts.
llvm-svn: 295591
2017-02-19 02:31:06 +00:00
Craig Topper 007c93b2b9 [X86] Remove patterns for MOVSD with v4i32 types. We don't appear to really need them and if we do we should just use a bitcast to a 64-bit element type.
llvm-svn: 295589
2017-02-19 02:08:48 +00:00
Craig Topper 06ae5e821c [X86] Tighten up some of the SDNode type constraints.
llvm-svn: 295588
2017-02-19 01:54:47 +00:00
Simon Pilgrim dba9011942 Fix unused variable warning when assertions are disabled.
llvm-svn: 295587
2017-02-19 00:33:37 +00:00
Simon Pilgrim 599b872ca2 [X86] Fix enumeral/non-enumeral conditional expression warning.
gcc only allows you to mix enums / ints if they have the same signedness.

llvm-svn: 295586
2017-02-19 00:04:30 +00:00
Simon Pilgrim ee6ef4d6dd Fix 'variable set but not used' warning when assertions are disabled.
llvm-svn: 295585
2017-02-19 00:03:46 +00:00
Daniel Berlin f7d9580a08 NewGVN: Start making use of predicateinfo pass.
Summary: This begins using the predicateinfo pass in NewGVN.

Reviewers: davide

Subscribers: llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D29682

llvm-svn: 295583
2017-02-18 23:06:50 +00:00
Daniel Berlin b355c4ff5f NewGVN: Make ranking prefer undef to constants. Fix direction of
shouldSwapOperands to be correct.

llvm-svn: 295582
2017-02-18 23:06:47 +00:00
Daniel Berlin 588e0be39d PredicateInfo: Clean up predicate info a little, using insertion
helpers, and fixing support for the renaming the comparison.

llvm-svn: 295581
2017-02-18 23:06:38 +00:00
Simon Pilgrim 2f2d8dc630 Fix signed/unsigned comparison warning.
llvm-svn: 295580
2017-02-18 22:56:17 +00:00
Craig Topper 811756b4dc [X86][XOP] Reduce the size of a multiclass by moving more stuff to parameters instead of doing 128-bit and 256-bit simultaneously.
This requires some instructions to be renamed to move the Y earlier in the instruction name. The new names are more consistent with other instructions.

llvm-svn: 295579
2017-02-18 22:53:43 +00:00
Craig Topper f6564c991b [TableGen] Make sure EnforceSameSize populates the type sets if necessary.
This was found by another commit I'm working on.

llvm-svn: 295578
2017-02-18 22:53:38 +00:00
Simon Pilgrim b092166a76 [AArch64] Fix enumeral/non-enumeral conditional expression warning.
gcc only allows you to mix enums / ints if they have the same signedness.

llvm-svn: 295577
2017-02-18 22:50:28 +00:00
Simon Pilgrim 7a87eebcad [X86] Fix enumeral/non-enumeral comparison warning.
gcc only allows you to mix enums / ints if they have the same signedness.

llvm-svn: 295576
2017-02-18 22:40:58 +00:00
Simon Pilgrim 2e78c94ea5 [X86][SSE] Avoid repeated calls to SDValue::getValueType.
Added assertion to check input type of X86ISD::VZEXT during target known bits calculation.

llvm-svn: 295575
2017-02-18 22:25:27 +00:00
Sanjay Patel 53c5c3d65d [InstCombine] add nsw/nuw X, signbit --> or X, signbit
Changing to 'or' (rather than 'xor' when no wrapping flags are set)
allows icmp simplifies to happen as expected.

Differential Revision: https://reviews.llvm.org/D29729

llvm-svn: 295574
2017-02-18 22:20:09 +00:00
Sanjay Patel fe67255961 [InstSimplify] add nsw/nuw (xor X, signbit), signbit --> X
The change to InstCombine in:
https://reviews.llvm.org/D29729
...exposes this missing fold in InstSimplify, so adding this
first to avoid a regression.

llvm-svn: 295573
2017-02-18 21:59:09 +00:00
Sanjay Patel 308eb22118 [InstSimplify] add tests for add nsw/nuw (xor X, signbit), signbit --> X; NFC
llvm-svn: 295572
2017-02-18 21:51:14 +00:00
Craig Topper de10312bea Recommit "[X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR."
Clang has now been fixed to not use these intrinsics.

llvm-svn: 295571
2017-02-18 21:50:58 +00:00
Sanjay Patel dc8a24ea4c [x86] remove stale comments from tests; NFC
llvm-svn: 295569
2017-02-18 21:07:37 +00:00
Sanjay Patel 12c2093e1e [x86] fold sext (xor Bool, -1) --> sub (zext Bool), 1
This is the same transform that is current used for:
select Bool, 0, -1

llvm-svn: 295568
2017-02-18 21:03:28 +00:00
Piotr Padlewski cc5868c186 [MemorySSA] NFC small fixes
Summary:
2 small fixes extracted from
https://reviews.llvm.org/D29064

Reviewers: kuhar, davide, dberlin, george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30109

llvm-svn: 295566
2017-02-18 20:34:36 +00:00
Craig Topper ba2a726cc6 Revert "[X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR."
This reverts r295564. I missed that clang was still using the intrinsics despite our half implemented autoupgrade support.

llvm-svn: 295565
2017-02-18 20:14:20 +00:00
Craig Topper 884db3f85d [X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR.
It seems we were already upgrading 128-bit VPCMOV, but the intrinsic was still defined and being used in isel patterns. While I was here I also simplified the tablegen multiclasses.

llvm-svn: 295564
2017-02-18 19:51:25 +00:00
Craig Topper 03a9adc2ba [X86][IR] Simplify the XOP vpcmov autoupgrade code. NFC
llvm-svn: 295563
2017-02-18 19:51:19 +00:00
Craig Topper aa49f14496 [X86][IR] Merge together some very similar AutoUpgrade handling. NFC
llvm-svn: 295562
2017-02-18 19:51:14 +00:00
Matt Arsenault 2021f08080 AMDGPU: Fix assembler subtarget predicate for gfx9
This was accepting GFX9 instructions on VI.

llvm-svn: 295557
2017-02-18 19:12:26 +00:00
Matt Arsenault a3b3b489fb AMDGPU: Fix disassembly of aperture registers
llvm-svn: 295555
2017-02-18 18:41:41 +00:00
Matt Arsenault e823d92f7f AMDGPU: Merge initial gfx9 support
llvm-svn: 295554
2017-02-18 18:29:53 +00:00
Sanjay Patel 6d5dddb85f [InstCombine] add tests for trunc(insertelement); NFC
llvm-svn: 295553
2017-02-18 18:27:04 +00:00
Easwaran Raman 617f63640b Refactor instruction simplification code in visitors. NFC.
Several visitors check if operands to the instruction are constants,
either as it is or after looking up SimplifiedValues, check if the
result is a constant and update the SimplifiedValues map. This
refactoring splits it into a common function that does the checking of
whether the operands are constants and updating of the SimplifiedValues
table, and an instruction specific part that is implemented by each
instruction visitor as a lambda and passed to the common function.

Differential revision: https://reviews.llvm.org/D30104

llvm-svn: 295552
2017-02-18 17:22:52 +00:00
Sanjay Patel 86554de2bd [InstCombine] update trunc(shuffle) tests to reflect IR reality; NFC
We're ok shrinking splats, but not shuffles in general.

See https://reviews.llvm.org/D30123 for discussion.

llvm-svn: 295547
2017-02-18 15:24:31 +00:00
Brian Cain cc24826e1e opt-viewer: Fix syntax highlighting
Syntax highlighting has been done line-at-a-time. Done this way, the lexer
resets the context at each line, distorting the formatting.

This change will render the whole file at once and feed the highlighted text
line-at-a-time to be wrapped by the SourceFileRenderer.

Leading/trailing newlines were being ignored by Pygments but since each line
was rendered in its own row, it didn't matter. This bug was masked by the
line-at-a-time algorithm. So now we need to add "stripnl=False" to the 
CppLexer to change its behavior to match the expectation.

llvm-svn: 295546
2017-02-18 15:13:58 +00:00
Craig Topper a505169ca5 [AVX-512] Remove 128/256-bit masked fp max/min intrinsics. Upgrade them to legacy unmasked intrinsics and select instructions.
llvm-svn: 295543
2017-02-18 07:07:50 +00:00
Dehao Chen b70afdb5e8 Add default OptLevel value for createSimpleLoopUnrollPass to fix the build break introduced by r295538. (NFC)
llvm-svn: 295542
2017-02-18 06:42:16 +00:00
Jan Vesely 4b1243facb AMDGPU/R600: Assert on infinite loop in EmitClauseMarkers
Differential Revision: https://reviews.llvm.org/D29792

llvm-svn: 295539
2017-02-18 04:24:10 +00:00
Dehao Chen 7d230325ef Increases full-unroll threshold.
Summary:
The default threshold for fully unroll is too conservative. This patch doubles the full-unroll threshold

This change will affect the following speccpu2006 benchmarks (performance numbers were collected from Intel Sandybridge):

Performance:

403	0.11%
433	0.51%
445	0.48%
447	3.50%
453	1.49%
464	0.75%

Code size:

403	0.56%
433	0.96%
445	2.16%
447	2.96%
453	0.94%
464	8.02%

The compiler time overhead is similar with code size.

Reviewers: davidxl, mkuper, mzolotukhin, hfinkel, chandlerc

Reviewed By: hfinkel, chandlerc

Subscribers: mehdi_amini, zzheng, efriedma, haicheng, hfinkel, llvm-commits

Differential Revision: https://reviews.llvm.org/D28368

llvm-svn: 295538
2017-02-18 03:46:51 +00:00
Davide Italiano 982bf827b5 [IR/Verifier] Don't visit DISubprograms more than needed.
Before this patch we happened to visit twice, one when scanning
MDNodes and the other one while visiting the function. Remove
the explicit call to visitDISubprogram there, so we don't emit
the same error twice in case the verifier fail and we save some
time when running it.
Thanks to Justin Bogner for the report and Adrian for the quick
review!

PR: 31995
llvm-svn: 295537
2017-02-18 03:02:44 +00:00
Dylan McKay 83788ff349 [AVR] Set UseIntegratedAssembler
llvm-svn: 295535
2017-02-18 02:26:11 +00:00
Justin Bogner 7bc978b543 OptDiag: Allow constructing DiagnosticLocation from DISubprograms
This avoids creating a DILocation just to represent a line number,
since creating Metadata is expensive. Creating a DiagnosticLocation
directly is much cheaper.

llvm-svn: 295531
2017-02-18 02:00:27 +00:00
Zachary Turner 607418771e Remove the is_trivially_copyable check entirely.
This is still breaking builds because some compilers think
this type is not trivially copyable even when it should be.

Reverting this static_assert until I have time to investigate.

llvm-svn: 295529
2017-02-18 01:51:00 +00:00
Zachary Turner 3720124545 Use llvm workaround for missing is_trivially_copyable.
some versions of GCC don't have this, so LLVM provides a
workaround.

llvm-svn: 295526
2017-02-18 01:46:01 +00:00
Zachary Turner 181fe17b6f Don't assume little endian in StreamReader / StreamWriter.
In an effort to generalize this so it can be used by more than
just PDB code, we shouldn't assume little endian.

llvm-svn: 295525
2017-02-18 01:35:33 +00:00
Matthias Braun 2a707a3d3d machine-region-info.mir: Slightly simplify test, -mtriple
llvm-svn: 295520
2017-02-18 00:48:43 +00:00
Justin Bogner d890f95bf6 OptDiag: Decouple backend diagnostics from debug info metadata
This creates and uses a DiagnosticLocation type rather than using
DebugLoc for this purpose in the backend diagnostics. This is NFC for
now, but will allow us to create locations for diagnostics without
having to create new metadata nodes when we don't have a DILocation.

llvm-svn: 295519
2017-02-18 00:42:23 +00:00
Matthias Braun 431305927f MachineRegionInfo: Fix pass initialization
- Adapt MachineBasicBlock::getName() to have the same behavior as the IR
  BasicBlock (Value::getName()).
- Add it to lib/CodeGen/CodeGen.cpp::initializeCodeGen so that it is linked in
  the CodeGen library.
- MachineRegionInfoPass's name conflicts with RegionInfoPass's name ("region").
- MachineRegionInfo should depend on MachineDominatorTree,
  MachinePostDominatorTree and MachineDominanceFrontier instead of their
  respective IR versions.
- Since there were no tests for this, add a X86 MIR test.

Patch by Francis Visoiu Mistrih<fvisoiumistrih@apple.com>

llvm-svn: 295518
2017-02-18 00:41:16 +00:00
Justin Bogner efc3fbf6a2 Verifier: Disallow a line number without a file in DISubprogram
A line number doesn't make much sense if you don't say where it's
from. Add a verifier check for this and update some tests that had
bogus debug info.

llvm-svn: 295516
2017-02-17 23:57:42 +00:00
Sanjay Patel f8346550bf [InstCombine] add tests for trunc(shuffle X, C, M); NFC
llvm-svn: 295513
2017-02-17 23:16:54 +00:00
Matthias Braun d9a59a8df8 AArch64LoadStoreOptimizer: Correctly clear kill flags
When promoting the Load of a Store-Load pair to a COPY all kill flags
between the store and the load need to be cleared.

rdar://30402435

Differential Revision: https://reviews.llvm.org/D30110

llvm-svn: 295512
2017-02-17 23:15:03 +00:00
Simon Pilgrim 8670993dc1 [X86] Add MOVBE targets to load combine tests
Test folded endian swap tests with MOVBE instructions.

llvm-svn: 295508
2017-02-17 23:00:21 +00:00
Guozhi Wei 7ec2c72095 [PPC] Give unaligned memory access lower cost on processor that supports it
Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handles this case in PPCTTIImpl::getMemoryOpCost.

This patch fixes pr31492.

Differential Revision: https://reviews.llvm.org/D28630

This is resubmit of r292680, which was reverted by r293092. The internal application failures were actually caused by a source code bug.

llvm-svn: 295506
2017-02-17 22:29:39 +00:00
Eugene Zelenko be37db1882 [CodeGen] Revert changes in LowLevelType to pre-r295499 to fix broken buildbots.
llvm-svn: 295505
2017-02-17 22:23:34 +00:00
Krzysztof Parzyszek 1aaf41af54 [Hexagon] Start using regmasks on calls
Reapply r295371 with a fix for the Windows bot failures.

llvm-svn: 295504
2017-02-17 22:14:51 +00:00
Davide Italiano bca05df38b [NewGVN] isOnlyReachableViaThisEdge() is dead now. NFCI.
llvm-svn: 295503
2017-02-17 22:12:30 +00:00
Simon Pilgrim 7db8f42fe3 [X86] Simplify by pulling out valuetype. NFCI.
llvm-svn: 295502
2017-02-17 22:10:10 +00:00
Eugene Zelenko 9676ebec15 [CodeGen] Attempt to fix buildbots broken in r295499.
llvm-svn: 295501
2017-02-17 22:07:26 +00:00
Davide Italiano 6db0ecafd7 [NewGVN] createVariableOrConstant is not required anymore. NFCI.
llvm-svn: 295500
2017-02-17 21:55:47 +00:00
Eugene Zelenko 5db84df728 [CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 295499
2017-02-17 21:43:25 +00:00
Simon Pilgrim 996f9b4cad [X86] Add subborrow stack folding tests
llvm-svn: 295496
2017-02-17 21:16:24 +00:00
Sanjay Patel 00872c3dfe [x86] add tests for sext (not bool); NFC
llvm-svn: 295495
2017-02-17 21:10:40 +00:00
Matthew Simpson a899f86054 [LAA] Remove unused code (NFC)
llvm-svn: 295493
2017-02-17 20:46:52 +00:00
Simon Pilgrim a4c350ff17 [X86][SSE] Add (V)MOVD folding pattern with zextloadi64i32 load node.
Fixes PRPR31309

llvm-svn: 295492
2017-02-17 20:43:32 +00:00
Adrian Prantl e6c6a945ca Fix windows bots by locking down the target triple on this testcase.
llvm-svn: 295490
2017-02-17 20:02:26 +00:00
Matt Arsenault f6cf1032fd AMDGPU: Fix crashes on invalid icmp/fcmp intrinsics
llvm-svn: 295489
2017-02-17 19:49:10 +00:00
Peter Collingbourne 184773d81f WholeProgramDevirt: For VCP use a 32-bit ConstantInt for the byte offset.
A future change will cause this byte offset to be inttoptr'd and then exported
via an absolute symbol. On the importing end we will expect the symbol to be
in range [0,2^32) so that it will fit into a 32-bit relocation. The problem
is that on 64-bit architectures if the offset is negative it will not be in
the correct range once we inttoptr it.

This change causes us to use a 32-bit integer so that it can be inttoptr'd
(which zero extends) into the correct range.

Differential Revision: https://reviews.llvm.org/D30016

llvm-svn: 295487
2017-02-17 19:43:45 +00:00
Adrian Prantl 67c2442210 Debug Info: Sort frame index expressions before emitting them.
This fixes PR31381, which caused an assertion and/or invalid debug info.

This affects debug variables that have multiple fragments in the MMI
side (i.e.: in the stack frame) table.
rdar://problem/30571676

llvm-svn: 295486
2017-02-17 19:42:32 +00:00
Petr Hosek 7acbeaf1bc [CMake] Support externalizing debug info on non-Darwin platforms
On other platorms, we use objcopy to export the debug info.

Differential Revision: https://reviews.llvm.org/D28575

llvm-svn: 295481
2017-02-17 19:29:12 +00:00
Simon Pilgrim a3f2803905 [X86][SHA] Add SHA stack folding tests
llvm-svn: 295479
2017-02-17 19:24:55 +00:00
Artyom Skrobov 4592f6206c In Thumb1 mode, the custom lowering for ARMISD::CMPZ could never emit tADDi3
Reviewers: jmolloy, t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, aemerson, rengolin, llvm-commits

Differential Revision: https://reviews.llvm.org/D30097

llvm-svn: 295478
2017-02-17 18:59:16 +00:00
Simon Pilgrim f4f5cd5d19 [X86][TBM] Add TBM stack folding tests
llvm-svn: 295477
2017-02-17 18:51:53 +00:00
Tim Northover 88634996c7 GlobalISel: verify that generic loads & stores have a mem operand.
The mem operand is used by GlobalISel to convey atomic constraints so dropping
it is invalid.

llvm-svn: 295476
2017-02-17 18:50:15 +00:00
Joel Jones ab0f3b43e3 [AArch64] Add Cavium ThunderX support
This set of patches adds support for Cavium ThunderX ARM64 processors:

  * ThunderX
  * ThunderX T81
  * ThunderX T83
  * ThunderX T88

Patch by Stefan Teleman
Differential Revision: https://reviews.llvm.org/D28891

llvm-svn: 295475
2017-02-17 18:34:24 +00:00
Peter Collingbourne 37317f1207 WholeProgramDevirt: Examine the function body when deciding whether functions are readnone.
The goal is to get an analysis result even for de-refineable functions.

Differential Revision: https://reviews.llvm.org/D29803

llvm-svn: 295472
2017-02-17 18:17:04 +00:00
Simon Pilgrim 99193de8ab [X86][BMI] Add BMI2 stack folding tests
llvm-svn: 295470
2017-02-17 18:00:43 +00:00
Peter Collingbourne 10c500ddc0 opt: Rename -default-data-layout flag to -data-layout and make it always override the layout.
There isn't much point in a flag that only works if the data layout is empty.

Differential Revision: https://reviews.llvm.org/D30014

llvm-svn: 295468
2017-02-17 17:36:52 +00:00
Justin Bogner 073f56dc1a OptDiag: Rename DiagnosticInfoWithDebugLoc to WithLocation. NFC
This generalizes the name in preparation for decoupling the concept
from DebugLoc.

llvm-svn: 295465
2017-02-17 17:34:37 +00:00
Rui Ueyama ac20c17962 MC/COFF: Do not emit forward associative section referenceds.
MSVC link.exe cannot handle associative sections that refer later
sections in the section header. Technically, such COFF object doesn't
violate the Microsoft COFF spec, as the spec doesn't say anything
about that, but still we should avoid doing that to make it compatible
with MS tools.

This patch assigns smaller section numbers to non-associative sections
and larger numbers to associative sections. This should resolve the
compatibility issue.

Differential Revision: https://reviews.llvm.org/D30080

llvm-svn: 295464
2017-02-17 17:32:54 +00:00
Sanjay Patel 7f2e58972c [DAGCombiner] split i1 select-of-constants from non-i1 case; NFCI
I can't find any tests of the non-i1 code path, so it may be unnecessary at this point.

llvm-svn: 295463
2017-02-17 17:13:27 +00:00
Simon Pilgrim 09dde435ab [X86][BMI] Add BMI stack folding tests
llvm-svn: 295462
2017-02-17 17:11:00 +00:00
Sanjay Patel f2a345c8ee [PowerPC] add tests for select-of-constants; NFC
llvm-svn: 295460
2017-02-17 16:43:43 +00:00
Sanjay Patel 9b6cfaa7b1 [ARM] add tests for select-of-constants; NFC
llvm-svn: 295459
2017-02-17 16:34:13 +00:00
Matthew Simpson f68e183f91 [LV] Remove constant restriction for vector phi creation
We previously only created a vector phi node for an induction variable if its
step had a constant integer type. However, the step actually only needs to be
loop-invariant. We only handle inductions having loop-invariant steps, so this
patch should enable vector phi node creation for all integer induction
variables that will be vectorized.

Differential Revision: https://reviews.llvm.org/D29956

llvm-svn: 295456
2017-02-17 16:09:07 +00:00
Simon Pilgrim 0429c0cf8b Fix signed/unsigned comparison warning.
llvm-svn: 295453
2017-02-17 16:01:16 +00:00
Sam Parker 58af0c55d2 [ARM] Replace HasT2ExtractPack with HasDSP
Removed the HasT2ExtractPack feature and replaced its references
with HasDSP. This then allows the Thumb2 extend instructions to be
selected for ARMv8M +dsp. These instruction descriptions have also
been refactored and more target tests have been added for their isel.

Differential Revision: https://reviews.llvm.org/D29623

llvm-svn: 295452
2017-02-17 15:42:44 +00:00
Simon Pilgrim 511d788a95 [DAGCombine] Recognise any_extend_vector_inreg and truncation style shuffle masks
During legalization we are often creating shuffles (via a build_vector scalarization stage) that are "any_extend_vector_inreg" style masks, and also other masks that are the equivalent of "truncate_vector_inreg" (if we had such a thing).

This patch is an attempt to match these cases to help undo the effects of just leaving shuffle lowering to handle it - which typically means we lose track of the undefined elements of the shuffles resulting in an unnecessary extension+truncation stage for widened illegal types.

The 2011-10-21-widen-cmp.ll regression will be fixed by making SIGN_EXTEND_VECTOR_IN_REG legal in SSE instead of lowering them to X86ISD::VSEXT (PR31712).

Differential Revision: https://reviews.llvm.org/D29454

llvm-svn: 295451
2017-02-17 15:14:48 +00:00
Sanjay Patel 5573042035 [DAGCombiner] improve readability; NFCI
llvm-svn: 295447
2017-02-17 14:21:59 +00:00
Diana Picus e836878bf1 [ARM] GlobalISel: Clean up some helpers
Return invalid opcodes when some of the helpers in the instruction selection
pass can't handle a given combination.

llvm-svn: 295446
2017-02-17 13:44:19 +00:00
Diana Picus 38699dbac5 [ARM] GlobalISel: Check mappings used by reg bank select
Add some asserts to make sure we're using the mappings that we think we're
using. This is to keep us from accidentally breaking functionality while moving
to TableGen'erated mappings.

llvm-svn: 295441
2017-02-17 13:14:25 +00:00
Diana Picus 7cab0786bd [ARM] GlobalISel: Use Subtarget in Legalizer
Start using the Subtarget to make decisions about what's legal. In particular,
we only mark floating point operations as legal if we have VFP2, which is
something we should've done from the very start.

llvm-svn: 295439
2017-02-17 11:25:17 +00:00
Diana Picus d2f3ba71c9 [ARM] GlobalISel: Add end-to-end tests for double
Test some really basic functionality through the whole GlobalISel pipeline.

llvm-svn: 295438
2017-02-17 11:25:11 +00:00
Ismail Donmez c7ff81435d Update Bugzilla URLs in docs
llvm-svn: 295432
2017-02-17 08:26:11 +00:00
Eugene Leviant 958fcd7502 InstCombine: fix extraction when performing vector/array punning
Differential revision: https://reviews.llvm.org/D29491

llvm-svn: 295429
2017-02-17 07:36:03 +00:00
Craig Topper cbd1b60e42 [IR][X86] Simplify some AutoUpgrade code slightly. NFC
llvm-svn: 295426
2017-02-17 07:07:24 +00:00
Craig Topper 905cc75f97 [IR][X86] Rename an AutoUpgrade helper function to more accurately match what intrinsics it handles. NFC
llvm-svn: 295425
2017-02-17 07:07:21 +00:00
Craig Topper b9b9cb0ce6 [IR][X86] Move X86 specific portions of UpgradeIntrinsicFunction1 to a couple helper functions. NFC
This enables some early outs to avoid repeatedly using IsX86 check to qualify. I hope to continue to improve this to shorten the lengths of some of the string comparisons.

llvm-svn: 295424
2017-02-17 07:07:19 +00:00