Commit Graph

38219 Commits

Author SHA1 Message Date
Sanjay Patel 9cc21ac412 fix typo; NFC
llvm-svn: 274636
2016-07-06 16:42:46 +00:00
Elena Demikhovsky ad0a56f3da Re-commit of 274613.
The prev commit failed on compilation.
A minor change in one pattern in lib/Target/X86/X86InstrAVX512.td fixes the failure.

llvm-svn: 274626
2016-07-06 14:15:43 +00:00
Diana Picus b772e409ba [ARM] Do not test for CPUs, use SubtargetFeatures. Also remove 2 flags.
This is a follow-up for r273544.

The end goal is to get rid of the isSwift / isCortexXY / isWhatever methods.

This commit also removes two command-line flags that weren't used in any of the
tests: widen-vmovs and swift-partial-update-clearance. The former may be easily
replaced with the mattr mechanism, but the latter may not (as it is a subtarget
property, and not a proper feature).

Differential Revision: http://reviews.llvm.org/D21797

llvm-svn: 274620
2016-07-06 11:22:11 +00:00
Diana Picus 4879b050cc [ARM] Do not test for CPUs, use SubtargetFeatures (Part 3). NFCI
This is a follow-up for r273544 and r273853.

The end goal is to get rid of the isSwift / isCortexXY / isWhatever methods.
This commit also marks them as obsolete.

Differential Revision: http://reviews.llvm.org/D21796

llvm-svn: 274616
2016-07-06 09:22:23 +00:00
Elena Demikhovsky 02ced295aa Reverted 274613 due to compilation failue.
llvm-svn: 274615
2016-07-06 09:11:49 +00:00
Elena Demikhovsky 5a4f2476fd AVX-512: Optimization for patterns with i1 scalar type
The patch removes redundant kmov instructions (not all, we still have a lot of work here) and redundant "and" instructions after "setcc".
I use "AssertZero" marker between X86ISD::SETCC node and "truncate" to eliminate extra "and $1" instruction.
I also changed zext, aext and trunc patterns in the .td file. It allows to remove extra "kmov" instruictions.

This patch fixes https://llvm.org/bugs/show_bug.cgi?id=28173.

Fast ISEL mode is not supported correctly for AVX-512. ICMP/FCMP scalar instruction should return result in k-reg. It will be fixed in one of the next patches. I redirected handling of "cmp" to the DAG builder mode. (The code looks worse in one specific test case, but without this fix the new patch fails).

Differential revision: http://reviews.llvm.org/D21956

llvm-svn: 274613
2016-07-06 09:01:20 +00:00
Nicolai Haehnle e40530ea7b AMDGPU: Fix return of non-void-returning shaders
Summary:
Since "AMDGPU: Fix verifier errors in SILowerControlFlow", the logic that
ensures that a non-void-returning shader falls off the end of the last
basic block was effectively disabled, since SI_RETURN is now used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96731

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21975

llvm-svn: 274612
2016-07-06 08:35:17 +00:00
Tim Northover 449c15e1bd AArch64: try to fix optimized build failure.
I think the Ops filled out by Regex::match contain pointers into the temporary
std::string returned by StringRef::upper. Its lifetime is extended by the call
to match, but only until the end of that call (not to the uses of Ops later
on).

llvm-svn: 274586
2016-07-05 23:15:58 +00:00
Simon Pilgrim 7643b337a2 [X86][AVX2] Simplified BROADCAST combining to avoid repeated matching attempts
llvm-svn: 274583
2016-07-05 22:41:04 +00:00
Manman Ren 39b37c0f9d Fix an ordering problem in r274431
llvm-svn: 274582
2016-07-05 22:24:44 +00:00
Matt Arsenault e8dbf791b1 AMDGPU: Remove unnecessary string usage in AsmPrinter
Registers are printed a lot, so don't create temporary
std::strings. Using char instead of a string to an ostream
saves a function call.

llvm-svn: 274581
2016-07-05 22:06:56 +00:00
Tim Northover e6ae6767d9 AArch64: TableGenerate system instruction operands.
The way the named arguments for various system instructions are handled at the
moment has a few problems:

  - Large-scale duplication between AArch64BaseInfo.h and AArch64BaseInfo.cpp
  - That weird Mapping class that I have no idea what I was on when I thought
    it was a good idea.
  - Searches are performed linearly through the entire list.
  - We print absolutely all registers in upper-case, even though some are
    canonically mixed case (SPSel for example).
  - The ARM ARM specifies sysregs in terms of 5 fields, but those are relegated
    to comments in our implementation, with a slightly opaque hex value
    indicating the canonical encoding LLVM will use.

This adds a new TableGen backend to produce efficiently searchable tables, and
switches AArch64 over to using that infrastructure.

llvm-svn: 274576
2016-07-05 21:23:04 +00:00
Balaram Makam d4acd7ed10 Revert r259387: "AArch64: Implement missed conditional compare sequences."
This reverts commit r259387 because it inserts illegal code after legalization
    in some backends where i64 OR type is illegal for example.

llvm-svn: 274573
2016-07-05 20:24:05 +00:00
Simon Pilgrim bec6543d17 [X86][AVX2] Add support for target shuffle combining to BROADCAST
Only support broadcast from vector register so far - memory folding support will have to wait.

llvm-svn: 274572
2016-07-05 20:11:29 +00:00
Simon Pilgrim 48adedffb7 [X86][AVX512] Fixed decoding of permd/permpd variable mask shuffles + enabled them for target shuffle combining
Corrected element mask masking to extract the bottom index bits (now matches the perm2 implementation but for unary inputs).

llvm-svn: 274571
2016-07-05 18:31:17 +00:00
Saleem Abdulrasool 4d950ef892 ARM: fix `-mlong-calls` for WoA
Not all code-paths set the relocation model to static for Windows.  This
currently breaks on Windows ARM with `-mlong-calls` when built with clang.
Loosen the assertion to what it was previously.  We would ideally ensure that
all the configuration sets Windows to static relocation model.

llvm-svn: 274570
2016-07-05 18:30:52 +00:00
Tim Northover 01dff9d18a AArch64: use correct SDValue # when looking for bitfield placement.
The other use really does only care about the SDNode (it checks the
opcode against a whitelist), but bitFieldPlacement can be misled if
the node produces multiple results.

Patch by Ismail Badawi.

llvm-svn: 274567
2016-07-05 18:02:57 +00:00
Matt Arsenault ffc8275f2b AMDGPU: Fix folding SGPRs into madak/madmk src0
Because of the special immediate operand, the constant
bus is already used so SGPRs are never useful.

r263212 changed the name of the immediate operand, which
broke the verifier check for the restriction.

llvm-svn: 274564
2016-07-05 17:09:01 +00:00
Tom Stellard a4b746d808 AMDGPU/SI: Remove address space query functions from AMDGPUDAGToDAGISel
Summary:
These have been replaced with TableGen code (except for isConstantLoad,
which is still used for R600).  The queries were broken for cases
where MemOperand was a PseudoSourceValue.

Reviewers: arsenm

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21684

llvm-svn: 274561
2016-07-05 16:10:44 +00:00
Valery Pykhtin e65b39ec09 [AMDGPU] rename DS_1A1D_Off8_NORET to DS_1A2D_Off8_NORET as ds_write2xx use 2 source registers. NFC.
llvm-svn: 274556
2016-07-05 15:15:28 +00:00
Simon Pilgrim 9769428e08 [X86][AVX512] Remove vector BROADCAST builtins.
llvm-svn: 274555
2016-07-05 14:49:58 +00:00
Michael Zuckerman bdc5f40dca [LLVM][INTRINSICS] adding intrinsics of CLFLUSHOPT
Differential Revision: http://reviews.llvm.org/D21789

llvm-svn: 274553
2016-07-05 14:42:12 +00:00
Sam Kolton a9cd6aa895 [AMDGPU] Assembler: Fix parsing error with floating-point literals passed to integer instructions
Differential Revision: http://reviews.llvm.org/D21972

llvm-svn: 274551
2016-07-05 14:01:11 +00:00
Daniel Sanders 976d938c1e [mips][ias] Remove k_PhysReg since it's not possible to create an operand of this kind.
Reviewers: sdardis

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D21986

llvm-svn: 274547
2016-07-05 13:38:40 +00:00
James Molloy ae5ff990ae [Thumb] Reapply r272251 with a fix for PR28348 (mk 2)
The important thing I was missing was ensuring newly added constants were kept in topological order. Repositioning the node is correct if the constant is newly added (so it has no topological ordering) but wrong if it already existed - positioning it next in the worklist would break the topological ordering.

Original commit message:
  [Thumb] Select a BIC instead of AND if the immediate can be encoded more optimally negated

  If an immediate is only used in an AND node, it is possible that the immediate can be more optimally materialized when negated. If this is the case, we can negate the immediate and use a BIC instead;

    int i(int a) {
      return a & 0xfffffeec;
    }

  Used to produce:
      ldr r1, [CONSTPOOL]
      ands r0, r1
    CONSTPOOL: 0xfffffeec

  And now produces:
      movs    r1, #255
      adds    r1, #20  ; Less costly immediate generation
      bics    r0, r1

llvm-svn: 274543
2016-07-05 12:37:13 +00:00
Daniel Sanders 7b361a2cc3 Revert r274536: [mips][ias] Don't break apart and reconstruct StringRef's for k_Token. NFC.
It turns out that MSVC requires this.

llvm-svn: 274538
2016-07-05 10:44:24 +00:00
Daniel Sanders b2e0ca8e9c [mips][ias] Don't break apart and reconstruct StringRef's for k_Token. NFC.
llvm-svn: 274536
2016-07-05 10:10:36 +00:00
Nemanja Ivanovic 44513e545f [PowerPC] - Legalize vector types by widening instead of integer promotion
This patch corresponds to review:
http://reviews.llvm.org/D20443

It changes the legalization strategy for illegal vector types from integer
promotion to widening. This only applies for vectors with elements of width
that is a multiple of a byte since we have hardware support for vectors with
1, 2, 3, 8 and 16 byte elements.
Integer promotion for vectors is quite expensive on PPC due to the sequence
of breaking apart the vector, extending the elements and reconstituting the
vector. Two of these operations are expensive.
This patch causes between minor and major improvements in performance on most
benchmarks. There are very few benchmarks whose performance regresses. These
regressions can be handled in a subsequent patch with a DAG combine (similar
to how this patch handles int -> fp conversions of illegal vector types).

llvm-svn: 274535
2016-07-05 09:22:29 +00:00
Tom Stellard 4a105d73a9 AMDGPU/R600: Add PatFrags for selecting the correct vtx id for loads
This moves of the r600 logic out of isGlobalLoad() and into the
TableGen files.

Differential Revision: http://reviews.llvm.org/D21710

llvm-svn: 274527
2016-07-05 00:12:51 +00:00
Tom Stellard 17a0ec5400 AMDGPU/SI: Remove hack for selecting < 32-bit loads to MUBUF instructions
Summary:
The isGlobalLoad() query was returning true for constant address space loads
with memory types less than 32-bits, which is wrong.  This logic has been
replaced with PatFrag in the TableGen files, to provide the same functionality.

Reviewers: arsenm

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21696

llvm-svn: 274521
2016-07-04 20:41:48 +00:00
Simon Pilgrim 3ad040909a [X86][AVX512] Add support for lowering shuffles to VSHUFPD
llvm-svn: 274520
2016-07-04 20:41:24 +00:00
Craig Topper 5d16cd9d63 [AVX512] Remove masked VPERMD/VPERMQ/VPERMILPS/VPERMILPD intrinsics. They were autoupgraded to native IR in r274506 and r274506.
llvm-svn: 274519
2016-07-04 19:58:38 +00:00
Jan Vesely 991dfd7b07 AMDGPU/R600: Add indentation to VTX and TEX fetch asm strings
These are printed as part of Fetch clauses.

Differential Revision: http://reviews.llvm.org/D21730

llvm-svn: 274517
2016-07-04 19:45:00 +00:00
James Molloy c3b4ed4a70 Revert "[Thumb] Reapply r272251 with a fix for PR28348"
This reverts commit r274510 - it made green dragon unhappy.

llvm-svn: 274512
2016-07-04 17:14:24 +00:00
James Molloy 9f019835ef [Thumb] Reapply r272251 with a fix for PR28348
We were using DAG->getConstant instead of DAG->getTargetConstant. This meant that we could inadvertently increase the use count of a constant if stars aligned, which it did in this testcase. Increasing the use count of the constant could cause ISel to fall over (because DAGToDAG lowering assumed the constant had only one use!)

Original commit message:
  [Thumb] Select a BIC instead of AND if the immediate can be encoded more optimally negated

  If an immediate is only used in an AND node, it is possible that the immediate can be more optimally materialized when negated. If this is the case, we can negate the immediate and use a BIC instead;

    int i(int a) {
      return a & 0xfffffeec;
    }

  Used to produce:
      ldr r1, [CONSTPOOL]
      ands r0, r1
    CONSTPOOL: 0xfffffeec

  And now produces:
      movs    r1, #255
      adds    r1, #20  ; Less costly immediate generation
      bics    r0, r1

llvm-svn: 274510
2016-07-04 16:35:41 +00:00
Simon Pilgrim c804751a18 [X86] Add shuffle mask rescaling helper function. NFCI.
llvm-svn: 274476
2016-07-03 21:28:17 +00:00
Simon Pilgrim 8e84fcf118 [X86][AVX2] Merge unary permute matching behind the same V2.isUndef() condition. NFCI.
llvm-svn: 274474
2016-07-03 20:39:42 +00:00
Simon Pilgrim 7f096de0b8 [X86][AVX512] Add support for 512-bit shuffle lowering to VPERMPD/VPERMQ
llvm-svn: 274473
2016-07-03 19:50:06 +00:00
Simon Pilgrim 68ea80649b [X86][AVX512] Add support for VPERMPD/VPERMQ masked shuffle comments
llvm-svn: 274469
2016-07-03 18:40:24 +00:00
Simon Pilgrim a0d73835b2 [X86][AVX512] Add support for 512-bit shuffle decoding of VPERMPD/VPERMQ
llvm-svn: 274468
2016-07-03 18:27:37 +00:00
Simon Pilgrim 5080e7f56c [X86][AVX] Renamed VPERMILPI shuffle comment macros to be more specific
llvm-svn: 274467
2016-07-03 18:02:43 +00:00
Simon Pilgrim dbd6db0dc7 [X86][AVX512] Add support for VPALIGNR/PSHUFD/PSHUFHW/PSHUFLW masked shuffle comments
llvm-svn: 274466
2016-07-03 15:00:51 +00:00
Simon Pilgrim 598bdb6bfe [X86][AVX512] Add support for UNPCK masked shuffle comments
llvm-svn: 274464
2016-07-03 14:26:21 +00:00
Simon Pilgrim 1f59076196 [X86][AVX512] Add support for VPERM/VSHUF masked shuffle comments
llvm-svn: 274462
2016-07-03 13:55:41 +00:00
Simon Pilgrim 68f438a036 [X86][AVX512] Add support for PMOVZX masked shuffle comments
llvm-svn: 274461
2016-07-03 13:33:28 +00:00
Simon Pilgrim 7c2fbdc101 [X86][AVX512] Add support for masked shuffle comments
This patch adds support for including the avx512 mask register information in the mask/maskz versions of shuffle instruction comments.

This initial version just adds support for MOVDDUP/MOVSHDUP/MOVSLDUP to reduce the mass of test regenerations, other shuffle instructions can be added in due course.

Differential Revision: http://reviews.llvm.org/D21953

llvm-svn: 274459
2016-07-03 13:08:29 +00:00
Simon Pilgrim 129b720c18 [X86][AVX512] Add support for lowering shuffles to VPERMILPS
llvm-svn: 274458
2016-07-03 12:47:21 +00:00
Simon Pilgrim a7329dac6f Fix spelling.
llvm-svn: 274451
2016-07-02 20:21:39 +00:00
Simon Pilgrim 99e8a1aa0b [X86][AVX512] Add support for lowering shuffles to VPERMILPD
llvm-svn: 274450
2016-07-02 20:20:12 +00:00
Simon Pilgrim cde7c54baa [X86][AVX512] Add support for 512-bit PSHUFB lowering
llvm-svn: 274444
2016-07-02 18:14:31 +00:00