Commit Graph

36827 Commits

Author SHA1 Message Date
Tom Stellard 9112758077 AMDGPU/SI: Add latency for export instructions
Reviewers: arsenm, nhaehnle

Subscribers: nhaehnle, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18599

llvm-svn: 265708
2016-04-07 18:30:05 +00:00
Quentin Colombet d21115876c [RegisterBank] Rename RegisterBank::contains into RegisterBank::covers.
llvm-svn: 265695
2016-04-07 17:09:39 +00:00
Ulrich Weigand 79391ee0f2 [SystemZ] Fix build break from r265689
Fix build error seen on some build bots due to:
error: default label in switch which covers all enumeration values

llvm-svn: 265693
2016-04-07 16:33:25 +00:00
Kevin B. Smith 3802c4af59 [X86]: Fix for PR27251.
Differential Revision: http://reviews.llvm.org/D18850

llvm-svn: 265690
2016-04-07 16:15:34 +00:00
Ulrich Weigand 2eb027d21f [SystemZ] Implement conditional returns
Return is now considered a predicable instruction, and is converted
to a newly-added CondReturn (which maps to BCR to %r14) instruction by
the if conversion pass.

Also, fused compare-and-branch transform knows about conditional
returns, emitting the proper fused instructions for them.

This transform triggers on a *lot* of tests, hence the huge diffstat.
The changes are mostly jX to br %r14 -> bXr %r14.

Author: koriakin

Differential Revision: http://reviews.llvm.org/D17339

llvm-svn: 265689
2016-04-07 16:11:44 +00:00
Ehsan Amiri 4701a91e59 [PPC] Enable transformations in PPCPassConfig::addIRPasses at O2
http://reviews.llvm.org/D18562

A large number of testcases has been modified so they pass after this test.
One testcase is deleted, because I realized even after undoing the original
change that was committed with this testcase, the testcase still passes. So
I removed it. The change to one other testcase (test/CodeGen/PowerPC/pr25802.ll)
is an arbitrary change to keep it passing. Given the original intention of the
testcase, and the fact that fixing it will require some time to change the testcase,
we concluded that this quick change will be enough.

llvm-svn: 265683
2016-04-07 15:30:55 +00:00
Tom Stellard d37630e461 AMDGPU/SI: Add MachineBasicBlock parameter to SIInstrInfo::insertWaitStates
Summary: This makes it possible to insert nops at the end of blocks.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18549

llvm-svn: 265678
2016-04-07 14:47:07 +00:00
Valery Pykhtin e23b6deb01 [AMDGPU] fix readlane/readfirstlane src vgpr operand type.
For VGPR_32 operand disassembler expects a VGPR register encoded as 0..255 (enum8 src operand).
readfirstlane/readline actually has enum9 operand and this change fixes VGPR_32 to VS_32 (enum9 encoding).

Differential Revision: http://reviews.llvm.org/D18696

llvm-svn: 265670
2016-04-07 13:41:51 +00:00
Benjamin Kramer 4fb78518f1 Make helper functions static. NFC.
llvm-svn: 265653
2016-04-07 10:10:09 +00:00
Simon Pilgrim d54bae6525 [X86][SSE] Add support for VZEXT constant folding
llvm-svn: 265646
2016-04-07 07:52:45 +00:00
Ahmed Bougacha 1cf67fb9cb [X86] Reuse EFLAGS and form LOCKed ops when only user is SETCC.
Re-apply r265450 which caused PR27245 and was reverted in r265559
because of a wrong generalization: the fetch_and_add->add_and_fetch
combine only works in specific, but pretty common, cases:
  (icmp slt x, 0) -> (icmp sle (add x, 1), 0)
  (icmp sge x, 0) -> (icmp sgt (add x, 1), 0)
  (icmp sle x, 0) -> (icmp slt (sub x, 1), 0)
  (icmp sgt x, 0) -> (icmp sge (sub x, 1), 0)

Original Message:

We only generate LOCKed versions of add/sub when the result is unused.
It often happens that the result is used, but only by a comparison. We
can optimize those out by reusing EFLAGS, which lets us use the proper
instructions, instead of having to fallback to LXADD.

Instead of doing this as an MI peephole (as we do for the other
non-LOCKed (really, non-MR) forms), do it in ISel. It becomes quite
tricky later.

This also makes it eventually possible to stop expanding and/or/xor
if the only user is an icmp (also see D18141).

This uses the LOCK ISD opcodes added by r262244.

Differential Revision: http://reviews.llvm.org/D17633

llvm-svn: 265636
2016-04-07 02:07:10 +00:00
Quentin Colombet b073c12912 [AArch64] Teach RegisterBankInfo about the CC register bank.
We need to cover each register class with a register bank.

llvm-svn: 265629
2016-04-07 00:39:29 +00:00
Quentin Colombet cbc353a422 [AArch64] Teach RegisterBankInfo about the mapping of register classes
on register banks.

llvm-svn: 265626
2016-04-07 00:14:30 +00:00
Hans Wennborg ab16be799c Re-commit r265039 "[X86] Merge adjacent stack adjustments in eliminateCallFramePseudoInstr (PR27140)"
Third time's the charm? The previous attempt (r265345) caused ASan test
failures on X86, as broken CFI caused stack traces to not work.

This version of the patch makes sure not to merge with stack adjustments
that have CFI, and to not add merged instructions' offests to the CFI
about to be generated.

This is already covered by the lit tests; I just got the expectations
wrong previously.

llvm-svn: 265623
2016-04-07 00:05:49 +00:00
JF Bastien 800f87a871 NFC: make AtomicOrdering an enum class
Summary:
In the context of http://wg21.link/lwg2445 C++ uses the concept of
'stronger' ordering but doesn't define it properly. This should be fixed
in C++17 barring a small question that's still open.

The code currently plays fast and loose with the AtomicOrdering
enum. Using an enum class is one step towards tightening things. I later
also want to tighten related enums, such as clang's
AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI'
enum).

This change touches a few lines of code which can be improved later, I'd
like to keep it as NFC for now as it's already quite complex. I have
related changes for clang.

As a follow-up I'll add:
  bool operator<(AtomicOrdering, AtomicOrdering) = delete;
  bool operator>(AtomicOrdering, AtomicOrdering) = delete;
  bool operator<=(AtomicOrdering, AtomicOrdering) = delete;
  bool operator>=(AtomicOrdering, AtomicOrdering) = delete;
This is separate so that clang and LLVM changes don't need to be in sync.

Reviewers: jyknight, reames

Subscribers: jyknight, llvm-commits

Differential Revision: http://reviews.llvm.org/D18775

llvm-svn: 265602
2016-04-06 21:19:33 +00:00
Ehsan Amiri 322eca3849 [PPC] Use VSX/FP Facility integer load when an integer load's only users are conversion to FP
http://reviews.llvm.org/D18405

When the integer value loaded is never used directly as integer we should use VSX 
or Floating Point Facility integer loads and avoid extra direct move

llvm-svn: 265593
2016-04-06 20:12:29 +00:00
James Y Knight 037b9894bd Put quotes around #error string.
GCC reports "missing terminating ' character", even when it's being
skipped by preprocessing.

llvm-svn: 265590
2016-04-06 19:52:32 +00:00
Nicolai Haehnle df3a20cd80 AMDGPU: Add a shader calling convention
This makes it possible to distinguish between mesa shaders
and other kernels even in the presence of compute shaders.

Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

Differential Revision: http://reviews.llvm.org/D18559

llvm-svn: 265589
2016-04-06 19:40:20 +00:00
Quentin Colombet 4f03c0b806 [AArch64] Change the CMake to avoid to build GlobalISel related APIs
when GISel is not built.
The positive side effects are:
- We do not have to define dummy implementation
- We do not have to do weird gymnastic to avoid like issues (like
  missing constructor or vtable for the base classes)

llvm-svn: 265570
2016-04-06 17:38:12 +00:00
Quentin Colombet c17f744001 [AArch64] Teach the subtarget how to get to the RegisterBankInfo.
Rework the access to GlobalISel APIs to contain how much of
the APIs we need to access for the final executable to build when
GlobalISel is not built.

This prevents massive usage of ifdefs in various places. Now, all the
GlobalISel ifdefs will be happing only in AArch64TargetMachine.cpp.

llvm-svn: 265567
2016-04-06 17:26:03 +00:00
Hans Wennborg 6849f8f15f Revert r265450 "[X86] Reuse EFLAGS and form LOCKed ops when only user is SETCC."
It caused ASan 32-bit tests to hang (PR27245).

llvm-svn: 265559
2016-04-06 16:44:38 +00:00
Hans Wennborg a7e396b5ef Revert "Re-commit r265039 "[X86] Merge adjacent stack adjustments in eliminateCallFramePseudoInstr (PR27140)""
It seems to be causing ASan tests to crash, probably due to
miscompiling the run-time somehow.

llvm-svn: 265551
2016-04-06 16:10:20 +00:00
Quentin Colombet d1d324b2ae [AArch64] Use the default constructor of RegisterBankInfo when GlobalISel is not built.
This will avoid link-time error as the defautl constructor of RegisterBankInfo is
the only one available when GlobalISel is not built.

llvm-svn: 265549
2016-04-06 15:53:13 +00:00
Sam Kolton ff90c60a78 [AMDGPU] AsmParser: disable DPP for unsupported instructions. New dpp tests. Fix v_nop_dpp.
Summary:
1. Disable DPP encoding for instructions that do not support it:
    - VOP1:
        - v_readfirstlane_b32
        - v_clrexcp
        - v_movreld_b32
        - v_movrels_b32
        - v_movrelsd_b32
    - VOP2:
        - v_madmk_f16/32
        - v_madak_f16/32
    - VOPC, VINTRP, VOP3
2. Fix DPP for v_nop
3. New DPP tests for VOP1 and VOP2 instructions

Reviewers: nhaustov, tstellarAMD, vpykhtin

Subscribers: tstellarAMD, arsenm

Differential Revision: http://reviews.llvm.org/D18552

llvm-svn: 265538
2016-04-06 13:29:59 +00:00
Evgeny Astigeevich 9c24ebfa6d [AArch64][CodeGen] NFC refactor AArch64InstrInfo::optimizeCompareInstr to prepare it for fixing a bug in it
AArch64InstrInfo::optimizeCompareInstr has a bug which causes generation of incorrect code (PR#27158).
The patch refactors the function to simplify reviewing the fix of the bug.

1. Function name ‘modifiesConditionCode’ is changed to ‘areCFlagsAccessedBetweenInstrs’
   to reflect that the function can check modifying accesses, reading accesses or both.
2. Function ‘AArch64InstrInfo::optimizeCompareInstr’
   - Documented the function
   - Cmp_NZCV is DeadNZCVIdx to reflect that it is an operand index of dead NZCV
   - The code for the case of substituting CmpInstr is put into separate
     functions the main of them is ‘substituteCmpInstr’.

Differential Revision: http://reviews.llvm.org/D18609

llvm-svn: 265531
2016-04-06 11:39:00 +00:00
Chuang-Yu Cheng 6e1408a891 [ppc64] Temporary disable sibling call optimization on ppc64 due to breaking test case
r265506 breaks print-stack-trace.cc test case of compiler-rt in bootstrap
test.

http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/1708

llvm-svn: 265528
2016-04-06 10:48:36 +00:00
Matthias Braun 8e594fdf19 AArch64: Fix compile error
Fixed to adapt a use of enterBasicBlock() in my last commit (because I
had follow on patches in my repository that change the code).

llvm-svn: 265513
2016-04-06 02:59:44 +00:00
Matthias Braun 7dc03f060e RegisterScavenger: Take a reference as enterBasicBlock() argument.
Make it obvious that the argument cannot be nullptr.
Remove an unnecessary nullptr check in initRegState.

llvm-svn: 265511
2016-04-06 02:47:09 +00:00
Chuang-Yu Cheng 2e5973ef74 [ppc64] Enable sibling call optimization on ppc64 ELFv1/ELFv2 abi
This patch enable sibling call optimization on ppc64 ELFv1/ELFv2 abi, and
add a couple of test cases. This patch also passed llvm/clang bootstrap
test, and spec2006 build/run/result validation.

Original issue: https://llvm.org/bugs/show_bug.cgi?id=25617

Great thanks to Tom's (tjablin) help, he contributed a lot to this patch.
Thanks Hal and Kit's invaluable opinions!

Reviewers: hfinkel kbarton

http://reviews.llvm.org/D16315

llvm-svn: 265506
2016-04-06 02:04:38 +00:00
Chuang-Yu Cheng 024a623c55 [Power9] Implement add-pc, multiply-add, modulo, extend-sign-shift, random number, set bool, and dfp test significance
This patch implement the following instructions:
- addpcis subpcis
- maddhd maddhdu maddld
- modsw moduw modsd modud
- darn
- extswsli extswsli.
- setb
- dtstsfi dtstsfiq

Total 15 instructions

Reviewers: nemanjai hfinkel tjablin amehsan kbarton

http://reviews.llvm.org/D17885

llvm-svn: 265505
2016-04-06 01:47:02 +00:00
Chuang-Yu Cheng eaf4b3d75c [Power9] Implement copy-paste, msgsync, slb, and stop instructions
This patch implements the following BookII and Book III instructions:
- copy copy_first cp_abort paste paste. paste_last
- msgsync
- slbieg slbsync
- stop

Total 10 instructions

Reviewers: nemanjai hfinkel tjablin amehsan kbarton
llvm-svn: 265504
2016-04-06 01:46:45 +00:00
NAKAMURA Takumi 285c8ff753 AArch64CodeGen: Make AArch64RegisterBankInfo.cpp optional along LLVM_BUILD_GLOBAL_ISEL.
llvm-svn: 265499
2016-04-06 01:18:08 +00:00
Quentin Colombet 5300950f3a [AArch64] Initial implementation of the targeting of the register bank information.
llvm-svn: 265489
2016-04-05 23:34:59 +00:00
Manman Ren 802cd6f9d7 Swift Calling Convention: swiftcc for ARM.
Differential Revision: http://reviews.llvm.org/D18769

llvm-svn: 265482
2016-04-05 22:44:44 +00:00
Evgeniy Stepanov dde29e2799 Faster stack-protector for Android/AArch64.
Bionic has a defined thread-local location for the stack protector
cookie. Emit a direct load instead of going through __stack_chk_guard.

llvm-svn: 265481
2016-04-05 22:41:50 +00:00
Manman Ren f8bdd88cd9 Swift Calling Convention: add swiftcc.
Differential Revision: http://reviews.llvm.org/D17863

llvm-svn: 265480
2016-04-05 22:41:47 +00:00
Duncan P. N. Exon Smith 1de3c7e790 IR: Introduce ConstantAggregate, NFC
Add a common parent class for ConstantArray, ConstantVector, and
ConstantStruct called ConstantAggregate.  These are the aggregate
subclasses of Constant that take operands.

This is mainly a cleanup, adding common `isa` target and removing
duplicated code.  However, it also simplifies caching which constants
point transitively at `GlobalValue` (a possible future direction).

llvm-svn: 265466
2016-04-05 21:10:45 +00:00
Duncan P. N. Exon Smith 91d3cfed78 Revert "Fix Clang-tidy modernize-deprecated-headers warnings in remaining files; other minor fixes."
This reverts commit r265454 since it broke the build.  E.g.:

  http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_build/22413/

llvm-svn: 265459
2016-04-05 20:45:04 +00:00
Eugene Zelenko 1760dc2a23 Fix Clang-tidy modernize-deprecated-headers warnings in remaining files; other minor fixes.
Some Include What You Use suggestions were used too.

Use anonymous namespaces in source files.

Differential revision: http://reviews.llvm.org/D18778

llvm-svn: 265454
2016-04-05 20:19:49 +00:00
Ahmed Bougacha 50e6cd4a3a [X86] Reuse EFLAGS and form LOCKed ops when only user is SETCC.
We only generate LOCKed versions of add/sub when the result is unused.
It often happens that the result is used, but only by a comparison. We
can optimize those out by reusing EFLAGS, which lets us use the proper
instructions, instead of having to fallback to LXADD.

Instead of doing this as an MI peephole (as we do for the other
non-LOCKed (really, non-MR) forms), do it in ISel. It becomes quite
tricky later.

This also makes it eventually possible to stop expanding and/or/xor
if the only user is an icmp (also see D18141).

This uses the LOCK ISD opcodes added by r262244.

Differential Revision: http://reviews.llvm.org/D17633

llvm-svn: 265450
2016-04-05 20:02:57 +00:00
Ahmed Bougacha 629446ba03 [X86] Simplify early-exit check. NFC.
llvm-svn: 265447
2016-04-05 20:02:22 +00:00
Sanjay Patel 4c7d094451 fix typo; NFC
llvm-svn: 265442
2016-04-05 19:27:39 +00:00
Jacques Pienaar 42991b3e5a [lanai] LanaiSetflagAluCombiner more conservative
Summary: LanaiSetflagAluCombiner could previously combine instructions across basic building blocks even when not legal. Make the LanaiSetflagAluCombiner more conservative to avoid this.

Reviewers: eliben

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D18746

llvm-svn: 265411
2016-04-05 16:18:13 +00:00
Sam Parker 0d3a3a537c [ARM] Cleanup of smul and smla instruction descriptions
Removed the SDNode argument passed to the AI_smul and AI_smla multiclass
definitions as they are always mul.

Differential Revision: http://reviews.llvm.org/D18791

llvm-svn: 265409
2016-04-05 16:01:25 +00:00
Konstantin Zhuravlyov e63e02cb0c [AMDGPU] Emit linkonce and linkonce_odr symbols
Differential Revision: http://reviews.llvm.org/D18726

llvm-svn: 265408
2016-04-05 16:00:58 +00:00
Simon Dardis d9d41f531e [mips] MIPSR6 Compact jump support
This patch adds support for compact jumps similiar to the previous compact
branch support for MIPSR6. Unlike compact branches, compact jumps do not
have a forbidden slot.

As MipsInstrInfo::getEquivalentCompactForm can determine the correct
expansion for jumps and branches for both microMIPS and MIPSR6, remove the
unnecessary distinction in the delay slot filler.

Reviewers: vkalintiris

Subscribers: llvm-commits, dsanders
llvm-svn: 265390
2016-04-05 12:50:29 +00:00
Justin Holewinski c79979299a [NVPTX] Handle ldg created from sign-/zero-extended load
Reviewers: jingyue

Subscribers: jholewinski

Differential Revision: http://reviews.llvm.org/D18053

llvm-svn: 265389
2016-04-05 12:38:01 +00:00
JF Bastien 1c3c223b65 Lanai: fix -Wsign-compare warning
llvm-svn: 265368
2016-04-05 00:20:27 +00:00
JF Bastien 393b79ee00 Lanai: fix -Wpedantic warnings
Extra semicolon.

llvm-svn: 265365
2016-04-04 23:47:30 +00:00
Hans Wennborg a47a692341 Re-commit r265039 "[X86] Merge adjacent stack adjustments in eliminateCallFramePseudoInstr (PR27140)"
The original commit miscompiled things on 32-bit Windows, e.g. a Clang
boostrap. It turns out that mergeSPUpdates() was a bit too generous in
what it interpreted as a stack adjustment, causing the following code:

        addl    $12, %esp
        leal    -4(%ebp), %esp

To be "optimized" into simply:

        addl    $8, %esp

This commit tightens up mergeSPUpdates() and includes a new test
(test14 in movtopush.ll) for this situation.

llvm-svn: 265345
2016-04-04 21:02:46 +00:00