Commit Graph

37179 Commits

Author SHA1 Message Date
Sriraman Tallam 7da9b445ea Differential Revision: http://reviews.llvm.org/D19733
llvm-svn: 268106
2016-04-29 21:19:16 +00:00
Matt Arsenault dc4ebad6d4 AMDGPU: Add kernarg.segment.ptr intrinsic
llvm-svn: 268105
2016-04-29 21:16:52 +00:00
Matt Arsenault cf2744f1c8 AMDGPU/SI: Move post regalloc run of SIShrinkInstructions
Move to addPreEmitPass. This is so it runs after post-RA
scheduling so we can merge s_nops emitted by the scheduler
and hazard recognizer.

llvm-svn: 268095
2016-04-29 20:23:42 +00:00
Artem Tamazov 38e496b175 Fixed/Recommitted r267733 "[AMDGPU][llvm-mc] Add support of TTMP quads. Rework M0 exclusion for SMRD."
Previously reverted by r267752.

r267733 review:
Differential Revision: http://reviews.llvm.org/D19342

llvm-svn: 268066
2016-04-29 17:04:50 +00:00
Guozhi Wei fa3e04298b [PPC] Enable shuffling of VSX vectors
This patch fixes PR27078 by enabling shuffling of vectors if VSX is available.

llvm-svn: 268064
2016-04-29 17:00:54 +00:00
Daniel Sanders 7225cd52e7 [mips][ias] Move createCpRestoreMemOp to MipsTargetStreamer. NFC.
Summary:
This removes the temporary call to isIntegratedAssemblerRequired() which was
added recently. It's effect is now acheived directly in the MipsTargetStreamer
hierarchy.

Reviewers: sdardis

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D19715

llvm-svn: 268058
2016-04-29 16:16:49 +00:00
Krzysztof Parzyszek 173fc57b54 Fix NDEBUG build: variables used only in debug code causing compile error
llvm-svn: 268057
2016-04-29 16:14:00 +00:00
Simon Dardis d8bceb9d3a [mips][FastISel] A store is not a load.
Correct trivial error. One of the failing tests from PR/27458.

Reviewers: dsanders, vkalintiris, mcrosier

Differential Review: http://reviews.llvm.org/D19726

llvm-svn: 268053
2016-04-29 16:07:47 +00:00
Simon Dardis 7383bfd8bd [PATCH] [mips] Fix forbidden slot hazard handling
MipsHazardSchedule has to determine what the next physical machine instruction
is to decide whether to insert a nop. In case where a branch with a forbidden
slot appears at the end of a basic block, first *real* instruction of the next
physical basic block was determined using getFirstNonDebugInstr().

Unfortunately this only considers DBG_VALUEs and not other transient opcodes
such as EHLABEL. As EHLABEL passes the SafeInForbiddenSlot predicate and the
instruction after the EHLABEL can be a CTI, we observed test failures in the
LNT testsuite.

Reviewers: dsanders

Differential Review: http://reviews.llvm.org/D19051

llvm-svn: 268052
2016-04-29 16:04:18 +00:00
Krzysztof Parzyszek f5cbac93eb [Hexagon] Optimize addressing modes for load/store
Patch by Jyotsna Verma.

llvm-svn: 268051
2016-04-29 15:49:13 +00:00
Filipe Cabecinhas 0da9937517 Unify XDEBUG and EXPENSIVE_CHECKS (into the latter), and add an option to the cmake build to enable them.
Summary:
Historically, we had a switch in the Makefiles for turning on "expensive
checks". This has never been ported to the cmake build, but the
(dead-ish) code is still around.

This will also make it easier to turn it on in buildbots.

Reviewers: chandlerc

Subscribers: jyknight, mzolotukhin, RKSimon, gberry, llvm-commits

Differential Revision: http://reviews.llvm.org/D19723

llvm-svn: 268050
2016-04-29 15:22:48 +00:00
Tom Stellard 92b24f324b AMDGPU/SI: Add offset field to ds_permute/ds_bpermute instructions
Summary:
These instructions can add an immediate offset to the address, like other
ds instructions.

Reviewers: arsenm

Subscribers: arsenm, scchan

Differential Revision: http://reviews.llvm.org/D19233

llvm-svn: 268043
2016-04-29 14:34:26 +00:00
Daniel Sanders fba875f902 [mips][ias] Split expandMemInst between MipsAsmParser and MipsTargetStreamer. Almost NFC.
Summary:
The portion in MipsAsmParser is responsible for figuring out which expansion to
use, while the portion in MipsTargetStreamer is responsible for emitting it.

This allows us to remove the call to isIntegratedAssemblerRequired() which is
currently ensuring the effect of .cprestore only occurs when writing objects.

The small functional change is that the memory offsets are now correctly
printed as signed values.

Reviewers: sdardis

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D19714

llvm-svn: 268042
2016-04-29 13:43:45 +00:00
Daniel Sanders a736b37a25 [mips][ias] Moved most instruction emission helpers to MipsTargetStreamer. NFC.
Summary:
* Moved all the emit*() helpers to MipsTargetStreamer.
* Moved createNop() to MipsTargetStreamer as emitNop() and emitEmptyDelaySlot().
  This instruction has been split to distinguish between the 'nop' instruction
  and the nop used in delay slots which is sometimes a different nop to the
  'nop' instruction (e.g. for short delay slots on microMIPS).
* Moved createAddu() to MipsTargetStreamer as emitAddu().
* Moved createAppropriateDSLL() to MipsTargetStreamer as emitDSLL().

Reviewers: sdardis

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D19712

llvm-svn: 268041
2016-04-29 13:33:12 +00:00
Daniel Sanders 9db710a171 [mips][ias] Make section sizes a multiple of the alignment.
Reviewers: sdardis

Subscribers: dsanders, llvm-commits, sdardis

Differential Revision: http://reviews.llvm.org/D19008

llvm-svn: 268036
2016-04-29 12:44:07 +00:00
Nikolay Haustov 4f672a34ed AMDGPU/SI: Assembler: Unify parsing/printing of operands.
Summary:
The goal is for each operand type to have its own parse function and
at the same time share common code for tracking state as different
instruction types share operand types (e.g. glc/glc_flat, etc).

Introduce parseAMDGPUOperand which can parse any optional operand.
DPP and Clamp/OMod have custom handling for now. Sam also suggested
to have class hierarchy for operand types instead of table. This
can be done in separate change.

Remove parseVOP3OptionalOps, parseDS*OptionalOps, parseFlatOptionalOps,
parseMubufOptionalOps, parseDPPOptionalOps.
Reduce number of definitions of AsmOperand's and MatchClasses' by using common base class.
Rename AsmMatcher/InstPrinter methods accordingly.
Print immediate type when printing parsed immediate operand.
Use 'off' if offset/index register is unused instead of skipping it to make it more readable (also agreed with SP3).
Update tests.

Reviewers: tstellarAMD, SamWot, artem.tamazov

Subscribers: qcolombet, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19584

llvm-svn: 268015
2016-04-29 09:02:30 +00:00
Zlatko Buljan 531809d340 [mips][microMIPS] Fix offsets for LLE, LWE, SBE, SCE and SHE instructions
Differential Revision: http://reviews.llvm.org/D18645

llvm-svn: 268012
2016-04-29 08:36:54 +00:00
Matt Arsenault 7d1b6c81af AMDGPU: Stop reporting an addressing mode for unknown addrspace
This was being treated the same as private, which has an immediate
offset. For unknown, it probably means it's for a computation not
actually being used for accessing memory, so it should not have a
nontrivial addressing mode.

llvm-svn: 268002
2016-04-29 06:25:10 +00:00
Craig Topper b805723294 [X86] Remove unnecessary header file containing a small class. It was only included in one place. Just define the class directly in the cpp file. NFC
llvm-svn: 267985
2016-04-29 04:22:28 +00:00
Craig Topper e7c1cd18d3 [X86] Include X86MCTargetDesc.h directly in X86Disassembler.cpp instead of duplicating parts of it. NFC
llvm-svn: 267984
2016-04-29 04:22:26 +00:00
Craig Topper 184310d6a9 [X86] Use nested switches to vary the operand to helper functions that were previously called in multiple cases. This seems to help the inliner reduce code. NFC
llvm-svn: 267964
2016-04-29 00:51:30 +00:00
Matthias Braun f84547c6e0 LiveIntervalAnalysis: Remove LiveVariables requirement
This requirement was a huge hack to keep LiveVariables alive because it
was optionally used by TwoAddressInstructionPass and PHIElimination.
However we have AnalysisUsage::addUsedIfAvailable() which we can use in
those passes.

This re-applies r260806 with LiveVariables manually added to PowerPC to
hopefully not break the stage 2 bots this time.

llvm-svn: 267954
2016-04-28 23:42:51 +00:00
Marcin Koscielnicki 7b32957852 [PowerPC] Fix the EH_SjLj_Setup pseudo.
This instruction is just a control flow marker - it should not
actually exist in the object file.  Unfortunately, nothing catches
it before it gets to AsmPrinter.  If integrated assembler is used,
it's considered to be a normal 4-byte instruction, and emitted as
an all-0 word, crashing the program.  With external assembler,
a comment is emitted.

Fixed by setting Size to 0 and handling it in MCCodeEmitter - this
means the comment will still be emitted if integrated assembler
is not used.

This broke an ASan test, which has been disabled for a long time
as a result (see the discussion on D19657).  We can reenable it
once this lands.

llvm-svn: 267943
2016-04-28 21:24:37 +00:00
Krzysztof Parzyszek bf90d5a3b3 [RDF] Recognize tail calls in graph creation
llvm-svn: 267939
2016-04-28 20:40:08 +00:00
Krzysztof Parzyszek c5a4e26410 [RDF] Improve handling of inline-asm
- Keep implicit defs from inline-asm instructions.
- Treat register references from inline-asm as fixed.

llvm-svn: 267936
2016-04-28 20:33:33 +00:00
Krzysztof Parzyszek 55874cf02b [RDF] Add option to keep dead phi nodes in DFG
Dead phi nodes are needed for code motion (such as copy propagation),
where a new use would be placed in a location that would be dominated
by a dead phi. Such a transformation is not legal for copy propagation,
and the existence of the phi would prevent it, but if the phi is not
there, it may appear to be valid.

llvm-svn: 267932
2016-04-28 20:17:06 +00:00
Kit Barton 7a1a9e01ad This reverts commit r265505.
Revert "[Power9] Implement add-pc, multiply-add, modulo, extend-sign-shift, random number, set bool, and dfp test significance".
This patch has caused a functional regression in SPEC2k6 namd, and a performance regression in mesa-pipe.

llvm-svn: 267927
2016-04-28 20:00:42 +00:00
Krzysztof Parzyszek e5fcce2d2b [Hexagon] Add instruction aliases for vector unsigned compare-equal
Unsigned compare-equal instructions are mapped to signed compare-equal.

llvm-svn: 267925
2016-04-28 19:49:18 +00:00
Matt Arsenault 1c4d0efe56 AMDGPU: Emit error if too much LDS is used
llvm-svn: 267922
2016-04-28 19:37:35 +00:00
Matt Arsenault c5fce69031 AMDGPU: Fix mishandling array allocations when promoting alloca
The canonical form for allocas is a single allocation of the array type.
In case we see a non-canonical array alloca, make sure we aren't
replacing this with an array N times smaller.

llvm-svn: 267916
2016-04-28 18:38:48 +00:00
Krzysztof Parzyszek 0e7d2d339d [Hexagon] Define certain aliases for vector instructions
Specifically:
  Vd = #0   -> Vd = vxor(Vd, Vd)
  Vdd = #0  -> Vdd.w = vsub(Vdd.w, Vdd.w)
  Vdd = Vss -> Vdd = vcombine(Vss.H, Vss.L)

llvm-svn: 267901
2016-04-28 16:43:16 +00:00
Simon Dardis a2d8cc3db9 [mips][atomics] Fix partword atomic binary operation implementation
Currently Mips::emitAtomicBinaryPartword() does not properly respect the
width of pointers. For MIPS64 this causes the memory address that the ll/sc
sequence uses to be truncated. At runtime this causes a segmentation fault.

This can be fixed by applying similar changes as r266204, so that a full 64bit
pointer is loaded.

Reviewers: dsanders

Differential Review: http://reviews.llvm.org/D19651

llvm-svn: 267900
2016-04-28 16:26:43 +00:00
Krzysztof Parzyszek e737b86f8c [Hexagon] Handle double-vector registers as new-value producers
Patch by Colin LeMahieu.

llvm-svn: 267897
2016-04-28 15:54:48 +00:00
Krzysztof Parzyszek efd72857a3 [RDF] Handle undefined registers in RDF copy propagation
When updating the graph, make sure that new uses without reaching defs
are handled correctly.

llvm-svn: 267891
2016-04-28 15:09:19 +00:00
Craig Topper 477649a4c0 [X86] Remove unused operand from a function and all its callers. NFC
llvm-svn: 267854
2016-04-28 05:58:46 +00:00
Craig Topper 33772c5375 [CodeGen] Default CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF to Expand in TargetLoweringBase. This is what the majority of the targets want and removes a bunch of code. Set it to Legal explicitly in the few cases where that's the desired behavior.
llvm-svn: 267853
2016-04-28 03:34:31 +00:00
Craig Topper 3b4842b56f [AArch64] Expand CTTZ for all vector types.
llvm-svn: 267837
2016-04-28 01:58:21 +00:00
Bryan Chan 893110ecaf [SystemZ] Support Swift Calling Convention
Summary:
Port rL265480, rL264754, rL265997 and rL266252 to SystemZ, in order to enable the Swift port on the architecture. SwiftSelf and SwiftError are assigned to R10 and R9, respectively, which are normally callee-saved registers. For more information, see:

RFC: Implementing the Swift calling convention in LLVM and Clang
https://groups.google.com/forum/#!topic/llvm-dev/epDd2w93kZ0

Reviewers: kbarton, manmanren, rjmccall, uweigand

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19414

llvm-svn: 267823
2016-04-28 00:17:23 +00:00
Mitch Bodart e60465ddf7 [X86] Enable the post-RA-scheduler for clang's default 32-bit cpu.
For compilations with no explicit cpu specified, this exhibits
nice gains on Silvermont, with neutral performance on big cores.

Differential Revision: http://reviews.llvm.org/D19138

llvm-svn: 267809
2016-04-27 22:52:35 +00:00
Quentin Colombet bf200688de [X86][FastISel] Make sure we use the right register class when we select stores.
llvm-svn: 267806
2016-04-27 22:33:42 +00:00
Colin LeMahieu a3782da3e3 [Hexagon] Merging nops in to previous packet rather than always creating a new one.
llvm-svn: 267798
2016-04-27 21:37:44 +00:00
Quentin Colombet d6dbec4c6f [X86] Fix the lowering of TLS calls.
The callseq_end node must be glued with the TLS calls, otherwise,
the generic code will miss the uses of the returned value and will
mark it dead.
Moreover, TLSCall 64-bit pseudo must not set an implicit-use on RDI,
the pseudo uses the symbol address at this point not RDI and the
lowering will do the right thing.

llvm-svn: 267797
2016-04-27 21:37:37 +00:00
Matt Arsenault 0547b016b1 AMDGPU: Account for globals in AMDGPUPromoteAlloca pass
Patch by Bas Nieuwenhuizen

llvm-svn: 267791
2016-04-27 21:05:08 +00:00
Ahmed Bougacha 65572afea8 [ARM] Set AddPristinesAndCSRs to expandCMP_SWAP LivePhysRegs.
We run after PEI.
Found via inspection; no obvious testcase.

Follow-up to r266679.

llvm-svn: 267781
2016-04-27 20:33:07 +00:00
Ahmed Bougacha 5a3bf6a4a9 [AArch64] Set AddPristinesAndCSRs to expandCMP_SWAP LivePhysRegs.
We run after PEI.
Found via inspection; no obvious testcase.

Follow-up to r266339.

llvm-svn: 267780
2016-04-27 20:33:05 +00:00
Ahmed Bougacha 9e71425f54 [AArch64] Set correct successors in CMPXCHG pseudo expansion.
transferSuccessors() would LoadCmpBB a successor of DoneBB,
whereas it should be a successor of the original MBB.

Follow-up to r266339.

Unfortunately, it's tricky to catch this in the verifier.

llvm-svn: 267779
2016-04-27 20:33:02 +00:00
Ahmed Bougacha b4af107239 [ARM] Set correct successors in CMPXCHG pseudo expansion.
transferSuccessors() would LoadCmpBB a successor of DoneBB, whereas
it should be a successor of the original MBB.

The testcase changes are caused by Thumb2SizeReduction, which
was previously confused by the broken CFG.

Follow-up to r266679.

Unfortunately, it's tricky to catch this in the verifier.

llvm-svn: 267778
2016-04-27 20:32:54 +00:00
Kevin B. Smith c378a99ba5 [X86]: Quit promoting 16 bit loads to 32 bit.
Differential Revision: http://reviews.llvm.org/D19592

llvm-svn: 267773
2016-04-27 19:58:03 +00:00
Andrew Kaylor 289bd5f684 Add optimization bisect opt-in calls for PowerPC passes
Differential Revision: http://reviews.llvm.org/D19554

llvm-svn: 267769
2016-04-27 19:39:32 +00:00
Justin Lebar 7cdbce5946 [NVPTX] Run NVVMReflect at the beginning of IR passes.
Summary:
Currently the NVVMReflect pass is run at the beginning of our backend
passes.  But really, it should be run as early as possible, as it's
simply resolving an "if" statement in code.  So copy it into
TargetMachine::addEarlyAsPossiblePasses.

We still run it at the beginning of the backend passes, since it's
needed for correctness when lowering to nvptx.

(Specifically, NVVMReflect changes each call to the __nvvm_reflect
function or llvm.nvvm.reflect intrinsic into an integer constant, based
on the pass's configuration.  Clearly we miss many optimization
opportunities if we perform this transformation at the beginning of
codegen.)

Reviewers: rnk

Subscribers: tra, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D18616

llvm-svn: 267765
2016-04-27 19:13:37 +00:00