Commit Graph

885 Commits

Author SHA1 Message Date
Matt Arsenault 52a4d9b429 AMDGPU: Move R600 only pieces into R600 classes
llvm-svn: 274979
2016-07-09 18:11:15 +00:00
Matt Arsenault 48d70cb486 Revert "AMDGPU: Remove unused control flow intrinsic"
llvm-svn: 274978
2016-07-09 17:18:39 +00:00
NAKAMURA Takumi f35424b73f AMDGPU: Prune AMDGPUAsmParser in libdeps.
llvm-svn: 274970
2016-07-09 07:54:27 +00:00
Matt Arsenault dfec5ce032 AMDGPU: Fix fdiv lowering when f32 denormals supported
Also fix test not actually using function labels.

llvm-svn: 274969
2016-07-09 07:48:11 +00:00
Matt Arsenault 1322b6f8bb AMDGPU: Improve offset folding for register indexing
llvm-svn: 274954
2016-07-09 01:13:56 +00:00
Matt Arsenault 95c7897555 AMDGPU: Simplify isSchedulingBoundary
llvm-svn: 274953
2016-07-09 01:13:51 +00:00
Matt Arsenault 8f0a92f0ba AMDGPU: Remove unused control flow intrinsic
llvm-svn: 274939
2016-07-08 21:39:44 +00:00
Duncan P. N. Exon Smith 4d29511894 AMDGPU: Remove implicit iterator conversions, NFC
Remove remaining implicit conversions from MachineInstrBundleIterator to
MachineInstr* from the AMDGPU backend.  In most cases, I made them less
attractive by preferring MachineInstr& or using a ranged-based for loop.

Once all the backends are fixed I'll make the operator explicit so that
this doesn't bitrot back.

llvm-svn: 274906
2016-07-08 19:16:05 +00:00
Duncan P. N. Exon Smith 221847ef63 AMDGPU: Make infinite loop clear, NFC
Change a while loop that was checking for nullptr on an
iterator-to-pointer conversion to an infinite for loop.  Now it's clear
that the condition doesn't terminate.

The only change in behaviour is if an invalid iterator (holding nullptr)
was passed into AMDGPUCFGStructurizer::reversePredicateSetter.  There
are only two callers, and they both dereference the iterator before
sending it in, so rather than adding an early return to avoid the loop
I've just asserted (using a static_cast, to avoid an implicit conversion
to pointer).

llvm-svn: 274902
2016-07-08 19:00:17 +00:00
Matt Arsenault b63f18c9c3 AMDGPU: Minor adjustment to r274817
The commit message is inaccurate, modifiesRegister
will check for partial defs of exec.

We currently don't ever emit partial defs of exec,
so it doesn't really matter.

llvm-svn: 274886
2016-07-08 17:06:48 +00:00
Valery Pykhtin 68853ab2c5 [AMDGPU] fix ds_swizzle_b32 opcode for VI (bz 28371)
Differential Revision: http://reviews.llvm.org/D22049

llvm-svn: 274852
2016-07-08 15:12:46 +00:00
Matt Arsenault a74374a86b AMDGPU: Move si_mask_branch register operand to be a use
llvm-svn: 274818
2016-07-08 00:55:44 +00:00
Matt Arsenault d4a84b1ed2 AMDGPU: Cleanup. Use definesRegister instead of manual loop
Also this will be more precise since it will check
exec_lo/exec_hi writes.

llvm-svn: 274817
2016-07-08 00:55:39 +00:00
Valery Pykhtin af8b1bddbd [AMDGPU] fix ds_write_src2 encoding (bz26027)
Differential revision: http://reviews.llvm.org/D22041

llvm-svn: 274756
2016-07-07 14:23:38 +00:00
Michael Kuperstein aa71bdd3af [TTI] The cost model should not assume vector casts get completely scalarized
The cost model should not assume vector casts get completely scalarized, since
on targets that have vector support, the common case is a partial split up to
the legal vector size. So, when a vector cast  gets split, the resulting casts
end up legal and cheap.

Instead of pessimistically assuming scalarization, base TTI can use the costs
the concrete TTI provides for the split vector, plus a fudge factor to account
for the cost of the split itself. This fudge factor is currently 1 by default,
except on AMDGPU where inserts and extracts are considered free.

Differential Revision: http://reviews.llvm.org/D21251

llvm-svn: 274642
2016-07-06 17:30:56 +00:00
Nicolai Haehnle e40530ea7b AMDGPU: Fix return of non-void-returning shaders
Summary:
Since "AMDGPU: Fix verifier errors in SILowerControlFlow", the logic that
ensures that a non-void-returning shader falls off the end of the last
basic block was effectively disabled, since SI_RETURN is now used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96731

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21975

llvm-svn: 274612
2016-07-06 08:35:17 +00:00
Matt Arsenault e8dbf791b1 AMDGPU: Remove unnecessary string usage in AsmPrinter
Registers are printed a lot, so don't create temporary
std::strings. Using char instead of a string to an ostream
saves a function call.

llvm-svn: 274581
2016-07-05 22:06:56 +00:00
Matt Arsenault ffc8275f2b AMDGPU: Fix folding SGPRs into madak/madmk src0
Because of the special immediate operand, the constant
bus is already used so SGPRs are never useful.

r263212 changed the name of the immediate operand, which
broke the verifier check for the restriction.

llvm-svn: 274564
2016-07-05 17:09:01 +00:00
Tom Stellard a4b746d808 AMDGPU/SI: Remove address space query functions from AMDGPUDAGToDAGISel
Summary:
These have been replaced with TableGen code (except for isConstantLoad,
which is still used for R600).  The queries were broken for cases
where MemOperand was a PseudoSourceValue.

Reviewers: arsenm

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21684

llvm-svn: 274561
2016-07-05 16:10:44 +00:00
Valery Pykhtin e65b39ec09 [AMDGPU] rename DS_1A1D_Off8_NORET to DS_1A2D_Off8_NORET as ds_write2xx use 2 source registers. NFC.
llvm-svn: 274556
2016-07-05 15:15:28 +00:00
Sam Kolton a9cd6aa895 [AMDGPU] Assembler: Fix parsing error with floating-point literals passed to integer instructions
Differential Revision: http://reviews.llvm.org/D21972

llvm-svn: 274551
2016-07-05 14:01:11 +00:00
Tom Stellard 4a105d73a9 AMDGPU/R600: Add PatFrags for selecting the correct vtx id for loads
This moves of the r600 logic out of isGlobalLoad() and into the
TableGen files.

Differential Revision: http://reviews.llvm.org/D21710

llvm-svn: 274527
2016-07-05 00:12:51 +00:00
Tom Stellard 17a0ec5400 AMDGPU/SI: Remove hack for selecting < 32-bit loads to MUBUF instructions
Summary:
The isGlobalLoad() query was returning true for constant address space loads
with memory types less than 32-bits, which is wrong.  This logic has been
replaced with PatFrag in the TableGen files, to provide the same functionality.

Reviewers: arsenm

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21696

llvm-svn: 274521
2016-07-04 20:41:48 +00:00
Jan Vesely 991dfd7b07 AMDGPU/R600: Add indentation to VTX and TEX fetch asm strings
These are printed as part of Fetch clauses.

Differential Revision: http://reviews.llvm.org/D21730

llvm-svn: 274517
2016-07-04 19:45:00 +00:00
Matt Arsenault 7f681ac7a9 AMDGPU: Add feature for unaligned access
llvm-svn: 274398
2016-07-01 23:03:44 +00:00
Matt Arsenault 8af47a09e5 AMDGPU: Expand unaligned accesses early
Due to visit order problems, in the case of an unaligned copy
the legalized DAG fails to eliminate extra instructions introduced
by the expansion of both unaligned parts.

llvm-svn: 274397
2016-07-01 22:55:55 +00:00
Matt Arsenault 327bb5ad82 AMDGPU: Improve load/store of illegal types.
There was a combine before to handle the simple copy case.
Split this into handling loads and stores separately.

We might want to change how this handles some of the vector
extloads, since this can result in large code size increases.

llvm-svn: 274394
2016-07-01 22:47:50 +00:00
Matt Arsenault 105c2a204c AMDGPU/SI: Enable testing several variants for si scheduler
Enable testing different scheduling variants if sgpr usage
is very high. It was previously disabled because of a bug
in handleMove, but it has been fixed since.

Patch by Axel Davy

llvm-svn: 274372
2016-07-01 18:03:46 +00:00
Nikolay Haustov beb24f5b20 Resubmit r268719 - AMDGPU/SI: Add amdgpu_kernel calling convention. Part 2.
This was reverted in r268740 because of problems with corresponding Clang change.
Clang change was updated and resubmitted in r274220.

Check calling convention in AMDGPUMachineFunction::isKernel

This will be used for AMDGPU_HSA_KERNEL symbol type in output ELF.

Also, in the future unused non-kernels may be optimized.

Reviewers: tstellarAMD, arsenm

Subscribers: arsenm, joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D19917

llvm-svn: 274341
2016-07-01 10:00:58 +00:00
Sam Kolton 5196b88f07 [AMDGPU] Assembler: support SDWA for VOPC instructions
Summary: dst_sel and dst_unused disabled for VOPC as they have no effect on result

Reviewers: artem.tamazov, tstellarAMD, vpykhtin

Subscribers: arsenm, kzhuravl

Differential Revision: http://reviews.llvm.org/D21376

llvm-svn: 274340
2016-07-01 09:59:21 +00:00
NAKAMURA Takumi 566597330a Update libdeps; AMDGPUCodeGen requires LLVMVectorize.
llvm-svn: 274339
2016-07-01 09:55:23 +00:00
Matt Arsenault 908b9e26a6 AMDGPU: Add option to run the load/store vectorizer
llvm-svn: 274329
2016-07-01 03:33:52 +00:00
Matt Arsenault 0994bd57fb AMDGPU: Implement getLoadStoreVecRegBitWidth
llvm-svn: 274312
2016-07-01 00:56:27 +00:00
Duncan P. N. Exon Smith 632987296f Target: Remove unused arguments from overrideSchedPolicy, NFC
TargetSubtargetInfo::overrideSchedPolicy takes two MachineInstr*
arguments (begin and end) that invite implicit conversions from
MachineInstrBundleIterator.  One option would be to change their type to
an iterator, but since they don't seem to have been used since the API
was added in 2010, I'm deleting the dead code.

llvm-svn: 274304
2016-07-01 00:23:27 +00:00
Duncan P. N. Exon Smith e4f5e4f4d1 CodeGen: Use MachineInstr& in TargetLowering, NFC
This is a mechanical change to make TargetLowering API take MachineInstr&
(instead of MachineInstr*), since the argument is expected to be a valid
MachineInstr.  In one case, changed a parameter from MachineInstr* to
MachineBasicBlock::iterator, since it was used as an insertion point.

As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.

llvm-svn: 274287
2016-06-30 22:52:52 +00:00
Matt Arsenault c1142725bd AMDGPU: Add m0 vgpr load loop block as successor
This shows up as a verifier error when I move this
earlier, not sure why it didn't before.

llvm-svn: 274275
2016-06-30 20:49:28 +00:00
Rafael Espindola d86e8bb0ed Delete MCCodeGenInfo.
MC doesn't really care about CodeGen stuff, so this was just
complicating target initialization.

llvm-svn: 274258
2016-06-30 18:25:11 +00:00
Duncan P. N. Exon Smith 9cfc75c214 CodeGen: Use MachineInstr& in TargetInstrInfo, NFC
This is mostly a mechanical change to make TargetInstrInfo API take
MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator)
when the argument is expected to be a valid MachineInstr.  This is a
general API improvement.

Although it would be possible to do this one function at a time, that
would demand a quadratic amount of churn since many of these functions
call each other.  Instead I've done everything as a block and just
updated what was necessary.

This is mostly mechanical fixes: adding and removing `*` and `&`
operators.  The only non-mechanical change is to split
ARMBaseInstrInfo::getOperandLatencyImpl out from
ARMBaseInstrInfo::getOperandLatency.  Previously, the latter took a
`MachineInstr*` which it updated to the instruction bundle leader; now,
the latter calls the former either with the same `MachineInstr&` or the
bundle leader.

As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.

Note: I updated WebAssembly, Lanai, and AVR (despite being
off-by-default) since it turned out to be easy.  I couldn't run tests
for AVR since llc doesn't link with it turned on.

llvm-svn: 274189
2016-06-30 00:01:54 +00:00
Matt Arsenault eb9025d698 AMDGPU: Fix global isel crashes
llvm-svn: 274039
2016-06-28 17:42:09 +00:00
Matt Arsenault 254a6450dd AMDGPU: Fix typo
llvm-svn: 274034
2016-06-28 16:59:53 +00:00
Matt Arsenault 3d846501fb AMDGPU: Remove unused function
llvm-svn: 274033
2016-06-28 16:59:49 +00:00
Matt Arsenault b4d9503171 AMDGPU: Fix out of bounds indirect indexing errors
This was producing acceses to registers beyond the super
register's limits, resulting in verifier failures.

llvm-svn: 273977
2016-06-28 01:09:00 +00:00
Matt Arsenault 55dff27122 AMDGPU: Fix global isel build
llvm-svn: 273964
2016-06-28 00:11:26 +00:00
Matt Arsenault ed1fd7e056 AMDGPU: Set MinInstAlignment
Not sure this actually changes anything

llvm-svn: 273947
2016-06-27 21:42:49 +00:00
Matt Arsenault 59c0ffa22a AMDGPU: Implement per-function subtargets
llvm-svn: 273940
2016-06-27 20:48:03 +00:00
Matt Arsenault 03d8584590 AMDGPU: Move subtarget feature checks into passes
llvm-svn: 273937
2016-06-27 20:32:13 +00:00
Matt Arsenault 21a4625a16 AMDGPU: Fix verifier errors with undef vector indices
Also fix pointlessly adding exec to liveins.

llvm-svn: 273916
2016-06-27 19:57:44 +00:00
Simon Pilgrim 634dde36b4 Fix "not all control paths return a value" warning on MSVC
llvm-svn: 273872
2016-06-27 12:58:10 +00:00
NAKAMURA Takumi 5cbd41e0a8 SIMachineFunctionInfo.cpp: Appease msc18 to use std::array.
llvm-svn: 273860
2016-06-27 10:26:43 +00:00
NAKAMURA Takumi f619b50fab Reformat.
llvm-svn: 273859
2016-06-27 10:26:36 +00:00