llvm-project

Commit Graph

Author	SHA1	Message	Date
Vincent Lejeune	d623644d17	R600/SI: Support byval arguments llvm-svn: 192555	2013-10-13 17:56:16 +00:00
Vincent Lejeune	fa58a5fb60	R600: Use masked read sel for texture instructions llvm-svn: 192554	2013-10-13 17:56:10 +00:00
Vincent Lejeune	301beb80d4	R600: fix swizzle export llvm-svn: 192553	2013-10-13 17:56:04 +00:00
Vincent Lejeune	533352f696	R600: Clear the VPM bit of export instructions. It makes apparently no change it to set this bit or not but the docs recommand to left it cleared. llvm-svn: 192552	2013-10-13 17:55:57 +00:00
Tom Stellard	ed69925998	R600: Store disassembly in a special ELF section when feature +DumpCode is enabled. Patch by: Jay Cornwall Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 192523	2013-10-12 05:02:51 +00:00
Matt Arsenault	8fb373891f	Fix typo llvm-svn: 192499	2013-10-11 21:03:36 +00:00
Matt Arsenault	1408b60291	Fix typo llvm-svn: 192406	2013-10-10 23:05:37 +00:00
Matt Arsenault	204cfa6e43	R600: Fix trunc i64 to i32 on SI llvm-svn: 192375	2013-10-10 18:04:16 +00:00
Tom Stellard	93fabcebf1	R600/SI: Implement SIInstrInfo::verifyInstruction() for VOP* The function is used by the machine verifier and checks that VOP* instructions have legal operands. llvm-svn: 192367	2013-10-10 17:11:55 +00:00
Tom Stellard	682bfbc43d	R600/SI: Define a separate MIMG instruction for each possible output value type During instruction selection, we rewrite the destination register class for MIMG instructions based on their writemasks. This creates machine verifier errors since the new register class does not match the register class in the MIMG instruction definition. We can avoid this by defining different MIMG instructions for each possible destination type and then switching to the correct instruction when we change the register class. llvm-svn: 192365	2013-10-10 17:11:24 +00:00
Tom Stellard	1b99ed8290	R600/SI: Mark the EXEC register as reserved This prevents the machine verifier from complaining about uses of an undefined physical register. Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 192364	2013-10-10 17:11:19 +00:00
Tom Stellard	ed0ceec1c1	R600: Use StructurizeCFGPass for non SI targets StructurizeCFG pass allows to make complex cfg reducible ; it allows a lot of shader from shadertoy (which exhibits complex control flow constructs) to works correctly with respect to CFG handling (and allow us to detect potential bug in other part of the backend). We provide a cmd line argument to disable the pass for debug purpose. Patch by: Vincent Lejeune Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 192363	2013-10-10 17:11:12 +00:00
Rafael Espindola	a17151ad5a	Add a MCTargetStreamer interface. This patch fixes an old FIXME by creating a MCTargetStreamer interface and moving the target specific functions for ARM, Mips and PPC to it. The ARM streamer is still declared in a common place because it is used from lib/CodeGen/ARMException.cpp, but the Mips and PPC are completely hidden in the corresponding Target directories. I will send an email to llvmdev with instructions on how to use this. llvm-svn: 192181	2013-10-08 13:08:17 +00:00
Vincent Lejeune	6df39438af	R600: Add a ldptr intrinsic to support MSAA. llvm-svn: 191838	2013-10-02 16:00:33 +00:00
Vincent Lejeune	a4da6fb535	R600: add a pass that merges clauses. llvm-svn: 191790	2013-10-01 19:32:58 +00:00
Vincent Lejeune	0b342d6f74	R600: Put PRED_X instruction in its own clause llvm-svn: 191789	2013-10-01 19:32:49 +00:00
Vincent Lejeune	269708b98d	R600: Enable -verify-machineinstrs in some tests. llvm-svn: 191788	2013-10-01 19:32:38 +00:00
Arnold Schwaighofer	d2f96b91ca	IfConverter: Use TargetSchedule for instruction latencies For targets that have instruction itineraries this means no change. Targets that move over to the new schedule model will use be able the new schedule module for instruction latencies in the if-converter (the logic is such that if there is no itineary we will use the new sched model for the latencies). Before, we queried "TTI->getInstructionLatency()" for the instruction latency and the extra prediction cost. Now, we query the TargetSchedule abstraction for the instruction latency and TargetInstrInfo for the extra predictation cost. The TargetSchedule abstraction will internally call "TTI->getInstructionLatency" if an itinerary exists, otherwise it will use the new schedule model. ATTENTION: Out of tree targets! (I will also send out an email later to LLVMDev) This means, if your target implements unsigned getInstrLatency(const InstrItineraryData ItinData, const MachineInstr MI, unsigned PredCost); and returns a value for "PredCost", you now also need to implement unsigned getPredictationCost(const MachineInstr MI); (if your target uses the IfConversion.cpp pass) radar://15077010 llvm-svn: 191671	2013-09-30 15:28:56 +00:00
Robert Wilhelm	2788d3ec99	Even more spelling fixes for "instruction". llvm-svn: 191611	2013-09-28 13:42:22 +00:00
Tom Stellard	0351ea2010	R600: Fix handling of NAN in comparison instructions We were completely ignoring the unorder/ordered attributes of condition codes and also incorrectly lowering seto and setuo. Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 191603	2013-09-28 02:50:50 +00:00
Tom Stellard	5694d3090a	SelectionDAG: Improve legalization of SELECT_CC with illegal condition codes SelectionDAG will now attempt to inverse an illegal conditon in order to find a legal one and if that doesn't work, it will attempt to swap the operands using the inverted condition. There are no new test cases for this, but a nubmer of the existing R600 tests hit this path. llvm-svn: 191602	2013-09-28 02:50:43 +00:00
Tom Stellard	cd42818d86	SelectionDAG: Try to expand all condition codes using getCCSwappedOperands() This is useful for targets like R600, which only support GT, GE, NE, and EQ condition codes as it removes the need to handle unsupported condition codes in target specific code. There are no tests with this commit, but R600 has been updated to take advantage of this new feature, so its existing selectcc tests are now testing the swapped operands path. llvm-svn: 191601	2013-09-28 02:50:38 +00:00
David Majnemer	1ccd2f2aee	MC: Remove vestigial PCSymbol field from AsmInfo llvm-svn: 191362	2013-09-25 09:36:11 +00:00
Tim Northover	31d093c705	ISelDAG: spot chain cycles involving MachineNodes Previously, the DAGISel function WalkChainUsers was spotting that it had entered already-selected territory by whether a node was a MachineNode (amongst other things). Since it's fairly common practice to insert MachineNodes during ISelLowering, this was not the correct check. Looking around, it seems that other nodes get their NodeId set to -1 upon selection, so this makes sure the same thing happens to all MachineNodes and uses that characteristic to determine whether we should stop looking for a loop during selection. This should fix PR15840. llvm-svn: 191165	2013-09-22 08:21:56 +00:00
Andrew Trick	978674b2bc	Allow subtarget selection of the default MachineScheduler and document the interface. The global registry is used to allow command line override of the scheduler selection, but does not work well as the normal selection API. For example, the same LLVM process should be able to target multiple targets or subtargets. llvm-svn: 191071	2013-09-20 05:14:41 +00:00
Vincent Lejeune	0167a313da	R600: Move clamp handling code to R600IselLowering.cpp llvm-svn: 190645	2013-09-12 23:45:00 +00:00
Vincent Lejeune	9a248e5c2d	R600: Move code handling literal folding into R600ISelLowering. llvm-svn: 190644	2013-09-12 23:44:53 +00:00
Vincent Lejeune	ab3baf80a8	R600: Move fabs/fneg/sel folding logic into PostProcessIsel This move makes possible to correctly handle multiples instructions from a single pattern. llvm-svn: 190643	2013-09-12 23:44:44 +00:00
Tom Stellard	afcf12f33a	R600/SI: expose TBUFFER_STORE_FORMAT_* for OpenGL transform feedback For _XYZ, the type of VDATA is v4i32, because v3i32 doesn't exist. The ADDR64 bit is not exposed. A simpler intrinsic that doesn't take a resource descriptor might be nicer. The maximum number of input SGPRs is bumped to 17. Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 190575	2013-09-12 02:55:14 +00:00
Tom Stellard	7f6fa4c4c5	R600: Don't use trans slot for instructions that read LDS source registers This fixes some regressions in the piglit local memory store tests introduced by recent commits which made the scheduler aware of the trans slot. It's not possible to test this using lit, because there is no way to determine from the assembly dumps whether or not an instruction is in the trans slot. Even if this were possible, the test would be highly sensitive to changes in the scheduler and might generate confusing false negatives. Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 190574	2013-09-12 02:55:06 +00:00
Bill Wendling	58e2d3d856	Generate compact unwind encoding from CFI directives. We used to generate the compact unwind encoding from the machine instructions. However, this had the problem that if the user used `-save-temps' or compiled their hand-written `.s' file (with CFI directives), we wouldn't generate the compact unwind encoding. Move the algorithm that generates the compact unwind encoding into the MCAsmBackend. This way we can generate the encoding whether the code is from a `.ll' or `.s' file. <rdar://problem/13623355> llvm-svn: 190290	2013-09-09 02:37:14 +00:00
Aaron Watry	372cecf642	R600: Add support for LDS atomic subtract Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 190200	2013-09-06 20:17:42 +00:00
Tom Stellard	8bc633ac09	R600: Coding style llvm-svn: 190110	2013-09-05 23:55:13 +00:00
Matt Arsenault	6f24379974	R600: Fix i64 to i32 trunc on SI llvm-svn: 190091	2013-09-05 19:41:10 +00:00
Tom Stellard	13c68ef88b	R600: Add support for local memory atomic add llvm-svn: 190080	2013-09-05 18:38:09 +00:00
Tom Stellard	53f2f90eb4	R600: Expand SELECT nodes rather than custom lowering them llvm-svn: 190079	2013-09-05 18:38:03 +00:00
Tom Stellard	de60e25278	R600: Fix incorrect LDS size calculation GlobalAdderss nodes that appeared in more than one basic block were being counted twice. llvm-svn: 190078	2013-09-05 18:37:57 +00:00
Tom Stellard	d50bb3c8d4	R600/SI: Don't emit S_WQM_B64 instruction for compute shaders llvm-svn: 190077	2013-09-05 18:37:52 +00:00
Tom Stellard	624741fded	R600: Fix segfault in R600TextureIntrinsicReplacer This pass was segfaulting when it ran into a non-intrinsic function call. Function calls are not supported, so now instead of segfaulting, we will get an assertion failure with a nice error message. I'm not sure how to test this using lit. llvm-svn: 190076	2013-09-05 18:37:45 +00:00
Vincent Lejeune	744efa4dca	R600: Use shared op optimization when checking cycle compatibility llvm-svn: 189981	2013-09-04 19:53:54 +00:00
Vincent Lejeune	7e2c83256b	R600: Non vector only instruction can be scheduled on trans unit llvm-svn: 189980	2013-09-04 19:53:46 +00:00
Vincent Lejeune	4d5c5e53d0	R600: Use SchedModel enum for is{Trans,Vector}Only functions llvm-svn: 189979	2013-09-04 19:53:30 +00:00
Michael Gottesman	c9f5859f81	Add llvm namespace to llvm::next. llvm-svn: 189912	2013-09-04 04:26:09 +00:00
Michael Gottesman	114ac1a230	Use llvm::next() instead of incrementing begin iterators of std::vector. Iterator of std::vector may be implemented as a raw pointer. In this case begin iterators are rvalues and cannot be incremented. For example, this is the case with STDCXX implementation of vector. Patch by Konstantin Tokarev <annulen@yandex.ru>. llvm-svn: 189911	2013-09-04 04:19:01 +00:00
Benjamin Kramer	bda73fff49	Mark an unreachable code path with llvm_unreachable. Pacifies GCC. llvm-svn: 189726	2013-08-31 21:20:04 +00:00
Tom Stellard	35bb18c2a7	R600: Add support for vector local memory loads llvm-svn: 189226	2013-08-26 15:06:04 +00:00
Tom Stellard	c6f4a29ed5	R600: Add support for i8 and i16 local memory loads llvm-svn: 189225	2013-08-26 15:05:59 +00:00
Tom Stellard	f3d166aa1e	R600: Add support for i8 and i16 local memory stores llvm-svn: 189223	2013-08-26 15:05:49 +00:00
Tom Stellard	2ffc330673	R600: Add support for v4i32 and v2i32 local stores llvm-svn: 189222	2013-08-26 15:05:44 +00:00
Tom Stellard	fd155828ed	SelectionDAG: Use correct pointer size when lowering function arguments v2 This adds minimal support to the SelectionDAG for handling address spaces with different pointer sizes. The SelectionDAG should now correctly lower pointer function arguments to the correct size as well as generate the correct code when lowering getelementptr. This patch also updates the R600 DataLayout to use 32-bit pointers for the local address space. v2: - Add more helper functions to TargetLoweringBase - Use CHECK-LABEL for tests llvm-svn: 189221	2013-08-26 15:05:36 +00:00
Tom Stellard	15e4811455	R600/SI: Fix another case of illegal VGPR to SGPR copy This fixes a crash in Unigine Tropics. https://bugs.freedesktop.org/show_bug.cgi?id=68389 llvm-svn: 189057	2013-08-22 20:21:02 +00:00
Tom Stellard	f6d8023ca4	R600: Remove unnecessary casts Spotted by Bill Wendling. llvm-svn: 188942	2013-08-21 22:14:17 +00:00
Dmitri Gribenko	8b2a3d1fea	Remove unused stdio.h includes llvm-svn: 188626	2013-08-18 08:29:51 +00:00
Tom Stellard	59ed08b238	R600: Fix possible use of an uninitialized variable Spotted by Nick Lewycky! llvm-svn: 188599	2013-08-17 00:06:51 +00:00
Tom Stellard	b249b75726	R600: Expand vector FRINT ops Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 188598	2013-08-16 23:51:33 +00:00
Tom Stellard	ad3aff246c	R600: Expand vector FFLOOR ops Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 188597	2013-08-16 23:51:29 +00:00
Tom Stellard	a92ff87929	R600: Expand vector float operations for both SI and R600 Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 188596	2013-08-16 23:51:24 +00:00
Michel Danzer	8522270d7e	R600/SI: Add pattern for xor of i1 Fixes two recent piglit regressions with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 188559	2013-08-16 16:19:31 +00:00
Michel Danzer	20680b1cc5	R600/SI: Fix broken encoding of DS_WRITE_B32 The logic in SIInsertWaits::getHwCounts() only really made sense for SMRD instructions, and trying to shoehorn it into handling DS_WRITE_B32 caused it to corrupt the encoding of that by clobbering the first operand with the second one. Undo that damage and only apply the SMRD logic to that. Fixes some derivates related piglit regressions with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 188558	2013-08-16 16:19:24 +00:00
Benjamin Kramer	a8eecee121	R600: Allocate memoperand in the MachienFunction so it doesn't leak. llvm-svn: 188555	2013-08-16 14:48:09 +00:00
Tom Stellard	dba25713a6	Revert "R600/SI: Fix incorrect encoding of DS_WRITE_B32 instructions" This reverts commit a6a39ced095c2f453624ce62c4aead25db41a18f. This is the wrong version of this fix. llvm-svn: 188523	2013-08-16 01:18:43 +00:00
Tom Stellard	82bef57f20	R600/SI: Fix incorrect encoding of DS_WRITE_B32 instructions The SIInsertWaits pass was overwriting the first operand (gds bit) of DS_WRITE_B32 with the second operand (value to write). This meant that any time the value to write was stored in an odd number VGPR, the gds bit would be set causing the instruction to write to GDS instead of LDS. llvm-svn: 188522	2013-08-16 01:12:20 +00:00
Tom Stellard	b03edeca67	R600: Add support for global vector loads with element types less than 32-bits Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 188521	2013-08-16 01:12:16 +00:00
Tom Stellard	fbab827e2a	R600: Add support for global vector stores with elements less than 32-bits Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 188520	2013-08-16 01:12:11 +00:00
Tom Stellard	d3ee8c103a	R600: Add support for i16 and i8 global stores Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 188519	2013-08-16 01:12:06 +00:00
Tom Stellard	6d1379e180	R600: Add support for v4i32 stores on Cayman Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 188518	2013-08-16 01:12:00 +00:00
Tom Stellard	16da74c205	R600: Enable folding of inline literals into REQ_SEQUENCE instructions Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 188517	2013-08-16 01:11:55 +00:00
Tom Stellard	676c16d088	R600: Add IsExport bit to TableGen instruction definitions Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 188516	2013-08-16 01:11:51 +00:00
Tom Stellard	ac00f9df79	R600: Change the RAT instruction assembly names so they match the docs Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 188515	2013-08-16 01:11:46 +00:00
Matt Arsenault	5cae894a13	Fix spelling llvm-svn: 188506	2013-08-15 23:11:03 +00:00
Alexey Samsonov	3186eb3efd	Tentative fix for global-buffer-overflow caused by r188426. Found by AddressSanitizer llvm-svn: 188448	2013-08-15 07:11:34 +00:00
Tom Stellard	d86003e31f	R600/SI: Improve legalization of vector operations This should fix hangs in the OpenCL piglit tests. llvm-svn: 188431	2013-08-14 23:25:00 +00:00
Tom Stellard	6785065ace	R600/SI: Replace v1i32 type with i32 in imageload and sample intrinsics llvm-svn: 188430	2013-08-14 23:24:53 +00:00
Tom Stellard	9fa1791a1b	R600/SI: Convert v16i8 resource descriptors to i128 Now that compute support is better on SI, we can't continue using v16i8 for descriptors since this is also a legal type in OpenCL. This patch fixes numerous hangs with the piglit OpenCL test and since we now use a target specific DAG node for LOAD_CONSTANT with the correct MemOperandFlags, this should also fix: https://bugs.freedesktop.org/show_bug.cgi?id=66805 llvm-svn: 188429	2013-08-14 23:24:45 +00:00
Tom Stellard	8e5da41374	R600/SI: Lower BUILD_VECTOR to REG_SEQUENCE v2 Using REG_SEQUENCE for BUILD_VECTOR rather than a series of INSERT_SUBREG instructions should make it easier for the register allocator to coalasce unnecessary copies. v2: - Use an SGPR register class if all the operands of BUILD_VECTOR are SGPRs. llvm-svn: 188427	2013-08-14 23:24:32 +00:00
Tom Stellard	df94dc3917	R600/SI: Choose the correct MOV instruction for copying immediates The instruction selector will now try to infer the destination register so it can decided whether to use V_MOV_B32 or S_MOV_B32 when copying immediates. llvm-svn: 188426	2013-08-14 23:24:24 +00:00
Tom Stellard	16a9a205c8	R600/SI: Assign a register class to the $vaddr operand for MIMG instructions The previous code declared the operand as unknown:$vaddr, which made it possible for scalar registers to be used instead of vector registers. llvm-svn: 188425	2013-08-14 23:24:17 +00:00
Tom Stellard	3494b7ee42	R600/SI: Handle MSAA texture targets Patch by: Marek Olšák Signed-off-by: Marek Olšák <marek.olsak@amd.com> llvm-svn: 188421	2013-08-14 22:22:14 +00:00
Tom Stellard	20ee94f152	R600/SI: Allow conversion between v32i8 and v8i32 Patch by: Marek Olšák Signed-off-by: Marek Olšák <marek.olsak@amd.com> llvm-svn: 188420	2013-08-14 22:22:09 +00:00
Tom Stellard	a36f077159	R600/SI: Fix an obvious typo Patch by: Marek Olšák Signed-off-by: Marek Olšák <marek.olsak@amd.com> llvm-svn: 188419	2013-08-14 22:22:03 +00:00
Tom Stellard	73c31d541e	R600/SI: Add pattern for fp_to_uint This fixes the F2U opcode for the Mesa driver. Patch by: Marek Olšák Signed-off-by: Marek Olšák <marek.olsak@amd.com> llvm-svn: 188418	2013-08-14 22:21:57 +00:00
Tom Stellard	fc455471c3	R600: Set scheduling preference to Sched::Source R600 doesn't need to do any scheduling on the SelectionDAG now that it has a very good MachineScheduler. Also, using the VLIW SelectionDAG scheduler was having a major impact on compile times. For example with the phatk kernel here are the LLVM IR to machine code compile times: With Sched::VLIW Total Compile Time: 1.4890 Seconds (User + System) SelectionDAG Instruction Scheduling: 1.1670 Seconds (User + System) With Sched::Source Total Compile Time: 0.3330 Seconds (User + System) SelectionDAG Instruction Scheduling: 0.0070 Seconds (User + System) The code ouput was identical with both schedulers. This may not be true for all programs, but it gives me confidence that there won't be much reduction, if any, in code quality by using Sched::Source. llvm-svn: 188215	2013-08-12 22:33:21 +00:00
Niels Ole Salscheider	d3a039fed2	R600/SI: FMA is faster than fmul and fadd for f64 llvm-svn: 188136	2013-08-10 10:38:54 +00:00
Niels Ole Salscheider	6509ac65a9	R600/SI: Add FMA pattern llvm-svn: 188135	2013-08-10 10:38:47 +00:00
Niels Ole Salscheider	719fbc9ae7	R600/SI: Implement fp32<->fp64 conversions llvm-svn: 187988	2013-08-08 16:06:15 +00:00
Niels Ole Salscheider	4715d886f8	R600/SI: Implement sint<->fp64 conversions llvm-svn: 187987	2013-08-08 16:06:08 +00:00
Evgeniy Stepanov	bc8808ce4a	Initialize SIInsertWaits::ExpInstrTypesSeen in the pass constructor. This value may be used uninitialized in SIInsertWaits::insertWait. Found with MemorySanitizer. llvm-svn: 187869	2013-08-07 07:47:41 +00:00
Tom Stellard	f5a988b35f	R600: Add new file from r187831 to CMakeLists.txt llvm-svn: 187834	2013-08-06 23:12:34 +00:00
Tom Stellard	2f7cdda57e	R600/SI: Use VSrc_* register classes as the default classes for types Since the VSrc_* register classes contain both VGPRs and SGPRs, copies that used be emitted by isel like this: SGPR = COPY VGPR Will now be emitted like this: VSrC = COPY VGPR This patch also adds a pass that tries to identify and fix situations where a VGPR to SGPR copy may occur. Hopefully, these changes will make it impossible for the compiler to generate illegal VGPR to SGPR copies. llvm-svn: 187831	2013-08-06 23:08:28 +00:00
Tom Stellard	4c0ffccbbf	R600/SI: Add more special cases for opcodes to ensureSRegLimit() Also factor out the register class lookup to its own function. llvm-svn: 187830	2013-08-06 23:08:18 +00:00
NAKAMURA Takumi	aaf66c7357	Target//CMakeLists.txt: Add the dependency to CommonTableGen explicitly for each corresponding CodeGen. Without explicit dependencies, both per-file action and in-CommonTableGen action could run in parallel. It races to emit .inc files simultaneously. llvm-svn: 187780	2013-08-06 06:38:37 +00:00
Tom Stellard	aa664d9b92	Factor FlattenCFG out from SimplifyCFG Patch by: Mei Ye llvm-svn: 187764	2013-08-06 02:43:45 +00:00
Tom Stellard	28d06de6f6	R600: Implement TargetLowering::getVectorIdxTy() We use MVT::i32 for the vector index type, because we use 32-bit operations to caculate offsets when dynamically indexing vectors. llvm-svn: 187749	2013-08-05 22:22:07 +00:00
Tom Stellard	0344cdfe39	R600: Add 64-bit float load/store support * Added R600_Reg64 class * Added T#Index#.XY registers definition * Added v2i32 register reads from parameter and global space * Added f32 and i32 elements extraction from v2f32 and v2i32 * Added v2i32 -> v2f32 conversions Tom Stellard: - Mark vec2 operations as expand. The addition of a vec2 register class made them all legal. Patch by: Dmitry Cherkassov Signed-off-by: Dmitry Cherkassov <dcherkassov@gmail.com> llvm-svn: 187582	2013-08-01 15:23:42 +00:00
Tom Stellard	53698938a4	R600: Use 64-bit alignment for 64-bit kernel arguments llvm-svn: 187581	2013-08-01 15:23:31 +00:00
Tom Stellard	98f675a994	R600/SI: Custom lower i64 ZERO_EXTEND llvm-svn: 187580	2013-08-01 15:23:26 +00:00
Tom Stellard	ca69a53bae	Revert "R600: Non vector only instruction can be scheduled on trans unit" This reverts commit 98ce62780ea7185ba710868bf83c8077e8d7f6d6. llvm-svn: 187526	2013-07-31 20:43:27 +00:00
Tom Stellard	4dd41845ec	Revert "R600: Use SchedModel enum for is{Trans,Vector}Only functions" This reverts commit 3f1de26cb5cc0543a6a1d71259a7a39d97139051. llvm-svn: 187524	2013-07-31 20:43:03 +00:00
Vincent Lejeune	220db748b0	R600: Do not mergevector after a vector reg is used If we merge vector when a vector is used, it will generate an artificial antidependency that can prevent 2 tex/vtx instructions to use the same clause and thus generate extra clauses that reduce performance. There is no test case as such situation is really hard to predict. llvm-svn: 187516	2013-07-31 19:32:12 +00:00
Vincent Lejeune	bb3f931123	R600: Avoid more than 4 literals in the same instruction group at scheduling llvm-svn: 187515	2013-07-31 19:32:07 +00:00
Vincent Lejeune	df18804e26	R600: Non vector only instruction can be scheduled on trans unit llvm-svn: 187514	2013-07-31 19:31:56 +00:00
Vincent Lejeune	21de8baa15	R600: Don't mix LDS and non-LDS instructions in the same group There are a lot of restrictions on instruction groups that contain LDS instructions, so for now we will be conservative and not packetize anything else with them. llvm-svn: 187513	2013-07-31 19:31:41 +00:00
Vincent Lejeune	79afe17e99	R600: Use SchedModel enum for is{Trans,Vector}Only functions llvm-svn: 187512	2013-07-31 19:31:35 +00:00
Vincent Lejeune	0c5ed2b437	R600: Remove predicated_break inst We were using two instructions for similar purpose : break and predicated break. Only predicated_break was emitted and it was lowered at R600ControlFlowFinalizer to JUMP;CF_BREAK;POP. This commit simplify the situation by making AMDILCFGStructurizer emit IF_PREDICATE;BREAK;ENDIF; instead of predicated_break (which is now removed). There is no functionality change. llvm-svn: 187510	2013-07-31 19:31:14 +00:00
Tom Stellard	aa313d0a74	R600/SI: Expand vector fp <-> int conversions llvm-svn: 187421	2013-07-30 14:31:03 +00:00
Quentin Colombet	e2e0548d77	[R600] Replicate old DAGCombiner behavior in target specific DAG combine. build_vector is lowered to REG_SEQUENCE, which is something the register allocator does a good job at optimizing. llvm-svn: 187397	2013-07-30 00:27:16 +00:00
Tom Stellard	8b1e021e85	SimplifyCFG: Use parallel-and and parallel-or mode to consolidate branch conditions Merge consecutive if-regions if they contain identical statements. Both transformations reduce number of branches. The transformation is guarded by a target-hook, and is currently enabled only for +R600, but the correctness has been tested on X86 target using a variety of CPU benchmarks. Patch by: Mei Ye llvm-svn: 187278	2013-07-27 00:01:07 +00:00
Tom Stellard	c54731aa9d	DAGCombiner: Pass the correct type to TargetLowering::isF(Abs\|Neg)Free This commit also implements these functions for R600 and removes a test case that was relying on the buggy behavior. llvm-svn: 187007	2013-07-23 23:55:03 +00:00
Tom Stellard	8cb0e47c9e	R600: Treat CONSTANT_ADDRESS loads like GLOBAL_ADDRESS loads when necessary These are really the same address space in hardware. The only difference is that CONSTANT_ADDRESS uses a special cache for faster access. When we are unable to use the constant kcache for some reason (e.g. smaller types or lack of indirect addressing) then the instruction selector must use GLOBAL_ADDRESS loads instead. llvm-svn: 187006	2013-07-23 23:54:56 +00:00
Tom Stellard	5263948a7b	R600: Add support for 24-bit MAD instructions Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186923	2013-07-23 01:48:49 +00:00
Tom Stellard	41fc7853be	R600: Add support for 24-bit MUL instructions Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186922	2013-07-23 01:48:42 +00:00
Tom Stellard	9f95033d33	R600: Improve support for < 32-bit loads Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186921	2013-07-23 01:48:35 +00:00
Tom Stellard	ba30932908	R600: Rename AMDILISelDAGToDAG.cpp -> AMDGPUISelDAGToDAG.cpp Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186920	2013-07-23 01:48:29 +00:00
Tom Stellard	840214437b	R600: Move CONST_ADDRESS folding into AMDGPUDAGToDAGISel::Select() This increases the number of opportunites we have for folding. With the previous implementation we were unable to fold into any instructions other than the first when multiple instructions were selected from a single SDNode. Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186919	2013-07-23 01:48:24 +00:00
Tom Stellard	1e80309ebe	R600: Use KCache for kernel arguments Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186918	2013-07-23 01:48:18 +00:00
Tom Stellard	34ed721af4	R600: Simplify assembly for KCache registers using the TableGen !add operator Before: MOV * T0.W, KC0[131-128].Y After: MOV * T0.W, KC0[3].Y Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186917	2013-07-23 01:48:08 +00:00
Tom Stellard	acfeebf883	R600: Use the same compute kernel calling convention for all GPUs A side-effect of this is that now the compiler expects kernel arguments to be 4-byte aligned. Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186916	2013-07-23 01:48:05 +00:00
Tom Stellard	78e012969c	R600: Use correct LoadExtType when lowering kernel arguments Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186915	2013-07-23 01:47:58 +00:00
Tom Stellard	33dd04bfbe	R600: Clean up extended load patterns Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186914	2013-07-23 01:47:52 +00:00
Tom Stellard	beed74af48	R600: Expand vector FNEG llvm-svn: 186913	2013-07-23 01:47:46 +00:00
Vincent Lejeune	8b8a7b5514	R600: Don't emit empty then clause and use alu_pop_after llvm-svn: 186725	2013-07-19 21:45:15 +00:00
Vincent Lejeune	960a622ca6	R600: Simplify AMDILCFGStructurize by removing templates and assuming single exit llvm-svn: 186724	2013-07-19 21:45:06 +00:00
Vincent Lejeune	a8c38fedd6	R600: Replace legacy debug code in AMDILCFGStructurizer.cpp llvm-svn: 186723	2013-07-19 21:44:56 +00:00
Tom Stellard	8374720aad	R600/SI: Fix crash with VSELECT https://bugs.freedesktop.org/show_bug.cgi?id=66175 llvm-svn: 186616	2013-07-18 21:43:53 +00:00
Tom Stellard	adf732cfbc	R600/SI: Add support for v2f32 loads llvm-svn: 186615	2013-07-18 21:43:48 +00:00
Tom Stellard	ed2f6149f3	R600/SI: Add support for v2f32 stores llvm-svn: 186614	2013-07-18 21:43:42 +00:00
Tom Stellard	67ae4762ef	R600: Expand VSELECT for all types llvm-svn: 186613	2013-07-18 21:43:35 +00:00
Craig Topper	8fc4096fab	Move string pointer from being a static class member to just a static global in the one file its needed in. llvm-svn: 186476	2013-07-17 00:31:35 +00:00
Craig Topper	d3a34f81f8	Add 'const' qualifiers to static const char* variables. llvm-svn: 186371	2013-07-16 01:17:10 +00:00
Tom Stellard	31209cc8eb	R600/SI: Add support for 64-bit loads https://bugs.freedesktop.org/show_bug.cgi?id=65873 llvm-svn: 186339	2013-07-15 19:00:09 +00:00
Craig Topper	0afd0ab749	Make some arrays 'static const' llvm-svn: 186307	2013-07-15 06:39:13 +00:00
Craig Topper	5871321e49	Use llvm::array_lengthof to replace sizeof(array)/sizeof(array[0]). llvm-svn: 186301	2013-07-15 04:27:47 +00:00
Craig Topper	b94011fd28	Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector size. llvm-svn: 186274	2013-07-14 04:42:23 +00:00
Benjamin Kramer	c22c790f89	R600: Remove unsafe type punning. No intended functionality change. llvm-svn: 186196	2013-07-12 20:18:05 +00:00
Tom Stellard	ccae60acc3	R600/SI: Add support for f64 kernel arguments Patch by: Niels Ole Salscheider Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186182	2013-07-12 18:15:26 +00:00
Tom Stellard	4e1100ab75	R600/SI: Implement select and compares for SI Patch by: Niels Ole Salscheider Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186181	2013-07-12 18:15:19 +00:00
Tom Stellard	8ed7b45da3	R600/SI: Add fsqrt pattern for SI Patch by: Niels Ole Salscheider Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186180	2013-07-12 18:15:13 +00:00
Tom Stellard	2a6a610516	R600/SI: Add double precision fsub pattern for SI Patch by: Niels Ole Salscheider Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186179	2013-07-12 18:15:08 +00:00
Tom Stellard	ab8a8c84d4	R600/SI: SI support for 64bit ConstantFP Patch by: Niels Ole Salscheider Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186178	2013-07-12 18:15:02 +00:00
Tom Stellard	7512c0803c	R600/SI: Add initial double precision support for SI Patch by: Niels Ole Salscheider Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186177	2013-07-12 18:14:56 +00:00
Aaron Ballman	f04bbd8b7f	Replacing an empty switch with its moral equivalent. No functional changes intended. llvm-svn: 186017	2013-07-10 17:19:22 +00:00
Michel Danzer	49812b5bbd	R600/SI: Initial local memory support Enough for the radeonsi driver to use it for calculating derivatives. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186012	2013-07-10 16:37:07 +00:00
Michel Danzer	1f87df365f	R600/SI: Add pattern for the AMDGPU.barrier.local intrinsic lit test coverage to follow in the next commit. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186011	2013-07-10 16:36:57 +00:00
Michel Danzer	8d69617b27	R600/SI: Add intrinsic for retrieving the current thread ID Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186010	2013-07-10 16:36:52 +00:00
Michel Danzer	1c45430e76	R600/SI: Initial support for LDS/GDS instructions Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186009	2013-07-10 16:36:43 +00:00
Michel Danzer	83f87c4c2e	R600/SI: Add intrinsics for texture sampling with user derivatives Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186008	2013-07-10 16:36:36 +00:00
Vincent Lejeune	ce499744b3	R600: Do not predicated basic block with multiple alu clause Test is not included as it is several 1000 lines long. To test this functionnality, a test case must generate at least 2 ALU clauses, where an ALU clause is ~110 instructions long. NOTE: This is a candidate for the stable branch. llvm-svn: 185943	2013-07-09 15:03:33 +00:00
Vincent Lejeune	b8aac8d720	R600: Fix a rare bug where swizzle optimization returns wrong values llvm-svn: 185942	2013-07-09 15:03:25 +00:00
Vincent Lejeune	a4d8d2ef2b	R600: Fix wrong export reswizzling llvm-svn: 185941	2013-07-09 15:03:19 +00:00
Vincent Lejeune	b55940cc7d	R600: Use DAG lowering pass to handle fcos/fsin NOTE: This is a candidate for the stable branch. llvm-svn: 185940	2013-07-09 15:03:11 +00:00
Vincent Lejeune	f10d1cd2a3	R600: Print Export Swizzle llvm-svn: 185939	2013-07-09 15:03:03 +00:00
Craig Topper	31ee5866de	Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid specifying the vector size. llvm-svn: 185540	2013-07-03 15:07:05 +00:00
Rafael Espindola	64e1af8eb9	Remove address spaces from MC. This is dead code since PIC16 was removed in 2010. The result was an odd mix, where some parts would carefully pass it along and others would assert it was zero (most of the object streamer for example). llvm-svn: 185436	2013-07-02 15:49:13 +00:00
Chad Rosier	797ee3e3c6	Add a newline. llvm-svn: 185385	2013-07-01 21:31:10 +00:00
Vincent Lejeune	a8a50248d8	R600: Fix an unitialized variable in R600InstrInfo.cpp llvm-svn: 185294	2013-06-30 21:44:06 +00:00
Benjamin Kramer	396906456f	R600: Unbreak GCC build. operator++ on an enum is not legal. clang happens to accept it anyways, I think that's a known bug. llvm-svn: 185269	2013-06-29 20:04:19 +00:00
Vincent Lejeune	77a8352476	R600: Support schedule and packetization of trans-only inst llvm-svn: 185268	2013-06-29 19:32:43 +00:00
Vincent Lejeune	bb8a872158	R600: Bank Swizzle now display SCL equivalent llvm-svn: 185267	2013-06-29 19:32:29 +00:00
Tom Stellard	c46e56721e	R600/SI: Add processor types for each CIK variant Patch By: Alex Deucher Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> llvm-svn: 185209	2013-06-28 20:23:29 +00:00
Tom Stellard	c026e8bc8e	R600: Add local memory support via LDS Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 185162	2013-06-28 15:47:08 +00:00
Tom Stellard	ce540330df	R600: Add support for GROUP_BARRIER instruction Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 185161	2013-06-28 15:46:59 +00:00
Tom Stellard	5eb903d9c5	R600: Add ALUInst bit to tablegen definitions v2 v2: - Remove functions left over from a previous rebase. Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 185160	2013-06-28 15:46:53 +00:00
Tom Stellard	02661d9605	R600: Use new getNamedOperandIdx function generated by TableGen llvm-svn: 184880	2013-06-25 21:22:18 +00:00
Aaron Watry	0a794a4612	R600: Consolidate expansion of v2i32/v4i32 ops for EG/SI By default, we expand these operations for both EG and SI. Move the duplicated code into a common space for now. If the targets ever actually implement these operations as instructions, we can override that in the relevant target. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184848	2013-06-25 13:55:57 +00:00
Aaron Watry	daabb20e1b	R600/SI: Expand xor v2i32/v4i32 Add test cases for both vector sizes on SI and also add v2i32 test for EG. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184846	2013-06-25 13:55:52 +00:00
Aaron Watry	83fa6006bc	R600/SI: Expand urem of v2i32/v4i32 for SI Also add lit test for both cases on SI, and v2i32 for evergreen. Note: I followed the guidance of the v4i32 EG check... UREM produces really complex code, so let's just check that the instruction was lowered successfully. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184844	2013-06-25 13:55:46 +00:00
Aaron Watry	5527b6c6b6	R600/SI: Expand udiv v[24]i32 for SI and v2i32 for EG Also add lit test for both cases on SI, and v2i32 for evergreen. Note: I followed the guidance of the v4i32 EG check... UDIV produces really complex code, so let's just check that the instruction was lowered successfully. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184843	2013-06-25 13:55:43 +00:00
Aaron Watry	16d80c0529	R600/SI: Expand ashr of v2i32/v4i32 for SI Also add lit test for both cases on SI, and v2i32 for evergreen. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184842	2013-06-25 13:55:40 +00:00
Aaron Watry	f63791e778	R600/SI: Expand srl of v2i32/v4i32 for SI Also add lit test for both cases on SI, and v2i32 for evergreen. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184841	2013-06-25 13:55:37 +00:00
Aaron Watry	5584553984	R600/SI: Expand shl of v2i32/v4i32 for SI Also add lit test for both cases on SI, and v2i32 for evergreen. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184840	2013-06-25 13:55:32 +00:00
Aaron Watry	2fa162e88e	R600/SI: Expand or of v2i32/v4i32 for SI Also add lit test for both cases on SI, and v2i32 for evergreen. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184839	2013-06-25 13:55:29 +00:00
Aaron Watry	265eef5efe	R600/SI: Expand mul of v2i32/v4i32 for SI Also add lit test for both cases on SI, and v2i32 for evergreen. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184838	2013-06-25 13:55:26 +00:00
Aaron Watry	00aeb119db	R600/SI: Expand and of v2i32/v4i32 for SI Also add lit test for both cases on SI, and v2i32 for evergreen. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184837	2013-06-25 13:55:23 +00:00
Tom Stellard	0125f2a6e4	R600/SI: Report unaligned memory accesses as legal for > 32-bit types In reality, some unaligned memory accesses are legal for 32-bit types and smaller too, but it all depends on the address space. Allowing unaligned loads/stores for > 32-bit types is mainly to prevent the legalizer from splitting one load into multiple loads of smaller types. https://bugs.freedesktop.org/show_bug.cgi?id=65873 llvm-svn: 184822	2013-06-25 02:39:35 +00:00
Tom Stellard	9810ec613c	R600: Add support for i32 loads from the constant address space on Cayman Tested-By: Aaron Watry <awatry@gmail.com> llvm-svn: 184821	2013-06-25 02:39:30 +00:00
Tom Stellard	b06f3fc1be	R600/SI: Add support for v4i32 and v4f32 kernel args Tested-By: Aaron Watry <awatry@gmail.com> llvm-svn: 184820	2013-06-25 02:39:25 +00:00
Tom Stellard	9d2e1500b4	R600: Fix typo in R600Schedule.td This should only make a difference in programs that use a lot of the vector ALU instructions like BFI_INT and BIT_ALIGN. There is a slight improvement in the phatk bitcoin mining kernel with this patch on Evergreen (vector size == 1): Before: 1173 Instruction Groups / 9520 dwords After: 1167 Instruction Groups / 9510 dwords Reviewed-by: Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 184819	2013-06-25 02:39:20 +00:00
Aaron Watry	52a72c926c	R600: Fix spelling error in comment our -> or llvm-svn: 184756	2013-06-24 16:57:57 +00:00
Tom Stellard	96d38760fc	R600/SI: Expand sub for v2i32 and v4i32 for SI Also add a v2i32 test to the existing v4i32 test. Patch by: Aaron Watry Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Aaron Watry<awatry@gmail.com> llvm-svn: 184482	2013-06-20 21:55:37 +00:00
Tom Stellard	043795e818	R600/SI: Expand add for v2i32 and v4i32 Also add SI tests to existing file and a v2i32 test for both R600 and SI. Patch by: Aaron Watry Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Aaron Watry <awatry@gmail.com> llvm-svn: 184481	2013-06-20 21:55:30 +00:00
Tom Stellard	6ec9e8043c	R600: Expand v2i32 load/store instead of custom lowering The custom lowering causes llc to crash with a segfault. Ideally, the custom lowering can be fixed, but this allows programs which load/store v2i32 to work without crashing. Patch by: Aaron Watry Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Aaron Watry<awatry@gmail.com> llvm-svn: 184480	2013-06-20 21:55:23 +00:00
Bill Wendling	a3cd350249	Access the TargetLoweringInfo from the TargetMachine object instead of caching it. The TLI may change between functions. No functionality change. llvm-svn: 184360	2013-06-19 21:36:55 +00:00
Matt Arsenault	d46fce1141	Move StructurizeCFG out of R600 to generic Transforms. Register it with PassManager llvm-svn: 184343	2013-06-19 20:18:24 +00:00
Matt Arsenault	2aabb06175	Use GetUnderlyingObject instead of custom function llvm-svn: 184261	2013-06-18 23:37:58 +00:00
Bill Wendling	b7b1681157	Remove dead prototype. llvm-svn: 184173	2013-06-18 06:24:14 +00:00
Vincent Lejeune	41d4cf26b4	R600: PV stores Reg id, not index llvm-svn: 184117	2013-06-17 20:16:40 +00:00
Vincent Lejeune	8bd10421ec	R600: Properly set COUNT_3 bit in TEX clause initiating inst for pre EG gen. Fixes rv7x0 bug in Heaven reported here: https://bugs.freedesktop.org/show_bug.cgi?id=64257 llvm-svn: 184116	2013-06-17 20:16:26 +00:00
Tom Stellard	371573448c	R600: Add SI load support for v[24]i32 and store for v2i32 Also add a seperate vector lit test file, since r600 doesn't seem to handle v2i32 load/store yet, but we can test both for SI. Patch by: Aaron Watry Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Aaron Watry <awatry@gmail.com> llvm-svn: 184021	2013-06-15 00:09:31 +00:00
Tom Stellard	ecf9d86404	R600: Use correct encoding for Vertex Fetch instructions on Cayman Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 184016	2013-06-14 22:12:30 +00:00
Tom Stellard	6aa0d5578d	R600: Use EXPORT_RAT_INST_STORE_DWORD for stores on Cayman We were using RAT_INST_STORE_RAW, which seemed to work, but the docs say this instruction doesn't exist for Cayman, so it's probably safer to use a documented instruction instead. Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 184015	2013-06-14 22:12:24 +00:00
Tom Stellard	d99b7932ae	R600: Factor the instruction encoding out the RAT_WRITE_CACHELESS_eg class Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 184014	2013-06-14 22:12:19 +00:00
Tom Stellard	3d0823f1cd	R600: Move instruction encoding definitions into a separate .td file Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 184013	2013-06-14 22:12:09 +00:00
Tom Stellard	adba083bc2	R600: Don't try to fix reg class when copying IMPLICIT_DEF to a register The test case for this is way too complex to be useful as a lit test, and I was unable to reduce it. https://bugs.freedesktop.org/show_bug.cgi?id=65438 llvm-svn: 183937	2013-06-13 20:14:00 +00:00
Benjamin Kramer	193960c822	R600: Make helper functions static. llvm-svn: 183744	2013-06-11 13:32:25 +00:00
Vincent Lejeune	d1a9d18120	R600: Use a refined heuristic to choose when switching clause This is using a hint from AMD APP OpenCL Programming Guide with empirically tweaked parameters. I used Unigine Heaven 3.0 to determine best parameters on my system (i7 2600/Radeon 6950/Kernel 3.9.4) the benchmark : it went from 38.8 average fps to 39.6, which is ~3% gain. (Lightmark 2008.2 gain is much more marginal: from 537 to 539) There is no lit test provided as the parameter were determined empirically and it it would be nearly impossiblet to find a test program that check for optimal behavior. llvm-svn: 183593	2013-06-07 23:30:34 +00:00
Vincent Lejeune	4d143328df	R600: Anti dep better handled in tex clause llvm-svn: 183592	2013-06-07 23:30:26 +00:00
Tom Stellard	d74583777f	R600: Fix calculation of stack offset in AMDGPUFrameLowering We weren't computing structure size correctly and we were relying on the original alloca instruction to compute the offset, which isn't always reliable. Reviewed-by: Vincent Lejeune <vljn@ovi.com> llvm-svn: 183568	2013-06-07 20:52:05 +00:00
Tom Stellard	a6c6e1bfc2	R600: Rework subtarget info and remove AMDILDevice classes This should simplify the subtarget definitions and make it easier to add new ones. Reviewed-by: Vincent Lejeune <vljn@ovi.com> llvm-svn: 183566	2013-06-07 20:37:48 +00:00
Bill Wendling	37e9adb091	Don't cache the instruction and register info from the TargetMachine, because the internals of TargetMachine could change. No functionality change intended. llvm-svn: 183561	2013-06-07 20:28:55 +00:00
Tom Stellard	3498e4ff1d	R600: Fix the fetch limits for R600 generation GPUs Reviewed-by: Vincent Lejeune <vljn@ovi.com> https://bugs.freedesktop.org/show_bug.cgi?id=64257 llvm-svn: 183560	2013-06-07 20:28:55 +00:00
Tom Stellard	99792774a4	R600: Move Subtarget feature definitions into AMDGPU.td This is the convention used by the other targets. Reviewed-by: Vincent Lejeune <vljn@ovi.com> llvm-svn: 183559	2013-06-07 20:28:49 +00:00
Tom Stellard	b0804ec2ad	R600: Remove unnecessary include Reviewed-by: Vincent Lejeune <vljn@ovi.com> llvm-svn: 183558	2013-06-07 20:28:43 +00:00
Benjamin Kramer	705d841bb6	R600: Don't compare iterators of different maps. Found be libstdc's debug mode. llvm-svn: 183549	2013-06-07 19:59:34 +00:00
Benjamin Kramer	ebe0be9ca4	Vincent says the element is at most once in the vector, so we don't need a full std::remove. llvm-svn: 183541	2013-06-07 18:18:12 +00:00
Benjamin Kramer	a857fe115b	R600: Fix a potential iterator invalidation issue. As a bonus this reduces the loop from O(n^2) to O(n). llvm-svn: 183532	2013-06-07 16:13:49 +00:00
Vincent Lejeune	931bb768fd	R600: Remove an extra break in R600OptimizeVectorRegisters.cpp llvm-svn: 183528	2013-06-07 15:44:53 +00:00
Vincent Lejeune	0030362ed9	R600: Rewrite an awkward loop in R600MachineScheduler llvm-svn: 183458	2013-06-06 23:08:32 +00:00
Vincent Lejeune	54476a1503	R600: Remove leftover code in R600MachineScheduler.cpp Spotted by Benjamin Kramer. llvm-svn: 183413	2013-06-06 14:18:29 +00:00
Bill Wendling	b91216817f	Cast to the correct type. Pointer, not reference. llvm-svn: 183385	2013-06-06 05:39:29 +00:00
NAKAMURA Takumi	4a8f079371	R600OptimizeVectorRegisters.cpp: Tweak a warning. [-Wsometimes-uninitialized] FIXME: Is it false alarm? llvm-svn: 183371	2013-06-06 02:15:12 +00:00
NAKAMURA Takumi	e5555fc238	R600OptimizeVectorRegisters.cpp: Suppress a warning. [-Wunused-variable] llvm-svn: 183370	2013-06-06 02:15:06 +00:00
NAKAMURA Takumi	372574d447	Trailing linefeed. llvm-svn: 183369	2013-06-06 02:15:00 +00:00
Bill Wendling	e410576865	Cast to the proper type. llvm-svn: 183365	2013-06-06 01:04:21 +00:00
Tom Stellard	acec99c948	R600: Replace predicate loop with predicate function llvm-svn: 183351	2013-06-05 23:39:50 +00:00
Vincent Lejeune	dec1875207	R600: Add a pass that merge Vector Register Previously commited @183279 but tests were failing, reverted @183286 It was broken because @183336 was missing, now it's there. llvm-svn: 183343	2013-06-05 21:38:04 +00:00
Vincent Lejeune	4b5b849753	R600: Schedule copy from phys register at beginning of block It allows regalloc pass to remove them by trivially assigning associated reg llvm-svn: 183336	2013-06-05 20:27:35 +00:00
Tom Stellard	aad5376fb6	R600: Make sure to schedule AR register uses and defs in the same clause Reviewed-by: vljn at ovi.com llvm-svn: 183294	2013-06-05 03:43:06 +00:00
Rafael Espindola	beef23fe21	Revert "R600: Add a pass that merge Vector Register" This reverts commit r183279. CodeGen/R600/texture-input-merge.ll was failing. llvm-svn: 183286	2013-06-05 01:48:30 +00:00
Vincent Lejeune	a45aafabfe	R600: Add a pass that merge Vector Register llvm-svn: 183279	2013-06-04 23:17:26 +00:00
Vincent Lejeune	c689679173	R600: Const/Neg/Abs can be folded to dot4 llvm-svn: 183278	2013-06-04 23:17:15 +00:00
Vincent Lejeune	276ceb8d5f	R600: Swizzle texture/export instructions llvm-svn: 183229	2013-06-04 15:04:53 +00:00
Aaron Ballman	19978553d4	Silencing an MSVC warning about mixing bool and unsigned int. llvm-svn: 183176	2013-06-04 01:03:03 +00:00
Tom Stellard	94593ee8c3	R600/SI: Add support for work item and work group intrinsics llvm-svn: 183138	2013-06-03 17:40:18 +00:00
Tom Stellard	ed882c2f1b	R600/SI: Add a calling convention for compute shaders llvm-svn: 183137	2013-06-03 17:40:11 +00:00
Tom Stellard	046039e81b	R600/SI: Custom lower i64 sign_extend llvm-svn: 183136	2013-06-03 17:40:03 +00:00
Tom Stellard	0518ff89ba	R600/SI: Adjust some instructions' out register class after ISel This is necessary to avoid generating VGPR to SGPR copies in some cases. llvm-svn: 183135	2013-06-03 17:39:58 +00:00
Tom Stellard	bad1f59212	R600/SI: Handle REG_SEQUENCE in fitsRegClass() llvm-svn: 183134	2013-06-03 17:39:54 +00:00
Tom Stellard	b5a97004fb	R600/SI: Handle nodes with glue results correctly SITargetLowering::foldOperands() llvm-svn: 183133	2013-06-03 17:39:50 +00:00
Tom Stellard	2183b70523	R600/SI: Fixup CopyToReg register class in PostprocessISelDAG() The CopyToReg nodes will sometimes try to copy a value from a VGPR to an SGPR. This kind of copy is not possible, so we need to detect VGPR->SGPR copies and do something else. The current strategy is to replace these copies with VGPR->VGPR copies and hope that all the users of CopyToReg can accept VGPRs as arguments. llvm-svn: 183132	2013-06-03 17:39:46 +00:00
Tom Stellard	07a10a3d3f	R600/SI: Add support for global loads llvm-svn: 183131	2013-06-03 17:39:43 +00:00
Tom Stellard	556d9aa841	R600/SI: Rework MUBUF store instructions The lowering of stores is now mostly handled in the tablegen files. No more BUFFER_STORE nodes I generated during legalization. llvm-svn: 183130	2013-06-03 17:39:37 +00:00
Vincent Lejeune	91a942b93e	R600: 3 op instructions have no write bit but the result are store in PV llvm-svn: 183111	2013-06-03 15:56:12 +00:00
Vincent Lejeune	eabf83e0a2	R600: CALL_FS consumes a stack size entry llvm-svn: 183108	2013-06-03 15:44:42 +00:00
Vincent Lejeune	f83df1f1cb	R600: use capital letter for PV channel llvm-svn: 183107	2013-06-03 15:44:35 +00:00
Vincent Lejeune	a09873dda7	R600: Constraints input regs of interp_xy,_zw llvm-svn: 183106	2013-06-03 15:44:16 +00:00
Ahmed Bougacha	b1a4d9da3b	Make SubRegIndex size mandatory, following r183020. This also makes TableGen able to compute sizes/offsets of synthesized indices representing tuples. llvm-svn: 183061	2013-05-31 23:45:26 +00:00
Patrik Hagglund	ae8faf2e9a	Temporary fix to get rid of gcc warning. llvm-svn: 182832	2013-05-29 07:32:08 +00:00
Andrew Trick	ef9de2a739	Track IR ordering of SelectionDAG nodes 2/4. Change SelectionDAG::getXXXNode() interfaces as well as call sites of these functions to pass in SDLoc instead of DebugLoc. llvm-svn: 182703	2013-05-25 02:42:55 +00:00
Tom Stellard	1b086cbcb8	R600: Fix R600ControlFlowFinalizer not considering VTX_READ 128 bit dst reg Patch by: Vincent Lejeune https://bugs.freedesktop.org/show_bug.cgi?id=64877 NOTE: This is a candidate for the 3.3 branch. llvm-svn: 182600	2013-05-23 18:26:42 +00:00
Benjamin Kramer	d78bb468bd	Move passes from namespace llvm into anonymous namespaces. Sort includes while there. llvm-svn: 182594	2013-05-23 17:10:37 +00:00
Benjamin Kramer	635e368e33	R600: Hide symbols of implementation details. Also removes an unused function. llvm-svn: 182587	2013-05-23 15:43:05 +00:00
Aaron Ballman	15f193a1a3	Setting the default value (fixes CRT assertions about uninitialized variable use when doing debug MSVC builds), and fixing coding style. llvm-svn: 182585	2013-05-23 14:55:00 +00:00
Rafael Espindola	00345fa97b	Fix 32 bit build in c++11 mode. The error was: error: non-constant-expression cannot be narrowed from type 'long long' to 'long' in initializer list [-Wc++11-narrowing] MI.getOperand(6).getImm() & 0x1F, llvm-svn: 182584	2013-05-23 13:22:30 +00:00
Rafael Espindola	39aca620db	Fix a leak on the r600 backend. This should bring the valgrind bot back to life. llvm-svn: 182561	2013-05-23 03:31:47 +00:00
Rafael Espindola	bd6847fbea	clang-format this file. llvm-svn: 182560	2013-05-23 03:28:39 +00:00
Rafael Espindola	e3d83fb8c3	Fix use after free (pr16103). llvm-svn: 182482	2013-05-22 15:31:11 +00:00
Rafael Espindola	ebd8e38849	Check that a function starts with llvm. before using GET_FUNCTION_RECOGNIZER. Fixes a use of uninitialized memory found by asan and valgind. llvm-svn: 182480	2013-05-22 14:57:42 +00:00
NAKAMURA Takumi	4f328e1c2f	R600ISelLowering.cpp: Avoid "using namespace Intrinsic;" to appease MSC. Specify namespaces explicitly here. MSC is confused about "memcpy" between <cstring> and llvm::Intrinsic::memcpy, when llvm::Intrinsic were exposed. llvm-svn: 182452	2013-05-22 06:37:31 +00:00
NAKAMURA Takumi	18ca09c1cc	R600: Whitespace and untabify. llvm-svn: 182451	2013-05-22 06:37:25 +00:00
Owen Anderson	616852848a	Create an FPOW SDNode opcode def in the target independent .td file rather than in a specific backend. llvm-svn: 182450	2013-05-22 06:36:09 +00:00

... 3 4 5 6 7 ...

729 Commits