llvm-project

Commit Graph

Author	SHA1	Message	Date
Tom Stellard	6b42f2d8aa	R600/SI: Remove unused register class llvm-svn: 231491	2015-03-06 17:00:16 +00:00
Mehdi Amini	46a43556db	Make DataLayout Non-Optional in the Module Summary: DataLayout keeps the string used for its creation. As a side effect it is no longer needed in the Module. This is "almost" NFC, the string is no longer canonicalized, you can't rely on two "equals" DataLayout having the same string returned by getStringRepresentation(). Get rid of DataLayoutPass: the DataLayout is in the Module The DataLayout is "per-module", let's enforce this by not duplicating it more than necessary. One more step toward non-optionality of the DataLayout in the module. Make DataLayout Non-Optional in the Module Module->getDataLayout() will never returns nullptr anymore. Reviewers: echristo Subscribers: resistor, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D7992 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231270	2015-03-04 18:43:29 +00:00
Marek Olsak	d2af89df10	R600/SI: Add an intrinsic for S_FLBIT_I32 / V_FFBH_I32 Required by OpenGL (ARB_gpu_shader5). llvm-svn: 231259	2015-03-04 17:33:45 +00:00
Jan Vesely	468e055f54	R600: Use c++11 style for loop Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 230987	2015-03-02 18:56:52 +00:00
Benjamin Kramer	7149aabf8b	Make some non-constant static variables non-static or fully const. Otherwise we have to emit thread-safe initialization for them. NFC. llvm-svn: 230894	2015-03-01 18:09:56 +00:00
Benjamin Kramer	f1362f6196	ArrayRefize memory operand folding. NFC. llvm-svn: 230846	2015-02-28 12:04:00 +00:00
Tom Stellard	aec94b3bf3	R600/SI: Add missing mubuf instructions llvm-svn: 230759	2015-02-27 14:59:46 +00:00
Tom Stellard	49282c92c5	R600/SI: Consistently put soffset before the offset operand for mubuf instructions This matches the assembly syntax. llvm-svn: 230758	2015-02-27 14:59:44 +00:00
Tom Stellard	1f9939fba6	R600/SI: Add slc, glc, and tfe to non-atomic _ADDR64 instructions llvm-svn: 230757	2015-02-27 14:59:41 +00:00
Tom Stellard	eb05c610b4	R600/SI: Remove M0 from DS assembly strings This matches the assembly syntax for the proprietary compiler. llvm-svn: 230645	2015-02-26 17:08:43 +00:00
Eric Christopher	23a3a7c871	Remove an argument-less call to getSubtargetImpl from TargetLoweringBase. This required plumbing a TargetRegisterInfo through computeRegisterProperties and into findRepresentativeClass which uses it for register class iteration. This required passing a subtarget into a few target specific initializations of TargetLowering. llvm-svn: 230583	2015-02-26 00:00:24 +00:00
Tom Stellard	ecc419c31d	R600/SI: Remove isel mubuf legalization We legalize mubuf instructions post-instruction selection, so this code is no longer needed. llvm-svn: 230352	2015-02-24 17:59:19 +00:00
Matt Arsenault	f07833057c	R600/SI: Use v_madmk_f32 llvm-svn: 230149	2015-02-21 21:29:10 +00:00
Matt Arsenault	0325d3d27f	R600/SI: Try to use v_madak_f32 This is a code size optimization when the constant only has one use. llvm-svn: 230148	2015-02-21 21:29:07 +00:00
Matt Arsenault	657b1cb739	R600/SI: Don't crash when getting immediate operand size llvm-svn: 230147	2015-02-21 21:29:04 +00:00
Matt Arsenault	70120fa813	R600/SI: Fix mad*k definitions llvm-svn: 230146	2015-02-21 21:29:00 +00:00
Tim Northover	3b6b7ca2bc	CodeGen: convert CCState interface to using ArrayRefs Everyone except R600 was manually passing the length of a static array at each callsite, calculated in a variety of interesting ways. Far easier to let ArrayRef handle that. There should be no functional change, but out of tree targets may have to tweak their calls as with these examples. llvm-svn: 230118	2015-02-21 02:11:17 +00:00
Matt Arsenault	20711b7bae	R600/SI: Remove v_sub_f64 pseudo The expansion code does the same thing. Since the operands were not defined with the correct types, this has the side effect of fixing operand folding since the expanded pseudo would never use SGPRs or inline immediates. llvm-svn: 230072	2015-02-20 22:10:45 +00:00
Matt Arsenault	8d6300346f	R600: Use new fmad node. This enables a few useful combines that used to only use fma. Also since v_mad_f32 apparently does not support denormals, disable the existing cases that are custom handled if they are requested. llvm-svn: 230071	2015-02-20 22:10:41 +00:00
Michael Kuperstein	efd7a96d2e	Reverting r229831 due to multiple ARM/PPC/MIPS build-bot failures. llvm-svn: 229841	2015-02-19 11:38:11 +00:00
Michael Kuperstein	ba5b04c798	Use std::bitset for SubtargetFeatures Previously, subtarget features were a bitfield with the underlying type being uint64_t. Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset. No functional change. Differential Revision: http://reviews.llvm.org/D7065 llvm-svn: 229831	2015-02-19 09:01:04 +00:00
Eric Christopher	0795a2ef0c	Remove a few more calls to TargetMachine::getSubtarget from the R600 port. llvm-svn: 229804	2015-02-19 01:10:55 +00:00
Eric Christopher	7edca437f5	Grab the subtarget off of the machine function for the R600 asm printer and clean up a bunch of uses. llvm-svn: 229803	2015-02-19 01:10:53 +00:00
Eric Christopher	96caeda730	Remove the DisasmEnabled AsmPrinter variable and just look it up on the subtarget where it's set anyhow than looking it up 2-3 times in the same place. llvm-svn: 229802	2015-02-19 01:10:49 +00:00
Eric Christopher	111de895a0	80-column fixups. llvm-svn: 229789	2015-02-19 00:15:33 +00:00
Marek Olsak	9b8f32eed1	R600/SI: Fix READLANE and WRITELANE lane select for VI VOP2 declares vsrc1, but VOP3 declares src1. We can't use the same "ins" if the operands have different names in VOP2 and VOP3 encodings. This fixes a hang in geometry shaders which spill M0 on VI. (BTW it doesn't look like M0 needs spilling and the spilling seems duplicated 3 times) llvm-svn: 229752	2015-02-18 22:12:45 +00:00
Marek Olsak	8eeebcccb5	R600/SI: Simplify verification of AMDGPU::OPERAND_REG_INLINE_C llvm-svn: 229751	2015-02-18 22:12:41 +00:00
Marek Olsak	b8c818337d	R600/SI: Remove explicit VOP operand checking This should be handled by the OperandType checking. llvm-svn: 229750	2015-02-18 22:12:37 +00:00
Tom Stellard	1ca873bbc5	R600/SI: Don't set isCodeGenOnly = 1 on all instructions We only need to set this on pseudo instructions which won't be used by the assembler. llvm-svn: 229689	2015-02-18 16:08:17 +00:00
Tom Stellard	c34c37ae66	R600/SI: Add missing VOP1 instructions llvm-svn: 229688	2015-02-18 16:08:15 +00:00
Tom Stellard	894b9883f4	R600/SI: Add missing VOP2 instructions llvm-svn: 229687	2015-02-18 16:08:14 +00:00
Tom Stellard	0c0008cb6e	R600/SI: Add definition for S_CBRANCH_G_FORK llvm-svn: 229686	2015-02-18 16:08:13 +00:00
Tom Stellard	ce449ade7e	R600/SI: Add missing SOP1 instructions llvm-svn: 229685	2015-02-18 16:08:11 +00:00
Tom Stellard	ee21faa029	R600/SI: Refactor SOP2 definitions llvm-svn: 229684	2015-02-18 16:08:09 +00:00
Matt Arsenault	0ba644b66b	R600/SI: Rename dst encoding field to be consistent with docs The docs call this vdst instead of just dst. llvm-svn: 229614	2015-02-18 02:15:37 +00:00
Matt Arsenault	e3dbcf6656	R600/SI: Consistently capitalize encoding field names Some formats capitalized these, but most didn't. Change them all to be consistently lowercase. Now, non-encoding fields and convenience bits are capitalized. Also remove weird looking empty line in some of the formats. llvm-svn: 229613	2015-02-18 02:15:35 +00:00
Matt Arsenault	1ecac06a6f	R600/SI: Set noNamedPositionallyEncodedOperands llvm-svn: 229612	2015-02-18 02:15:32 +00:00
Matt Arsenault	096ec1e10c	R600/SI: Fix src1_modifiers for class instructions src1 doesn't have modifiers, but the operand was missing resulting in an encoding build error when all fields are required.' llvm-svn: 229611	2015-02-18 02:15:30 +00:00
Matt Arsenault	65fa1c425d	R600/SI: Fix not setting clamp / omod for v_cndmask_b32_e64 Rename the multiclass since it now applies to the output modifiers as well. llvm-svn: 229610	2015-02-18 02:15:27 +00:00
Matt Arsenault	284d7dfb53	R600: Fix operand encoding error llvm-svn: 229609	2015-02-18 02:10:42 +00:00
Matt Arsenault	1991f5e40b	R600/SI: Fix encoding error from glc bit on VI SMRD instructions llvm-svn: 229608	2015-02-18 02:10:40 +00:00
Matt Arsenault	e6c5241814	R600/SI: Fix operand encoding for flat instructions llvm-svn: 229607	2015-02-18 02:10:37 +00:00
Matt Arsenault	07e3bb153f	R600/SI: Fix error from vdst on no return atomics Set the ignored field to 0 so we can enable noNamedPositionallyEncodedOperands. llvm-svn: 229606	2015-02-18 02:10:35 +00:00
Matt Arsenault	caa1288fff	R600/SI: Add missing offset operand to buffer bothen llvm-svn: 229605	2015-02-18 02:04:38 +00:00
Matt Arsenault	2ad8bab7ee	R600/SI: Add missing soffset operand to global atomics llvm-svn: 229604	2015-02-18 02:04:35 +00:00
Matt Arsenault	3c34ae293c	R600/SI: Fix brace identation llvm-svn: 229603	2015-02-18 02:04:31 +00:00
Tom Stellard	7b3aa88ac1	R600/SI: Fix asam errors in SIFoldOperands We were trying to fold into implicit uses, which led to out of bounds access of the MCInstrDesc::OpInfo arrray. llvm-svn: 229533	2015-02-17 20:11:54 +00:00
Tom Stellard	bc3776803b	R600/SI: Extend private extload pattern to include zext loads llvm-svn: 229507	2015-02-17 16:36:00 +00:00
Benjamin Kramer	6cd780ff21	Prefer SmallVector::append/insert over push_back loops. Same functionality, but hoists the vector growth out of the loop. llvm-svn: 229500	2015-02-17 15:29:18 +00:00
Andrew Trick	05938a5481	AArch64: Safely handle the incoming sret call argument. This adds a safe interface to the machine independent InputArg struct for accessing the index of the original (IR-level) argument. When a non-native return type is lowered, we generate the hidden machine-level sret argument on-the-fly. Before this fix, we were representing this argument as OrigArgIndex == 0, which is an outright lie. In particular this crashed in the AArch64 backend where we actually try to access the type of the original argument. Now we use a sentinel value for machine arguments that have no original argument index. AArch64, ARM, Mips, and PPC now check for this case before accessing the original argument. Fixes <rdar://19792160> Null pointer assertion in AArch64TargetLowering llvm-svn: 229413	2015-02-16 18:10:47 +00:00
Aaron Ballman	f9a1897c72	Removing LLVM_DELETED_FUNCTION, as MSVC 2012 was the last reason for requiring the macro. NFC; LLVM edition. llvm-svn: 229340	2015-02-15 22:54:22 +00:00
Matt Arsenault	0bbcd8ba2f	R600/SI: Implement correct f64 fdiv This version passes the OpenCL conformance test. llvm-svn: 229239	2015-02-14 04:30:08 +00:00
Matt Arsenault	044f1d19cf	R600/SI: Use complex operand folding for div_scale llvm-svn: 229238	2015-02-14 04:24:28 +00:00
Matt Arsenault	1bc9d95047	R600/SI: Fix implicit vcc operand to v_div_fmas_* This should allow finally fixing the f64 fdiv implementation. Test is disabled for VI since there seems to be a problem with one of the buffer load instructions on it. llvm-svn: 229236	2015-02-14 04:22:00 +00:00
Matt Arsenault	6e26b8d854	R600/SI: Fix schedule model for v_div_scale_{f32\|f64} llvm-svn: 229235	2015-02-14 04:03:18 +00:00
Matt Arsenault	35733e2dec	R600/SI: Really fix size of VReg_1 llvm-svn: 229234	2015-02-14 03:54:32 +00:00
Matt Arsenault	1bcc8cba5a	R600/SI: Rename encoding field to match docs for VOP3b llvm-svn: 229233	2015-02-14 03:54:29 +00:00
Matt Arsenault	31ec598a2a	R600/SI: Fix not encoding src2 for v_div_scale_{f32\|f64} This apparently got lost in the VI changes. llvm-svn: 229230	2015-02-14 03:40:35 +00:00
Matt Arsenault	692acf1438	R600/SI: Fix VOP3b encoding on VI llvm-svn: 229228	2015-02-14 03:02:23 +00:00
Matt Arsenault	95546b46ab	R600/SI: Fix phys reg copies in SIFoldOperands llvm-svn: 229227	2015-02-14 02:55:57 +00:00
Matt Arsenault	9998168982	R600/SI: Fix copies from SGPR to VCC This shows up without optimizations when vcc is required to be used. llvm-svn: 229226	2015-02-14 02:55:56 +00:00
Matt Arsenault	834b1aa806	R600/SI: Add hack to copy from a VGPR to VCC This hopefully should be fixed when VReg_1 is removed. llvm-svn: 229225	2015-02-14 02:55:54 +00:00
Matt Arsenault	f417ff8f2a	R600/SI: Fix size of VReg_1 This is really a 32-bit register, if we try to check the size of it, we want 32-bits. llvm-svn: 229223	2015-02-14 02:51:44 +00:00
Duncan P. N. Exon Smith	8480c87ce6	R600: Canonicalize access to function attributes, NFC Canonicalize access to function attributes to use the simpler API. getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind) getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind) llvm-svn: 229222	2015-02-14 02:45:45 +00:00
Tom Stellard	e1e4a2d310	R600/SI: Refactor SOP1 classes llvm-svn: 229152	2015-02-13 21:02:37 +00:00
Tom Stellard	6c65e9a99a	R600/SI: Lowercase register names llvm-svn: 229151	2015-02-13 21:02:36 +00:00
Tom Stellard	d09fa9cec8	R600/SI: Remove some unused TableGen classes llvm-svn: 229150	2015-02-13 21:02:33 +00:00
Matt Arsenault	774e20b42a	R600/SI: Remove handling of fpimm llvm-svn: 229136	2015-02-13 19:05:07 +00:00
Matt Arsenault	11a4d6774b	R600/SI: Allow f64 inline immediates in i64 operands This requires considering the size of the operand when checking immediate legality. llvm-svn: 229135	2015-02-13 19:05:03 +00:00
Chandler Carruth	30d69c2e36	[PM] Remove the old 'PassManager.h' header file at the top level of LLVM's include tree and the use of using declarations to hide the 'legacy' namespace for the old pass manager. This undoes the primary modules-hostile change I made to keep out-of-tree targets building. I sent an email inquiring about whether this would be reasonable to do at this phase and people seemed fine with it, so making it a reality. This should allow us to start bootstrapping with modules to a certain extent along with making it easier to mix and match headers in general. The updates to any code for users of LLVM are very mechanical. Switch from including "llvm/PassManager.h" to "llvm/IR/LegacyPassManager.h". Qualify the types which now produce compile errors with "legacy::". The most common ones are "PassManager", "PassManagerBase", and "FunctionPassManager". llvm-svn: 229094	2015-02-13 10:01:29 +00:00
Matt Arsenault	63bef0d177	R600/SI: Remove unnecessary check for fpimm llvm-svn: 229034	2015-02-13 02:47:22 +00:00
Benjamin Kramer	5f6a907288	MathExtras: Bring Count(Trailing\|Leading)Ones and CountPopulation in line with countTrailingZeros Update all callers. llvm-svn: 228930	2015-02-12 15:35:40 +00:00
Tom Stellard	0648588e7d	R600/SI: Disable subreg liveness This is temporary while we try to fix a crash in the register coalescer. llvm-svn: 228861	2015-02-11 18:24:53 +00:00
Tom Stellard	de5b7b180a	R600: Split AMDGPUPassConfig into R600PassConfig and GCNPassConfig llvm-svn: 228850	2015-02-11 17:11:51 +00:00
Tom Stellard	c65b36061a	R600: Create an R600TargetMachine for pre-gcn GPUs No functinality change. R600TargetMachine inherits from AMDGPUTargetMachine. llvm-svn: 228849	2015-02-11 17:11:50 +00:00
Tom Stellard	94b7231740	R600/SI: Store immediate offsets > 12-bits in soffset This will save us from having to extend these offsets to 64-bits and storing them in a pair of vgprs. llvm-svn: 228776	2015-02-11 00:34:35 +00:00
Tom Stellard	c53861ab84	R600/SI: Add soffset operand to mubuf addr64 instruction We were previously hard-coding soffset to 0. llvm-svn: 228775	2015-02-11 00:34:32 +00:00
Benjamin Kramer	970eac40bf	Make helper functions/classes/globals static. NFC. llvm-svn: 228410	2015-02-06 17:51:54 +00:00
Michel Danzer	d89a557f33	R600/SI: Don't enable WQM for V_INTERP_* instructions v2 Doesn't seem necessary anymore. I think this was mostly compensating for not enabling WQM for texture sampling instructions. v2: Add test coverage Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 228373	2015-02-06 02:51:25 +00:00
Michel Danzer	494391ba47	R600/SI: Also enable WQM for image opcodes which calculate LOD v3 If whole quad mode isn't enabled for these, the level of detail is calculated incorrectly for pixels along diagonal triangle edges, causing artifacts. v2: Use a TSFlag instead of lots of switch cases v3: Add test coverage Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88642 Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 228372	2015-02-06 02:51:20 +00:00
Tom Stellard	eea3f70432	R600/SI: Fix bug in TTI loop unrolling preferences We should be setting UnrollingPreferences::MaxCount to MAX_UINT instead of UnrollingPreferences::Count. Count is a 'forced unrolling factor', while MaxCount sets an upper limit to the unrolling factor. Setting Count to MAX_UINT was causing the loop in the testcase to be unrolled 15 times, when it only had a maximum of 4 iterations. llvm-svn: 228303	2015-02-05 15:32:18 +00:00
Tom Stellard	0f29de78e6	R600/SI: Fix bug from insertion of llvm.SI.end.cf into loop headers The llvm.SI.end.cf intrinsic is used to mark the end of if-then blocks, if-then-else blocks, and loops. It is responsible for updating the exec mask to re-enable threads that had been masked during the preceding control flow block. For example: s_mov_b64 exec, 0x3 ; Initial exec mask s_mov_b64 s[0:1], exec ; Saved exec mask v_cmpx_gt_u32 exec, s[2:3], v0, 0 ; llvm.SI.if do_stuff() s_or_b64 exec, exec, s[0:1] ; llvm.SI.end.cf The bug fixed by this patch was one where the llvm.SI.end.cf intrinsic was being inserted into the header of loops. This would happen when an if block terminated in a loop header and we would end up with code like this: s_mov_b64 exec, 0x3 ; Initial exec mask s_mov_b64 s[0:1], exec ; Saved exec mask v_cmpx_gt_u32 exec, s[2:3], v0, 0 ; llvm.SI.if do_stuff() LOOP: ; Start of loop header s_or_b64 exec, exec, s[0:1] ; llvm.SI.end.cf <-BUG: The exec mask has the same value at the beginning of each loop iteration. do_stuff(); s_cbranch_execnz LOOP The fix is to create a new basic block before the loop and insert the llvm.SI.end.cf there. This way the exec mask is restored before the start of the loop instead of at the beginning of each iteration. llvm-svn: 228302	2015-02-05 15:32:15 +00:00
Matt Arsenault	abd271b4e8	R600/SI: Fix i64 truncate to i1 llvm-svn: 228273	2015-02-05 06:05:13 +00:00
Tom Stellard	f6afc80cc0	R600/SI: Enable subreg liveness by default llvm-svn: 228228	2015-02-04 23:14:18 +00:00
Tom Stellard	33e64c66ac	R600/SI: Expand misaligned 16-bit memory accesses llvm-svn: 228190	2015-02-04 20:49:52 +00:00
Tom Stellard	c7e448c92e	R600/SI: Make more store operations legal v2i32, i32, trunc i32 to i16, and truc i32 to i8 stores are legal for all address spaces. We had marked them as custom in order to lower them for the private address space, but this is no longer necessary. This enables lowering of misaligned stores of these types in the DAGLegalizer. llvm-svn: 228189	2015-02-04 20:49:51 +00:00
Tom Stellard	096b8c1e6d	R600: Don't promote i64 stores to v2i32 during DAG legalization We take care of this during instruction selection now. This fixes a potential infinite loop when lowering misaligned stores. llvm-svn: 228188	2015-02-04 20:49:49 +00:00
Marek Olsak	24ae2cda7c	R600/SI: Remove useless patterns in VALU which are already covered by SALU Also remove hasPostISelHook=1 from V_LSHL_B32. It's defined by InstSI already. Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 228039	2015-02-03 21:53:08 +00:00
Marek Olsak	3ecf508734	R600/SI: Rewrite VOP1InstSI to contain a pseudo and _si opcode What this does is that if you accidentally select these instructions on VI, the code generation will fail, because the pseudo -> _vi mapping will be undefined. The idea is to be able to catch possible future bugs easily. Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 228038	2015-02-03 21:53:05 +00:00
Marek Olsak	707a6d0c20	R600/SI: Fix B64 VALU shifts on VI SI only has standard versions. VI only has REV versions. Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 228037	2015-02-03 21:53:01 +00:00
Marek Olsak	191507e0b7	R600/SI: Don't generate non-existent LSHL, LSHR, ASHR B32 variants on VI This can happen when a REV instruction is commuted. The trick is not to define the _vi versions of instructions, which has these consequences: - code generation will always fail if a pseudo cannot be lowered (very useful to catch bugs where an unsupported instruction somehow makes it to the printer) - ability to query if a pseudo can be lowered, which is done in commuteOpcode to prevent REV from commuting to non-REV on VI Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 227990	2015-02-03 17:38:12 +00:00
Marek Olsak	7585a29bd4	R600/SI: Remove VOP2_REV definitions from target-specific instructions The getCommute* functions are only used with pseudos, so this commit doesn't change anything. The issue with missing non-rev versions of shift instructions on VI will fixed separately. Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 227989	2015-02-03 17:38:05 +00:00
Marek Olsak	11057ee022	R600/SI: Trivial instruction definition corrections for VI (v2) - V_MAC_LEGACY_F32 exists on VI, but it's VOP3-only. - Define CVT_PK opcodes which are different between SI and VI. These are unused. The idea is to define all chip differences. v2: keep V_MUL_LO_U32 Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 227988	2015-02-03 17:38:01 +00:00
Marek Olsak	3db6ba8cfa	R600/SI: Determine target-specific encoding of READLANE and WRITELANE early v2 These are VOP2 on SI and VOP3 on VI, and their pseudos are neither, which can be a problem. In order to make isVOP2 and isVOP3 queries behave as expected, the encoding must be determined first. This doesn't fix any known issue, but better safe than sorry. v2: add and use getMCOpcodeFromPseudo Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 227987	2015-02-03 17:37:57 +00:00
Marek Olsak	1bd2463548	R600/SI: Fix dependency between instruction writing M0 and S_SENDMSG on VI (v2) This fixes a hang when using an empty geometry shader. v2: - don't add s_nop when followed by s_waitcnt - comestic changes Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 227986	2015-02-03 17:37:52 +00:00
Tom Stellard	c6b299c8c4	R600/SI: 64-bit and larger memory access must be at least 4-byte aligned This is true for SI only. CI+ supports unaligned memory accesses, but this requires driver support, so for now we disallow unaligned accesses for all GCN targets. llvm-svn: 227822	2015-02-02 18:02:28 +00:00
Chandler Carruth	ab5cb36c40	[multiversion] Remove the function parameter from the unrolling preferences interface on TTI now that all of TTI is per-function. llvm-svn: 227741	2015-02-01 14:31:23 +00:00
Chandler Carruth	c956ab6603	[multiversion] Switch the TTI queries from TargetMachine to Subtarget now that we have a correct and cached subtarget specific to the function. Also, finish providing a cached per-function subtarget in the core LLVMTargetMachine -- that layer hadn't switched over yet. The only use of the TargetMachine was to re-lookup a subtarget for a particular function to work around the fact that TTI was immutable. Now that it is per-function and we haved a cached subtarget, use it. This still leaves a few interfaces with real warts on them where we were passing Function objects through the TTI interface. I'll remove these and clean their usage up in subsequent commits now that this isn't necessary. llvm-svn: 227738	2015-02-01 14:22:17 +00:00
Chandler Carruth	c340ca839c	[multiversion] Remove the cached TargetMachine pointer from the intermediate TTI implementation template and instead query up to the derived class for both the TargetMachine and the TargetLowering. Most of the derived types had a TLI cached already and there is no need to store a less precisely typed target machine pointer. This will in turn make it much cleaner to look up the TLI via a per-function subtarget instead of the generic subtarget, and it will pave the way toward pulling the subtarget used for unroll preferences into the same form once we are always using the function to look up the correct subtarget. llvm-svn: 227737	2015-02-01 14:01:15 +00:00
Chandler Carruth	8b04c0d26a	[multiversion] Switch all of the targets over to use the TargetIRAnalysis access path directly rather than implementing getTTI. This even removes getTTI from the interface. It's more efficient for each target to just register a precise callback that creates their specific TTI. As part of this, all of the targets which are building their subtargets individually per-function now build their TTI instance with the function and thus look up the correct subtarget and cache it. NVPTX, R600, and XCore currently don't leverage this functionality, but its trivial for them to add it now. llvm-svn: 227735	2015-02-01 13:20:00 +00:00
Chandler Carruth	ee642690ea	[multiversion] Remove a false freedom to leave the TargetMachine pointer null. For some reason some of the original TTI code supported a null target machine. This seems to have been legacy, and I made matters worse when refactoring this code by spreading that pattern further through the various targets. The TargetMachine can't actually be null, and it doesn't make sense to support that use case. I've now consistently removed it and removed all of the code trying to cope with that situation. This is probably good, as several targets didn't cope with it being null despite the null default argument in their constructors. =] llvm-svn: 227734	2015-02-01 12:38:24 +00:00
Chandler Carruth	d8b3e9a420	[PM] Remove a bunch of stale TTI creation method declarations. I nuked their definitions, but forgot to clean up all the declarations which are in different files. llvm-svn: 227698	2015-02-01 00:22:15 +00:00
Matt Arsenault	25f61a6f89	Fix typo llvm-svn: 227697	2015-01-31 23:37:27 +00:00
Matt Arsenault	08ad328ae2	R600/SI: Only select cvt_flr/cvt_rpi with no NaNs. These have different behavior from cvt_i32_f32 on NaN. llvm-svn: 227693	2015-01-31 21:28:13 +00:00
Chandler Carruth	93dcdc47db	[PM] Switch the TargetMachine interface from accepting a pass manager base which it adds a single analysis pass to, to instead return the type erased TargetTransformInfo object constructed for that TargetMachine. This removes all of the pass variants for TTI. There is now a single TTI pass in the Analysis layer. All of the Analysis <-> Target communication is through the TTI's type erased interface itself. While the diff is large here, it is nothing more that code motion to make types available in a header file for use in a different source file within each target. I've tried to keep all the doxygen comments and file boilerplate in line with this move, but let me know if I missed anything. With this in place, the next step to making TTI work with the new pass manager is to introduce a really simple new-style analysis that produces a TTI object via a callback into this routine on the target machine. Once we have that, we'll have the building blocks necessary to accept a function argument as well. llvm-svn: 227685	2015-01-31 11:17:59 +00:00
Chandler Carruth	705b185f90	[PM] Change the core design of the TTI analysis to use a polymorphic type erased interface and a single analysis pass rather than an extremely complex analysis group. The end result is that the TTI analysis can contain a type erased implementation that supports the polymorphic TTI interface. We can build one from a target-specific implementation or from a dummy one in the IR. I've also factored all of the code into "mix-in"-able base classes, including CRTP base classes to facilitate calling back up to the most specialized form when delegating horizontally across the surface. These aren't as clean as I would like and I'm planning to work on cleaning some of this up, but I wanted to start by putting into the right form. There are a number of reasons for this change, and this particular design. The first and foremost reason is that an analysis group is complete overkill, and the chaining delegation strategy was so opaque, confusing, and high overhead that TTI was suffering greatly for it. Several of the TTI functions had failed to be implemented in all places because of the chaining-based delegation making there be no checking of this. A few other functions were implemented with incorrect delegation. The message to me was very clear working on this -- the delegation and analysis group structure was too confusing to be useful here. The other reason of course is that this is much more natural fit for the new pass manager. This will lay the ground work for a type-erased per-function info object that can look up the correct subtarget and even cache it. Yet another benefit is that this will significantly simplify the interaction of the pass managers and the TargetMachine. See the future work below. The downside of this change is that it is very, very verbose. I'm going to work to improve that, but it is somewhat an implementation necessity in C++ to do type erasure. =/ I discussed this design really extensively with Eric and Hal prior to going down this path, and afterward showed them the result. No one was really thrilled with it, but there doesn't seem to be a substantially better alternative. Using a base class and virtual method dispatch would make the code much shorter, but as discussed in the update to the programmer's manual and elsewhere, a polymorphic interface feels like the more principled approach even if this is perhaps the least compelling example of it. ;] Ultimately, there is still a lot more to be done here, but this was the huge chunk that I couldn't really split things out of because this was the interface change to TTI. I've tried to minimize all the other parts of this. The follow up work should include at least: 1) Improving the TargetMachine interface by having it directly return a TTI object. Because we have a non-pass object with value semantics and an internal type erasure mechanism, we can narrow the interface of the TargetMachine to just do what we need: build and return a TTI object that we can then insert into the pass pipeline. 2) Make the TTI object be fully specialized for a particular function. This will include splitting off a minimal form of it which is sufficient for the inliner and the old pass manager. 3) Add a new pass manager analysis which produces TTI objects from the target machine for each function. This may actually be done as part of #2 in order to use the new analysis to implement #2. 4) Work on narrowing the API between TTI and the targets so that it is easier to understand and less verbose to type erase. 5) Work on narrowing the API between TTI and its clients so that it is easier to understand and less verbose to forward. 6) Try to improve the CRTP-based delegation. I feel like this code is just a bit messy and exacerbating the complexity of implementing the TTI in each target. Many thanks to Eric and Hal for their help here. I ended up blocked on this somewhat more abruptly than I expected, and so I appreciate getting it sorted out very quickly. Differential Revision: http://reviews.llvm.org/D7293 llvm-svn: 227669	2015-01-31 03:43:40 +00:00
Eric Christopher	7792e32b64	Reuse a bunch of cached subtargets and remove getSubtarget calls without a Function argument. llvm-svn: 227638	2015-01-30 23:24:40 +00:00
Tom Stellard	b357c43b93	R600/SI: Handle SI_SPILL_V96_RESTORE in SIRegisterInfo::eliminateFrameIndex() This fixes a crash in Unigine Heaven. llvm-svn: 227618	2015-01-30 21:51:51 +00:00
Matt Arsenault	423bf3f64a	R600/SI: Implement enableAggressiveFMAFusion Add tests for the various combines. This should always be at least cycle neutral on all subtargets for f64, and faster on some. For f32 we should prefer selecting v_mad_f32 over v_fma_f32. llvm-svn: 227484	2015-01-29 19:34:32 +00:00
Matt Arsenault	b035a5740c	R600/SI: Add subtarget feature for if f32 fma is fast llvm-svn: 227483	2015-01-29 19:34:25 +00:00
Matt Arsenault	572d2301e2	R600/SI: Fix tonga's basic scheduling model llvm-svn: 227482	2015-01-29 19:34:18 +00:00
Rafael Espindola	ba31e27f0a	Compute the ELF SectionKind from the flags. Any code creating an MCSectionELF knows ELF and already provides the flags. SectionKind is an abstraction used by common code that uses a plain MCSection. Use the flags to compute the SectionKind. This removes a lot of guessing and boilerplate from the MCSectionELF construction. llvm-svn: 227476	2015-01-29 17:33:21 +00:00
Tom Stellard	b14ead55f4	R600/SI: Remove stray debug statements llvm-svn: 227462	2015-01-29 16:55:28 +00:00
Tom Stellard	83f0bcef7a	R600/SI: Define a schedule model and enable the generic machine scheduler The schedule model is not complete yet, and could be improved. llvm-svn: 227461	2015-01-29 16:55:25 +00:00
Tom Stellard	40ce8af4a5	R600: Move DataLayout to AMDGPUTargetMachine This is a follow up to r227113. It is now required to use the amdgcn target for SI and newer GPUs. llvm-svn: 227316	2015-01-28 16:04:26 +00:00
Tom Stellard	eba5648ad2	R600: Use a Southern Islands GPU as the default for the amdgcn target llvm-svn: 227314	2015-01-28 15:38:42 +00:00
Marek Olsak	794ff8392e	R600/SI: Fix MIN3/MAX3 on VI, define MED3 llvm-svn: 227213	2015-01-27 17:25:15 +00:00
Marek Olsak	367447c255	R600/SI: Don't set patterns for chip-specific instructions while having pseudos Only pseudos have patterns on them. Also don't set the asm string for VINTRP_Pseudo. All pseudos should have empty asm. This matches what all other multiclasses do. llvm-svn: 227212	2015-01-27 17:25:11 +00:00
Marek Olsak	0c1f8812f5	R600/SI: Add VI versions of LDS atomics Each class is split into two: one adds let statements around non-pseudos, and the other one specifies the parameters. llvm-svn: 227211	2015-01-27 17:25:07 +00:00
Marek Olsak	19d9e1f459	R600/SI: Add VI versions of MUBUF atomics llvm-svn: 227210	2015-01-27 17:25:02 +00:00
Marek Olsak	ee98b1177c	R600/SI: Add VI versions of MUBUF loads and stores This enables a lot of existing patterns for VI. llvm-svn: 227209	2015-01-27 17:24:58 +00:00
Marek Olsak	7ef6db49ac	R600/SI: Add pseudos for MUBUF loads and stores This defines the SI versions only, so it shouldn't change anything. There are no changes other than using the new multiclasses, adding missing mayLoad/mayStore, and formatting fixes. llvm-svn: 227208	2015-01-27 17:24:54 +00:00
Eric Christopher	8b7706517c	Move DataLayout back to the TargetMachine from TargetSubtargetInfo derived classes. Since global data alignment, layout, and mangling is often based on the DataLayout, move it to the TargetMachine. This ensures that global data is going to be layed out and mangled consistently if the subtarget changes on a per function basis. Prior to this all targets() have had subtarget dependent code moved out and onto the TargetMachine. One target hasn't been migrated as part of this change: R600. The R600 port has, as a subtarget feature, the size of pointers and this affects global data layout. I've currently hacked in a FIXME to enable progress, but the port needs to be updated to either pass the 64-bitness to the TargetMachine, or fix the DataLayout to avoid subtarget dependent features. llvm-svn: 227113	2015-01-26 19:03:15 +00:00
Tom Stellard	edd188c459	R600/SI: Emit .hsa.version section for amdhsa OS llvm-svn: 226970	2015-01-23 23:59:08 +00:00
Tom Stellard	20f6c0732f	R600/SI: Move i64 -> v2i32 load promotion into AMDGPUDAGToDAGISel::Select() We used to do this promotion during DAG legalization, but this caused an infinite loop in ExpandUnalignedLoad() because it assumed that i64 loads were legal if i64 was a legal type. It also seems better to report i64 loads as legal, since they actually are and we were just promoting them to simplify our tablegen files. llvm-svn: 226945	2015-01-23 22:05:45 +00:00
Jan Vesely	5f715d36a7	R600: Try to use lower types for 64bit division if possible v2: add and enable tests for SI Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> llvm-svn: 226881	2015-01-22 23:42:43 +00:00
Jan Vesely	f7987ca5a7	R600: Simplify LowerUDIVREM optimizations can handle removing the Hi part operations. The generated code is identical for R600, ~10% icount reduction for SI v2: rebase Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> llvm-svn: 226879	2015-01-22 23:42:39 +00:00
Matt Arsenault	b00554886f	R600/SI: Custom lower fround This fixes it for SI. It also removes the pattern used previously for Evergreen for f32. I'm not sure if the the new R600 output is better or not, but it uses 1 fewer instructions if BFI is available. llvm-svn: 226682	2015-01-21 18:18:25 +00:00
Tom Stellard	e99fb65d87	R600/SI: Add subtarget feature to enable VGPR spilling for all shader types This is disabled by default, but can be enabled with the subtarget feature: 'vgpr-spilling' llvm-svn: 226597	2015-01-20 19:33:04 +00:00
Tom Stellard	021053f500	R600/SI: Fix simple-loop.ll test llvm-svn: 226596	2015-01-20 19:33:02 +00:00
Tom Stellard	3a70d07f51	R600/SI: Remove stray debugging code from r226586 llvm-svn: 226591	2015-01-20 19:24:31 +00:00
Tom Stellard	95292bbfcd	R600/SI: Use external symbols for scratch buffer We were passing the scratch buffer address to the shaders via user sgprs, but now we use external symbols and have the driver patch the shader using reloc information. llvm-svn: 226586	2015-01-20 17:49:47 +00:00
Tom Stellard	8255af45cb	R600/SI: Add kill flag when copying scratch offset to a register This allows us to re-use the same register for the scratch offset when accessing large private arrays. llvm-svn: 226585	2015-01-20 17:49:45 +00:00
Tom Stellard	8058069529	R600/SI: Don't store scratch buffer frame index in MUBUF offset field We don't have a good way of legalizing this if the frame index offset is more than the 12-bits, which is size of MUBUF's offset field, so now we store the frame index in the vaddr field. llvm-svn: 226584	2015-01-20 17:49:43 +00:00
Tom Stellard	1106b1c662	R600/SI: Update SIInstrInfo:verifyInstruction() after r225662 Now that we have our own custom register operand types, we need to handle them in the verifiier. llvm-svn: 226583	2015-01-20 17:49:41 +00:00
Rafael Espindola	2658554aec	Add r224985 back with fixes. The fixes are to note that AArch64 has additional restrictions on when local relocations can be used. In particular, ld64 requires that relocations to cstring/cfstrings use linker visible symbols. Original message: In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. llvm-svn: 226503	2015-01-19 21:11:14 +00:00
David Blaikie	9459832ebd	std::unique_ptrify the MCStreamer argument to createAsmPrinter llvm-svn: 226414	2015-01-18 20:29:04 +00:00
Matt Arsenault	eeb2a7e688	R600/SI: Add patterns for v_cvt_{flr\|rpi}_i32_f32 llvm-svn: 226230	2015-01-15 23:58:35 +00:00
Matt Arsenault	268757ba60	R600/SI: Fix trailing comma with modifiers Instructions with 1 operand can still use source modifiers, so make sure we don't print an extra comma afterwards. llvm-svn: 226226	2015-01-15 23:17:03 +00:00
Marek Olsak	f0b130ace0	R600/SI: Unify VOP2 instructions which are VOP3-only on VI This removes some duplicated classes and definitions. These instructions are defined: _e32 // pseudo _e32_si _e64 // pseudo _e64_si _e64_vi llvm-svn: 226191	2015-01-15 18:43:06 +00:00
Marek Olsak	c536850526	R600/SI: Use 64-bit encoding by default for opcodes that are VOP3-only on VI llvm-svn: 226190	2015-01-15 18:43:01 +00:00
Marek Olsak	15e4a59899	R600/SI: Add V_READLANE_B32 and V_WRITELANE_B32 for VI These are VOP3-only on VI. The new multiclass doesn't define VOP3 versions of VOP2 instructions. llvm-svn: 226189	2015-01-15 18:42:55 +00:00
Marek Olsak	a93603d508	R600/SI: Don't shrink instructions whose e32 encoding doesn't exist v2: modify hasVALU32BitEncoding instead v3: - add pseudoToMCOpcode helper to AMDGPUInstInfo, which is used by both hasVALU32BitEncoding and AMDGPUMCInstLower::lower - report an error if a pseudo can't be lowered llvm-svn: 226188	2015-01-15 18:42:51 +00:00
Marek Olsak	dc4d202f10	R600/SI: Add common class VOPAnyCommon llvm-svn: 226187	2015-01-15 18:42:44 +00:00
Marek Olsak	eae20ab5fd	R600/SI: Don't select SI-only VOP3 opcodes on VI llvm-svn: 226186	2015-01-15 18:42:40 +00:00
Rafael Espindola	7244bb3c17	Revert "Add r224985 back with two fixes." This reverts commit r225644 while I debug a regression. llvm-svn: 226022	2015-01-14 19:07:23 +00:00
Tom Stellard	0febe685ed	R600/SI: Use IMPLICIT_DEF and KILL when failing to spill VGPRs This helps us avoid 'invalid register class for operand' verifier errors. llvm-svn: 225989	2015-01-14 15:42:34 +00:00
Tom Stellard	42fb60e1a7	R600/SI: Spill VGPRs to scratch space for compute shaders llvm-svn: 225988	2015-01-14 15:42:31 +00:00
Chandler Carruth	d9903888d9	[cleanup] Re-sort all the #include lines in LLVM using utils/sort_includes.py. I clearly haven't done this in a while, so more changed than usual. This even uncovered a missing include from the InstrProf library that I've added. No functionality changed here, just mechanical cleanup of the include order. llvm-svn: 225974	2015-01-14 11:23:27 +00:00
Matt Arsenault	e698663687	R600/SI: Fix bad code with unaligned byte vector loads Don't do the v4i8 -> v4f32 combine if the load will need to be expanded due to alignment. This stops adding instructions to repack into a single register that the v_cvt_ubyteN_f32 instructions read. llvm-svn: 225926	2015-01-14 01:35:22 +00:00

1 2 3 4 5 ...

1710 Commits