llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	aac9b49325	AMDGPU: Make flat_scratch name consistent The printed name and the parsed assembler names weren't the same. I'm not sure which name SC prints these as, but I think it's this one. llvm-svn: 252010	2015-11-03 22:50:34 +00:00
Matt Arsenault	967c2f5dee	AMDGPU: Fix asserts on invalid register ranges If the requested SGPR was not actually aligned, it was accepted and rounded down instead of rejected. Also fix an assert if the range is an invalid size. llvm-svn: 252009	2015-11-03 22:50:32 +00:00
Matt Arsenault	3473c72aab	AMDGPU: Fix off by one error in register parsing If trying to use one past the end, this would assert. llvm-svn: 252008	2015-11-03 22:50:27 +00:00
Matt Arsenault	e8ed13d946	AMDGPU: s[102:103] is unavailable on VI llvm-svn: 252000	2015-11-03 22:39:52 +00:00
Matt Arsenault	192b282bf3	AMDGPU: Define correct number of SGPRs There are actually 104 so 2 were missing. More assembler tests with high register number tuples will be included in later patches. llvm-svn: 251999	2015-11-03 22:39:50 +00:00
Matt Arsenault	6c0674112a	AMDGPU: Make findUsedSGPR more readable Add more comments etc. llvm-svn: 251996	2015-11-03 22:30:15 +00:00
Matt Arsenault	782c03bb7e	AMDGPU: Initialize SIFixSGPRCopies so -print-after works llvm-svn: 251995	2015-11-03 22:30:13 +00:00
Matt Arsenault	d9d659aa23	AMDGPU: Alphabetize includes llvm-svn: 251994	2015-11-03 22:30:08 +00:00
Matthias Braun	93563e7032	ScheduleDAGInstrs: Remove IsPostRA flag; NFC ScheduleDAGInstrs doesn't behave differently before or after register allocation. It was only used in a method of MachineSchedulerBase which behaved differently in MachineScheduler/PostMachineScheduler. Change this to let MachineScheduler/PostMachineScheduler just pass in a parameter to that function. The order of the LiveIntervals* and bool RemoveKillFlags paramters have been switched to make out-of-tree code fail instead of unintentionally passing a value intended for the IsPostRA flag to the (previously following and default initialized) RemoveKillFlags. Differential Revision: http://reviews.llvm.org/D14245 llvm-svn: 251883	2015-11-03 01:53:29 +00:00
Matt Arsenault	f1aebbf33a	AMDGPU: Stop assuming vreg for build_vector This was causing a variety of test failures when v2i64 is added as a legal type. SIFixSGPRCopies should correctly handle the case of vector inputs to a scalar reg_sequence, so this isn't necessary anymore. This was hiding some deficiencies in how reg_sequence is handled later, but this shouldn't be a problem anymore since the register class copy of a reg_sequence is now done before the reg_sequence. llvm-svn: 251860	2015-11-02 23:30:48 +00:00
Matt Arsenault	d48da14269	AMDGPU: Error on graphics shaders with HSA I've found myself pointlessly debugging problems from running graphics tests with an HSA triple a few times, so stop this from happening again. llvm-svn: 251858	2015-11-02 23:23:02 +00:00
Matt Arsenault	0de924b76d	AMDGPU: Distribute SGPR->VGPR copies of REG_SEQUENCE Make the REG_SEQUENCE be a VGPR, and do the register class copy first. llvm-svn: 251855	2015-11-02 23:15:42 +00:00
Marek Olsak	6f6d318e16	AMDGPU/SI: handle undef for llvm.SI.packf16 llvm-svn: 251632	2015-10-29 15:29:09 +00:00
Marek Olsak	74d084f466	AMDGPU/SI: use S_OR for fneg (fabs f32) llvm-svn: 251631	2015-10-29 15:29:05 +00:00
Marek Olsak	f924dd6f3c	AMDGPU/SI: use S_AND for i1 trunc llvm-svn: 251630	2015-10-29 15:05:03 +00:00
Matt Arsenault	2ea0a23f18	AMDGPU: Print modifiers when dumping AMDGPUOperand llvm-svn: 251160	2015-10-24 00:12:56 +00:00
Matt Arsenault	382557ec72	AMDGPU: Fix parsing of 32-bit literals with sign bit set llvm-svn: 251132	2015-10-23 18:07:58 +00:00
Matt Arsenault	391be09ef3	AMDGPU: Fix adding redundant m0 uses BuildMI already adds these since they are defined correctly now. llvm-svn: 250961	2015-10-21 22:37:51 +00:00
Matt Arsenault	e8c0891e42	AMDGPU: Fix verifier error in SIFoldOperands There may be other use operands that also need their kill flags cleared. This happens in a few tests when SIFoldOperands is moved after PeepholeOptimizer. PeepholeOptimizer rewrites cases that look like: %vreg0 = ... %vreg1 = COPY %vreg0 use %vreg1<kill> %vreg2 = COPY %vreg0 use %vreg2<kill> to use the earlier source to %vreg0 = ... use %vreg0 use %vreg0 Currently SIFoldOperands sees the copied registers, so there is only one use. So far I haven't managed to come up with a test that currently has multiple uses of a foldable VGPR -> VGPR copy. llvm-svn: 250960	2015-10-21 22:37:50 +00:00
Matt Arsenault	b6fd98c7d9	AMDGPU: Split DiagnosticInfoUnsupported into its own file llvm-svn: 250959	2015-10-21 22:37:46 +00:00
Matt Arsenault	6005fcbe12	AMDGPU: Simplify VOP3 operand legalization. This was checking for a variety of situations that should never happen. This saves a tiny bit of compile time. We should not be selecting instructions with invalid operands in the first place. Most of the time for registers copys are inserted to the correct operand register class. For VOP3, since all operand types are supported and literal constants never are, we just need to verify the constant bus requirements (all immediates should be legal inline ones). The only possibly tricky case to maybe worry about is if when legalizing operands in moveToVALU with s_add_i32 and similar instructions. If the original s_add_i32 had a literal constant and we need to replace it with v_add_i32_e64 we would have an unsupported literal operand. However, I don't think we should worry about that because SIFoldOperands should handle folding literal constant operands into the SALU instructions based on the uses. At SIFoldOperands time, the legality and profitability of operand types is a bit different. llvm-svn: 250951	2015-10-21 21:51:02 +00:00
Matt Arsenault	e223cebd10	AMDGPU: Fix not checking implicit operands in verifyInstruction When verifying constant bus restrictions, this wasn't catching uses in implicit operands. llvm-svn: 250948	2015-10-21 21:15:01 +00:00
Matt Arsenault	3add6439d0	AMDGPU: Add MachineInstr overloads for instruction format tests llvm-svn: 250797	2015-10-20 04:35:43 +00:00
Matt Arsenault	8f18917a90	AMDGPU: Stop reserving v[254:255] This wasn't doing anything useful. They weren't explicitly used anywhere, and the RegScavenger ignores reserved registers. This for some reason caused a random scheduling change in the test. Getting the check lines to pass is too frustrating, and there's probably not too much value in checking the vector case's operands N times. llvm-svn: 250794	2015-10-20 03:59:58 +00:00
Craig Topper	2626094fa1	Make a bunch of static arrays const. llvm-svn: 250642	2015-10-18 05:15:34 +00:00
Artyom Skrobov	63471330d2	Don't pretend AMDGPU backend knows how to custom-lower UDIVREM for vector types; it can't Reviewers: arsenm, jvesely, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13734 llvm-svn: 250384	2015-10-15 09:18:47 +00:00
Duncan P. N. Exon Smith	a73371a9b7	AMDGPU: Remove implicit ilist iterator conversions, NFC One of the changes in lib/Target/AMDGPU/AMDGPUMCInstLower.cpp was a new one. Previously, bundle iterators and single-instruction iterators could be compared to each other (comparing on underlying pointers). I changed a comparison from using `MBB->end()` to using `MBB->instr_end()`, since both end iterators should point at the some place anyway. I don't think the implicit conversion between the two iterator types is a good idea since it's fairly easy to accidentally compare to the wrong thing (they aren't always end iterators). Otherwise I would have just added the conversion. Even with that, no there should be functionality change here. llvm-svn: 250218	2015-10-13 20:07:10 +00:00
Matt Arsenault	f0d9e47da2	AMDGPU: Refactor isVGPRToSGPRCopy It should now correctly handle physical registers and make it easier to identify the other direction. llvm-svn: 250132	2015-10-13 00:07:54 +00:00
Matt Arsenault	61dc235f20	DAGCombiner: Combine extract_vector_elt from build_vector This basic combine was surprisingly missing. AMDGPU legalizes many operations in terms of 32-bit vector components, so not doing this results in many extra copies and subregister extracts that need to be cleaned up later. InstCombine already does this for the hasOneUse case. The target hook is to fix a handful of tests which break (e.g. ARM/vmov.ll) which turn from a vector materialize repeated immediate instruction to a constant vector load with more scalar copies from it. llvm-svn: 250129	2015-10-12 23:59:50 +00:00
Matt Arsenault	8c0ef8b36d	AMDGPU: Register some more passes so -print-before works llvm-svn: 250071	2015-10-12 17:43:59 +00:00
Justin Bogner	468c998031	CodeGen: print and verify after TargetPassConfig::insertPass by default In r224059, we started verifying after addPass, but missed doing so on insertPass. There isn't a good reason for the discrepancy, and skipping the verifier in these cases causes bugs. This also exposes a verifier error that was introduced in r249087, but the verifier doesn't run until after the register coalescer, when the issue happens to have been resolved. I've skipped the verifier after SIFixSGPRLiveRangesID to avoid the failures for now and will follow up with Matt for a proper fix. llvm-svn: 249643	2015-10-08 00:36:22 +00:00
Matt Arsenault	fc0ad42516	AMDGPU: Fix missing implicit m0 uses on movrel instructions llvm-svn: 249577	2015-10-07 17:46:32 +00:00
Matt Arsenault	10e6a61892	AMDGPU: Add comment for VOP2b operand class Because of the constant bus requirement, it is never legal to use a literal constant for these instructions despite the encoding allowing it. This was already doing the right thing, but note why. llvm-svn: 249500	2015-10-07 01:36:00 +00:00
Matt Arsenault	187276fa94	AMDGPU: Properly register passes llvm-svn: 249495	2015-10-07 00:42:53 +00:00
Matt Arsenault	284192730a	AMDGPU: Use explicit register size indirect pseudos This stops using an unknown reg class operand. Currently build_vector selection has a broken looking check where it tries to use a VGPR reg class and an SGPR one if it sees an SGPR use. With the source operand has an explicit VGPR class, illegal copies will be inserted that SIFixSGPRCopies will take care of normally later, which will allow removing the weird check of build_vector users. Without this, when removed v_movrels_b32 would still be emitted even though all of the values were only stored in SGPRs. llvm-svn: 249494	2015-10-07 00:42:51 +00:00
Matt Arsenault	922b7bf808	AMDGPU: Remove inferRegClassFromUses / inferRegClassFromDefs I'm not sure why this would be necessary, and no tests fail with them removed. Looking at the uses is suspect as well because the use reg classes will likely change when the users are moved as a result of moving this instruction. llvm-svn: 249493	2015-10-07 00:42:31 +00:00
Tom Stellard	0fbf899c0f	AMDGPU/SI: Remove calling convention assertion from LowerFormalArguments() Summary: We currently ignore the calling convention, so there is no real reason to assert on the calling convention of functions. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13367 llvm-svn: 249468	2015-10-06 21:16:34 +00:00
Tom Stellard	88e0b25181	AMDGPU/SI: Add 64-bit versions of v_nop and v_clrexcp Summary: The assembly printing of these is still missing the encoding size suffix, but this will be fixed in a later commit. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13436 llvm-svn: 249424	2015-10-06 15:57:53 +00:00
Tom Stellard	d585cd85a3	AMDGPU/SI: Add a helper for creating aliases for the _e32 instructions Summary: We are currently only using these aliases for VOPC instructions, but this helper will make it easier to use them everywhere. These aliases allow for the automatic matching of instructions with forced 32-bit encoding. Eventually, we should be able to remove the custom C++ logic we have for this in the assembler. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13396 llvm-svn: 249330	2015-10-05 17:57:39 +00:00
Tom Stellard	dc9088a10e	AMDGPU/SI: Remove unused tablegen multiclass Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13395 llvm-svn: 249221	2015-10-03 00:29:50 +00:00
Matt Arsenault	d092a068ba	AMDGPU/SI: Add verifier check for exec reads Make sure we aren't accidentally not setting these in the instruction definitions. llvm-svn: 249170	2015-10-02 18:58:37 +00:00
Matt Arsenault	b733f00510	AMDGPU: Fix unused variable warning in release build llvm-svn: 249091	2015-10-01 22:40:35 +00:00
Matt Arsenault	b87fc22915	AMDGPU: Move SIFixSGPRLiveRanges to be a regalloc pass Replace LiveInterval usage with LiveVariables. LiveIntervals computes far more information than is needed for this pass which just needs to find if an SGPR is live out of the defining block. LiveIntervals are not usually available that early, requiring computing them twice which is very expensive. The extra run of LiveIntervals/LiveVariables/SlotIndexes was costing in total about 5% of compile time. Continuing to use LiveIntervals is problematic. It seems there is an option (early-live-intervals) to run the analysis about where it should go to avoid recomputing LiveVariables, but it seems to be completely broken with subreg liveness enabled. There are also problems from trying to recompute LiveIntervals since this seems to undo LiveVariables and clearing kill flags, causing TwoAddressInstructions to make bad decisions. Insert the pass right after live variables and preserve it. The tricky case to worry about might be phis since LiveVariables doesn't count a register as live out if in the successor block it is only used in a phi, but I don't think this is a concern right now because SIFixSGPRCopies replaces SGPR phis. llvm-svn: 249087	2015-10-01 22:10:03 +00:00
Matt Arsenault	d2c7589f93	AMDGPU: Merge if and switch llvm-svn: 249082	2015-10-01 21:51:59 +00:00
Matt Arsenault	db7f0ef367	AMDGPU: Remove dead code There's no point in checking VReg_1 because all uses of it should already have been removed by SILowerI1Copies. llvm-svn: 249081	2015-10-01 21:51:57 +00:00
Matt Arsenault	d1d499aa56	AMDGPU: Make SIInsertWaits about a factor of 4 faster This was the slowest target custom pass and was spending 80% of the time in getMinimalPhysRegClass which was called for every register operand. Try to use the statically known register class when possible from the instruction's MCOperandInfo. There are a few pseudo instructions which are not well behaved with unknown register classes which still require the expensive physical register class search. There are a few other possibilities for making this even faster, such as not inspecting implicit operands. For now those are checked because it is technically possible to have a scalar load into exec or vcc which can be implicitly used. llvm-svn: 249079	2015-10-01 21:43:15 +00:00
Tom Stellard	e9f8b24985	AMDGPU/SI: Remove assert from AMDGPUOpenCLImageTypeLowering pass Summary: Instead of asserting when the kernel metadata is different than we expect, we should just skip lowering that function. This fixes assertion failures with OpenCL argument metadata from older LLVM releases. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13356 llvm-svn: 249073	2015-10-01 21:16:05 +00:00
Tom Stellard	e0e582c9aa	AMDGPU: Add MEM_RAT STORE_TYPED. v2: Add test (Matt). Fix capitalization of isEOP (Matt). Move pattern to class parameter (Matt). Make the instruction available to Cayman (Matt). Change name from MEM_RAT WRITE_TYPED to MEM_RAT STORE_TYPED. Patch by: Zoltan Gilian llvm-svn: 249042	2015-10-01 17:51:34 +00:00
Tom Stellard	c0f0fba2c4	AMDGPU: Factor out EOP query. v2: Fix brace placement and capitalization (Matt). Patch by: Zoltan Gilian llvm-svn: 249041	2015-10-01 17:51:29 +00:00
Tom Stellard	1f0e7bbc5b	AMDGPU/SI: Re-order PreloadedValue enum and number entries based on init order Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12451 llvm-svn: 248978	2015-10-01 02:02:46 +00:00

1 2 3 4 5 ...

251 Commits