llvm-project

Commit Graph

Author	SHA1	Message	Date
Jakob Stoklund Olesen	b24cb8c541	Add ATOMIC_LDR* pseudo-instructions to model atomic_load on ARM. It is not safe to use normal LDR instructions because they may be reordered by the scheduler. The ATOMIC_LDR pseudos have a mayStore flag that prevents reordering. Atomic loads are also prevented from participating in rematerialization and load folding. llvm-svn: 162713	2012-08-27 23:58:52 +00:00
Bill Wendling	988a47d7e5	Make sure we add the predicate after all of the registers are added. <rdar://problem/12183003> llvm-svn: 162703	2012-08-27 22:12:44 +00:00
Jakob Stoklund Olesen	74e6f9fc65	Add a missing def flag. * Bad machine code: Explicit definition marked as use * - function: test_cos - basic block: BB#0 L.entry (0x7ff2a2024fd0) - instruction: VSETLNi32 %D11, %D11<undef>, %R0, 0, pred:14, pred:%noreg, %Q5<imp-use,kill>, %Q5<imp-def> - operand 0: %D11 llvm-svn: 162247	2012-08-21 00:34:53 +00:00
Jakob Stoklund Olesen	7b1a2e8f02	Avoid folding ADD instructions with FI operands. PEI can't handle the pseudo-instructions. This can be removed when the pseudo-instructions are replaced by normal predicated instructions. Fixes PR13628. llvm-svn: 162130	2012-08-17 20:55:34 +00:00
Tim Northover	f66181530f	Implement NEON domain switching for scalar <-> S-register vmovs on ARM llvm-svn: 162094	2012-08-17 11:32:52 +00:00
Jakob Stoklund Olesen	0ea1fce6b4	Add ADD and SUB to the predicable ARM instructions. It is not my plan to duplicate the entire ARM instruction set with predicated versions. We need a way of representing predicated instructions in SSA form without requiring a separate opcode. Then the pseudo-instructions can go away. llvm-svn: 162061	2012-08-16 23:21:55 +00:00
Jakob Stoklund Olesen	c19bf0282d	Handle ARM MOVCC optimization in PeepholeOptimizer. Use the target independent select analysis hooks. llvm-svn: 162060	2012-08-16 23:14:20 +00:00
Jakob Stoklund Olesen	6cb96120f1	Fold predicable instructions into MOVCC / t2MOVCC. The ARM select instructions are just predicated moves. If the select is the only use of an operand, the instruction defining the operand can be predicated instead, saving one instruction and decreasing register pressure. This implementation can turn AND/ORR/EOR instructions into their corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to predicate any instruction, but we don't yet support predicated instructions in SSA form. llvm-svn: 161994	2012-08-15 22:16:39 +00:00
Anton Korobeynikov	3a4fdfeceb	Recognize vst1.64 / vld1.64 with 3 and 4 regs as load from / store to stack stuff (this corresponds by spilling/reloading regs in DTriple / DQuad reg classes). No testcase, found by inspection. llvm-svn: 161300	2012-08-04 13:22:14 +00:00
Anton Korobeynikov	218aaf6d04	Add stack spill / reload instructions for DTriple and DQuad register classes, which were missed for no reason. This fixes PR13377 llvm-svn: 161299	2012-08-04 13:16:12 +00:00
Sylvestre Ledru	35521e2310	Fix a typo (the the => the) llvm-svn: 160621	2012-07-23 08:51:15 +00:00
Manman Ren	88a0d3313b	ARM: fix typo in comments llvm-svn: 160093	2012-07-11 23:47:00 +00:00
Manman Ren	34cb93e192	ARM: Fix optimizeCompare to correctly check safe condition. It is safe if CPSR is killed or re-defined. When we are done with the basic block, check whether CPSR is live-out. Do not optimize away cmp if CPSR is live-out. llvm-svn: 160090	2012-07-11 22:51:44 +00:00
Andrew Trick	21cca97d95	Revert accidental checkin. My last checkin was apparently not the branch I intended. It was missing one change (added by chandlerc), and contained a spurious change. llvm-svn: 159548	2012-07-02 19:12:29 +00:00
Andrew Trick	f161e391f8	Reapply "Make NumMicroOps a variable in the subtarget's instruction itinerary." Reapplies r159406 with minor cleanup. The regressions appear to have been spurious. llvm-svn: 159541	2012-07-02 18:10:42 +00:00
Manman Ren	b1b3db6802	ARM: Clean up optimizeCompare in peephole, no functional change. Use getUniqueVRegDef. Replace a loop with existing interfaces: modifiesRegister and readsRegister. Factor out code into inline functions and simplify the code. llvm-svn: 159470	2012-06-29 22:06:19 +00:00
Manman Ren	6fa76dc0e0	Add SrcReg2 to analyzeCompare and optimizeCompareInstr to handle Compare instructions with two register operands. llvm-svn: 159465	2012-06-29 21:33:59 +00:00
Andrew Trick	51a8cf77b8	Revert "Make NumMicroOps a variable in the subtarget's instruction itinerary." This reverts commit r159406. I noticed a performance regression so I'll back out for now. llvm-svn: 159411	2012-06-29 07:10:41 +00:00
Andrew Trick	1f50152b2d	Make NumMicroOps a variable in the subtarget's instruction itinerary. The TargetInstrInfo::getNumMicroOps API does not change, but soon it will be used by MachineScheduler. Now each subtarget can specify the number of micro-ops per itinerary class. For ARM, this is currently always dynamic (-1), because it is used for load/store multiple which depends on the number of register operands. Zero is now a valid number of micro-ops. This can be used for nop pseudo-instructions or instructions that the hardware can squash during dispatch. llvm-svn: 159406	2012-06-29 03:23:18 +00:00
Evan Cheng	a75127871c	Add a missing check to avoid dereference null. No sensible test case possible. Sorry. rdar://11745134 llvm-svn: 159236	2012-06-26 22:54:59 +00:00
Manman Ren	606953fbe7	ARM: update peephole optimization. More condition codes are included when deciding whether to remove cmp after a sub instruction. Specifically, we extend from GE\|LT\|GT\|LE to GE\|LT\|GT\|LE\|HS\|LS\|HI\|LO\|EQ\|NE. If we have "sub a, b; cmp b, a; movhs", we should be able to replace with "sub a, b; movls". rdar: 11725965 llvm-svn: 159166	2012-06-25 21:49:38 +00:00
Andrew Trick	77d0b88999	ARM scheduling fix: don't guess at implicit operand latency. This is a minor drive-by fix with no robust way to unit test. As an example see neon-div.ll: SU(16): %Q8<def> = VMOVLsv4i32 %D17, pred:14, pred:%noreg, %Q8<imp-use,kill> val SU(1): Latency=2 Reg=%Q8 ...should be latency=1 llvm-svn: 158960	2012-06-22 02:50:33 +00:00
Andrew Trick	3ccb1b8cf9	ARM scheduling fix: compute predicated implicit use properly. Minor drive by fix to cleanup latency computation. Calling getOperandLatency with a deliberately incorrect operand index does not give you the latency you want. llvm-svn: 158959	2012-06-22 02:50:31 +00:00
Andrew Trick	a5d24ca453	Continue factoring computeOperandLatency. Use it for ARM hasHighOperandLatency. llvm-svn: 158164	2012-06-07 19:42:04 +00:00
Andrew Trick	5b1cadf9f7	ARM getOperandLatency rewrite. Match expectations of the new latency API. Cleanup and make the logic consistent. llvm-svn: 158163	2012-06-07 19:42:00 +00:00
Andrew Trick	3564bdfa61	ARM getOperandLatency should return -1 for unknown, consistent with API llvm-svn: 158162	2012-06-07 19:41:58 +00:00
Andrew Trick	fb1a74c2b2	Fix ARM getInstrLatency logic to work with the current API. llvm-svn: 158161	2012-06-07 19:41:55 +00:00
Andrew Trick	4544606c71	misched: API for minimum vs. expected latency. Minimum latency determines per-cycle scheduling groups. Expected latency determines critical path and cost. llvm-svn: 158021	2012-06-05 21:11:27 +00:00
Craig Topper	2fbd130a79	Mark a static table as const. Shrink opcode size in static tables to uint16_t. Simplify loop iterating over one of those tables. No functional change intended. llvm-svn: 157367	2012-05-24 03:59:11 +00:00
David Blaikie	81a84bd841	Fix use of uninitialized variable. Found by GCC's maybe-uninitialized. llvm-svn: 156780	2012-05-14 21:48:19 +00:00
Manman Ren	0d5ec28ccc	Add space before an open parenthesis in control flow statements. llvm-svn: 156620	2012-05-11 15:36:46 +00:00
Manman Ren	dc8ad0058f	ARM: peephole optimization to remove cmp instruction This patch will optimize the following cases: sub r1, r3 \| sub r1, imm cmp r3, r1 or cmp r1, r3 \| cmp r1, imm bge L1 TO subs r1, r3 bge L1 or ble L1 If the branch instruction can use flag from "sub", then we can replace "sub" with "subs" and eliminate the "cmp" instruction. rdar: 10734411 llvm-svn: 156599	2012-05-11 01:30:47 +00:00
Manman Ren	b555b382bd	Revert: 156550 "ARM: peephole optimization to remove cmp instruction" This commit broke an external linux bot and gave a compile-time warning. llvm-svn: 156556	2012-05-10 18:49:43 +00:00
Manman Ren	c860887b2d	ARM: peephole optimization to remove cmp instruction This patch will optimize the following cases: sub r1, r3 \| sub r1, imm cmp r3, r1 or cmp r1, r3 \| cmp r1, imm bge L1 TO subs r1, r3 bge L1 or ble L1 If the branch instruction can use flag from "sub", then we can replace "sub" with "subs" and eliminate the "cmp" instruction. rdar: 10734411 llvm-svn: 156550	2012-05-10 16:48:21 +00:00
Jakob Stoklund Olesen	0a5b72f0e4	Implement ARMBaseInstrInfo::commuteInstruction() for MOVCCr. A MOVCCr instruction can be commuted by inverting the condition. This can help reduce register pressure and remove unnecessary copies in some cases. <rdar://problem/11182914> llvm-svn: 154033	2012-04-04 18:23:42 +00:00
Jakob Stoklund Olesen	caa6bd273f	Handle register copies for the new ARM register classes. ARM recently gained DPair, DTriple, and DQuad register classes. Update copyPhysReg() to handle copies in these register classes. No test case, it is difficult to make the register allocator emit the odd copies reliably. The missing DPair copy caused a failure on partialsums in the nightly test suite. <rdar://problem/11147997> llvm-svn: 153686	2012-03-29 21:10:40 +00:00
Jakob Stoklund Olesen	9e512120b7	Spill DPair registers, not just QPR. The arm_neon intrinsics can create virtual registers from the DPair register class which allows both even-odd and odd-even D-register pairs. This fixes PR12389. llvm-svn: 153603	2012-03-28 21:20:32 +00:00
Evan Cheng	a2b48d985b	ARM has a peephole optimization which looks for a def / use pair. The def produces a 32-bit immediate which is consumed by the use. It tries to fold the immediate by breaking it into two parts and fold them into the immmediate fields of two uses. e.g movw r2, #40885 movt r3, #46540 add r0, r0, r3 => add.w r0, r0, #3019898880 add.w r0, r0, #30146560 ; However, this transformation is incorrect if the user produces a flag. e.g. movw r2, #40885 movt r3, #46540 adds r0, r0, r3 => add.w r0, r0, #3019898880 adds.w r0, r0, #30146560 Note the adds.w may not set the carry flag even if the original sequence would. rdar://11116189 llvm-svn: 153484	2012-03-26 23:31:00 +00:00
Craig Topper	5fa0caafc0	Prune includes and replace uses of ARMRegisterInfo.h with ARMBaeRegisterInfo.h llvm-svn: 153422	2012-03-26 00:45:15 +00:00
Jim Grosbach	13a292cc74	ARM refactor more NEON VLD/VST instructions to use composite physregs Register pair VLD1/VLD2 all-lanes instructions. Kill off more of the pseudos as a result. llvm-svn: 152150	2012-03-06 22:01:44 +00:00
Jakob Stoklund Olesen	d9b427ee65	Add <imp-def> operands when reloading into physregs. When an instruction only writes sub-registers, it is still necessary to add an <imp-def> operand for the super-register. When reloading into a virtual register, rewriting will add the operand, but when loading directly into a virtual register, the <imp-def> operand is still necessary. llvm-svn: 152095	2012-03-06 02:48:17 +00:00
Jim Grosbach	c988e0c521	ARM refactor away a bunch of VLD/VST pseudo instructions. With the new composite physical registers to represent arbitrary pairs of DPR registers, we don't need the pseudo-registers anymore. Get rid of a bunch of them that use DPR register pairs and just use the real instructions directly instead. llvm-svn: 152045	2012-03-05 19:33:30 +00:00
Jakob Stoklund Olesen	f729ceae04	Use <def,undef> operands when spilling NEON bundles. MachineOperands that define part of a virtual register must have an <undef> flag if they are not intended as read-modify-write operands. The old trick of adding an <imp-def> operand doesn't work any longer. Fixes PR12177. llvm-svn: 152008	2012-03-04 18:40:30 +00:00
Jim Grosbach	617f84ddbd	ARM implement TargetInstrInfo::getNoopForMachoTarget() Without this hook, functions w/ a completely empty body (including no epilogue) will cause an MCEmitter assertion failure. For example, define internal fastcc void @empty_function() { unreachable } rdar://10947471 llvm-svn: 151673	2012-02-28 23:53:30 +00:00
Jakob Stoklund Olesen	5f37f1c39d	Clarify ARM calling conventions. llvm-svn: 151113	2012-02-22 01:07:19 +00:00
Jakob Stoklund Olesen	6909faaf35	Calls don't really change the stack pointer. Even if a call instruction has %SP<imp-def> operands, it doesn't change the value of the stack pointer. llvm-svn: 151104	2012-02-21 23:47:43 +00:00
Jia Liu	b22310fda6	Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore. llvm-svn: 150878	2012-02-18 12:03:15 +00:00
Jakob Stoklund Olesen	4fad5b2b9e	Handle regmask operands in ARMInstrInfo. llvm-svn: 150833	2012-02-17 19:23:15 +00:00
Jakob Stoklund Olesen	96732a438d	Fix ARMBaseInstrInfo::getInstrLatency for calls. Calls always clobber CPSR. llvm-svn: 150831	2012-02-17 19:07:59 +00:00
Craig Topper	e55c556a24	Convert assert(0) to llvm_unreachable llvm-svn: 149961	2012-02-07 02:50:20 +00:00
Evan Cheng	613d6d3b43	DefinesPredicate should only look for def operands. Patch by Ludwig Meier. llvm-svn: 149846	2012-02-05 19:55:04 +00:00
David Blaikie	46a9f016c5	More dead code removal (using -Wunreachable-code) llvm-svn: 148578	2012-01-20 21:51:11 +00:00
Jakob Stoklund Olesen	d110e2a83f	Reapply r146997, "Heed spill slot alignment on ARM." Now that canRealignStack() understands frozen reserved registers, it is safe to use it for aligned spill instructions. It will only return true if the registers reserved at the beginning of register allocation allow for dynamic stack realignment. <rdar://problem/10625436> llvm-svn: 147579	2012-01-05 00:26:57 +00:00
Jakob Stoklund Olesen	1b7f2a7638	Revert r146997, "Heed spill slot alignment on ARM." This patch caused a miscompilation of oggenc because a frame pointer was suddenly needed halfway through register allocation. <rdar://problem/10625436> llvm-svn: 147487	2012-01-03 22:34:35 +00:00
Jim Grosbach	c80a264386	ARM NEON assmebly parsing for VLD2 to all lanes instructions. llvm-svn: 147069	2011-12-21 19:40:55 +00:00
Jakob Stoklund Olesen	b95c102c2f	Heed spill slot alignment on ARM. Use the spill slot alignment as well as the local variable alignment to determine when the stack needs to be realigned. This works now that the ARM target can always realign the stack by using a base pointer. Still respect the ARMBaseRegisterInfo::canRealignStack() function vetoing a realigned stack. Don't use aligned spill code in that case. llvm-svn: 146997	2011-12-20 22:15:04 +00:00
Evan Cheng	da103bf9ec	Model ARM predicated write as read-mod-write. e.g. r0 = mov #0 r0 = moveq #1 Then the second instruction has an implicit data dependency on the first instruction. Sadly I have yet to come up with a small test case that demonstrate the post-ra scheduler taking advantage of this. llvm-svn: 146583	2011-12-14 20:00:08 +00:00
Evan Cheng	7fae11b231	- Add MachineInstrBundle.h and MachineInstrBundle.cpp. This includes a function to finalize MI bundles (i.e. add BUNDLE instruction and computing register def and use lists of the BUNDLE instruction) and a pass to unpack bundles. - Teach more of MachineBasic and MachineInstr methods to be bundle aware. - Switch Thumb2 IT block to MI bundles and delete the hazard recognizer hack to prevent IT blocks from being broken apart. llvm-svn: 146542	2011-12-14 02:11:42 +00:00
Jim Grosbach	d146a02c79	ARM assembly parsing and encoding for VLD2 with writeback. Refactor the instructions into fixed writeback and register-stride writeback variants to simplify the offset operand (no more optional register operand using reg0). This is a simpler representation and allows the assembly parser to more easily handle these instructions. Add tests for the instruction variants now supported. llvm-svn: 146278	2011-12-09 21:28:25 +00:00
Evan Cheng	7f8e563a69	Add bundle aware API for querying instruction properties and switch the code generator to it. For non-bundle instructions, these behave exactly the same as the MC layer API. For properties like mayLoad / mayStore, look into the bundle and if any of the bundled instructions has the property it would return true. For properties like isPredicable, only return true if all of the bundled instructions have the property. For properties like canFoldAsLoad, isCompare, conservatively return false for bundles. llvm-svn: 146026	2011-12-07 07:15:52 +00:00
Jakob Stoklund Olesen	cc6bfa8e79	Revert r145971: "Use conservative size estimate for tBR_JTr." This caused more offset errors. llvm-svn: 145980	2011-12-06 22:41:31 +00:00
Evan Cheng	2a81dd4a3c	First chunk of MachineInstr bundle support. 1. Added opcode BUNDLE 2. Taught MachineInstr class to deal with bundled MIs 3. Changed MachineBasicBlock iterator to skip over bundled MIs; added an iterator to walk all the MIs 4. Taught MachineBasicBlock methods about bundled MIs llvm-svn: 145975	2011-12-06 22:12:01 +00:00
Jakob Stoklund Olesen	33fe130e12	Use conservative size estimate for tBR_JTr. This pseudo-instruction contains a .align directive in its expansion, so the total size may vary by 2 bytes. It is too difficult to accurately keep track of this alignment directive, just use the worst-case size instead. llvm-svn: 145971	2011-12-06 21:55:39 +00:00
Jim Grosbach	a68c9a847e	ARM parsing for VLD1 all lanes, with writeback. llvm-svn: 145510	2011-11-30 19:35:44 +00:00
Jakob Stoklund Olesen	653183fd5c	Enable -widen-vmovs by default. This will widen 32-bit register vmov instructions to 64-bit when possible. The 64-bit vmovd instructions can then be translated to NEON vorr instructions by the execution dependency fix pass. The copies are only widened if they are marked as clobbering the whole D-register. llvm-svn: 144734	2011-11-15 23:53:18 +00:00
Jay Foad	465101bb0e	Make use of MachinePointerInfo::getFixedStack. This removes all mention of PseudoSourceValue from lib/Target/. llvm-svn: 144632	2011-11-15 07:34:52 +00:00
Jim Grosbach	17ec1a19e5	ARM assembly parsing and encoding for VLD1 with writeback. Four entry register lists. llvm-svn: 142882	2011-10-25 00:14:01 +00:00
Jim Grosbach	30c39c8bf2	Nuke dead code. Nothing generates the VLD1d64QPseudo_UPD instruction. llvm-svn: 142877	2011-10-24 23:40:46 +00:00
Jim Grosbach	92fd05ecdc	ARM assembly parsing and encoding for VLD1 w/ writeback. Three entry register list variation. llvm-svn: 142876	2011-10-24 23:26:05 +00:00
Jim Grosbach	2098cb1e6f	ARM refactor am6offset usage for VLD1. Split am6offset into fixed and register offset variants so the instruction encodings are explicit rather than relying an a magic reg0 marker. Needed to being able to parse these. llvm-svn: 142853	2011-10-24 21:45:13 +00:00
Andrew Trick	88b2450adc	Use ARM/t2PseudoInst class from ARM/Thumb2 special adds/subs patterns. Clean up the patterns, fix comments, and avoid confusing both tools and coders. Note that the special adds/subs SelectionDAG nodes no longer have the dummy cc_out operand. llvm-svn: 142397	2011-10-18 19:18:52 +00:00
Jakob Stoklund Olesen	39c31a77b8	Fix -widen-vmovs liveness issues. When widening a copy, we are reading a larger register that may not be live. Use an <undef> flag to tell the register scavenger and machine code verifier that we know the value isn't defined. We now widen: %S6<def> = COPY %S4<kill>, %D3<imp-def> into: %D3<def> = VMOVD %D2<undef>, pred:14, pred:%noreg, %S4<imp-use,kill> This also keeps the <kill> flag on %S4 so we don't inadvertently kill a live value in %S5. Finally, ensure that ARMBaseInstrInfo::setExecutionDomain() preserves the <undef> flag when converting VMOVD to VORR. llvm-svn: 141746	2011-10-12 00:06:23 +00:00
Jakob Stoklund Olesen	da7c0f8f7d	Move -widen-vmovs to ARMBaseInstrInfo::expandPostRAPseudo(). The VMOVS widening needs to look at the implicit COPY operands. Trying to dig out the COPY instruction from an iterator in copyPhysReg() is the wrong approach. The expandPostRAPseudo() hook gets to look at COPY instructions before they are converted to copyPhysReg() calls. llvm-svn: 141619	2011-10-11 00:59:06 +00:00
Bill Wendling	4a4772fae2	Use the ARMConstantPoolMBB class to handle the MBB values. llvm-svn: 140943	2011-10-01 09:30:42 +00:00
Bill Wendling	c214cb055d	Use the new ARMConstantPoolSymbol class to handle external symbols. llvm-svn: 140939	2011-10-01 08:58:29 +00:00
Bill Wendling	7753d66468	Switch over to using ARMConstantPoolConstant for global variables, functions, and block addresses. llvm-svn: 140936	2011-10-01 08:00:54 +00:00
Bill Wendling	69bc3de4fc	Create a machine basic block in the constant pool and retrieve the symbol for an MBB. llvm-svn: 140824	2011-09-29 23:50:42 +00:00
Jakob Stoklund Olesen	f7ad189033	Use ExecutionDepsFix instead of NEONMoveFix. This enables NEON domain tracking across basic blocks, but should otherwise do the same thing. llvm-svn: 140772	2011-09-29 02:48:41 +00:00
Jakob Stoklund Olesen	f9b71a2e01	Implement TII::get/setExecutionDomain() for ARM. llvm-svn: 140653	2011-09-27 22:57:21 +00:00
Andrew Trick	924123acb3	Lower ARM adds/subs to add/sub after adding optional CPSR operand. This is still a hack until we can teach tblgen to generate the optional CPSR operand rather than an implicit CPSR def. But the strangeness is now limited to the selection DAG. ADD/SUB MI's no longer have implicit CPSR defs, nor do we allow flag setting variants of these opcodes in machine code. There are several corner cases to consider, and getting one wrong would previously lead to nasty miscompilation. It's not the first time I've debugged one, so this time I added enough verification to ensure it won't happen again. llvm-svn: 140228	2011-09-21 02:20:46 +00:00
Andrew Trick	3f1fdf1b31	whitespace llvm-svn: 140227	2011-09-21 02:17:37 +00:00
Owen Anderson	eb3f0fbdce	Fix an ambiguously nested if. llvm-svn: 139431	2011-09-09 23:13:02 +00:00
Owen Anderson	29cfe6c368	Thumb unconditional branches are allowed in IT blocks, and therefore should have a predicate operand, unlike conditional branches. llvm-svn: 139415	2011-09-09 21:48:23 +00:00
Jakob Stoklund Olesen	cd893390f5	Put VMOVS widening under a command line option, off by default. It appears that our use of the imp-use and imp-def flags with sub-registers is not yet robust enough to support this. The failing test case is complicated, I am working on a reduction. <rdar://problem/10044201> llvm-svn: 138861	2011-08-31 17:00:02 +00:00
Jim Grosbach	e364ad540a	Clean up Thumb load/store multiple definitions. There is no non-writeback store multiple instruction in Thumb1, so don't define one. As a result load multiple is the only instantiation of the multiclass, so refactor that away entirely. llvm-svn: 138338	2011-08-23 17:41:15 +00:00
Chad Rosier	61f92efb5c	Remove the VMOVQQ pseudo instruction. llvm-svn: 138177	2011-08-20 00:52:40 +00:00
Jakob Stoklund Olesen	59015c8b17	Add <imp-def> operands to QQ and QQQQ stack loads. This pleases the register scavenger and brings test/CodeGen/ARM/2011-08-12-vmovqqqq-pseudo.ll a little closer to working with -verify-machineinstrs. llvm-svn: 138164	2011-08-20 00:17:45 +00:00
Chad Rosier	be7625161e	VMOVQQQQs pseudo instructions are only created by ARMBaseInstrInfo::copyPhysReg. Therefore, rather then generate a pseudo instruction, which is later expanded, generate the necessary instructions in place. llvm-svn: 138163	2011-08-20 00:17:25 +00:00
Owen Anderson	732f82c463	Rewrite some ARM InstrInfo functions to be most accepting of arbitrary register subclasses. Hopefully this fixes some buildbots. llvm-svn: 137223	2011-08-10 17:21:20 +00:00
Jakob Stoklund Olesen	6a14dc01ff	Promote VMOVS to VMOVD when possible. On Cortex-A8, we use the NEON v2f32 instructions for f32 arithmetic. For better latency, we also send D-register copies down the NEON pipeline by translating them to vorr instructions. This patch promotes even S-register copies to D-register copies when possible so they can also go down the NEON pipeline. Example: vldr.32 s0, LCPI0_0 loop: vorr d1, d0, d0 loop2: ... vadd.f32 d1, d1, d16 The vorr instruction looked like this after regalloc: %S2<def> = COPY %S0, %D1<imp-def> Copies involving odd S-registers, and copies that don't define the full D-register are left alone. llvm-svn: 137182	2011-08-09 23:41:44 +00:00
Jakob Stoklund Olesen	c04a66b48e	Implement isLoadFromStackSlotPostFE and isStoreToStackSlotPostFE for ARM. They improve the verbose assembly. llvm-svn: 137069	2011-08-08 21:45:32 +00:00
Owen Anderson	b595ed0085	Split up the ARM so_reg ComplexPattern into so_reg_reg and so_reg_imm, allowing us to distinguish the encodings that use shifted registers from those that use shifted immediates. This is necessary to allow the fixed-length decoder to distinguish things like BICS vs LDRH. llvm-svn: 135693	2011-07-21 18:54:16 +00:00
Evan Cheng	a20cde31e7	Sink ARMMCExpr and ARMAddressingModes into MC layer. First step to separate ARM MC code from target. llvm-svn: 135636	2011-07-20 23:34:39 +00:00
Owen Anderson	454e1c7abb	Remove VMOVDneon and VMOVQ, which are just aliases for VORR. This continues to simplify the path towards an auto-generated disassembler. llvm-svn: 135290	2011-07-15 18:46:47 +00:00
Evan Cheng	bc153d49b7	Next round of MC refactoring. This patch factor MC table instantiations, MC registeration and creation code into XXXMCDesc libraries. llvm-svn: 135184	2011-07-14 20:59:42 +00:00
Owen Anderson	651b230ca0	Add a target-indepedent entry to MCInstrDesc to describe the encoded size of an opcode. Switch ARM over to using that rather than its own special MCInstrDesc bits. llvm-svn: 135106	2011-07-13 23:22:26 +00:00
Jakub Staszak	9b07c0ab6b	Use BranchProbability instead of floating points in IfConverter. llvm-svn: 134858	2011-07-10 02:58:07 +00:00
Evan Cheng	703a0fbf39	Hide the call to InitMCInstrInfo into tblgen generated ctor. llvm-svn: 134244	2011-07-01 17:57:27 +00:00
Jim Grosbach	d86f34d631	Refactor away tSpill and tRestore pseudos in ARM backend. The tSpill and tRestore instructions are just copies of the tSTRspi and tLDRspi instructions, respectively. Just use those directly instead. llvm-svn: 134092	2011-06-29 20:26:39 +00:00
Evan Cheng	194c3dc01f	Move CallFrameSetupOpcode and CallFrameDestroyOpcode to TargetInstrInfo. llvm-svn: 134030	2011-06-28 21:14:33 +00:00
Evan Cheng	1e210d08d8	Merge XXXGenRegisterNames.inc into XXXGenRegisterInfo.inc llvm-svn: 134024	2011-06-28 20:07:07 +00:00
Evan Cheng	6cc775f905	- Rename TargetInstrDesc, TargetOperandInfo to MCInstrDesc and MCOperandInfo and sink them into MC layer. - Added MCInstrInfo, which captures the tablegen generated static data. Chang TargetInstrInfo so it's based off MCInstrInfo. llvm-svn: 134021	2011-06-28 19:10:37 +00:00
Chris Lattner	1d0c25756e	use the MachineInstrBuilder operator-> to simplify some code. There are probably more instances of this floating around. llvm-svn: 130474	2011-04-29 05:24:29 +00:00
Evan Cheng	7d6cd4902e	Change A9 scheduling itineraries VLD* / VST* entries default to "aligned". That is, it assumes addresses are 64-bit aligned (which should be the more common case). If the alignment is found not to be aligned, then getOperandLatency() would adjust the operand latency computation by one to compensate for it. rdar://9294833 llvm-svn: 129742	2011-04-19 01:21:49 +00:00
Cameron Zwarich	9c65e4d69c	Add ORR and EOR to the CMP peephole optimizer. It's hard to get isel to generate a case involving EOR, so I only added a test for ORR. llvm-svn: 129610	2011-04-15 21:24:38 +00:00
Cameron Zwarich	0829b3065a	The AND instruction leaves the V flag unmodified, so it falls victim to the same problem as all of the other instructions we fold with CMPs. llvm-svn: 129602	2011-04-15 20:45:00 +00:00
Cameron Zwarich	93eae1571c	Add missing register forms of instructions to the ARM CMP-folding code. This fixes <rdar://problem/9287901>. llvm-svn: 129599	2011-04-15 20:28:28 +00:00
Chris Lattner	0ab5e2cded	Fix a ton of comment typos found by codespell. Patch by Luis Felipe Strano Moraes! llvm-svn: 129558	2011-04-15 05:18:47 +00:00
Cameron Zwarich	8001850ee8	Fix a typo. llvm-svn: 129429	2011-04-13 06:39:16 +00:00
Owen Anderson	bdff1c997a	Teach the ARM peephole optimizer that RSB, RSC, ADC, and SBC can be used for folded comparisons, just like ADD and SUB. llvm-svn: 129038	2011-04-06 23:35:59 +00:00
Owen Anderson	d6c5a741b5	Get rid of the non-writeback versions VLDMDB and VSTMDB, which don't actually exist. llvm-svn: 128461	2011-03-29 16:45:53 +00:00
Evan Cheng	f098bf1199	Nasty bug in ARMBaseInstrInfo::produceSameValue(). The MachineConstantPoolEntry entries being compared may not be ARMConstantPoolValue. Without checking whether they are ARMConstantPoolValue first, and if the stars and moons are aligned properly, the equality test may return true (when the first few words of two Constants' values happen to be identical) and very bad things can happen. rdar://9125354 llvm-svn: 128203	2011-03-24 06:20:03 +00:00
Evan Cheng	425489d397	Cmp peephole optimization isn't always safe for signed arithmetics. int tries = INT_MAX; while (tries > 0) { tries--; } The check should be: subs r4, #1 cmp r4, #0 bgt LBB0_1 The subs can set the overflow V bit when r4 is INT_MAX+1 (which loop canonicalization apparently does in this case). cmp #0 would have cleared it while not changing the N and Z bits. Since BGT is dependent on the V bit, i.e. (N == V) && !Z, it is not safe to eliminate the cmp #0. rdar://9172742 llvm-svn: 128179	2011-03-23 22:52:04 +00:00
Anton Korobeynikov	e7410dd0d5	Preliminary support for ARM frame save directives emission via MI flags. This is just very first approximation how the stuff should be done (e.g. ARM-only for now). More to follow. llvm-svn: 127101	2011-03-05 18:43:32 +00:00
Evan Cheng	2f2435d026	Last round of fixes for movw + movt global address codegen. 1. Fixed ARM pc adjustment. 2. Fixed dynamic-no-pic codegen 3. CSE of pc-relative load of global addresses. It's now enabled by default for Darwin. llvm-svn: 123991	2011-01-21 18:55:51 +00:00
Andrew Trick	47ff14b091	Convert -enable-sched-cycles and -enable-sched-hazard to -disable flags. They are still not enable in this revision. Added TargetInstrInfo::isZeroCost() to fix a fundamental problem with the scheduler's model of operand latency in the selection DAG. Generalized unit tests to work with sched-cycles. llvm-svn: 123969	2011-01-21 05:51:33 +00:00
Evan Cheng	028ccbfcbf	Don't be overly aggressive with CSE of "ldr constantpool". If it's a pc-relative value, the "add pc" must be CSE'ed at the same time. We could follow the same approach as T2 by adding pseudo instructions that combine the ldr + "add pc". But the better approach is to use movw + movt (which I will enable soon), so I'll leave this as a TODO. llvm-svn: 123949	2011-01-20 23:55:07 +00:00
Evan Cheng	b8b0ad80a8	Sorry, several patches in one. TargetInstrInfo: Change produceSameValue() to take MachineRegisterInfo as an optional argument. When in SSA form, targets can use it to make more aggressive equality analysis. Machine LICM: 1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead. 2. Fix a bug which prevent CSE of instructions which are not re-materializable. 3. Use improved form of produceSameValue. ARM: 1. Teach ARM produceSameValue to look pass some PIC labels. 2. Look for operands from different loads of different constant pool entries which have same values. 3. Re-implement PIC GA materialization using movw + movt. Combine the pair with a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible to re-materialize the instruction, allow machine LICM to hoist the set of instructions out of the loop and make it possible to CSE them. It's a bit hacky, but it significantly improve code quality. 4. Some minor bug fixes as well. With the fixes, using movw + movt to materialize GAs significantly outperform the load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap and 176.gcc ~10%. llvm-svn: 123905	2011-01-20 08:34:58 +00:00
Evan Cheng	dfce83c8f5	Materialize GA addresses with movw + movt pairs for Darwin in PIC mode. e.g. movw r0, :lower16:(L_foo$non_lazy_ptr-(LPC0_0+4)) movt r0, :upper16:(L_foo$non_lazy_ptr-(LPC0_0+4)) LPC0_0: add r0, pc, r0 It's not yet enabled by default as some tests are failing. I suspect bugs in down stream tools. llvm-svn: 123619	2011-01-17 08:03:18 +00:00
Jakob Stoklund Olesen	2fb5b31578	Simplify a bunch of isVirtualRegister() and isPhysicalRegister() logic. These functions not longer assert when passed 0, but simply return false instead. No functional change intended. llvm-svn: 123155	2011-01-10 02:58:51 +00:00
Evan Cheng	078b0b095e	Recognize inline asm 'rev /bin/bash, ' as a bswap intrinsic call. llvm-svn: 123048	2011-01-08 01:24:27 +00:00
Andrew Trick	10ffc2b6c2	Various bits of framework needed for precise machine-level selection DAG scheduling during isel. Most new functionality is currently guarded by -enable-sched-cycles and -enable-sched-hazard. Added InstrItineraryData::IssueWidth field, currently derived from ARM itineraries, but could be initialized differently on other targets. Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is active, and if so how many cycles of state it holds. Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry into the scheduler's available queue. ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to get information about it's SUnits, provides RecedeCycle for bottom-up scheduling, correctly computes scoreboard depth, tracks IssueCount, and considers potential stall cycles when checking for hazards. ScheduleDAGRRList now models machine cycles and hazards (under flags). It tracks MinAvailableCycle, drives the hazard recognizer and priority queue's ready filter, manages a new PendingQueue, properly accounts for stall cycles, etc. llvm-svn: 122541	2010-12-24 05:03:26 +00:00
Andrew Trick	c416ba612b	whitespace llvm-svn: 122539	2010-12-24 04:28:06 +00:00
Bob Wilson	651eaa02b8	Remove the rest of the _sfp Neon instruction patterns. Use the same COPY_TO_REGCLASS approach as for the 2-register _sfp instructions. This change made a big difference in the code generated for the CodeGen/Thumb2/cross-rc-coalescing-2.ll test: The coalescer is still doing a fine job, but some instructions that were previously moved outside the loop are not moved now. It's using fewer VFP registers now, which is generally a good thing, so I think the estimates for register pressure changed and that affected the LICM behavior. Since that isn't obviously wrong, I've just changed the test file. This completes the work for Radar 8711675. llvm-svn: 121730	2010-12-13 23:02:37 +00:00
Jim Grosbach	327cf8ee5f	Refactor the ARM CMPz* patterns to just use the normal CMP instructions when possible. They were duplicates for everything exception the source pattern before. llvm-svn: 121179	2010-12-07 20:41:06 +00:00
Evan Cheng	62c7b5bf76	Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. llvm-svn: 120960	2010-12-05 22:04:16 +00:00
Jim Grosbach	81af4f9eb1	Rename t2 TBB and TBH instructions to reference that they encode the jump table data. Next up, pseudo-izing them. llvm-svn: 120320	2010-11-29 21:28:32 +00:00
Anton Korobeynikov	d08fbd19f5	Move callee-saved regs spills / reloads to TFI llvm-svn: 120228	2010-11-27 23:05:03 +00:00
Eric Christopher	b006fc9c07	Rewrite stack callee saved spills and restores to use push/pop instructions. Remove movePastCSLoadStoreOps and associated code for simple pointer increments. Update routines that depended upon other opcodes for save/restore. Adjust all testcases accordingly. llvm-svn: 119725	2010-11-18 19:40:05 +00:00
Evan Cheng	2d4e42fba6	Silence compiler warnings. llvm-svn: 119610	2010-11-18 01:43:23 +00:00
Evan Cheng	7f8ab6ee8b	Remove ARM isel hacks that fold large immediates into a pair of add, sub, and, and xor. The 32-bit move immediates can be hoisted out of loops by machine LICM but the isel hacks were preventing them. Instead, let peephole optimization pass recognize registers that are defined by immediates and the ARM target hook will fold the immediates in. Other changes include 1) do not fold and / xor into cmp to isel TST / TEQ instructions if there are multiple uses. This happens when the 'and' is live out, machine sink would have sinked the computation and that ends up pessimizing code. The peephole pass would recognize situations where the 'and' can be toggled to define CPSR and eliminate the comparison anyway. 2) Move peephole pass to after machine LICM, sink, and CSE to avoid blocking important optimizations. rdar://8663787, rdar://8241368 llvm-svn: 119548	2010-11-17 20:13:28 +00:00
Evan Cheng	655364797e	Simplify code that toggle optional operand to ARM::CPSR. llvm-svn: 119484	2010-11-17 08:06:50 +00:00
Bill Wendling	a68e3a5397	Encode the multi-load/store instructions with their respective modes ('ia', 'db', 'ib', 'da') instead of having that mode as a separate field in the instruction. It's more convenient for the asm parser and much more readable for humans. <rdar://problem/8654088> llvm-svn: 119310	2010-11-16 01:16:36 +00:00
Evan Cheng	2ce016c7f8	Code clean up. The peephole pass should be the one updating the instruction iterator, not TII->OptimizeCompareInstr. llvm-svn: 119186	2010-11-15 21:20:45 +00:00
Eric Christopher	b90f7004cf	Revert this temporarily. llvm-svn: 118827	2010-11-11 19:47:02 +00:00
Eric Christopher	e6283f950d	Change the prologue and epilogue to use push/pop for the low ARM registers. llvm-svn: 118823	2010-11-11 19:26:03 +00:00
Evan Cheng	debf9c502a	Two sets of changes. Sorry they are intermingled. 1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to "optimize for latency". Call instructions don't have the right latency and this is more likely to use introduce spills. 2. Fix if-converter cost function. For ARM, it should use instruction latencies, not # of micro-ops since multi-latency instructions is completely executed even when the predicate is false. Also, some instruction will be "slower" when they are predicated due to the register def becoming implicit input. rdar://8598427 llvm-svn: 118135	2010-11-03 00:45:17 +00:00
Bill Wendling	c6627eec13	When we look at instructions to convert to setting the 's' flag, we need to look at more than those which define CPSR. You can have this situation: (1) subs ... (2) sub r6, r5, r4 (3) movge ... (4) cmp r6, 0 (5) movge ... We cannot convert (2) to "subs" because (3) is using the CPSR set by (1). There's an analogous situation here: (1) sub r1, r2, r3 (2) sub r4, r5, r6 (3) cmp r4, ... (5) movge ... (6) cmp r1, ... (7) movge ... We cannot convert (1) to "subs" because of the intervening use of CPSR. llvm-svn: 117950	2010-11-01 20:41:43 +00:00
Evan Cheng	99cce36cf5	Fix fpscr <-> GPR latency info. llvm-svn: 117737	2010-10-29 23:16:55 +00:00
Evan Cheng	6c1414f9c2	Avoiding overly aggressive latency scheduling. If the two nodes share an operand and one of them has a single use that is a live out copy, favor the one that is live out. Otherwise it will be difficult to eliminate the copy if the instruction is a loop induction variable update. e.g. BB: sub r1, r3, #1 str r0, [r2, r3] mov r3, r1 cmp bne BB => BB: str r0, [r2, r3] sub r3, r3, #1 cmp bne BB This fixed the recent 256.bzip2 regression. llvm-svn: 117675	2010-10-29 18:09:28 +00:00
Evan Cheng	ff310737e5	Re-commit 117518 and 117519 now that ARM MC test failures are out of the way. llvm-svn: 117531	2010-10-28 06:47:08 +00:00
Evan Cheng	e2c211c1b9	Revert 117518 and 117519 for now. They changed scheduling and cause MC tests to fail. Ugh. llvm-svn: 117520	2010-10-28 02:00:25 +00:00
Evan Cheng	ff1c862f8e	- Assign load / store with shifter op address modes the right itinerary classes. - For now, loads of [r, r] addressing mode is the same as the [r, r lsl/lsr/asr #] variants. ARMBaseInstrInfo::getOperandLatency() should identify the former case and reduce the output latency by 1. - Also identify [r, r << 2] case. This special form of shifter addressing mode is "free". llvm-svn: 117519	2010-10-28 01:49:06 +00:00
Jim Grosbach	338de3ee56	Refactor ARM STR/STRB instruction patterns into STR{B}i12 and STR{B}rs, like the LDR instructions have. This makes the literal/register forms of the instructions explicit and allows us to assign scheduling itineraries appropriately. rdar://8477752 llvm-svn: 117505	2010-10-27 23:12:14 +00:00
Jim Grosbach	8bf1483a3d	The immediate operands of an LDRi12 instruction doesn't need the addrmode2 encoding tricks. Handle the 'imm doesn't fit in the insn' case. llvm-svn: 117454	2010-10-27 16:50:31 +00:00
Jim Grosbach	9d2d1f0f00	LDRi12 machine instructions handle negative offset operands normally (simple integer values), not with the addrmode2 encoding. llvm-svn: 117429	2010-10-27 01:19:41 +00:00
Jim Grosbach	5a7c715470	Split ARM::LDRB into LDRBi12 and LDRBrs. Adjust accordingly. Continuing on rdar://8477752. llvm-svn: 117419	2010-10-27 00:19:44 +00:00
Jim Grosbach	1e4d9a17c2	First part of refactoring ARM addrmode2 (load/store) instructions to be more explicit about the operands. Split out the different variants into separate instructions. This gives us the ability to, among other things, assign different scheduling itineraries to the variants. rdar://8477752. llvm-svn: 117409	2010-10-26 22:37:02 +00:00
Evan Cheng	e96b8d7ab6	Use instruction itinerary to determine what instructions are 'cheap'. llvm-svn: 117348	2010-10-26 02:08:50 +00:00
Chandler Carruth	82058c05f8	Move the remaining attribute macros to systematic names based on the attribute name and prefixed with 'LLVM_'. llvm-svn: 117203	2010-10-23 08:40:19 +00:00

1 2 3 4 5 ...

441 Commits