llvm-project

Commit Graph

Author	SHA1	Message	Date
Evan Cheng	6cc775f905	- Rename TargetInstrDesc, TargetOperandInfo to MCInstrDesc and MCOperandInfo and sink them into MC layer. - Added MCInstrInfo, which captures the tablegen generated static data. Chang TargetInstrInfo so it's based off MCInstrInfo. llvm-svn: 134021	2011-06-28 19:10:37 +00:00
Chris Lattner	1d0c25756e	use the MachineInstrBuilder operator-> to simplify some code. There are probably more instances of this floating around. llvm-svn: 130474	2011-04-29 05:24:29 +00:00
Evan Cheng	7d6cd4902e	Change A9 scheduling itineraries VLD* / VST* entries default to "aligned". That is, it assumes addresses are 64-bit aligned (which should be the more common case). If the alignment is found not to be aligned, then getOperandLatency() would adjust the operand latency computation by one to compensate for it. rdar://9294833 llvm-svn: 129742	2011-04-19 01:21:49 +00:00
Cameron Zwarich	9c65e4d69c	Add ORR and EOR to the CMP peephole optimizer. It's hard to get isel to generate a case involving EOR, so I only added a test for ORR. llvm-svn: 129610	2011-04-15 21:24:38 +00:00
Cameron Zwarich	0829b3065a	The AND instruction leaves the V flag unmodified, so it falls victim to the same problem as all of the other instructions we fold with CMPs. llvm-svn: 129602	2011-04-15 20:45:00 +00:00
Cameron Zwarich	93eae1571c	Add missing register forms of instructions to the ARM CMP-folding code. This fixes <rdar://problem/9287901>. llvm-svn: 129599	2011-04-15 20:28:28 +00:00
Chris Lattner	0ab5e2cded	Fix a ton of comment typos found by codespell. Patch by Luis Felipe Strano Moraes! llvm-svn: 129558	2011-04-15 05:18:47 +00:00
Cameron Zwarich	8001850ee8	Fix a typo. llvm-svn: 129429	2011-04-13 06:39:16 +00:00
Owen Anderson	bdff1c997a	Teach the ARM peephole optimizer that RSB, RSC, ADC, and SBC can be used for folded comparisons, just like ADD and SUB. llvm-svn: 129038	2011-04-06 23:35:59 +00:00
Owen Anderson	d6c5a741b5	Get rid of the non-writeback versions VLDMDB and VSTMDB, which don't actually exist. llvm-svn: 128461	2011-03-29 16:45:53 +00:00
Evan Cheng	f098bf1199	Nasty bug in ARMBaseInstrInfo::produceSameValue(). The MachineConstantPoolEntry entries being compared may not be ARMConstantPoolValue. Without checking whether they are ARMConstantPoolValue first, and if the stars and moons are aligned properly, the equality test may return true (when the first few words of two Constants' values happen to be identical) and very bad things can happen. rdar://9125354 llvm-svn: 128203	2011-03-24 06:20:03 +00:00
Evan Cheng	425489d397	Cmp peephole optimization isn't always safe for signed arithmetics. int tries = INT_MAX; while (tries > 0) { tries--; } The check should be: subs r4, #1 cmp r4, #0 bgt LBB0_1 The subs can set the overflow V bit when r4 is INT_MAX+1 (which loop canonicalization apparently does in this case). cmp #0 would have cleared it while not changing the N and Z bits. Since BGT is dependent on the V bit, i.e. (N == V) && !Z, it is not safe to eliminate the cmp #0. rdar://9172742 llvm-svn: 128179	2011-03-23 22:52:04 +00:00
Anton Korobeynikov	e7410dd0d5	Preliminary support for ARM frame save directives emission via MI flags. This is just very first approximation how the stuff should be done (e.g. ARM-only for now). More to follow. llvm-svn: 127101	2011-03-05 18:43:32 +00:00
Evan Cheng	2f2435d026	Last round of fixes for movw + movt global address codegen. 1. Fixed ARM pc adjustment. 2. Fixed dynamic-no-pic codegen 3. CSE of pc-relative load of global addresses. It's now enabled by default for Darwin. llvm-svn: 123991	2011-01-21 18:55:51 +00:00
Andrew Trick	47ff14b091	Convert -enable-sched-cycles and -enable-sched-hazard to -disable flags. They are still not enable in this revision. Added TargetInstrInfo::isZeroCost() to fix a fundamental problem with the scheduler's model of operand latency in the selection DAG. Generalized unit tests to work with sched-cycles. llvm-svn: 123969	2011-01-21 05:51:33 +00:00
Evan Cheng	028ccbfcbf	Don't be overly aggressive with CSE of "ldr constantpool". If it's a pc-relative value, the "add pc" must be CSE'ed at the same time. We could follow the same approach as T2 by adding pseudo instructions that combine the ldr + "add pc". But the better approach is to use movw + movt (which I will enable soon), so I'll leave this as a TODO. llvm-svn: 123949	2011-01-20 23:55:07 +00:00
Evan Cheng	b8b0ad80a8	Sorry, several patches in one. TargetInstrInfo: Change produceSameValue() to take MachineRegisterInfo as an optional argument. When in SSA form, targets can use it to make more aggressive equality analysis. Machine LICM: 1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead. 2. Fix a bug which prevent CSE of instructions which are not re-materializable. 3. Use improved form of produceSameValue. ARM: 1. Teach ARM produceSameValue to look pass some PIC labels. 2. Look for operands from different loads of different constant pool entries which have same values. 3. Re-implement PIC GA materialization using movw + movt. Combine the pair with a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible to re-materialize the instruction, allow machine LICM to hoist the set of instructions out of the loop and make it possible to CSE them. It's a bit hacky, but it significantly improve code quality. 4. Some minor bug fixes as well. With the fixes, using movw + movt to materialize GAs significantly outperform the load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap and 176.gcc ~10%. llvm-svn: 123905	2011-01-20 08:34:58 +00:00
Evan Cheng	dfce83c8f5	Materialize GA addresses with movw + movt pairs for Darwin in PIC mode. e.g. movw r0, :lower16:(L_foo$non_lazy_ptr-(LPC0_0+4)) movt r0, :upper16:(L_foo$non_lazy_ptr-(LPC0_0+4)) LPC0_0: add r0, pc, r0 It's not yet enabled by default as some tests are failing. I suspect bugs in down stream tools. llvm-svn: 123619	2011-01-17 08:03:18 +00:00
Jakob Stoklund Olesen	2fb5b31578	Simplify a bunch of isVirtualRegister() and isPhysicalRegister() logic. These functions not longer assert when passed 0, but simply return false instead. No functional change intended. llvm-svn: 123155	2011-01-10 02:58:51 +00:00
Evan Cheng	078b0b095e	Recognize inline asm 'rev /bin/bash, ' as a bswap intrinsic call. llvm-svn: 123048	2011-01-08 01:24:27 +00:00
Andrew Trick	10ffc2b6c2	Various bits of framework needed for precise machine-level selection DAG scheduling during isel. Most new functionality is currently guarded by -enable-sched-cycles and -enable-sched-hazard. Added InstrItineraryData::IssueWidth field, currently derived from ARM itineraries, but could be initialized differently on other targets. Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is active, and if so how many cycles of state it holds. Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry into the scheduler's available queue. ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to get information about it's SUnits, provides RecedeCycle for bottom-up scheduling, correctly computes scoreboard depth, tracks IssueCount, and considers potential stall cycles when checking for hazards. ScheduleDAGRRList now models machine cycles and hazards (under flags). It tracks MinAvailableCycle, drives the hazard recognizer and priority queue's ready filter, manages a new PendingQueue, properly accounts for stall cycles, etc. llvm-svn: 122541	2010-12-24 05:03:26 +00:00
Andrew Trick	c416ba612b	whitespace llvm-svn: 122539	2010-12-24 04:28:06 +00:00
Bob Wilson	651eaa02b8	Remove the rest of the _sfp Neon instruction patterns. Use the same COPY_TO_REGCLASS approach as for the 2-register _sfp instructions. This change made a big difference in the code generated for the CodeGen/Thumb2/cross-rc-coalescing-2.ll test: The coalescer is still doing a fine job, but some instructions that were previously moved outside the loop are not moved now. It's using fewer VFP registers now, which is generally a good thing, so I think the estimates for register pressure changed and that affected the LICM behavior. Since that isn't obviously wrong, I've just changed the test file. This completes the work for Radar 8711675. llvm-svn: 121730	2010-12-13 23:02:37 +00:00
Jim Grosbach	327cf8ee5f	Refactor the ARM CMPz* patterns to just use the normal CMP instructions when possible. They were duplicates for everything exception the source pattern before. llvm-svn: 121179	2010-12-07 20:41:06 +00:00
Evan Cheng	62c7b5bf76	Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. llvm-svn: 120960	2010-12-05 22:04:16 +00:00
Jim Grosbach	81af4f9eb1	Rename t2 TBB and TBH instructions to reference that they encode the jump table data. Next up, pseudo-izing them. llvm-svn: 120320	2010-11-29 21:28:32 +00:00
Anton Korobeynikov	d08fbd19f5	Move callee-saved regs spills / reloads to TFI llvm-svn: 120228	2010-11-27 23:05:03 +00:00
Eric Christopher	b006fc9c07	Rewrite stack callee saved spills and restores to use push/pop instructions. Remove movePastCSLoadStoreOps and associated code for simple pointer increments. Update routines that depended upon other opcodes for save/restore. Adjust all testcases accordingly. llvm-svn: 119725	2010-11-18 19:40:05 +00:00
Evan Cheng	2d4e42fba6	Silence compiler warnings. llvm-svn: 119610	2010-11-18 01:43:23 +00:00
Evan Cheng	7f8ab6ee8b	Remove ARM isel hacks that fold large immediates into a pair of add, sub, and, and xor. The 32-bit move immediates can be hoisted out of loops by machine LICM but the isel hacks were preventing them. Instead, let peephole optimization pass recognize registers that are defined by immediates and the ARM target hook will fold the immediates in. Other changes include 1) do not fold and / xor into cmp to isel TST / TEQ instructions if there are multiple uses. This happens when the 'and' is live out, machine sink would have sinked the computation and that ends up pessimizing code. The peephole pass would recognize situations where the 'and' can be toggled to define CPSR and eliminate the comparison anyway. 2) Move peephole pass to after machine LICM, sink, and CSE to avoid blocking important optimizations. rdar://8663787, rdar://8241368 llvm-svn: 119548	2010-11-17 20:13:28 +00:00
Evan Cheng	655364797e	Simplify code that toggle optional operand to ARM::CPSR. llvm-svn: 119484	2010-11-17 08:06:50 +00:00
Bill Wendling	a68e3a5397	Encode the multi-load/store instructions with their respective modes ('ia', 'db', 'ib', 'da') instead of having that mode as a separate field in the instruction. It's more convenient for the asm parser and much more readable for humans. <rdar://problem/8654088> llvm-svn: 119310	2010-11-16 01:16:36 +00:00
Evan Cheng	2ce016c7f8	Code clean up. The peephole pass should be the one updating the instruction iterator, not TII->OptimizeCompareInstr. llvm-svn: 119186	2010-11-15 21:20:45 +00:00
Eric Christopher	b90f7004cf	Revert this temporarily. llvm-svn: 118827	2010-11-11 19:47:02 +00:00
Eric Christopher	e6283f950d	Change the prologue and epilogue to use push/pop for the low ARM registers. llvm-svn: 118823	2010-11-11 19:26:03 +00:00
Evan Cheng	debf9c502a	Two sets of changes. Sorry they are intermingled. 1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to "optimize for latency". Call instructions don't have the right latency and this is more likely to use introduce spills. 2. Fix if-converter cost function. For ARM, it should use instruction latencies, not # of micro-ops since multi-latency instructions is completely executed even when the predicate is false. Also, some instruction will be "slower" when they are predicated due to the register def becoming implicit input. rdar://8598427 llvm-svn: 118135	2010-11-03 00:45:17 +00:00
Bill Wendling	c6627eec13	When we look at instructions to convert to setting the 's' flag, we need to look at more than those which define CPSR. You can have this situation: (1) subs ... (2) sub r6, r5, r4 (3) movge ... (4) cmp r6, 0 (5) movge ... We cannot convert (2) to "subs" because (3) is using the CPSR set by (1). There's an analogous situation here: (1) sub r1, r2, r3 (2) sub r4, r5, r6 (3) cmp r4, ... (5) movge ... (6) cmp r1, ... (7) movge ... We cannot convert (1) to "subs" because of the intervening use of CPSR. llvm-svn: 117950	2010-11-01 20:41:43 +00:00
Evan Cheng	99cce36cf5	Fix fpscr <-> GPR latency info. llvm-svn: 117737	2010-10-29 23:16:55 +00:00
Evan Cheng	6c1414f9c2	Avoiding overly aggressive latency scheduling. If the two nodes share an operand and one of them has a single use that is a live out copy, favor the one that is live out. Otherwise it will be difficult to eliminate the copy if the instruction is a loop induction variable update. e.g. BB: sub r1, r3, #1 str r0, [r2, r3] mov r3, r1 cmp bne BB => BB: str r0, [r2, r3] sub r3, r3, #1 cmp bne BB This fixed the recent 256.bzip2 regression. llvm-svn: 117675	2010-10-29 18:09:28 +00:00
Evan Cheng	ff310737e5	Re-commit 117518 and 117519 now that ARM MC test failures are out of the way. llvm-svn: 117531	2010-10-28 06:47:08 +00:00
Evan Cheng	e2c211c1b9	Revert 117518 and 117519 for now. They changed scheduling and cause MC tests to fail. Ugh. llvm-svn: 117520	2010-10-28 02:00:25 +00:00
Evan Cheng	ff1c862f8e	- Assign load / store with shifter op address modes the right itinerary classes. - For now, loads of [r, r] addressing mode is the same as the [r, r lsl/lsr/asr #] variants. ARMBaseInstrInfo::getOperandLatency() should identify the former case and reduce the output latency by 1. - Also identify [r, r << 2] case. This special form of shifter addressing mode is "free". llvm-svn: 117519	2010-10-28 01:49:06 +00:00
Jim Grosbach	338de3ee56	Refactor ARM STR/STRB instruction patterns into STR{B}i12 and STR{B}rs, like the LDR instructions have. This makes the literal/register forms of the instructions explicit and allows us to assign scheduling itineraries appropriately. rdar://8477752 llvm-svn: 117505	2010-10-27 23:12:14 +00:00
Jim Grosbach	8bf1483a3d	The immediate operands of an LDRi12 instruction doesn't need the addrmode2 encoding tricks. Handle the 'imm doesn't fit in the insn' case. llvm-svn: 117454	2010-10-27 16:50:31 +00:00
Jim Grosbach	9d2d1f0f00	LDRi12 machine instructions handle negative offset operands normally (simple integer values), not with the addrmode2 encoding. llvm-svn: 117429	2010-10-27 01:19:41 +00:00
Jim Grosbach	5a7c715470	Split ARM::LDRB into LDRBi12 and LDRBrs. Adjust accordingly. Continuing on rdar://8477752. llvm-svn: 117419	2010-10-27 00:19:44 +00:00
Jim Grosbach	1e4d9a17c2	First part of refactoring ARM addrmode2 (load/store) instructions to be more explicit about the operands. Split out the different variants into separate instructions. This gives us the ability to, among other things, assign different scheduling itineraries to the variants. rdar://8477752. llvm-svn: 117409	2010-10-26 22:37:02 +00:00
Evan Cheng	e96b8d7ab6	Use instruction itinerary to determine what instructions are 'cheap'. llvm-svn: 117348	2010-10-26 02:08:50 +00:00
Chandler Carruth	82058c05f8	Move the remaining attribute macros to systematic names based on the attribute name and prefixed with 'LLVM_'. llvm-svn: 117203	2010-10-23 08:40:19 +00:00
Evan Cheng	ad79526471	Latency between CPSR def and branch is zero. llvm-svn: 117192	2010-10-23 02:04:38 +00:00
Evan Cheng	63c7608c34	Re-enable register pressure aware machine licm with fixes. Hoist() may have erased the instruction during LICM so UpdateRegPressureAfter() should not reference it afterwards. llvm-svn: 116845	2010-10-19 18:58:51 +00:00
Daniel Dunbar	418204e523	Revert r116781 "- Add a hook for target to determine whether an instruction def is", which breaks some nightly tests. llvm-svn: 116816	2010-10-19 17:14:24 +00:00
Evan Cheng	8249dfe6ce	- Add a hook for target to determine whether an instruction def is "long latency" enough to hoist even if it may increase spilling. Reloading a value from spill slot is often cheaper than performing an expensive computation in the loop. For X86, that means machine LICM will hoist SQRT, DIV, etc. ARM will be somewhat aggressive with VFP and NEON instructions. - Enable register pressure aware machine LICM by default. llvm-svn: 116781	2010-10-19 00:55:07 +00:00
Bill Wendling	337a31133b	Don't recompute MachineRegisterInfo in the Optimize* method. llvm-svn: 116750	2010-10-18 21:22:31 +00:00
Bill Wendling	59ebe44049	Check to make sure that the iterator isn't at the beginning of the basic block before decrementing. <rdar://problem/8529919> llvm-svn: 116126	2010-10-09 00:03:48 +00:00
Evan Cheng	412e37bd34	Code refactoring. llvm-svn: 116002	2010-10-07 23:12:15 +00:00
Evan Cheng	1958cefd69	Model operand cycles of vldm / vstm; also fixes scheduling itineraries of vldr / vstr, etc. llvm-svn: 115898	2010-10-07 01:50:48 +00:00
Jim Grosbach	24ab1ce8c2	Clean up MOVi32imm and t2MOVi32imm pseudo instruction definitions. llvm-svn: 115853	2010-10-06 22:01:26 +00:00
Evan Cheng	49d4c0bd18	- Add TargetInstrInfo::getOperandLatency() to compute operand latencies. This allow target to correctly compute latency for cases where static scheduling itineraries isn't sufficient. e.g. variable_ops instructions such as ARM::ldm. This also allows target without scheduling itineraries to compute operand latencies. e.g. X86 can return (approximated) latencies for high latency instructions such as division. - Compute operand latencies for those defined by load multiple instructions, e.g. ldm and those used by store multiple instructions, e.g. stm. llvm-svn: 115755	2010-10-06 06:27:31 +00:00
Michael J. Spencer	70ac5fa42c	fix MSVC 2010 build. llvm-svn: 115594	2010-10-05 06:00:43 +00:00
Michael J. Spencer	e7f00cbb7c	Cleanup Whitespace. llvm-svn: 115593	2010-10-05 06:00:33 +00:00
Owen Anderson	f31f33ea89	Thread the determination of branch prediction hit rates back through the if-conversion heuristic APIs. For now, stick with a constant estimate of 90% (branch predictors are good!), but we might find that we want to provide more nuanced estimates in the future. llvm-svn: 115364	2010-10-01 22:45:50 +00:00
Owen Anderson	2ecba4a07e	Make the spelling of the flags for old-style if-conversion heuristics consistent between ARM and Thumb2. llvm-svn: 115341	2010-10-01 20:33:47 +00:00
Owen Anderson	b9b63ee031	Temporarily add a flag to make it easier to compare the new-style ARM if conversion heuristics to the old-style ones. llvm-svn: 115239	2010-09-30 23:48:38 +00:00
Gabor Greif	d36e3e8850	improve heuristics to find the 'and' corresponding to 'tst' to also catch opportunities on thumb2 added some doxygen on the way llvm-svn: 115033	2010-09-29 10:12:08 +00:00
Owen Anderson	a3181e2d79	Add a subtarget hook for reporting the misprediction penalty. Use this to provide more precise cost modeling for if-conversion. Now if only we had a way to estimate the misprediction probability. Adjsut CodeGen/ARM/ifcvt10.ll. The pipeline on Cortex-A8 is long enough that it is still profitable to predicate an ldm, but the shorter pipeline on Cortex-A9 makes it unprofitable. llvm-svn: 114995	2010-09-28 21:57:50 +00:00
Owen Anderson	88af7d00fc	Part one of switching to using a more sane heuristic for determining if-conversion profitability. Rather than having arbitrary cutoffs, actually try to cost model the conversion. For now, the constants are tuned to more or less match our existing behavior, but these will be changed to reflect realistic values as this work proceeds. llvm-svn: 114973	2010-09-28 18:32:13 +00:00
Eric Christopher	bf86fd3c47	80-col fixups. llvm-svn: 114943	2010-09-28 04:18:29 +00:00
Evan Cheng	1596f7f6f3	Fix r114632. Return if the only terminator is an unconditional branch after the redundant ones are deleted. llvm-svn: 114688	2010-09-23 19:42:03 +00:00
Evan Cheng	66c8cd2b32	If there are multiple unconditional branches terminating a block, eliminate all but the first one. Those will never be executed. There was logic to do this but it was faulty. llvm-svn: 114632	2010-09-23 06:54:40 +00:00
Evan Cheng	d757c88bba	OptimizeCompareInstr should avoid iterating pass the beginning of the MBB when the 'and' instruction is after the comparison. llvm-svn: 114506	2010-09-21 23:49:07 +00:00
Gabor Greif	1a25ae88ff	Fix buglet when the TST instruction directly uses the AND result. I am unable to write a test for this case, help is solicited, though... What I did is to tickle the code in the debugger and verify that we do the right thing. llvm-svn: 114430	2010-09-21 13:30:57 +00:00
Gabor Greif	adbbb93d3d	Move the search for the appropriate AND instruction into OptimizeCompareInstr. This necessitates the passing of CmpValue around, so widen the virtual functions to accomodate. No functionality changes. llvm-svn: 114428	2010-09-21 12:01:15 +00:00
Chris Lattner	e3d864b857	convert targets to the new MF.getMachineMemOperand interface. llvm-svn: 114391	2010-09-21 04:39:43 +00:00
Jakob Stoklund Olesen	44857a38fa	Remember VLDMQ. llvm-svn: 114026	2010-09-15 21:40:11 +00:00
Jakob Stoklund Olesen	b929c7173d	Add missing break. llvm-svn: 114025	2010-09-15 21:40:09 +00:00
Jakob Stoklund Olesen	33005d1327	Recognize VST1q64Pseudo and VSTMQ as stack slot stores. Recognize VLD1q64Pseudo as a stack slot load. Reject these if they are loading or storing a subregister. The API (and VirtRegRewriter) doesn't know how to deal with that. llvm-svn: 113985	2010-09-15 17:27:09 +00:00
Bob Wilson	660d7ecf32	Reapply Gabor's 113839, 113840, and 113876 with a fix for a problem encountered while building llvm-gcc for arm. This is probably the same issue that the ppc buildbot hit. llvm::prior works on a MachineBasicBlock::iterator, not a plain MachineInstr. llvm-svn: 113983	2010-09-15 17:12:08 +00:00
Gabor Greif	9ae4b271f2	the darwin9-powerpc buildbot keeps consistently crashing, backing out following to get it back to green, so I can investigate in peace: svn merge -c -113840 llvm/test/CodeGen/ARM/arm-and-tst-peephole.ll svn merge -c -113876 -c -113839 llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp llvm-svn: 113980	2010-09-15 16:53:07 +00:00
Jakob Stoklund Olesen	11f5be3b86	Move ARM is{LoadFrom,StoreTo}StackSlot closer to their siblings so they won't be forgotten in the future. Coalesce identical cases in switch. No functional changes intended. llvm-svn: 113979	2010-09-15 16:36:26 +00:00
Bob Wilson	2c00b5098a	Spelling fix. llvm-svn: 113978	2010-09-15 16:28:21 +00:00
Bob Wilson	b1e9d4bff1	Use VLD1/VST1 pseudo instructions for loadRegFromStackSlot and storeRegToStackSlot. llvm-svn: 113918	2010-09-15 01:48:05 +00:00
Gabor Greif	b54e9387ab	an attempt to salvage the darwin9-powerpc buildbot, which could be miscompiling this line llvm-svn: 113876	2010-09-14 22:25:16 +00:00
Gabor Greif	d0cef1e2ef	Eliminate a 'tst' that immediately follows an 'and' by morphing the 'and' to its recording form 'andS'. This is basically a test commit into this area, to see whether the bots like me. Several generalizations can be applied and various avenues of code simplification are open. I'll introduce those as I go. I am aware of stylistic input from Bill Wendling, about where put the analysis complexity, but I am positive that we can move things around easily and will find a satisfactory solution. llvm-svn: 113839	2010-09-14 09:23:22 +00:00
Bill Wendling	27dddd1fd1	Rename ConvertToSetZeroFlag to something more general. llvm-svn: 113670	2010-09-11 00:13:50 +00:00
Bill Wendling	d0a5f4e238	No need to recompute the SrcReg and CmpValue. llvm-svn: 113666	2010-09-10 23:46:12 +00:00
Bill Wendling	041230014c	Move some of the decision logic for converting an instruction into one that sets the 'zero' bit down into the back-end. There are other cases where this logic isn't sufficient, so they should be handled separately. llvm-svn: 113665	2010-09-10 23:34:19 +00:00
Bill Wendling	aee679bf35	Modify the comparison optimizations in the peephole optimizer to update the iterator when an optimization took place. This allows us to do more insane things with the code than just remove an instruction or two. llvm-svn: 113640	2010-09-10 21:55:43 +00:00
Jim Grosbach	1f77ee5691	Add a missing case to duplicateCPV() for LSDA constants. Add a FIXME. rdar://8302157 llvm-svn: 113637	2010-09-10 21:38:22 +00:00
Evan Cheng	bf4070756f	Teach if-converter to be more careful with predicating instructions that would take multiple cycles to decode. For the current if-converter clients (actually only ARM), the instructions that are predicated on false are not nops. They would still take machine cycles to decode. Micro-coded instructions such as LDM / STM can potentially take multiple cycles to decode. If-converter should take treat them as non-micro-coded simple instructions. llvm-svn: 113570	2010-09-10 01:29:16 +00:00
Evan Cheng	367a5df8cf	For each instruction itinerary class, specify the number of micro-ops each instruction in the class would be decoded to. Or zero if the number of uOPs must be determined dynamically. This will be used to determine the cost-effectiveness of predicating a micro-coded instruction. llvm-svn: 113513	2010-09-09 18:18:55 +00:00
Jim Grosbach	19cb2f4c67	remove obsolete comment llvm-svn: 113337	2010-09-08 03:51:44 +00:00
Jim Grosbach	136d035e45	correct spill code to properly determine if dynamic stack realignment is present in the function and thus whether aligned load/store instructions can be used. llvm-svn: 113323	2010-09-08 00:26:59 +00:00
Bob Wilson	13ce07fa92	Change ARM VFP VLDM/VSTM instructions to use addressing mode #4 , just like all the other LDM/STM instructions. This fixes asm printer crashes when compiling with -O0. I've changed one of the NEON tests (vst3.ll) to run with -O0 to check this in the future. Prior to this change VLDM/VSTM used addressing mode #5, but not really. The offset field was used to hold a count of the number of registers being loaded or stored, and the AM5 opcode field was expanded to specify the IA or DB mode, instead of the standard ADD/SUB specifier. Much of the backend was not aware of these special cases. The crashes occured when rewriting a frameindex caused the AM5 offset field to be changed so that it did not have a valid submode. I don't know exactly what changed to expose this now. Maybe we've never done much with -O0 and NEON. Regardless, there's no longer any reason to keep a count of the VLDM/VSTM registers, so we can use addressing mode #4 and clean things up in a lot of places. llvm-svn: 112322	2010-08-27 23:18:17 +00:00
Bill Wendling	ad2aa57774	Minor simplification. Gets rid of a needless temporary. llvm-svn: 111430	2010-08-18 21:32:07 +00:00
Bill Wendling	79553bad50	Handle ARM compares as well as converting for ARM adds, subs, and thumb2's adds. llvm-svn: 110762	2010-08-11 00:23:00 +00:00
Bill Wendling	0757820f8f	Turn optimize compares back on with fix. We needed to test that a machine op was a register before checking if it was defined. llvm-svn: 110733	2010-08-10 21:38:11 +00:00
Bill Wendling	798617b1ab	Use the "isCompare" machine instruction attribute instead of calling the relatively expensive comparison analyzer on each instruction. Also rename the comparison analyzer method to something more in line with what it actually does. This pass is will eventually be folded into the Machine CSE pass. llvm-svn: 110539	2010-08-08 05:04:59 +00:00
Bill Wendling	7de9d52c13	Add the Optimize Compares pass (disabled by default). This pass tries to remove comparison instructions when possible. For instance, if you have this code: sub r1, 1 cmp r1, 0 bz L1 and "sub" either sets the same flag as the "cmp" instruction or could be converted to set the same flag, then we can eliminate the "cmp" instruction all together. This is a important for ARM where the ALU instructions could set the CPSR flag, but need a special suffix ('s') to do so. llvm-svn: 110423	2010-08-06 01:32:48 +00:00
Jim Grosbach	d343166a0b	Many Thumb2 instructions can reference the full ARM register set (i.e., have 4 bits per register in the operand encoding), but have undefined behavior when the operand value is 13 or 15 (SP and PC, respectively). The trivial coalescer in linear scan sometimes will merge a copy from SP into a subsequent instruction which uses the copy, and if that instruction cannot legally reference SP, we get bad code such as: mls r0,r9,r0,sp instead of: mov r2, sp mls r0, r9, r0, r2 This patch adds a new register class for use by Thumb2 that excludes the problematic registers (SP and PC) and is used instead of GPR for those operands which cannot legally reference PC or SP. The trivial coalescer explicitly requires that the register class of the destination for the COPY instruction contain the source register for the COPY to be considered for coalescing. This prevents errant instructions like that above. PR7499 llvm-svn: 109842	2010-07-30 02:41:01 +00:00
Chris Lattner	cbe9856fce	prune #includes a little. llvm-svn: 108929	2010-07-20 21:17:29 +00:00
Jakob Stoklund Olesen	8289f78569	Remove the isMoveInstr() hook. llvm-svn: 108567	2010-07-16 22:35:46 +00:00
Bill Wendling	499f797cdd	Rename DBG_LABEL PROLOG_LABEL, because it's only used during prolog emission and thus is a much more meaningful name. llvm-svn: 108563	2010-07-16 22:20:36 +00:00
Jakob Stoklund Olesen	0961c55161	RISC architectures get their memory operand folding for free. The only folding these load/store architectures can do is converting COPY into a load or store, and the target independent part of foldMemoryOperand already knows how to do that. llvm-svn: 108099	2010-07-11 19:19:13 +00:00
Jakob Stoklund Olesen	d7b33002dd	Replace copyRegToReg with copyPhysReg for ARM. llvm-svn: 108078	2010-07-11 06:33:54 +00:00
Jakob Stoklund Olesen	7a7b55eb67	Automatically fold COPY instructions into stack load/store. llvm-svn: 108012	2010-07-09 20:43:13 +00:00
Bob Wilson	1eade1a327	For big-endian systems, VLD2/VST2 with 32-bit vector elements will swap the words within the 64-bit D registers. Use VLD1/VST1 with 64-bit elements instead. llvm-svn: 107890	2010-07-08 17:44:00 +00:00
Bob Wilson	4c1ca29039	Represent NEON load/store alignments in bytes, not bits. llvm-svn: 107701	2010-07-06 21:26:18 +00:00
Rafael Espindola	7c510aa7bc	Don't create neon moves in CopyRegToReg. NEONMoveFixPass will do the conversion if profitable. llvm-svn: 107673	2010-07-06 16:24:34 +00:00
Rafael Espindola	38a7d7cbc3	Add a VT argument to getMinimalPhysRegClass and replace the copy related uses of getPhysicalRegisterRegClass with it. If we want to make a copy (or estimate its cost), it is better to use the smallest class as more efficient operations might be possible. llvm-svn: 107140	2010-06-29 14:02:34 +00:00
Evan Cheng	02b184de5b	Change if-conversion block size limit checks to add some flexibility. llvm-svn: 106901	2010-06-25 22:42:03 +00:00
Jim Grosbach	ba3ece6f27	IT instructions are considered to be scheduling hazards, but are scheduled with the following instructions. This is done via trickery by considering the instruction preceding the IT to be the hazard. Care must be taken to ensure it's the first non-debug instruction, or the presence of debug info will affect codegen. Part of the continuing work for rdar://7797940, making ARM code-gen unaffected by the presence of debug information. llvm-svn: 106871	2010-06-25 18:43:14 +00:00
Bill Wendling	f470747a36	We are missing opportunites to use ldm. Take code like this: void t(int cp0, int cp1, int dp, int fmd) { int c0, c1, d0, d1, d2, d3; c0 = (cp0++ & 0xffff) \| ((cp1++ << 16) & 0xffff0000); c1 = (cp0++ & 0xffff) \| ((cp1++ << 16) & 0xffff0000); / ... */ } It code gens into something pretty bad. But with this change (analogous to the X86 back-end), it will use ldm and generate few instructions. llvm-svn: 106693	2010-06-23 23:00:16 +00:00
Evan Cheng	2d51c7c592	Allow ARM if-converter to be run after post allocation scheduling. - This fixed a number of bugs in if-converter, tail merging, and post-allocation scheduler. If-converter now runs branch folding / tail merging first to maximize if-conversion opportunities. - Also changed the t2IT instruction slightly. It now defines the ITSTATE register which is read by instructions in the IT block. - Added Thumb2 specific hazard recognizer to ensure the scheduler doesn't change the instruction ordering in the IT block (since IT mask has been finalized). It also ensures no other instructions can be scheduled between instructions in the IT block. This is not yet enabled. llvm-svn: 106344	2010-06-18 23:09:54 +00:00
Bob Wilson	a92e41a50a	Rewrite chained if's as switches and replace assertions with llvm_unreachable (as suggested in radar 8104405). llvm-svn: 106318	2010-06-18 21:32:42 +00:00
Stuart Hastings	0125b6410a	Add a DebugLoc parameter to TargetInstrInfo::InsertBranch(). This addresses a longstanding deficiency noted in many FIXMEs scattered across all the targets. This effectively moves the problem up one level, replacing eleven FIXMEs in the targets with eight FIXMEs in CodeGen, plus one path through FastISel where we actually supply a DebugLoc, fixing Radar 7421831. llvm-svn: 106243	2010-06-17 22:43:56 +00:00
Dale Johannesen	44f9dfc9cf	Next round of tail call changes. Register used in a tail call must not be callee-saved; following x86, add a new regclass to represent this. Also fixes a couple of bugs. Still disabled by default; Thumb doesn't work yet. llvm-svn: 106053	2010-06-15 22:08:33 +00:00
Bob Wilson	1478142485	VMOVQQ and VMOVQQQQ are pseudo instructions and not predicable. llvm-svn: 105990	2010-06-15 05:51:27 +00:00
Bruno Cardoso Lopes	c2f87b7bb2	Reapply r105521, this time appending "LLU" to 64 bit immediates to avoid breaking the build. llvm-svn: 105652	2010-06-08 22:51:23 +00:00
Chris Lattner	fdd2614330	revert r105521, which is breaking the buildbots with stuff like this: In file included from X86InstrInfo.cpp:16: X86GenInstrInfo.inc:2789: error: integer constant is too large for 'long' type X86GenInstrInfo.inc:2790: error: integer constant is too large for 'long' type X86GenInstrInfo.inc:2792: error: integer constant is too large for 'long' type X86GenInstrInfo.inc:2793: error: integer constant is too large for 'long' type X86GenInstrInfo.inc:2808: error: integer constant is too large for 'long' type X86GenInstrInfo.inc:2809: error: integer constant is too large for 'long' type X86GenInstrInfo.inc:2816: error: integer constant is too large for 'long' type X86GenInstrInfo.inc:2817: error: integer constant is too large for 'long' type llvm-svn: 105524	2010-06-05 04:17:30 +00:00
Bruno Cardoso Lopes	594fa26317	Initial AVX support for some instructions. No patterns matched yet, only assembly encoding support. llvm-svn: 105521	2010-06-05 03:53:24 +00:00
Jakob Stoklund Olesen	a8ad97743d	Slightly change the meaning of the reMaterialize target hook when the original instruction defines subregisters. Any existing subreg indices on the original instruction are preserved or composed with the new subreg index. Also substitute multiple operands mentioning the original register by using the new MachineInstr::substituteRegister() function. This is necessary because there will soon be <imp-def> operands added to non read-modify-write partial definitions. This instruction: %reg1234:foo = FLAP %reg1234<imp-def> will reMaterialize(%reg3333, bar) like this: %reg3333:bar-foo = FLAP %reg333:bar<imp-def> Finally, replace the TargetRegisterInfo pointer argument with a reference to indicate that it cannot be NULL. llvm-svn: 105358	2010-06-02 22:47:25 +00:00
Jim Grosbach	84511e1526	Clean up 80 column violations. No functional change. llvm-svn: 105350	2010-06-02 21:53:11 +00:00
Rafael Espindola	f2dffcef82	Remove the TargetRegisterClass member from CalleeSavedInfo llvm-svn: 105344	2010-06-02 20:02:30 +00:00
Jim Grosbach	faa3abbe39	Update the saved stack pointer in the sjlj function context following either an alloca() or an llvm.stackrestore(). rdar://8031573 llvm-svn: 104900	2010-05-27 23:49:24 +00:00
Jakob Stoklund Olesen	6c47d6423c	Switch ARMRegisterInfo.td to use SubRegIndex and eliminate the parallel enums from ARMRegisterInfo.h llvm-svn: 104508	2010-05-24 16:54:32 +00:00
Evan Cheng	168ced94d8	Implement @llvm.returnaddress. rdar://8015977. llvm-svn: 104421	2010-05-22 01:47:14 +00:00
Jim Grosbach	bd9485db63	Implement eh.sjlj.longjmp for ARM. Clean up the intrinsic a bit. Followups: docs patch for the builtin and eh.sjlj.setjmp cleanup to match longjmp. llvm-svn: 104419	2010-05-22 01:06:18 +00:00
Evan Cheng	cd67c21407	Added a QQQQ register file to model 4-consecutive Q registers. llvm-svn: 103760	2010-05-14 02:13:41 +00:00
Evan Cheng	9de7cfe3f4	Bring back VLD1q and VST1q and use them for reloading / spilling Q registers. This allows folding loads and stores into VMOVQ. llvm-svn: 103692	2010-05-13 01:12:06 +00:00
Evan Cheng	86eb22976f	Use VLD2q32 / VST2q32 to reload / spill QQ (pair of Q) registers when stack slot is sufficiently aligned. Use VLDMD / VSTMD otherwise. llvm-svn: 103235	2010-05-07 02:04:02 +00:00
Evan Cheng	04d47e8efa	Use VSTMD / VLDMD for spills and reloads of Q registers instead of VSTMQ / VLDQ. The later are aliases which ought to be eliminated but we can't because they are used for storing and loading v2f64 values. llvm-svn: 103234	2010-05-07 01:54:08 +00:00
Evan Cheng	ddc93c7e04	Remove VLD1q and VST1q for reloading and spilling Q registers. Just use VLD1q64 / VST1q64 and reference sub-registers. llvm-svn: 103218	2010-05-07 00:24:52 +00:00
Dan Gohman	779c69bbc5	Add a DebugLoc argument to TargetInstrInfo::copyRegToReg, so that it doesn't have to guess. llvm-svn: 103194	2010-05-06 20:33:48 +00:00
Evan Cheng	efb126a665	Add argument TargetRegisterInfo to loadRegFromStackSlot and storeRegToStackSlot. llvm-svn: 103193	2010-05-06 19:06:44 +00:00
Evan Cheng	31cdcd46d6	Re-apply 103156 and 103157. 103156 didn't break anything. 10315 exposed a coalescer bug that's fixed by 103170. llvm-svn: 103172	2010-05-06 06:36:08 +00:00
Dan Gohman	77c71811f5	Revert r103157, which broke test/CodeGen/ARM/2009-11-30-LiveVariablesBug.ll. llvm-svn: 103163	2010-05-06 05:08:57 +00:00
Eric Christopher	9feb1bb117	Revert r103156 since it was breaking the build bots. Reverse-merging r103156 into '.': U lib/Target/ARM/ARMInstrNEON.td U lib/Target/ARM/ARMRegisterInfo.h U lib/Target/ARM/ARMBaseRegisterInfo.cpp U lib/Target/ARM/ARMBaseInstrInfo.cpp U lib/Target/ARM/ARMRegisterInfo.td llvm-svn: 103159	2010-05-06 02:29:06 +00:00
Evan Cheng	8fd7b510d6	Fix an obvious bug in isMoveInstr. It needs to return sub-register indices. llvm-svn: 103157	2010-05-06 01:54:03 +00:00
Evan Cheng	8f99a1c6b4	Adding pseudo 256-bit registers QQ0 . . . QQ7 to represent pairs of Q registers. These will be used to model VLD2 / VST2 instructions in order to get substantially better codegen for them. llvm-svn: 103156	2010-05-06 01:52:03 +00:00
Evan Cheng	9d768f4445	Cosmetic changes. llvm-svn: 103155	2010-05-06 01:34:11 +00:00
Evan Cheng	718ff448df	storeRegToStackSlot has forgotten about QPR_8 register class. llvm-svn: 103154	2010-05-06 01:32:54 +00:00
Evan Cheng	250e917e9d	Frame index can be negative. llvm-svn: 102577	2010-04-29 01:13:30 +00:00
Jim Grosbach	04cbcca319	Add sizes non-floating point versions for the eh sjlj intrinsic expansions. rdar://7895451 llvm-svn: 102526	2010-04-28 20:33:09 +00:00
Evan Cheng	bcb99ecc18	Add ARM specific emitFrameIndexDebugValue. llvm-svn: 102324	2010-04-26 07:39:25 +00:00
Dale Johannesen	60b289709e	Educate GetInstrSizeInBytes implementations that DBG_VALUE does not generate code. llvm-svn: 100681	2010-04-07 19:51:44 +00:00
Chris Lattner	6f306d7d30	use DebugLoc default ctor instead of DebugLoc::getUnknownLoc() llvm-svn: 100214	2010-04-02 20:16:16 +00:00
Dale Johannesen	4244d12769	Teach AnalyzeBranch, RemoveBranch and the branch folder to be tolerant of debug info following the branch(es) at the end of a block. llvm-svn: 100168	2010-04-02 01:38:09 +00:00
Bob Wilson	59f75bba24	Fix VLDMQ and VSTMQ instructions to use the correct encoding and address modes. These instructions are only needed for codegen, so I've removed all the explicit encoding bits for now; they should be set in the same way as the for VLDMD and VSTMD whenever we add encodings for VFP. The use of addrmode5 requires that the instructions be custom-selected so that the number of registers can be set in the AM5Opc value. llvm-svn: 99309	2010-03-23 18:54:46 +00:00
Bob Wilson	9b680e21c0	Rename some instructions to match the corresponding NEON opcode. llvm-svn: 99266	2010-03-23 06:26:18 +00:00
Bob Wilson	cc0a2a75a0	Change VST1 instructions for loading Q register values to operate on pairs of D registers. Add a separate VST1q instruction with a Q register source operand for use by storeRegToStackSlot. llvm-svn: 99265	2010-03-23 06:20:33 +00:00
Bob Wilson	340861d29e	Change VLD1 instructions for loading Q register values to operate on pairs of D registers. Add a separate VLD1q instruction with a Q register destination operand for use by loadRegFromStackSlot. llvm-svn: 99261	2010-03-23 05:25:43 +00:00
Bob Wilson	ae08a736d6	Re-commit r98683 ("remove redundant writeback flag from ARM address mode 6") with changes to add a separate optional register update argument. Change all the NEON instructions with address register writeback to use it. llvm-svn: 99095	2010-03-20 22:13:40 +00:00
Anton Korobeynikov	422dd6608a	Refactor Reg-Reg copy emission routine for ARM. This makes cross-regclass copies weirdness more straightforward. Also, add GPR <-> SPR copy support. llvm-svn: 98887	2010-03-18 22:35:02 +00:00
Bob Wilson	c7ba918b84	Revert 98683. It is breaking something in the disassembler. llvm-svn: 98692	2010-03-16 23:01:13 +00:00
Bob Wilson	c953bca10b	Remove redundant writeback flag from ARM address mode 6. Also remove the optional register update argument, which is currently unused -- when we add support for that, it can just be a separate operand. llvm-svn: 98683	2010-03-16 21:44:40 +00:00
Evan Cheng	e9c46c25a1	- Change MachineInstr::isIdenticalTo to take a new option that determines whether it should skip checking defs or at least virtual register defs. This subsumes part of the TargetInstrInfo::isIdentical functionality. - Eliminate TargetInstrInfo::isIdentical and replace it with produceSameValue. In the default case, produceSameValue just checks whether two machine instructions are identical (except for virtual register defs). But targets may override it to check for unusual cases (e.g. ARM pic loads from constant pools). llvm-svn: 97628	2010-03-03 01:44:33 +00:00
Bob Wilson	37f106e18c	Handle tGPR register class in a few more places. This fixes some llvm-gcc build failures due to my fix for pr6111. llvm-svn: 96402	2010-02-16 22:01:59 +00:00
Bob Wilson	70aa8d0745	Fix pr6111: Avoid using the LR register for the target address of an indirect branch in ARM v4 code, since it gets clobbered by the return address before it is used. Instead of adding a new register class containing all the GPRs except LR, just use the existing tGPR class. llvm-svn: 96360	2010-02-16 17:24:15 +00:00
Chris Lattner	b06015aa69	move target-independent opcodes out of TargetInstrInfo into TargetOpcodes.h. #include the new TargetOpcodes.h into MachineInstr. Add new inline accessors (like isPHI()) to MachineInstr, and start using them throughout the codebase. llvm-svn: 95687	2010-02-09 19:54:29 +00:00
Jim Grosbach	a570d05228	tighten up eh.setjmp sequence a bit. llvm-svn: 95603	2010-02-08 23:22:00 +00:00
Jim Grosbach	a3575ca846	Adjust setjmp instruction sequence to not need 32-bit alignment padding llvm-svn: 94627	2010-01-27 00:07:20 +00:00
Chris Lattner	a14ac3fd80	prep work to support a future where getJumpTableInfo will return a null pointer for functions with no jump tables. No functionality change. llvm-svn: 94469	2010-01-25 23:22:00 +00:00
Jim Grosbach	04770f2aa1	For aligned load/store instructions, it's only required to know whether a function can support dynamic stack realignment. That's a much easier question to answer at instruction selection stage than whether the function actually will have dynamic alignment prologue. This allows the removal of the stack alignment heuristic pass, and improves code quality for cases where the heuristic would result in dynamic alignment code being generated when it was not strictly necessary. llvm-svn: 93885	2010-01-19 18:31:11 +00:00
Jakob Stoklund Olesen	29a64c9575	Add Target hook to duplicate machine instructions. Some instructions refer to unique labels, and so cannot be trivially cloned with CloneMachineInstr. llvm-svn: 92873	2010-01-06 23:47:07 +00:00
Bill Wendling	a91b8207cc	Remove dead variable. llvm-svn: 92193	2009-12-28 01:57:39 +00:00
Jim Grosbach	5f9f721e95	remove out of date FIXME. llvm-svn: 90490	2009-12-03 21:55:01 +00:00
Chris Lattner	c831fac043	fix a build problem with VC++, PR5664, patch by Alp Toker! llvm-svn: 90419	2009-12-03 06:58:32 +00:00
Jim Grosbach	36d4dec28a	Thumb1 exception handling setjmp llvm-svn: 90246	2009-12-01 18:10:36 +00:00
Bob Wilson	505ddaa4dc	Remove isProfitableToDuplicateIndirectBranch target hook. It is profitable for all the processors where I have tried it, and even when it might not help performance, the cost is quite low. The opportunities for duplicating indirect branches are limited by other factors so code size does not change much due to tail duplicating indirect branches aggressively. llvm-svn: 90144	2009-11-30 18:35:03 +00:00
Bob Wilson	d4d40670e8	Refactor target hook for tail duplication as requested by Chris. Make tail duplication of indirect branches much more aggressive (for targets that indicate that it is profitable), based on further experience with this transformation. I compiled 3 large applications with and without this more aggressive tail duplication and measured minimal changes in code size. ("size" on Darwin seems to round the text size up to the nearest page boundary, so I can only say that any code size increase was less than one 4k page.) Radar 7421267. llvm-svn: 89814	2009-11-24 23:35:49 +00:00
Evan Cheng	184ec26fcd	Enable predication of NEON instructions in Thumb2 mode. llvm-svn: 89748	2009-11-24 08:06:15 +00:00
Evan Cheng	a33fc86be3	Add predicate operand to NEON instructions. Fix lots (but not all) 80 col violations in ARMInstrNEON.td. llvm-svn: 89542	2009-11-21 06:21:52 +00:00
Evan Cheng	bbd50b0f78	Also CSE non-pic load from constant pools. llvm-svn: 89440	2009-11-20 02:10:27 +00:00
Bob Wilson	290e9a47a9	Add a target hook to allow changing the tail duplication limit based on the contents of the block to be duplicated. Use this for ARM Cortex A8/9 to be more aggressive tail duplicating indirect branches, since it makes it much more likely that they will be predicted in the branch target buffer. Testcase coming soon. llvm-svn: 89187	2009-11-18 03:34:27 +00:00
Jim Grosbach	01c1cae34d	Detect need for autoalignment of the stack earlier to catch spills more conservatively. eliminateFrameIndex() machinery adjust to handle addr mode 6 (vld1/vst1) used for spills. Fix tests to expect aligned Q-reg spilling llvm-svn: 88874	2009-11-15 21:45:34 +00:00
Jim Grosbach	74ae3e5b0e	set the def of the VLD1q64 properly llvm-svn: 88873	2009-11-15 21:05:07 +00:00
Evan Cheng	6ad7da96fe	- Change TargetInstrInfo::reMaterialize to pass in TargetRegisterInfo. - If destination is a physical register and it has a subreg index, use the sub-register instead. This fixes PR5423. llvm-svn: 88745	2009-11-14 02:55:43 +00:00
Jim Grosbach	d7cf55cd0e	Use Unified Assembly Syntax for the ARM backend. llvm-svn: 86494	2009-11-09 00:11:35 +00:00
Jim Grosbach	a15c3b7124	Use aligned load/store instructions for spilling Q registers when we know the stack slot is 128 bit aligned llvm-svn: 86425	2009-11-08 00:27:19 +00:00
Evan Cheng	fe864425cb	Refactor code. llvm-svn: 86423	2009-11-08 00:15:23 +00:00
Jim Grosbach	4e9f379554	80-column cleanup of file header comments llvm-svn: 86408	2009-11-07 22:00:39 +00:00
Evan Cheng	a8e8a7c976	Refactor code. Fix a potential missing check. Teach isIdentical() about tLDRpci_pic. llvm-svn: 86330	2009-11-07 04:04:34 +00:00
Evan Cheng	b376ce0169	Fix t2Int_eh_sjlj_setjmp. Immediate form of orr is a 32-bit instruction. So it should be 22 bytes instead of 20 bytes long. llvm-svn: 85965	2009-11-03 23:13:34 +00:00
Evan Cheng	31c2f4701b	Trim unnecessary include. llvm-svn: 85878	2009-11-03 07:08:08 +00:00
Evan Cheng	23c009f125	Clean up copyRegToReg. llvm-svn: 85870	2009-11-03 05:51:39 +00:00
Anton Korobeynikov	d195f9e5c3	Turn neon reg-reg moves fixup code into separate pass. This should reduce the compile time. llvm-svn: 85850	2009-11-03 01:04:26 +00:00
Evan Cheng	1708b06c0e	Unbreak ARMBaseRegisterInfo::copyRegToReg. llvm-svn: 85787	2009-11-02 04:44:55 +00:00
Anton Korobeynikov	14635da94b	Use NEON reg-reg moves, where profitable. This reduces "domain-cross" stalls, when we used to mix vfp and neon code (the former were used for reg-reg moves) llvm-svn: 85764	2009-11-02 00:10:38 +00:00
Bob Wilson	73789b848d	Add a Thumb BRIND pattern. Change the ARM BRIND assembly to separate the opcode and operand with a tab. Check for these instructions in the usual places. llvm-svn: 85411	2009-10-28 18:26:41 +00:00
Evan Cheng	5d1b849658	Don't forget subreg indices when folding load / store. llvm-svn: 85048	2009-10-25 07:52:27 +00:00
Evan Cheng	46ed1f8341	80 col violation. llvm-svn: 84986	2009-10-24 02:07:42 +00:00
Evan Cheng	0e9d9ca855	-Revert parts of 84326 and 84411. Distinquishing between fixed and non-fixed stack slots and giving them different PseudoSourceValue's did not fix the problem of post-alloc scheduling miscompiling llvm itself. - Apply Dan's conservative workaround by assuming any non fixed stack slots can alias other memory locations. This means a load from spill slot #1 cannot move above a store of spill slot #2. - Enable post-alloc scheduling for x86 at optimization leverl Default and above. llvm-svn: 84424	2009-10-18 18:16:27 +00:00
Evan Cheng	4729191bb2	Distinquish stack slots from other stack objects. They (and fixed objects) get FixedStack PseudoSourceValues. llvm-svn: 84326	2009-10-17 09:20:14 +00:00
Evan Cheng	8759585aba	Revert 84315 for now. Re-thinking the patch. llvm-svn: 84321	2009-10-17 07:53:04 +00:00
Evan Cheng	0818d87ed1	Rename getFixedStack to getStackObject. The stack objects represented are not necessarily fixed. Only those will negative frame indices are "fixed." llvm-svn: 84315	2009-10-17 06:22:26 +00:00
Anton Korobeynikov	75b59fb055	Add PseudoSourceValues for constpool stuff on ELF (Darwin should use something similar) and register spills. llvm-svn: 83435	2009-10-07 00:06:35 +00:00
Jakob Stoklund Olesen	dc9efe8078	Introduce the TargetInstrInfo::KILL machine instruction and get rid of the unused DECLARE instruction. KILL is not yet used anywhere, it will replace TargetInstrInfo::IMPLICIT_DEF in the places where IMPLICIT_DEF is just used to alter liveness of physical registers. llvm-svn: 83006	2009-09-28 20:32:26 +00:00
Evan Cheng	83e0d481ae	Make ARM and Thumb2 32-bit immediate materialization into a single 32-bit pseudo instruction. This makes it re-materializable. Thumb2 will split it back out into two instructions so IT pass will generate the right mask. Also, this expose opportunies to optimize the movw to a 16-bit move. llvm-svn: 82982	2009-09-28 09:14:39 +00:00
Anton Korobeynikov	8d0fbebb9f	Add QPR_VFP2 regclass and add copy_to_regclass nodes, where needed to constraint the register usage. llvm-svn: 81635	2009-09-12 22:21:08 +00:00

... 2 3 4 5 6 ...

390 Commits