llvm-project

Commit Graph

Author	SHA1	Message	Date
Bill Wendling	26feb849a4	Testcase for r110248. llvm-svn: 110249	2010-08-04 21:56:30 +00:00
Stuart Hastings	cba0d06b7c	call-imm.ll test case regex fix. Patch by Dimitry Andric! llvm-svn: 110199	2010-08-04 15:31:35 +00:00
Kalle Raiskila	8b2f70125f	Make SPU backend handle insertelement and store for "half vectors" llvm-svn: 110198	2010-08-04 13:59:48 +00:00
Bob Wilson	79daf7e0ae	Combine NEON VABD (absolute difference) intrinsics with ADDs to make VABA (absolute difference with accumulate) intrinsics. Radar 8228576. llvm-svn: 110170	2010-08-04 00:12:08 +00:00
Jakob Stoklund Olesen	011ff9bec9	OK, that's it. This test is going away now. But don't worry, I am taking it to a nice farm in the country where it can play with other tests. And bunnies. It is not clear what is being tested, and the revision history shows a bunch of random changes to the expected instruction count. Clearly, we are just fudging it to pass whenever it fails. llvm-svn: 110118	2010-08-03 17:21:14 +00:00
Kalle Raiskila	77558b7d13	More SPU v2f32 stuff added: insertelement and shuffle. llvm-svn: 110038	2010-08-02 11:22:10 +00:00
Kalle Raiskila	68b3886678	Add preliminary v2f32 support for SPU. Like with v2i32, we just duplicate the instructions and operate on half vectors. Also reorder code in SPUInstrInfo.td for better coherency. llvm-svn: 110037	2010-08-02 10:25:47 +00:00
Kalle Raiskila	622f8eb981	Add preliminary v2i32 support for SPU backend. As there are no such registers in SPU, this support boils down to "emulating" them by duplicating instructions on the general purpose registers. This adds the most basic operations on v2i32: passing parameters, addition, subtraction, multiplication and a few others. llvm-svn: 110035	2010-08-02 08:54:39 +00:00
Eli Friedman	7595ce05a2	PR7781: Fix incorrect shifting in PPCTargetLowering::LowerBUILD_VECTOR. llvm-svn: 109998	2010-08-02 00:18:19 +00:00
Eli Friedman	1b2bc1b844	PR7774: Fix undefined shifts in Alpha backend. As a bonus, this actually improves the generated code in some cases. llvm-svn: 109985	2010-08-01 21:13:28 +00:00
Bob Wilson	66161f5eb4	Revert new AVX intrinsic tests. They are breaking buildbots and Bruno is away from a computer now. --- Reverse-merging r109881 into '.': D test/CodeGen/X86/avx-intrinsics-x86.ll D test/CodeGen/X86/avx-intrinsics-x86_64.ll llvm-svn: 109959	2010-07-31 22:36:03 +00:00
Bruno Cardoso Lopes	92941fdb26	A bunch of tests for AVX intrinsics llvm-svn: 109881	2010-07-30 19:57:56 +00:00
Eli Friedman	ffe64c06ef	Fix for bug reported by Evzen Muller on llvm-commits: make sure to correctly check the range of the constant when optimizing a comparison between a constant and a sign_extend_inreg node. llvm-svn: 109854	2010-07-30 06:44:31 +00:00
Jim Grosbach	d343166a0b	Many Thumb2 instructions can reference the full ARM register set (i.e., have 4 bits per register in the operand encoding), but have undefined behavior when the operand value is 13 or 15 (SP and PC, respectively). The trivial coalescer in linear scan sometimes will merge a copy from SP into a subsequent instruction which uses the copy, and if that instruction cannot legally reference SP, we get bad code such as: mls r0,r9,r0,sp instead of: mov r2, sp mls r0, r9, r0, r2 This patch adds a new register class for use by Thumb2 that excludes the problematic registers (SP and PC) and is used instead of GPR for those operands which cannot legally reference PC or SP. The trivial coalescer explicitly requires that the register class of the destination for the COPY instruction contain the source register for the COPY to be considered for coalescing. This prevents errant instructions like that above. PR7499 llvm-svn: 109842	2010-07-30 02:41:01 +00:00
Dale Johannesen	2bff50546c	Implement vector constants which are splat of integers with mov + vdup. 8003375. This is currently disabled by default because LICM will not hoist a VDUP, so it pessimizes the code if the construct occurs inside a loop (8248029). llvm-svn: 109799	2010-07-29 20:10:08 +00:00
Nate Begeman	53afc8f06a	Implement a vectorized algorithm for <16 x i8> << <16 x i8> This is about 4x faster and smaller than the existing scalarization. llvm-svn: 109566	2010-07-28 00:21:48 +00:00
Nate Begeman	269a6da023	~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller types coming in future patches. For: define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp { entry: %shl = shl <4 x i32> %r, %a ; <<4 x i32>> [#uses=1] %tmp2 = bitcast <4 x i32> %shl to <2 x i64> ; <<2 x i64>> [#uses=1] ret <2 x i64> %tmp2 } We get: _shl: ## @shl pslld $23, %xmm1 paddd LCPI0_0, %xmm1 cvttps2dq %xmm1, %xmm1 pmulld %xmm1, %xmm0 ret Instead of: _shl: ## @shl pshufd $3, %xmm0, %xmm2 movd %xmm2, %eax pshufd $3, %xmm1, %xmm2 movd %xmm2, %ecx shll %cl, %eax movd %eax, %xmm2 pshufd $1, %xmm0, %xmm3 movd %xmm3, %eax pshufd $1, %xmm1, %xmm3 movd %xmm3, %ecx shll %cl, %eax movd %eax, %xmm3 punpckldq %xmm2, %xmm3 movd %xmm0, %eax movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm2 movhlps %xmm0, %xmm0 movd %xmm0, %eax movhlps %xmm1, %xmm1 movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm0 punpckldq %xmm0, %xmm2 movdqa %xmm2, %xmm0 punpckldq %xmm3, %xmm0 ret llvm-svn: 109549	2010-07-27 22:37:06 +00:00
Nate Begeman	317b969ac5	Fix a crash in the dag combiner caused by ConstantFoldBIT_CONVERTofBUILD_VECTOR calling itself recursively and returning a SCALAR_TO_VECTOR node, but assuming the input was always a BUILD_VECTOR. llvm-svn: 109519	2010-07-27 18:02:18 +00:00
Anton Korobeynikov	6bcea068db	Currently EH lowering code expects typeinfo to be global only. This assumption is not satisfied due to global mergeing. Workaround the issue by temporary disablinge mergeing of const globals. Also, ignore LLVM "special" globals. This fixes PR7716 llvm-svn: 109423	2010-07-26 18:45:39 +00:00
Evan Cheng	df907f4594	- Allow target to specify when is register pressure "too high". In most cases, it's too late to start backing off aggressive latency scheduling when most of the registers are in use so the threshold should be a bit tighter. - Correctly handle live out's and extract_subreg etc. - Enable register pressure aware scheduling by default for hybrid scheduler. For ARM, this is almost always a win on # of instructions. It's runtime neutral for most of the tests. But for some kernels with high register pressure it can be a huge win. e.g. 464.h264ref reduced number of spills by 54 and sped up by 20%. llvm-svn: 109279	2010-07-23 22:39:59 +00:00
Dan Gohman	55e244698a	Use the proper type for shift counts. This fixes a bootstrap error. llvm-svn: 109265	2010-07-23 21:08:12 +00:00
Dan Gohman	0818684a70	DAGCombine (shl (anyext x, c)) to (anyext (shl x, c)) if the high bits are not demanded. This often allows the anyext to be folded away. llvm-svn: 109242	2010-07-23 18:03:30 +00:00
Eric Christopher	9a77382685	Custom lower the memory barrier instructions and add support for lowering without sse2. Add a couple of new testcases. Fixes a few libgomp tests and latent bugs. Remove a few todos. llvm-svn: 109078	2010-07-22 02:48:34 +00:00
Evan Cheng	285903853f	More register pressure aware scheduling work. llvm-svn: 109064	2010-07-21 23:53:58 +00:00
Eric Christopher	84bdfd80df	Baby steps towards ARM fast-isel. llvm-svn: 109047	2010-07-21 22:26:11 +00:00
Rafael Espindola	4277e14dc4	Fix calling convention on ARM if vfp2+ is enabled. llvm-svn: 109009	2010-07-21 11:38:30 +00:00
Dan Gohman	625fd2292d	Fix SCEV denormalization of expressions where the exit value from one loop is involved in the increment of an addrec for another loop. This fixes rdar://8168938. llvm-svn: 108863	2010-07-20 17:06:20 +00:00
Jim Grosbach	badf087e45	update tests for smarter BIC usage llvm-svn: 108846	2010-07-20 16:16:48 +00:00
Duncan Sands	2e839de377	The same problem was being tracked in PR7652. llvm-svn: 108843	2010-07-20 15:52:32 +00:00
Bruno Cardoso Lopes	160695fecb	Fix PR7174, a couple o Mips fixes: - Fix a typo for PIC check during jmp table lowering - Also fix the "first jump table basic block is not considered only reachable by fall through" problem, use this ad-hoc solution until I come up with something better. Patch by stetorvs@gmail.com llvm-svn: 108820	2010-07-20 08:37:04 +00:00
Bruno Cardoso Lopes	ea7863647b	Fix Mips PR7473. Patch by stetorvs@gmail.com llvm-svn: 108816	2010-07-20 07:58:51 +00:00
Dan Gohman	b5e918dc05	After a custom inserter, in a block which has constant instructions, update the current basic block in addition to the current insert position, so that they remain consistent. This fixes rdar://8204072. llvm-svn: 108765	2010-07-19 22:48:56 +00:00
Owen Anderson	9c271e2835	Remove r108639 now that it is handled by InstCombine instead. llvm-svn: 108688	2010-07-19 08:10:24 +00:00
Owen Anderson	41670a11a8	Add a testcase for r108639. llvm-svn: 108640	2010-07-18 08:57:19 +00:00
Jim Grosbach	b97e2bbe32	Add combiner patterns to more effectively utilize the BFI (bitfield insert) instruction for non-constant operands. This includes the case referenced in the README.txt regarding a bitfield copy. llvm-svn: 108608	2010-07-17 03:30:54 +00:00
Jim Grosbach	11013eda5a	Add basic support to code-gen the ARM/Thumb2 bit-field insert (BFI) instruction and a combine pattern to use it for setting a bit-field to a constant value. More to come for non-constant stores. llvm-svn: 108570	2010-07-16 23:05:05 +00:00
Bill Wendling	bf8370ff36	Consider this function: void foo() { __builtin_unreachable(); } It will output the following on Darwin X86: _func1: Leh_func_begin0: pushq %rbp Ltmp0: movq %rsp, %rbp Ltmp1: Leh_func_end0: This prolog adds a new Call Frame Information (CFI) row to the FDE with an address that is not within the address range of the code it describes -- part is equal to the end of the function -- and therefore results in an invalid EH frame. If we emit a nop in this situation, then the CFI row is now within the address range. llvm-svn: 108568	2010-07-16 22:51:10 +00:00
Jakob Stoklund Olesen	c30b4ddc58	Remove the X86::FP_REG_KILL pseudo-instruction and the X86FloatingPointRegKill pass that inserted it. It is no longer necessary to limit the live ranges of FP registers to a single basic block. llvm-svn: 108536	2010-07-16 17:41:44 +00:00
Benjamin Kramer	50729ad717	Feed the right output into FileCheck. llvm-svn: 108523	2010-07-16 10:58:02 +00:00
Jakob Stoklund Olesen	37c42a3d02	Remove many calls to TII::isMoveInstr. Targets should be producing COPY anyway. TII::isMoveInstr is going tobe completely removed. llvm-svn: 108507	2010-07-16 04:45:42 +00:00
Jakob Stoklund Olesen	b1671271ab	Add forgotten test case. llvm-svn: 108506	2010-07-16 04:45:35 +00:00
Dan Gohman	103c4ebea5	Use the source-order scheduler instead of the "fast" scheduler at -O0, because it's more likely to keep debug line information in its original order. llvm-svn: 108496	2010-07-16 02:01:19 +00:00
Dale Johannesen	bfd4fd7bb7	The SelectionDAGBuilder's handling of debug info, on rare occasions, caused code to be generated in a different order. All cases I've seen involved float softening in the type legalizer, and this could be perhaps be fixed there, but it's better not to generate things differently in the first place. 7797940 (6/29/2010..7/15/2010). llvm-svn: 108484	2010-07-16 00:02:08 +00:00
Bill Wendling	4bda1c8e68	Revert. This isn't the correct way to go. llvm-svn: 108478	2010-07-15 23:42:21 +00:00
Bill Wendling	973dc3b1d8	Handle code gen for the unreachable instruction if it's the only instruction in the function. We'll just turn it into a "trap" instruction instead. The problem with not handling this is that it might generate a prologue without the equivalent epilogue to go with it: $ cat t.ll define void @foo() { entry: unreachable } $ llc -o - t.ll -relocation-model=pic -disable-fp-elim -unwind-tables .section __TEXT,__text,regular,pure_instructions .globl _foo .align 4, 0x90 _foo: ## @foo Leh_func_begin0: ## BB#0: ## %entry pushq %rbp Ltmp0: movq %rsp, %rbp Ltmp1: Leh_func_end0: ... The unwind tables then have bad data in them causing all sorts of problems. Fixes <rdar://problem/8096481>. llvm-svn: 108473	2010-07-15 23:32:40 +00:00
Evan Cheng	55f0c6b9fc	Split -enable-finite-only-fp-math to two options: -enable-no-nans-fp-math and -enable-no-infs-fp-math. All of the current codegen fp math optimizations only care whether the fp arithmetics arguments and results can never be NaN. llvm-svn: 108465	2010-07-15 22:07:12 +00:00
Chris Lattner	60b131654b	fix the definitions of ConstTextCoalSection/ConstDataCoalSection to keep "Text" in sync with the "pure instructions" section attribute. Lack of this attribute was preventing the assembler from emitting multibyte noops instructions for templates (and inlines, and other coalesced stuff) and was causing the assembler to mismatch .o files. This fixes rdar://8018335 llvm-svn: 108461	2010-07-15 21:22:00 +00:00
Devang Patel	df09db62e2	Fix crash reported in PR7653. llvm-svn: 108441	2010-07-15 18:45:27 +00:00
Dan Gohman	4afd412d6b	Watch out for a constant offset cancelling out a base register, forming a zero. This situation arrises in Fortran code with induction variables that start at 1 instead of 0. This fixes PR7651. llvm-svn: 108424	2010-07-15 15:14:45 +00:00
Devang Patel	29168baf4b	Make it a .ll test case. llvm-svn: 108370	2010-07-14 23:12:52 +00:00
Jim Grosbach	a90af1ba38	Improve 64-subtraction of immediates when parts of the immediate can fit in the literal field of an instruction. E.g., long long foo(long long a) { return a - 734439407618LL; } rdar://7038284 llvm-svn: 108339	2010-07-14 17:45:16 +00:00
Dan Gohman	042523340b	Delete fast-isel's trivial load optimization; it breaks debugging because it can look past points where a debugger might modify user variables. llvm-svn: 108336	2010-07-14 17:25:37 +00:00
Bob Wilson	bb57896f8e	Fix test to appease the buildbots. llvm-svn: 108334	2010-07-14 16:43:47 +00:00
Evan Cheng	a8e8874552	Fix for PR7193 was overly conservative. The only case where sibcall callee address cannot be allocated a register is in 32-bit mode where the first three arguments are marked inreg. In that case EAX, EDX, and ECX will be used for argument passing. This fixes PR7610. llvm-svn: 108327	2010-07-14 06:44:01 +00:00
Bob Wilson	bad47f62f6	Add support for NEON VMVN immediate instructions. llvm-svn: 108324	2010-07-14 06:31:50 +00:00
Evan Cheng	c893115312	Re-enable the test with fix. llvm-svn: 108319	2010-07-14 05:49:23 +00:00
Chris Lattner	711338fb04	temporarily disable to test to fix buildbots. llvm-svn: 108310	2010-07-14 02:21:59 +00:00
Evan Cheng	d542414945	Teach ProcessImplicitDefs to transform more COPY instructions into IMPLICIT_DEF (and subsequently eliminate them). This allows machine LICM to hoist IMPLICIT_DEF's. PR7620. llvm-svn: 108304	2010-07-14 01:22:19 +00:00
Bob Wilson	103a0dcfe1	Add an ARM-specific DAG combining to avoid redundant VDUPLANE nodes. Radar 7373643. llvm-svn: 108303	2010-07-14 01:22:12 +00:00
Bob Wilson	a3f1901531	Use a target-specific VMOVIMM DAG node instead of BUILD_VECTOR to represent NEON VMOV-immediate instructions. This simplifies some things. llvm-svn: 108275	2010-07-13 21:16:48 +00:00
Dale Johannesen	caca5488dc	In inline asm treat indirect 'X' constraint as 'm'. This may not be right in all cases, but it's better than asserting which it was doing before. PR 7528. llvm-svn: 108268	2010-07-13 20:17:05 +00:00
Evan Cheng	0cc4ad983d	Extend the r107852 optimization which turns some fp compare to code sequence using only i32 operations. It now optimize some f64 compares when fp compare is exceptionally slow (e.g. cortex-a8). It also catches comparison against 0.0. llvm-svn: 108258	2010-07-13 19:27:42 +00:00
Evan Cheng	f43961007c	-enable-unsafe-fp-math should not imply -enable-finite-only-fp-math. llvm-svn: 108254	2010-07-13 18:46:14 +00:00
Dale Johannesen	f241d4626c	Fix PR number. llvm-svn: 108251	2010-07-13 18:14:47 +00:00
Dan Gohman	51e6d9bbf6	Apply the SSE dependence idiom for SSE unary operations to SD instructions too, in addition to SS instructions. And add a comment about it. llvm-svn: 108191	2010-07-12 20:46:04 +00:00
Jakob Stoklund Olesen	c4227f1362	Remove TargetInstrInfo::copyRegToReg entirely. Targets must now implement TargetInstrInfo::copyPhysReg instead. There is no longer a default implementation forwarding to copyRegToReg. llvm-svn: 108095	2010-07-11 17:01:17 +00:00
Rafael Espindola	a76eccf815	Fix va_arg for doubles. With this patch VAARG nodes always contain the correct alignment information, which simplifies ExpandRes_VAARG a bit. The patch introduces a new alignment information to TargetLoweringInfo. This is needed since the two natural candidates cannot be used: * The 's' in target data: If this is set to the minimal alignment of any argument, getCallFrameTypeAlignment would return 4 for doubles on ARM for example. * The getTransientStackAlignment method. It is possible for an architecture to have argument less aligned than what we maintain the stack pointer. llvm-svn: 108072	2010-07-11 04:01:49 +00:00
Dan Gohman	79be2b9be5	Fix this test. llvm-svn: 108059	2010-07-10 22:42:12 +00:00
Jakob Stoklund Olesen	c4b3bcc051	FileCheckize inline asm FP stack tests llvm-svn: 108046	2010-07-10 16:30:25 +00:00
Dan Gohman	d7b5ce3312	Reapply bottom-up fast-isel, with several fixes for x86-32: - Check getBytesToPopOnReturn(). - Eschew ST0 and ST1 for return values. - Fix the PIC base register initialization so that it doesn't ever fail to end up the top of the entry block. llvm-svn: 108039	2010-07-10 09:00:22 +00:00
Jakob Stoklund Olesen	51702ec46b	Fix a few tests llvm-svn: 108011	2010-07-09 20:43:09 +00:00
Jim Grosbach	2a5725b1a3	In the presence of variable sized objects, allocate an emergency spill slot. rdar://8131327 llvm-svn: 108008	2010-07-09 20:27:06 +00:00
Dan Gohman	ea9ae3e6ed	Add a target triple. llvm-svn: 108003	2010-07-09 19:17:36 +00:00
Dan Gohman	7929c448fc	Fix MachineLICM to actually visit inner loops. llvm-svn: 108001	2010-07-09 18:49:45 +00:00
Bob Wilson	6586e9b203	--- Reverse-merging r107947 into '.': U utils/TableGen/FastISelEmitter.cpp --- Reverse-merging r107943 into '.': U test/CodeGen/X86/fast-isel.ll U test/CodeGen/X86/fast-isel-loads.ll U include/llvm/Target/TargetLowering.h U include/llvm/Support/PassNameParser.h U include/llvm/CodeGen/FunctionLoweringInfo.h U include/llvm/CodeGen/CallingConvLower.h U include/llvm/CodeGen/FastISel.h U include/llvm/CodeGen/SelectionDAGISel.h U lib/CodeGen/LLVMTargetMachine.cpp U lib/CodeGen/CallingConvLower.cpp U lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp U lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp U lib/CodeGen/SelectionDAG/FastISel.cpp U lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp U lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp U lib/CodeGen/SelectionDAG/InstrEmitter.cpp U lib/CodeGen/SelectionDAG/TargetLowering.cpp U lib/Target/XCore/XCoreISelLowering.cpp U lib/Target/XCore/XCoreISelLowering.h U lib/Target/X86/X86ISelLowering.cpp U lib/Target/X86/X86FastISel.cpp U lib/Target/X86/X86ISelLowering.h llvm-svn: 107987	2010-07-09 16:37:18 +00:00
Jakob Stoklund Olesen	a57965827f	Fix test to be less sensitive of regalloc accidents llvm-svn: 107951	2010-07-09 01:32:11 +00:00
Bob Wilson	88a4e6dc0e	Print "dregpair" NEON operands with a space between them, for readability and consistency with other instructions that have lists of register operands. llvm-svn: 107944	2010-07-09 00:47:20 +00:00
Dan Gohman	0b5aa1cdd3	Re-apply bottom-up fast-isel, with fixes. Be very careful to avoid emitting a DBG_VALUE after a terminator, or emitting any instructions before an EH_LABEL. llvm-svn: 107943	2010-07-09 00:39:23 +00:00
Bob Wilson	21eed476e8	Reenable DAG combining for vector shuffles. It looks like it was temporarily disabled and then never turned back on again. Adjust some tests, one because this change avoids an unnecessary instruction, and the other to make it continue testing what it was intended to test. llvm-svn: 107941	2010-07-09 00:38:12 +00:00
Bill Wendling	a992445ff2	Extension of r107506. Make sure that we don't mark a function as having a call if the inline ASM doesn't need a stack frame. llvm-svn: 107922	2010-07-08 22:38:02 +00:00
Evan Cheng	0f54854a1d	Check for FiniteOnlyFPMath as well. llvm-svn: 107904	2010-07-08 20:12:24 +00:00
Eric Christopher	e796253217	A slight reworking of the custom patterns for x86-64 tpoff codegen and correct the testcase for valid assembly. Needs more tests. llvm-svn: 107860	2010-07-08 07:36:46 +00:00
Evan Cheng	be1f7a931e	r107852 is only safe with -enable-unsafe-fp-math to account for +0.0 == -0.0. llvm-svn: 107856	2010-07-08 06:01:49 +00:00
Evan Cheng	25f9364cbd	Optimize some vfp comparisons to integer ones. This patch implements the simplest case when the following conditions are met: 1. The arguments are f32. 2. The arguments are loads and they have no uses other than the comparison. 3. The comparison code is EQ or NE. e.g. vldr.32 s0, [r1] vldr.32 s1, [r0] vcmpe.f32 s1, s0 vmrs apsr_nzcv, fpscr beq LBB0_2 => ldr r1, [r1] ldr r0, [r0] cmp r0, r1 beq LBB0_2 More complicated cases will be implemented in subsequent patches. llvm-svn: 107852	2010-07-08 02:08:50 +00:00
Dale Johannesen	e2289285ae	Changes to ARM tail calls, mostly cosmetic. Add explicit testcases for tail calls within the same module. Duplicate some code to humor those who think .w doesn't apply on ARM. Leave this disabled on Thumb1, and add some comments explaining why it's hard and won't gain much. llvm-svn: 107851	2010-07-08 01:18:23 +00:00
Dan Gohman	e75704369d	Revert 107840 107839 107813 107804 107800 107797 107791. Debug info intrinsics win for now. llvm-svn: 107850	2010-07-08 01:00:56 +00:00
Jakob Stoklund Olesen	ddaf0099a5	Allow copies between GR8_ABCD_L and GR8_ABCD_H. This fixes PR7540. llvm-svn: 107809	2010-07-07 20:33:27 +00:00
Dan Gohman	e7ccc51cc1	Implement bottom-up fast-isel. This has the advantage of not requiring a separate DCE pass over MachineInstrs. llvm-svn: 107804	2010-07-07 19:20:32 +00:00
Dan Gohman	2d4d01d0de	Add X86FastISel support for return statements. This entails refactoring a bunch of stuff, to allow the target-independent calling convention logic to be employed. llvm-svn: 107800	2010-07-07 18:32:53 +00:00
Dale Johannesen	ce65663330	Accept RIP-relative symbols with 'i' constraint, and print the (%rip) only if the 'a' modifier is present. PR 7528. llvm-svn: 107727	2010-07-06 23:27:00 +00:00
Dale Johannesen	6f01541ae6	Make test not hang waiting for input. llvm-svn: 107721	2010-07-06 23:06:58 +00:00
Jakob Stoklund Olesen	a64c0a3d22	Be more forgiving when calculating alias interference for physreg coalescing. It is OK for an alias live range to overlap if there is a copy to or from the physical register. CoalescerPair can work out if the copy is coalescable independently of the alias. This means that we can join with the actual destination interval instead of using the getOrigDstReg() hack. It is no longer necessary to merge clobber ranges into subregisters. llvm-svn: 107695	2010-07-06 20:31:51 +00:00
Devang Patel	23a7593534	Fix PR7545 crash. llvm-svn: 107678	2010-07-06 18:18:32 +00:00
Rafael Espindola	7c510aa7bc	Don't create neon moves in CopyRegToReg. NEONMoveFixPass will do the conversion if profitable. llvm-svn: 107673	2010-07-06 16:24:34 +00:00
Eric Christopher	8f06b4a294	Remove mistakenly added test. llvm-svn: 107641	2010-07-06 05:20:13 +00:00
Eric Christopher	2ad0c779c3	Fix up -fstack-protector on linux to use the segment registers. Split out testcases per architecture and os now. Patch from Nelson Elhage. llvm-svn: 107640	2010-07-06 05:18:56 +00:00
Chris Lattner	60db4557cd	another v2f32 case, in this case showing poor codegen. llvm-svn: 107614	2010-07-05 05:52:56 +00:00
Chris Lattner	431e81f2fb	fix test on non-x86 hosts. llvm-svn: 107608	2010-07-05 03:56:55 +00:00
Chris Lattner	45cc4d74a3	Just rip v2f32 support completely out of the X86 backend. In the example in the testcase, we now generate: _test1: ## @test1 movss 4(%esp), %xmm0 addss 8(%esp), %xmm0 movl 12(%esp), %eax movss %xmm0, (%eax) ret instead of: _test1: ## @test1 subl $20, %esp movl 24(%esp), %eax movq %mm0, (%esp) movq %mm0, 8(%esp) movss (%esp), %xmm0 addss 12(%esp), %xmm0 movss %xmm0, (%eax) addl $20, %esp ret v2f32 support did not work reliably because most of the X86 backend didn't know it was legal. It was apparently only added to support returning source-level v2f32 values in MMX registers in x86-32 mode. If ABI compatibility is important on this GCC-extended-vector type for some reason, then the frontend should generate IR that returns v2i32 instead of v2f32. However, we generally don't try very hard to be abi compatible on gcc extended vectors. llvm-svn: 107601	2010-07-04 23:07:25 +00:00
Chris Lattner	681b926d54	fix PR7518 - terrible codegen of <2 x float>, by only marking v2f32 as legal in 32-bit mode. It is just as terrible there, but I just care about x86-64 and noone claims it is valuable in 64-bit mode. llvm-svn: 107600	2010-07-04 22:57:10 +00:00
Evan Cheng	0ce84486c3	- Two-address pass should not assume unfolding is always successful. - X86 unfolding should check if the instructions being unfolded has memoperands. If there is no memoperands, then it must assume conservative alignment. If this would introduce an expensive sse unaligned load / store, then unfoldMemoryOperand etc. should not unfold the instruction. llvm-svn: 107509	2010-07-02 20:36:18 +00:00
Dale Johannesen	4d887f7ca7	Propagate the AlignStack bit in InlineAsm's to the PrologEpilog code, and use it to determine whether the asm forces stack alignment or not. gcc consistently does not do this for GCC-style asms; Apple gcc inconsistently sometimes does it for asm blocks. There is no convenient place to put a bit in either the SDNode or the MachineInstr form, so I've added an extra operand to each; unlovely, but it does allow for expansion for more bits, should we need it. PR 5125. Some existing testcases are affected. The operand lists of the SDNode and MachineInstr forms are indexed with awesome mnemonics, like "2"; I may fix this someday, but not now. I'm not making it any worse. If anyone is inspired I think you can find all the right places from this patch. llvm-svn: 107506	2010-07-02 20:16:09 +00:00
Bob Wilson	771d04b969	Fix incorrect asm-printing of some NEON immediates. Fix weak testcase so that it checks the immediate values, not just the instructions opcodes. Radar 8110263. llvm-svn: 107487	2010-07-02 17:23:44 +00:00
Bob Wilson	8a99b730a9	ARM function alignments were off by a power of two. svn 83242 changed getFunctionAlignment and the corresponding use of that value in the ARM asm printer, but now we're using the standard asm printer. The result of this was that function alignments were dropped completely for Thumb functions. Radar 8143571. llvm-svn: 107435	2010-07-01 22:26:26 +00:00
Bill Wendling	03bcd6ecc8	Implement the "linker_private_weak" linkage type. This will be used for Objective-C metadata types which should be marked as "weak", but which the linker will remove upon final linkage. However, this linkage isn't specific to Objective-C. For example, the "objc_msgSend_fixup_alloc" symbol is defined like this: .globl l_objc_msgSend_fixup_alloc .weak_definition l_objc_msgSend_fixup_alloc .section __DATA, __objc_msgrefs, coalesced .align 3 l_objc_msgSend_fixup_alloc: .quad _objc_msgSend_fixup .quad L_OBJC_METH_VAR_NAME_1 This is different from the "linker_private" linkage type, because it can't have the metadata defined with ".weak_definition". Currently only supported on Darwin platforms. llvm-svn: 107433	2010-07-01 21:55:59 +00:00
Dan Gohman	d2965c10a1	Temporarily disable on-demand fast-isel. llvm-svn: 107393	2010-07-01 12:15:30 +00:00
Dan Gohman	aef3d140b7	Teach fast-isel to avoid loading a value from memory when it's already available in a register. This is pretty primitive, but it reduces the number of instructions in common testcases by 4%. llvm-svn: 107380	2010-07-01 03:49:38 +00:00
Dan Gohman	722f5fc567	Enable on-demand fast-isel. llvm-svn: 107377	2010-07-01 02:58:57 +00:00
Dan Gohman	7937d5606d	Teach X86FastISel to fold constant offsets and scaled indices in the same address. llvm-svn: 107373	2010-07-01 02:27:15 +00:00
Jakob Stoklund Olesen	dadea5b178	Fix the handling of partial redefines in the fast register allocator. A partial redefine needs to be treated like a tied operand, and the register must be reloaded while processing use operands. This fixes a bug where partially redefined registers were processed as normal defs with a reload added. The reload could clobber another use operand if it was a kill that allowed register reuse. llvm-svn: 107193	2010-06-29 19:15:30 +00:00
Bob Wilson	d91d5bfc95	Fix a register scavenger crash when dealing with undefined subregs. The LowerSubregs pass needs to preserve implicit def operands attached to EXTRACT_SUBREG instructions when it replaces those instructions with copies. llvm-svn: 107189	2010-06-29 18:42:49 +00:00
Rafael Espindola	38a7d7cbc3	Add a VT argument to getMinimalPhysRegClass and replace the copy related uses of getPhysicalRegisterRegClass with it. If we want to make a copy (or estimate its cost), it is better to use the smallest class as more efficient operations might be possible. llvm-svn: 107140	2010-06-29 14:02:34 +00:00
Evan Cheng	b59dd8f10a	PR7503: uxtb16 is not available for ARMv7-M. Patch by Brian G. Lucas. llvm-svn: 107122	2010-06-29 05:38:36 +00:00
Bob Wilson	1e5da550e5	Reapply my if-conversion cleanup from svn r106939 with fixes. There are 2 changes relative to the previous version of the patch: 1) For the "simple" if-conversion case, there's no need to worry about RemoveExtraEdges not handling an unanalyzable branch. Predicated terminators are ignored in this context, so RemoveExtraEdges does the right thing. This might break someday if we ever treat indirect branches (BRIND) as predicable, but for now, I just removed this part of the patch, because in the case where we do not add an unconditional branch, we rely on keeping the fall-through edge to CvtBBI (which is empty after this transformation). The change relative to the previous patch is: @@ -1036,10 +1036,6 @@ IterIfcvt = false; } - // RemoveExtraEdges won't work if the block has an unanalyzable branch, - // which is typically the case for IfConvertSimple, so explicitly remove - // CvtBBI as a successor. - BBI.BB->removeSuccessor(CvtBBI->BB); RemoveExtraEdges(BBI); // Update block info. BB can be iteratively if-converted. 2) My patch exposed a bug in the code for merging the tail of a "diamond", which had previously never been exercised. The code was simply checking that the tail had a single predecessor, but there was a case in MultiSource/Benchmarks/VersaBench/dbms where that single predecessor was neither edge of the diamond. I added the following change to check for that: @@ -1276,7 +1276,18 @@ // tail, add a unconditional branch to it. if (TailBB) { BBInfo TailBBI = BBAnalysis[TailBB->getNumber()]; - if (TailBB->pred_size() == 1 && !TailBBI.HasFallThrough) { + bool CanMergeTail = !TailBBI.HasFallThrough; + // There may still be a fall-through edge from BBI1 or BBI2 to TailBB; + // check if there are any other predecessors besides those. + unsigned NumPreds = TailBB->pred_size(); + if (NumPreds > 1) + CanMergeTail = false; + else if (NumPreds == 1 && CanMergeTail) { + MachineBasicBlock::pred_iterator PI = TailBB->pred_begin(); + if (PI != BBI1->BB && PI != BBI2->BB) + CanMergeTail = false; + } + if (CanMergeTail) { MergeBlocks(BBI, TailBBI); TailBBI.IsDone = true; } else { With these fixes, I was able to run all the SingleSource and MultiSource tests successfully. llvm-svn: 107110	2010-06-29 00:55:23 +00:00
Bob Wilson	269a89fd3a	Unlike other targets, ARM now uses BUILD_VECTORs post-legalization so they can't be changed arbitrarily by the DAGCombiner without checking if it is running after legalization. llvm-svn: 107097	2010-06-28 23:40:25 +00:00
Dale Johannesen	17feb07c53	In asm's, output operands with matching input constraints have to be registers, per gcc documentation. This affects the logic for determining what "g" should lower to. PR 7393. A couple of existing testcases are affected. llvm-svn: 107079	2010-06-28 22:09:45 +00:00
Jakob Stoklund Olesen	fde9c348e9	Don't write temporary files in test directory llvm-svn: 107049	2010-06-28 20:01:15 +00:00
Jakob Stoklund Olesen	0117091c16	Add a triple so test runs on Linux as well. llvm-svn: 107045	2010-06-28 19:31:15 +00:00
Jakob Stoklund Olesen	0d94d7af78	Add more special treatment for inline asm in RegAllocFast. When an instruction has tied operands and physreg defines, we must take extra care that the tied operands conflict with neither physreg defs nor uses. The special treatment is given to inline asm and instructions with tied operands / early clobbers and physreg defines. This fixes PR7509. llvm-svn: 107043	2010-06-28 18:34:34 +00:00
Benjamin Kramer	3bbc52ce3e	Fix some tests that didn't test anything. llvm-svn: 106954	2010-06-26 20:05:06 +00:00
Rafael Espindola	2041abd958	When splitting a VAARG, remember its alignment. This produces terrible but correct code. llvm-svn: 106952	2010-06-26 18:22:20 +00:00
Bob Wilson	418e64a385	Revert my if-conversion cleanup since it caused a bunch of nightly test regressions. --- Reverse-merging r106939 into '.': U test/CodeGen/Thumb2/thumb2-ifcvt3.ll U lib/CodeGen/IfConversion.cpp llvm-svn: 106951	2010-06-26 17:47:06 +00:00
Eli Friedman	b9bdc5a52d	Remove bogus test. llvm-svn: 106941	2010-06-26 04:59:56 +00:00
Bob Wilson	c72da6bb56	Clean up some problems with extra CFG edges being introduced during if-conversion. The RemoveExtraEdges function doesn't work for blocks that end with unanalyzable branches, so in those cases, the "extra" edges must be explicitly removed. The CopyAndPredicateBlock and MergeBlocks methods can also avoid copying successor edges due to branches that have already been removed. The latter case is especially helpful when MergeBlocks is called for handling "diamond" if-conversions, where otherwise you can end up with some weird intermediate states in the CFG. Unfortunately I've been unable to find cases where this cleanup actually makes a significant difference in the code. There is one test where we manage to remove an empty block at the end of a function. Radar 6911268. llvm-svn: 106939	2010-06-26 04:27:33 +00:00
Jakob Stoklund Olesen	d7d0d4e882	When creating X86 MUL8 and DIV8 instructions, make sure we don't produce CopyFromReg nodes for aliasing registers (AX and AL). This confuses the fast register allocator. Instead of CopyFromReg(AL), use ExtractSubReg(CopyFromReg(AX), sub_8bit). This fixes PR7312. llvm-svn: 106934	2010-06-26 00:39:23 +00:00
Daniel Dunbar	acbdf53db4	Thumb2ITBlockPass: Fix a possible dereference of an invalid iterator. This was introduced in r106343, but only showed up recently (with a particular compiler & linker combination) because of the particular check, and because we have no builtin checking for dereferencing the end of an array, which is truly unfortunate. llvm-svn: 106908	2010-06-25 23:14:54 +00:00
Evan Cheng	02b184de5b	Change if-conversion block size limit checks to add some flexibility. llvm-svn: 106901	2010-06-25 22:42:03 +00:00
Dale Johannesen	ce97d55ad9	The hasMemory argument is irrelevant to how the argument for an "i" constraint should get lowered; PR 6309. While this argument was passed around a lot, this is the only place it was used, so it goes away from a lot of other places. llvm-svn: 106893	2010-06-25 21:55:36 +00:00
Dan Gohman	8de1fe3ccf	pcmpeqd and friends are Commutable. llvm-svn: 106886	2010-06-25 21:05:35 +00:00
Bill Wendling	e41e40f689	- Reapply r106066 now that the bzip2 build regression has been fixed. - 2010-06-25-CoalescerSubRegDefDead.ll is the testcase for r106878. llvm-svn: 106880	2010-06-25 20:48:10 +00:00
Dan Gohman	600658a4ba	Don't write an output file to cwd, and put an rdar prefix on an rdar number. llvm-svn: 106810	2010-06-24 23:45:15 +00:00
Dan Gohman	9a2f0473b2	Teach EmitLiveInCopies to omit copies for unused virtual registers, and to clean up unused incoming physregs from the live-in list. llvm-svn: 106805	2010-06-24 22:23:02 +00:00
Bill Wendling	2d3c490026	It's possible that a flag is added to the SDNode that points back to the original SDNode. This is badness. Also, this function allows one SDNode to point multiple flags to another SDNode. Badness as well. llvm-svn: 106793	2010-06-24 22:00:37 +00:00
Dale Johannesen	5ad5226c58	Disallow matching "i" constraint to symbol addresses when address requires a register or secondary load to compute (most PIC modes). This improves "g" constraint handling. 8015842. The test from 2007 is attempting to test the fix for PR1761, but since -relocation-model=static doesn't work on Darwin x86-64, it was not testing what it was supposed to be testing and was passing erroneously. Fixed to use Linux x86-64. llvm-svn: 106779	2010-06-24 20:14:51 +00:00
Jakob Stoklund Olesen	45230239e4	Replace a big gob of old coalescer logic with the new CoalescerPair class. CoalescerPair can determine if a copy can be coalesced, and which register gets merged away. The old logic in SimpleRegisterCoalescing had evolved into something a bit too convoluted. This second attempt fixes some crashes that only occurred Linux. llvm-svn: 106769	2010-06-24 18:15:01 +00:00
Bob Wilson	279e55fb2e	PR7458: Try commuting Thumb2 instruction operands to put them into 2-address form so they can be narrowed to 16-bit instructions. llvm-svn: 106762	2010-06-24 16:50:20 +00:00
Dan Gohman	463f26b4be	Eliminate the other half of the BRCOND optimization, and update as many tests as possible. llvm-svn: 106749	2010-06-24 15:24:03 +00:00
Dan Gohman	df6b33e778	Eliminate the first have of the optimization which eliminates BRCOND when the condition is constant. This optimization shouldn't be necessary, because codegen shouldn't be able to find dead control paths that the IR-level optimizer can't find. And it's undesirable, because it encourages bugpoint to leave "br i1 false" branches in its output. And it wasn't updating the CFG. I updated all the tests I could, but some tests are too reduced and I wasn't able to meaningfully preserve them. llvm-svn: 106748	2010-06-24 15:04:11 +00:00
Dan Gohman	600f62b3ba	Reapply r106634, now that the bug it exposed is fixed. llvm-svn: 106746	2010-06-24 14:30:44 +00:00
Dan Gohman	0695e09b09	Optimize the "bit test" code path for switch lowering in the case where the bit mask has exactly one bit. llvm-svn: 106716	2010-06-24 02:06:24 +00:00
Jakob Stoklund Olesen	dbb58d2974	Revert "Replace a big gob of old coalescer logic with the new CoalescerPair class." Whiny buildbots. llvm-svn: 106710	2010-06-24 00:52:22 +00:00
Jakob Stoklund Olesen	f38e6720cc	Replace a big gob of old coalescer logic with the new CoalescerPair class. CoalescerPair can determine if a copy can be coalesced, and which register gets merged away. The old logic in SimpleRegisterCoalescing had evolved into something a bit too convoluted. llvm-svn: 106701	2010-06-24 00:12:39 +00:00
Bill Wendling	f470747a36	We are missing opportunites to use ldm. Take code like this: void t(int cp0, int cp1, int dp, int fmd) { int c0, c1, d0, d1, d2, d3; c0 = (cp0++ & 0xffff) \| ((cp1++ << 16) & 0xffff0000); c1 = (cp0++ & 0xffff) \| ((cp1++ << 16) & 0xffff0000); / ... */ } It code gens into something pretty bad. But with this change (analogous to the X86 back-end), it will use ldm and generate few instructions. llvm-svn: 106693	2010-06-23 23:00:16 +00:00
Dale Johannesen	fc40f0a1ab	Reinstate correct test, remove the real invalidated test. llvm-svn: 106664	2010-06-23 18:56:06 +00:00
Dale Johannesen	6effb503f5	Remove tests invalidated by previous checkin. llvm-svn: 106663	2010-06-23 18:53:12 +00:00
Bill Wendling	a136521a17	MorphNodeTo doesn't preserve the memory operands. Because we're morphing a node into the same node, but with different non-memory operands, we need to replace the memory operands after it's finished morphing. llvm-svn: 106643	2010-06-23 18:16:24 +00:00
Daniel Dunbar	4df321b7ad	Revert r106263, "Fold the ShrinkDemandedOps pass into the regular DAGCombiner pass,"... it was causing both 'file' (with clang) and 176.gcc (with llvm-gcc) to be miscompiled. llvm-svn: 106634	2010-06-23 17:09:26 +00:00
Daniel Dunbar	ef5a4383ad	Revert r106066, "Create a more targeted fix for not sinking instructions into a range where it"... it causes bzip2 to be miscompiled by Clang. Conflicts: lib/CodeGen/MachineSink.cpp llvm-svn: 106614	2010-06-23 00:48:25 +00:00
Dan Gohman	f1cf963c64	Loosen up this test so that it doesn't depend as much on register allocation details. llvm-svn: 106599	2010-06-22 23:32:47 +00:00
Dan Gohman	1081f1a0f5	Fix OptimizeMax to handle an odd case where one of the max operands is another max which folds. This fixes PR7454. llvm-svn: 106594	2010-06-22 23:07:13 +00:00
Bob Wilson	c5d712232d	Thumb1 functions using @llvm.returnaddress were not saving the incoming LR. Radar 8031193. llvm-svn: 106582	2010-06-22 22:04:24 +00:00
Dale Johannesen	6d4802ba6c	Add SSE so these actually pass on non-X86 hosts. llvm-svn: 106575	2010-06-22 20:54:03 +00:00
Mon P Wang	825639e849	Move v-binop-widen tests to X86 since they don't work on all platforms llvm-svn: 106562	2010-06-22 19:40:50 +00:00
Jakob Stoklund Olesen	9c47dac677	Remove the SimpleJoin optimization from SimpleRegisterCoalescing. Measurements show that it does not speed up coalescing, so there is no reason the keep the added complexity around. Also clean out some unused methods and static functions. llvm-svn: 106548	2010-06-22 16:13:57 +00:00
Evan Cheng	37bb617f8a	Tail merging pass shall not break up IT blocks. rdar://8115404 llvm-svn: 106517	2010-06-22 01:18:16 +00:00
Dan Gohman	3c1b3c61e9	Teach two-address lowering how to unfold a load to open up commuting opportunities. For example, this lets it emit this: movq (%rax), %rcx addq %rdx, %rcx instead of this: movq %rdx, %rcx addq (%rax), %rcx in the case where %rdx has subsequent uses. It's the same number of instructions, and usually the same encoding size on x86, but it appears faster, and in general, it may allow better scheduling for the load. llvm-svn: 106493	2010-06-21 22:17:20 +00:00
Evan Cheng	1fb4de8ec5	Fix PR7421: bug in kill transferring logic. It was ignoring loads / stores which have already been processed. llvm-svn: 106481	2010-06-21 21:21:14 +00:00
Dan Gohman	2dd1d3d182	Make this test more robust in case LLVM ever decides to align the global variable differently. llvm-svn: 106454	2010-06-21 19:56:27 +00:00
Dale Johannesen	dd471bbb10	Add missing FileCheck call. llvm-svn: 106443	2010-06-21 18:46:08 +00:00
Dale Johannesen	d5c58b76ab	Fix PR 7433. Silly typo in non-Darwin ARM tail call handling, plus correct R9 handling in that mode. llvm-svn: 106434	2010-06-21 18:21:49 +00:00
Eric Christopher	bf572c7cea	Add some codegen patterns for x86_64-linux-gnu tls codegen matching. Based on a patch by Patrick Marlier! llvm-svn: 106433	2010-06-21 18:21:27 +00:00
Kalle Raiskila	df071b7e42	Add the check to the testcase of r106419. llvm-svn: 106421	2010-06-21 15:11:51 +00:00
Kalle Raiskila	0ab5a02579	Mark the SPU 'lr' instruction to never have side effects. This allows the fast regiser allocator to remove redundant register moves. Update a set of tests that depend on the register allocator to be linear scan. llvm-svn: 106420	2010-06-21 15:08:16 +00:00
Kalle Raiskila	d7f50c118a	Fix the lowering of VECTOR_SHUFFLE on SPU to handle splats. llvm-svn: 106419	2010-06-21 14:42:19 +00:00
Kalle Raiskila	6f58190f6f	Fix lowering of VECTOR_SHUFFLE on SPU. Old algorithm used to choke llc with the attached test. llvm-svn: 106411	2010-06-21 10:17:36 +00:00
Evan Cheng	884a8fe5fa	Fix a crash caused by dereference of MBB.end(). rdar://8110842 llvm-svn: 106399	2010-06-20 00:54:38 +00:00
Dan Gohman	51d00092b6	Include the use kind along with the expression in the key of the use sharing map. The reconcileNewOffset logic already forces a separate use if the kinds differ, so incorporating the kind in the key means we can track more sharing opportunities. More sharing means fewer total uses to track, which means smaller problem sizes, which means the conservative throttles don't kick in as often. llvm-svn: 106396	2010-06-19 21:29:59 +00:00
Evan Cheng	f3c01f3ef6	Disable sibcall optimization for Thumb1 for now since Thumb1RegisterInfo::emitEpilogue is not expecting them. llvm-svn: 106368	2010-06-19 01:01:32 +00:00
Evan Cheng	119824ed4d	Move ARM if-conversion before post-ra scheduling. llvm-svn: 106355	2010-06-18 23:32:07 +00:00
Evan Cheng	2d51c7c592	Allow ARM if-converter to be run after post allocation scheduling. - This fixed a number of bugs in if-converter, tail merging, and post-allocation scheduler. If-converter now runs branch folding / tail merging first to maximize if-conversion opportunities. - Also changed the t2IT instruction slightly. It now defines the ITSTATE register which is read by instructions in the IT block. - Added Thumb2 specific hazard recognizer to ensure the scheduler doesn't change the instruction ordering in the IT block (since IT mask has been finalized). It also ensures no other instructions can be scheduled between instructions in the IT block. This is not yet enabled. llvm-svn: 106344	2010-06-18 23:09:54 +00:00
Jakob Stoklund Olesen	07f4fa8198	TwoAddressInstructionPass::CoalesceExtSubRegs can insert INSERT_SUBREG instructions, but it doesn't really understand live ranges, so the first INSERT_SUBREG uses an implicitly defined register. Fix it in LiveVariableAnalysis by adding the <undef> flag. llvm-svn: 106333	2010-06-18 22:29:44 +00:00
Evan Cheng	cf9e8a987f	Fix an inverted condition. llvm-svn: 106330	2010-06-18 22:17:13 +00:00
Jakob Stoklund Olesen	22a212f97c	When using ADDri to get the address of a stack object, 255 is a conservative limit on the offset that can be materialized without using the register scavenger. llvm-svn: 106312	2010-06-18 20:59:25 +00:00
Dale Johannesen	c1570dda5c	Enable tail calls on ARM by default, with some basic tests. This has been well tested on Darwin but not elsewhere. It should work provided the linker correctly resolves B.W <label in other function> which it has not seen before, at least from llvm-based compilers. I'm leaving the arm-tail-calls switch in until I see if there's any problems because of that; it might need to be disabled for some environments. llvm-svn: 106299	2010-06-18 19:00:18 +00:00
Jakob Stoklund Olesen	b9f91667e1	Treat the ARM inline asm {cc} constraint as a physreg (%CPSR), just like X86 does for {flags}. If we create virtual registers of the CCR class, RegAllocFast may try to spill them, and we can't do that. llvm-svn: 106289	2010-06-18 16:49:33 +00:00
Dan Gohman	99ba4dac59	Don't maintain a set of deleted nodes; instead, use a HandleSDNode to track a node over CSE events. This fixes PR7368. llvm-svn: 106266	2010-06-18 01:24:29 +00:00
Dan Gohman	b92156d5e4	Fold the ShrinkDemandedOps pass into the regular DAGCombiner pass, which is faster, simpler, and less surprising. llvm-svn: 106263	2010-06-18 01:05:21 +00:00
Dan Gohman	30d7a51d6c	Make this test less fragile. llvm-svn: 106255	2010-06-18 00:06:03 +00:00
Rafael Espindola	29dda21e96	Remove arm_apcscc from the test files. It is the default and doing this matches what llvm-gcc and clang now produce. llvm-svn: 106221	2010-06-17 15:18:27 +00:00
Jakob Stoklund Olesen	207cd4bbd7	Allow a register to be redefined multiple times in a basic block. LiveVariableAnalysis was a bit picky about a register only being redefined once, but that really isn't necessary. Here is an example of chained INSERT_SUBREGs that we can handle now: 68 %reg1040<def> = INSERT_SUBREG %reg1040, %reg1028<kill>, 14 register: %reg1040 +[70,134:0) 76 %reg1040<def> = INSERT_SUBREG %reg1040, %reg1029<kill>, 13 register: %reg1040 replace range with [70,78:1) RESULT: %reg1040,0.000000e+00 = [70,78:1)[78,134:0) 0@78-(134) 1@70-(78) 84 %reg1040<def> = INSERT_SUBREG %reg1040, %reg1030<kill>, 12 register: %reg1040 replace range with [78,86:2) RESULT: %reg1040,0.000000e+00 = [70,78:1)[78,86:2)[86,134:0) 0@86-(134) 1@70-(78) 2@78-(86) 92 %reg1040<def> = INSERT_SUBREG %reg1040, %reg1031<kill>, 11 register: %reg1040 replace range with [86,94:3) RESULT: %reg1040,0.000000e+00 = [70,78:1)[78,86:2)[86,94:3)[94,134:0) 0@94-(134) 1@70-(78) 2@78-(86) 3@86-(94) rdar://problem/8096390 llvm-svn: 106152	2010-06-16 21:29:40 +00:00
Jim Grosbach	2c8b829238	modify so the test doesn't drop an output file in the test source directory. The test should also likely have some FileCheck bits to validate the output(?). llvm-svn: 106146	2010-06-16 21:07:06 +00:00
Evan Cheng	f128bdcb55	Make post-ra scheduling, anti-dep breaking, and register scavenger (conservatively) aware of predicated instructions. This enables ARM to move if-conversion before post-ra scheduler. llvm-svn: 106091	2010-06-16 07:35:02 +00:00
Bill Wendling	8c0cf0994d	Create a more targeted fix for not sinking instructions into a range where it will conflict with another live range. The place which creates this scenerio is the code in X86 that lowers a select instruction by splitting the MBBs. This eliminates the need to check from the bottom up in an MBB for live pregs. llvm-svn: 106066	2010-06-15 23:46:31 +00:00
Jakob Stoklund Olesen	ec2e964fd6	Remove the local register allocator. Please use the fast allocator instead. llvm-svn: 106051	2010-06-15 21:58:33 +00:00
Rafael Espindola	ae591be4e9	Set the mtriple in some tests so that they use AAPCS. llvm-svn: 106041	2010-06-15 20:42:00 +00:00
Mon P Wang	7a84689cc5	Fixed vector widening of binary instructions that can trap. Patch by Visa Putkinen! llvm-svn: 106038	2010-06-15 20:29:05 +00:00
Chris Lattner	874c92bd47	fix fastisel to handle GS and FS relative pointers. Patch by Nelson Elhage! llvm-svn: 106031	2010-06-15 19:08:40 +00:00
Rafael Espindola	5a24a56e1e	Remove the arm_aapcscc marker from the tests. It is the default for the linux targets. llvm-svn: 106029	2010-06-15 19:04:29 +00:00
Jakob Stoklund Olesen	246e9a07a2	Avoid processing early clobbers twice in RegAllocFast. Early clobbers defining a virtual register were first alocated to a physreg and then processed as a physreg EC, spilling the virtreg. This fixes PR7382. llvm-svn: 105998	2010-06-15 16:20:57 +00:00
Jakob Stoklund Olesen	82eca35b3e	Add CoalescerPair helper class. Given a copy instruction, CoalescerPair can determine which registers to coalesce in order to eliminate the copy. It deals with all the subreg fun to determine a tuple (DstReg, SrcReg, SubIdx) such that: - SrcReg is a virtual register that will disappear after coalescing. - DstReg is a virtual or physical register whose live range will be extended. - SubIdx is 0 when DstReg is a physical register. - SrcReg can be joined with DstReg:SubIdx. CoalescerPair::isCoalescable() determines if another copy instruction is compatible with the same tuple. This fixes some NEON miscompilations where shuffles are getting coalesced as if they were copies. The CoalescerPair class will replace a lot of the spaghetti logic in JoinCopy later. llvm-svn: 105997	2010-06-15 16:04:21 +00:00
Bob Wilson	a55b8877e6	Generalize the pre-coalescing of extract_subregs feeding reg_sequences, replacing the overly conservative checks that I had introduced recently to deal with correctness issues. This makes a pretty noticable difference in our testcases where reg_sequences are used. I've updated one test to check that we no longer emit the unnecessary subreg moves. llvm-svn: 105991	2010-06-15 05:56:31 +00:00
Chris Lattner	00ab615406	apparently lots of dupes. llvm-svn: 105956	2010-06-14 20:19:03 +00:00
Chris Lattner	faa7bdccbf	fix a nasty bug where we were not treating available_externally symbols as declarations in the X86 backend. This would manifest on darwin x86-32 as errors like this with -fvisibility=hidden: symbol '__ZNSbIcED1Ev' can not be undefined in a subtraction expression This fixes PR7353. llvm-svn: 105954	2010-06-14 20:11:56 +00:00
Chris Lattner	bbb798c7d1	remove old test. llvm-svn: 105953	2010-06-14 20:07:43 +00:00
Chris Lattner	b30f87b74e	rename test llvm-svn: 105952	2010-06-14 20:07:34 +00:00
Bob Wilson	f07d33d8f1	Add a missing bitcast. This code used to only handle conversions between i64 and f64 types, but now it also handle Neon vector types, so the f64 result of VMOVDRR may need to be converted to a Neon type. Radar 8084742. llvm-svn: 105845	2010-06-11 22:45:25 +00:00
Bill Wendling	d53a2cb4ac	Testcase for r105741. llvm-svn: 105750	2010-06-09 20:30:22 +00:00
Jakob Stoklund Olesen	8bc5eca331	Mark physregs defined by inline asm as implicit. This is a bit of a hack to make inline asm look more like call instructions. It would be better to produce correct dead flags during isel. llvm-svn: 105749	2010-06-09 20:05:00 +00:00
Kalle Raiskila	5e0862f7f5	Fix SPU to cope with vector insertelement to an undef position. We default to inserting to lane 0. llvm-svn: 105722	2010-06-09 09:58:17 +00:00
Kalle Raiskila	056113a211	Handle loading from/storing to undef pointers on SPU by inserting a random load/store, rather than crashing llc. llvm-svn: 105710	2010-06-09 08:29:41 +00:00
Dan Gohman	bbfb6aca92	LSR needs to remember inserted instructions even in postinc mode, because there could be multiple subexpressions within a single expansion which require insert point adjustment. This fixes PR7306. llvm-svn: 105510	2010-06-05 00:33:07 +00:00
Evan Cheng	a03e6f85fe	Re-apply 105308 with fix. llvm-svn: 105502	2010-06-04 23:28:13 +00:00
Dale Johannesen	065d6fd537	More tail call removal. llvm-svn: 105485	2010-06-04 21:14:24 +00:00
Dan Gohman	538b413ccb	Fix normalization and de-normalization of non-affine SCEVs. llvm-svn: 105480	2010-06-04 19:16:34 +00:00
Mon P Wang	622cdd2297	Fixed a bug during widening where we would avoid legalizing a node. When we replace an OpA with a widened OpB, it is possible to get new uses of OpA due to CSE when recursively updating nodes. Since OpA has been processed, the new uses are not examined again. The patch checks if this occurred and it it did, updates the new uses of OpA to use OpB. llvm-svn: 105453	2010-06-04 01:20:10 +00:00
Dale Johannesen	b3780b1103	Remove more tail calls. llvm-svn: 105450	2010-06-04 01:01:24 +00:00
Dale Johannesen	e7b392dca9	Remove a tail call, and move some CHECKs to the functions where they belong. llvm-svn: 105449	2010-06-04 01:01:04 +00:00
Dan Gohman	8fdda8a655	This test doesn't need the ssp attribute. llvm-svn: 105440	2010-06-04 00:14:48 +00:00
Dale Johannesen	e288fee959	Remove tail call. A tail call version will follow. llvm-svn: 105438	2010-06-04 00:03:37 +00:00
Dale Johannesen	9f71f7f70c	Remove tail call to preserve this test. A tail call version will follow. llvm-svn: 105422	2010-06-03 21:57:48 +00:00
Dale Johannesen	41528aeb0b	Make this test not use tail calls. A tail call version will follow. llvm-svn: 105419	2010-06-03 21:53:01 +00:00
Dan Gohman	d83e3e7750	Fix SimplifyDemandedBits' AssertZext logic to demand all the bits. It needs to demand the high bits because it's asserting that they're zero. llvm-svn: 105406	2010-06-03 20:21:33 +00:00
Bob Wilson	30093b5d8b	Revert 105308. llvm-svn: 105399	2010-06-03 18:28:31 +00:00
Bill Wendling	f82aea634c	Machine sink could potentially sink instructions into a block where the physical registers it defines then interfere with an existing preg live range. For instance, if we had something like these machine instructions: BB#0 ... = imul ... EFLAGS<imp-def,dead> test ..., EFLAGS<imp-def> jcc BB#2 EFLAGS<imp-use> BB#1 ... ; fallthrough to BB#2 BB#2 ... ; No code that defines EFLAGS jcc ... EFLAGS<imp-use> Machine sink will come along, see that imul implicitly defines EFLAGS, but because it's "dead", it assumes that it can move imul into BB#2. But when it does, imul's "dead" imp-def of EFLAGS is raised from the dead (a zombie) and messes up the condition code for the jump (and pretty much anything else which relies upon it being correct). The solution is to know which pregs are live going into a basic block. However, that information isn't calculated at this point. Nor does the LiveVariables pass take into account non-allocatable physical registers. In lieu of this, we do a very conservative pass through the basic block to determine if a preg is live coming out of it. llvm-svn: 105387	2010-06-03 07:54:20 +00:00
Eric Christopher	f67fe3b1e8	One underscore, not two. llvm-svn: 105379	2010-06-03 04:02:59 +00:00
Eli Friedman	dbbbf73c96	Implement expansion in type legalization for add/sub with overflow. The expansion is the same as that used by LegalizeDAG. The resulting code sucks in terms of performance/codesize on x86-32 for a 64-bit operation; I haven't looked into whether different expansions might be better in general. llvm-svn: 105378	2010-06-03 03:49:50 +00:00
Evan Cheng	a2da22734f	Enable machine cse of instructions which define physical registers. llvm-svn: 105308	2010-06-02 01:08:27 +00:00
Dan Gohman	b782caa393	Fill in missing support for ISD::FEXP, ISD::FPOWI, and friends. llvm-svn: 105283	2010-06-01 18:35:14 +00:00
Kalle Raiskila	8916358f97	Fix handling of 'load' nodes. llvm-svn: 105269	2010-06-01 13:34:47 +00:00
Chris Lattner	14c46517b5	fix PR6623: when optimizing for size, don't inline memcpy/memsets that are too large. This causes the freebsd bootloader to be too large apparently. It's unclear if this should be an -Os or -Oz thing. Thoughts welcome. llvm-svn: 105228	2010-05-31 17:30:14 +00:00
Chris Lattner	291a189cda	upgrade and filecheckize this test. llvm-svn: 105227	2010-05-31 17:27:17 +00:00
Evan Cheng	707b7cc429	Remove schedule-livein-copies. It's not being used. llvm-svn: 105095	2010-05-29 02:23:39 +00:00
Evan Cheng	27c4933e02	Fix PR7193: if sibling call address can take a register, make sure there are enough registers available by counting inreg arguments. llvm-svn: 105092	2010-05-29 01:35:22 +00:00
Evan Cheng	cc2efe11db	Fix some latency computation bugs: if the use is not a machine opcode do not just return zero. llvm-svn: 105061	2010-05-28 23:26:21 +00:00
Jakob Stoklund Olesen	2085089c49	Fix more tests that depended on the default register allocator choice. llvm-svn: 104961	2010-05-28 17:06:30 +00:00
Dan Gohman	2140a74979	Eliminate the restriction that the array size in an alloca must be i32. This will help reduce the amount of casting required on 64-bit targets. llvm-svn: 104911	2010-05-28 01:14:11 +00:00
Jakob Stoklund Olesen	b613ae2c89	Add a -regalloc=default option that chooses a register allocator based on the -O optimization level. This only really affects llc for now because both the llvm-gcc and clang front ends override the default register allocator. I intend to remove that code later. llvm-svn: 104904	2010-05-27 23:57:25 +00:00
Evan Cheng	3d3ee87d4e	llvm can't correctly support 'H', 'Q' and 'R' modifiers. Just mark it an error. llvm-svn: 104891	2010-05-27 22:08:38 +00:00
Devang Patel	6b9a9fe207	Simplify. Eliminate unneeded debug_loc entry. llvm-svn: 104785	2010-05-26 23:55:23 +00:00
Devang Patel	1b08572a66	Update debug info when live-in reg is copied into a vreg. llvm-svn: 104732	2010-05-26 20:18:50 +00:00
Dale Johannesen	053dd21c84	Testcase for 104624/104619/PR7191/8023512. Reduced from one provided by Duncan Sands, thanks! llvm-svn: 104710	2010-05-26 17:55:45 +00:00
Dale Johannesen	cd4ba6caba	Removing test; Chris thinks it's better to have the bug go untested than have a testcase this large. So be it. llvm-svn: 104632	2010-05-25 20:40:10 +00:00
Dale Johannesen	60fe2cdc4f	Fix another variant of PR 7191. Also add a testcase Mon Ping provided; unfortunately bugpoint failed to reduce it, but I think it's important to have a test for this in the suite. 8023512. llvm-svn: 104624	2010-05-25 18:47:23 +00:00
Bob Wilson	3eb7691858	Thumb2 RSBS instructions were being printed without the 'S' suffix. Fix it by changing the T2I_rbin_s_is multiclass to handle the CPSR output and 'S' suffix in the same way as T2I_bin_s_irs. llvm-svn: 104531	2010-05-24 18:44:06 +00:00
Evan Cheng	755d45be43	LR is in GPR, not tGPR even in Thumb1 mode. llvm-svn: 104518	2010-05-24 18:00:18 +00:00
Evan Cheng	168ced94d8	Implement @llvm.returnaddress. rdar://8015977. llvm-svn: 104421	2010-05-22 01:47:14 +00:00
Eric Christopher	64087cd346	This test is darwin only. Make it so(tm). llvm-svn: 104418	2010-05-22 00:55:55 +00:00
Bob Wilson	91fdf68516	Recognize more BUILD_VECTORs and VECTOR_SHUFFLEs that can be implemented by copying VFP subregs. This exposed a bunch of dead code in the *spill-q.ll tests, so I tweaked those tests to keep that code from being optimized away. Radar 7872877. llvm-svn: 104415	2010-05-22 00:23:12 +00:00
Eric Christopher	6fdea1bda8	Add full bss data support for darwin tls variables. llvm-svn: 104414	2010-05-22 00:10:22 +00:00
Bob Wilson	51d9ee3ff6	Change CodeGen/ARM/2009-11-02-NegativeLane.ll to use 16-bit vector elements so that it will continue to test what it was meant to test when I commit a separate change for better support of BUILD_VECTOR and VECTOR_SHUFFLE for Neon. Fix a DAG combiner crash exposed by this test change. llvm-svn: 104380	2010-05-21 21:05:32 +00:00
Chris Lattner	0735ecfe17	now that fp reg kill insertion stuff happens as a separate pass after isel instead of being interlaced with it, we can trust that all the code for a function has been isel'd before it is run. The practical impact of this is that we can scan for machine instr phis instead of doing a fuzzy match on the LLVM BB for phi nodes. Doing the fuzzy match required knowing when isel would produce an fp reg stack phi which was gross. It was also wrong in cases where select got lowered to a branch tree because cmovs aren't available (PR6828). Just do the scan on machine phis which is simpler, faster and more correct. This fixes PR6828. llvm-svn: 104333	2010-05-21 18:17:54 +00:00
Jakob Stoklund Olesen	a648c6a757	Teach VirtRegRewriter to handle spilling in instructions that have multiple definitions of the virtual register. This happens when spilling the registers produced by REG_SEQUENCE: %reg1047:5<def>, %reg1047:6<def>, %reg1047:7<def> = VLD3d8 %reg1033, 0, pred:14, pred:%reg0 The rewriter would spill the register multiple times, dead store elimination tried to keep up, but ended up cutting the branch it was sitting on. llvm-svn: 104321	2010-05-21 16:36:13 +00:00
Dale Johannesen	b3b9c8ac48	Fix i64->f64 conversion, x86-64, -no-sse. A bit tricky since there's a 3rd 64-bit type, MMX vectors. PR 7135. llvm-svn: 104308	2010-05-21 00:52:33 +00:00
Evan Cheng	34c260458a	Change ARM scheduling default to list-hybrid if the target supports floating point instructions (and is not using soft float). llvm-svn: 104307	2010-05-21 00:43:17 +00:00
Dan Gohman	ee2fea3cd7	When canonicalizing icmp operand order to put the loop invariant operand on the left, the interesting operand is on the right. This fixes a bug where LSR was failing to recognize ICmpZero uses, which led it to be unable to reverse the induction variable in the attached testcase. Delete test/CodeGen/X86/stack-color-with-reg-2.ll, because its test is extremely fragile and hard to meaningfully update. llvm-svn: 104262	2010-05-20 19:26:52 +00:00
Bob Wilson	5954994bba	Handle Neon v2f64 and v2i64 vector shuffles as register copies. This fixes the remaining issue with pr7167. llvm-svn: 104257	2010-05-20 18:39:53 +00:00
Dan Gohman	20fab456da	Teach LSR how to cope better with unrolled loops on targets where the addressing modes don't make this trivially easy. This allows it to avoid falling into the less precise heuristics in more cases. llvm-svn: 104186	2010-05-19 23:43:12 +00:00
Jakob Stoklund Olesen	e11cdf8cc8	TwoAddressInstructionPass doesn't really know how to merge live intervals when lowering REG_SEQUENCE instructions. Insert copies for REG_SEQUENCE sources not killed to avoid breaking later passes. llvm-svn: 104146	2010-05-19 20:08:00 +00:00
Bob Wilson	f070b1b571	Testcase to go with 104141. llvm-svn: 104142	2010-05-19 18:58:37 +00:00
Evan Cheng	daeca2d156	t2LEApcrel and tLEApcrel are re-materializable. This makes it possible to hoist more loads during machine LICM. llvm-svn: 104115	2010-05-19 07:28:01 +00:00
Evan Cheng	abd0ad54a4	Intrinsics which do a vector compare (results are all zero or all ones) are modeled as icmp / fcmp + sext. This is turned into a vsetcc by dag combine (yes, not a good long term solution). The targets can then isel the vsetcc to the appropriate instruction. The trouble arises when the result of a vector cmp + sext is then and'ed with all ones. Instcombine will turn it into a vector cmp + zext, dag combiner will miss turning it into a vsetcc and hell breaks loose after that. Teach dag combine to turn a vector cpm + zest into a vsetcc + and 1. This fixes rdar://7923010. llvm-svn: 104094	2010-05-19 01:08:17 +00:00
Jakob Stoklund Olesen	430b6e40ab	Remember to update VirtRegLastUse when spilling without killing before a call. llvm-svn: 104074	2010-05-18 22:20:09 +00:00
Dan Gohman	887dd1cd31	When converting a test to a cmp to fold a load, use the cmp that has an 8-bit immediate field rather than one with a wider immediate field. llvm-svn: 104064	2010-05-18 21:42:03 +00:00
Evan Cheng	f19384d54a	Sink dag combine's post index load / store code that swap base ptr and index into the target hook. Only the target knows whether the swap is safe. In Thumb2 mode, the offset must be an immediate. rdar://7998649 llvm-svn: 104060	2010-05-18 21:31:17 +00:00
Evan Cheng	e7fc64a5c9	Fix PR7162: Use source register classes and sub-indices to determine the correct register class of the definitions of REG_SEQUENCE. llvm-svn: 104050	2010-05-18 20:03:28 +00:00
Daniel Dunbar	a4820fcc78	MC/X86: Implement custom lowering to make sure we match things like X86::ADC32ri $0, %eax to X86::ADC32i32 $0 llvm-svn: 104030	2010-05-18 17:22:24 +00:00
Evan Cheng	48f0de96d6	FIX PR7158. SimplifyVBinOp was asserting when it fails to constant fold (op (build_vector), (build_vector)). llvm-svn: 104004	2010-05-18 00:03:40 +00:00
Evan Cheng	1e4f55200d	Fix PR7175. Insert copies of a REG_SEQUENCE source if it is used by other REG_SEQUENCE instructions. llvm-svn: 103994	2010-05-17 23:24:12 +00:00
Evan Cheng	f2c9a96f3c	Fix PR7156. If the sources of a REG_SEQUENCE are all IMPLICIT_DEF's. Replace it with an IMPLICIT_DEF rather than deleting it or else it would be left without a def. llvm-svn: 103984	2010-05-17 22:09:49 +00:00
Evan Cheng	29c463862e	Careful with reg_sequence coalescing to not to overwrite sub-register indices. llvm-svn: 103971	2010-05-17 20:57:12 +00:00
Evan Cheng	3d98b996ff	Turn on -neon-reg-sequence by default. Using NEON load / store multiple instructions will no longer create gobs of vmov of D registers! llvm-svn: 103960	2010-05-17 19:51:20 +00:00
Jakob Stoklund Olesen	176a9c4272	Avoid allocating the same physreg to multiple virtregs in one instruction. While that approach works wonders for register pressure, it tends to break everything. This should unbreak the arm-linux builder and fix a number of miscompilations. llvm-svn: 103946	2010-05-17 17:18:59 +00:00
Jakob Stoklund Olesen	7d22a81b61	Only use clairvoyance when defining a register, and then only if it has one use. This makes allocation independent on the ordering of use-def chains. llvm-svn: 103935	2010-05-17 04:50:57 +00:00
Dale Johannesen	f92c344167	Removing as part of previous reversion. llvm-svn: 103915	2010-05-16 20:19:40 +00:00
Dale Johannesen	2ef974ee0e	Revert 103911; it broke a test that expects bitconvert <1xi64> -> i64 to work in MMX registers on hosts where -no-sse is the default (not mine). The right thing is to accept this and make i64->f64 conversions go through memory, but I don't have time right now. llvm-svn: 103914	2010-05-16 20:19:04 +00:00
Dale Johannesen	fc1492d71b	Make x86-64 64-bit bitconvert work when SSE is not available. (This worked as of about 6 months ago and I didn't track down exactly what broke it; I think this fix is appropriate.) llvm-svn: 103911	2010-05-16 18:22:38 +00:00
Anton Korobeynikov	8f35fabbc1	Add support for thiscall calling convention. Patch by Charles Davis and Steven Watanabe! llvm-svn: 103902	2010-05-16 09:08:45 +00:00
Anton Korobeynikov	1bf28a128b	Some cheap DAG combine goodness for multiplication with a particular constant. This can be extended later on to handle more "complex" constants. llvm-svn: 103881	2010-05-15 18:16:59 +00:00
Evan Cheng	4cad68eb34	Allow TargetLowering::getRegClassFor() to be called on illegal types. Also allow target to override it in order to map register classes to illegal but synthesizable types. e.g. v4i64, v8i64 for ARM / NEON. llvm-svn: 103854	2010-05-15 02:18:07 +00:00
Bill Wendling	0160e55893	SystemZ really does mean "has calls" and not just "adjusts stack." Go ahead and replace the check with the appropriate predicate. Modify the testcase to reflect the correct code. (It should be saving callee-saved registers on the stack allocated by the calling fuction.) llvm-svn: 103829	2010-05-14 22:17:42 +00:00
Jakob Stoklund Olesen	4d5c1061e3	Simplify the handling of physreg defs and uses in RegAllocFast. This adds extra security against using clobbered physregs, and it adds kill markers to physreg uses. llvm-svn: 103784	2010-05-14 18:03:25 +00:00
Jakob Stoklund Olesen	0ba2e2a568	Take allocation hints from copy instructions to/from physregs. This causes way more identity copies to be generated, ripe for coalescing. llvm-svn: 103686	2010-05-13 00:19:43 +00:00
Jakob Stoklund Olesen	955a0e71e9	Make sure to add kill flags to the last use of a virtreg when it is redefined. The X86 floating point stack pass and others depend on good kill flags. llvm-svn: 103635	2010-05-12 18:46:03 +00:00
Jakob Stoklund Olesen	e6e39dc310	Enable a bunch more -regalloc=fast tests llvm-svn: 103531	2010-05-12 00:11:24 +00:00
Jakob Stoklund Olesen	132668102e	Keep track of the last place a live virtreg was used. This allows us to add accurate kill markers, something the scavenger likes. Add some more tests from ARM that needed this. llvm-svn: 103521	2010-05-11 23:24:45 +00:00
Jakob Stoklund Olesen	84c881e593	One more -regalloc=fast test llvm-svn: 103509	2010-05-11 20:51:07 +00:00
Jakob Stoklund Olesen	3f0241e0f9	Simplify the tracking of used physregs to a bulk bitor followed by a transitive closure after allocating all blocks. Add a few more test cases for -regalloc=fast. llvm-svn: 103500	2010-05-11 20:30:28 +00:00
Jakob Stoklund Olesen	f1b3029a54	Mostly rewrite RegAllocFast. Sorry for the big change. The path leading up to this patch had some TableGen changes that I didn't want to commit before I knew they were useful. They weren't, and this version does not need them. The fast register allocator now does no liveness calculations. Instead it relies on kill flags provided by isel. (Currently those kill flags are also ignored due to isel bugs). The allocation algorithm is supposed to work with any subset of valid kill flags. More kill flags simply means fewer spills inserted. Registers are allocated from a working set that contains no aliases. That means most allocations can be done directly without expensive alias checks. When the working set runs out of registers we do the full alias check to find new free registers. llvm-svn: 103488	2010-05-11 18:54:45 +00:00
Kalle Raiskila	9dd3ef8d01	Make SPU backend not assert on jump tables. llvm-svn: 103466	2010-05-11 11:00:02 +00:00
Evan Cheng	2fa5a7e7e4	Select @llvm.trap to the special B with 1111 condition (i.e. trap) instruction. llvm-svn: 103459	2010-05-11 07:26:32 +00:00
Evan Cheng	02947a4551	Be careful with operand promotion. For a binary operation, the source operands may be the same. PR7018. rdar://7939869. llvm-svn: 103419	2010-05-10 19:03:57 +00:00
Kalle Raiskila	92ea401d8f	Fix encoding of 'sf' and 'sfh' instructions. llvm-svn: 103399	2010-05-10 08:13:49 +00:00
Bill Wendling	cd476b6760	Readd testcase. llvm-svn: 103335	2010-05-08 04:47:54 +00:00
Dan Gohman	d0800241d2	When pruning candidate formulae out of an LSRUse, update the LSRUse's Regs set after all pruning is done, rather than trying to do it on the fly, which can produce an incomplete result. This fixes a case where heuristic pruning was stripping all formulae from a use, which led the solver to enter an infinite loop. Also, add a few asserts to diagnose this kind of situation. llvm-svn: 103328	2010-05-07 23:36:59 +00:00
Bill Wendling	6b5897b4de	Remove. Don't XFAIL. llvm-svn: 103321	2010-05-07 23:09:17 +00:00
Bill Wendling	32d8981ec0	Temorarily revert r101984. llvm-svn: 103314	2010-05-07 22:45:36 +00:00
Dan Gohman	7de01ec2c9	SDDbgValues are apparently not being legalized. Fix a symptom of the problem, and not the real problem itself, by dropping debug info for i128 values. rdar://7958162. llvm-svn: 103310	2010-05-07 22:19:08 +00:00
Dale Johannesen	51c1695a0a	Fix PR 7087, and probably other things, by extending getConstantFP to accept the two supported long double target types. This was not the original intent, but there are other places that assume this works and it's easy enough to do. llvm-svn: 103299	2010-05-07 21:35:53 +00:00
Jim Grosbach	2a41cad900	Clean up the conditional for handling of sign_extend_inreg based on whether the extract instructions are available. rdar://7956878 llvm-svn: 103277	2010-05-07 18:34:55 +00:00
Duncan Sands	ebf838274f	Correct some bogus target triples. llvm-svn: 103265	2010-05-07 17:03:48 +00:00
Nick Lewycky	45f530db39	Revert r103133 and add testcase from PR7066. llvm-svn: 103233	2010-05-07 01:45:38 +00:00
Dan Gohman	7421ae48bf	Disable the new unknown-location code for now. It causes a major increase in the debug line info section, and it's causing regressions in a gdb testsuite. llvm-svn: 103226	2010-05-07 01:08:53 +00:00
Dan Gohman	779c69bbc5	Add a DebugLoc argument to TargetInstrInfo::copyRegToReg, so that it doesn't have to guess. llvm-svn: 103194	2010-05-06 20:33:48 +00:00
Dan Gohman	cb4e3e51a9	Add a testcase for r103135, explicitly representing unknown locations in debug line info. llvm-svn: 103189	2010-05-06 17:49:17 +00:00
Chris Lattner	35096e82c5	Fix PR7054 - Assertion `Symbol->isUndefined() && "Cannot define a symbol twice!"' failed. Users can write broken code that emits the same label twice with asm renaming, detect this and emit a fatal backend error instead of aborting. llvm-svn: 103140	2010-05-06 00:05:37 +00:00
Jim Grosbach	151cd8f159	Cleanup of ARMv7M support. Move hardware divide and Thumb2 extract/pack instructions to subtarget features and update tests to reflect. PR5717. llvm-svn: 103136	2010-05-05 23:44:43 +00:00
Jakob Stoklund Olesen	1b6f698e85	Fix PR6520. An earlyclobber physreg must not be allocated to anything else. llvm-svn: 103133	2010-05-05 23:07:41 +00:00
Jim Grosbach	245b169212	fix copy/paste oops. llvm-svn: 103122	2010-05-05 21:07:46 +00:00
Jim Grosbach	44d7f49887	Add tests for ARMV7M divide instruction use llvm-svn: 103120	2010-05-05 20:47:15 +00:00
Jim Grosbach	e36cd72e38	remove unneeded underscores. llvm-svn: 103114	2010-05-05 19:55:58 +00:00
Jim Grosbach	5ced648ba8	Convert to filecheck llvm-svn: 103113	2010-05-05 19:41:11 +00:00
Chris Lattner	0185047b3f	"on the rare occasion the SPU BE produces illegal assembly - it tries to emit an add instruction of the form 'a reg, reg, imm'." Patch by Kalle Raiskila! llvm-svn: 103021	2010-05-04 17:58:46 +00:00
Dale Johannesen	81bfca7bde	Implement builtin_return_address(x) and builtin_frame_address(x) on PPC for x!=0. 7624113. llvm-svn: 102972	2010-05-03 22:59:34 +00:00
Jakob Stoklund Olesen	f4e4e84115	Check that subregisters don't have independent values in RemoveCopyByCommutingDef(). This fixes PR6941. llvm-svn: 102970	2010-05-03 22:40:32 +00:00
Dan Gohman	0553acff5e	Fix tests to use fadd, fsub, and fmul, instead of add, sub, and mul, when the type is floating-point. llvm-svn: 102969	2010-05-03 22:36:46 +00:00
Dan Gohman	2ad68de4aa	Fix a bug which prevented tail merging of return instructions in beneficial cases. See the changes in test/CodeGen/X86/tail-opts.ll and test/CodeGen/ARM/ifcvt2.ll for details. The fix is to change HashEndOfMBB to hash at most one instruction, instead of trying to apply heuristics about when it will be profitable to consider more than one instruction. The regular tail-merging heuristics are already prepared to handle the same cases, and they're more precise. Also, make test/CodeGen/ARM/ifcvt5.ll and test/CodeGen/Thumb2/thumb2-branch.ll slightly more complex so that they continue to test what they're intended to test. And, this eliminates the problem in test/CodeGen/Thumb2/2009-10-15-ITBlockBranch.ll, the testcase from PR5204. Update it accordingly. llvm-svn: 102907	2010-05-03 14:35:47 +00:00
Duncan Sands	211427bda9	Remove the -enable-sjlj-eh option, which doesn't do anything. Remove the -enable-eh option which is only used by the JIT, and replace it with -jit-enable-eh. llvm-svn: 102865	2010-05-02 15:36:26 +00:00
Anton Korobeynikov	737718d4f4	Insert ANY_EXTEND node instead of invalid truncate during DAG Combining (X & 1), when needed. This fixes PR7001 llvm-svn: 102838	2010-05-01 12:52:34 +00:00
Anton Korobeynikov	319d71f44f	Do folding for indirect branches, where possible llvm-svn: 102836	2010-05-01 12:28:21 +00:00
Anton Korobeynikov	ebbdfef2fc	Implement indirect branches on MSP430 llvm-svn: 102835	2010-05-01 12:04:32 +00:00
Bill Wendling	02bc6787ca	Test failing too much on too many platforms. llvm-svn: 102812	2010-05-01 00:12:33 +00:00
Bill Wendling	06cacb1291	Maybe it needs sse2? llvm-svn: 102802	2010-04-30 23:19:29 +00:00
Bill Wendling	613fb7daa6	Force 64-bit. llvm-svn: 102800	2010-04-30 22:45:20 +00:00
Bill Wendling	de4b225093	EXTRACT_VECTOR_ELT of an INSERT_VECTOR_ELT may have the same index, but the indexes could be of a different value type. Or not even using the same SDNode for the constant (weird, I know). Compare the actual values instead of the pointers. llvm-svn: 102791	2010-04-30 22:19:17 +00:00
Jakob Stoklund Olesen	9afed0f98b	The local register allocator has to spill dirty callee saved registers before a call that might throw. The landing pad assumes that all registers are in stack slots. We used to spill those dirty CSRs after the call, and the stack slots would be wrong when arriving at the landing pad. llvm-svn: 102770	2010-04-30 21:19:29 +00:00
Evan Cheng	5f2314f3a3	Fix test. llvm-svn: 102694	2010-04-30 06:00:56 +00:00
Evan Cheng	5117a555e0	Another sibcall bug. If caller and callee calling conventions differ, then it's only safe to do a tail call if the results are returned in the same way. llvm-svn: 102683	2010-04-30 01:12:32 +00:00
Jakob Stoklund Olesen	8d4214578d	Reject really weird coalescer case when trying to merge identical subregisters of different register classes. e.g. %reg1048:3<def> = EXTRACT_SUBREG %RAX<kill>, 3 Where %reg1048 is a GR32 register. This is not impossible to handle, but it is pretty hard and very rare. This should unbreak the dragonegg builder. llvm-svn: 102672	2010-04-29 23:47:46 +00:00
Evan Cheng	38dfa5cf20	Load folding tail call should not use ebp / rbp after it's popped. PEI should use esp / rsp to reference frame instead. llvm-svn: 102596	2010-04-29 05:08:22 +00:00
Chris Lattner	08e9e72fa9	Rework global alignment computation again. Now we do round up alignment of globals to the preferred alignment, but only when there is no section specified on the global (by far the common case). llvm-svn: 102515	2010-04-28 19:58:07 +00:00
Evan Cheng	050df1b8de	Enable i16 to i32 promotion by default. llvm-svn: 102493	2010-04-28 08:30:49 +00:00
Evan Cheng	fe420adde0	Update tests. llvm-svn: 102487	2010-04-28 01:53:13 +00:00
Devang Patel	50c9431203	Emit debug info for byval parameters. llvm-svn: 102486	2010-04-28 01:39:28 +00:00
Evan Cheng	eb828b6391	Do not count kill, implicit_def instructions as printed instructions. llvm-svn: 102453	2010-04-27 19:38:45 +00:00
Chris Lattner	64d43d80be	round zero-byte .zerofill directives up to 1 byte. This should fix some "g++.dg-struct-layout-1" failures, rdar://7886017 llvm-svn: 102421	2010-04-27 07:41:44 +00:00
Chris Lattner	6a5e706e3c	on darwin empty functions need to codegen into something of non-zero length, otherwise labels get incorrectly merged. We handled this by emitting a ".byte 0", but this isn't correct on thumb/arm targets where the text segment needs to be a multiple of 2/4 bytes. Handle this by emitting a noop. This is more gross than it should be because arm/ppc are not fully mc'ized yet. This fixes rdar://7908505 llvm-svn: 102400	2010-04-26 23:37:21 +00:00
Bob Wilson	25f85947a3	Handle register-to-register copies within the tGPR class. Radar 7896289 llvm-svn: 102396	2010-04-26 23:20:08 +00:00
Dan Gohman	58b0470592	When checking whether the special handling for an addrec increment which doesn't dominate the header is needed, don't check whether the increment expression has computable loop evolution. While the operands of an addrec are required to be loop-invariant, they're not required to dominate any part of the loop. This fixes PR6914. llvm-svn: 102389	2010-04-26 21:46:36 +00:00
Chris Lattner	f740a8ceeb	fix PR6921 a different way. Intead of increasing the alignment of globals with a specified alignment, we fix common variables to obey their alignment. Add a comment explaining why this behavior is important. llvm-svn: 102365	2010-04-26 18:46:46 +00:00
Chris Lattner	e80442aa6d	Revert r102300/102301, which serious broke objc apps. llvm-svn: 102359	2010-04-26 18:30:45 +00:00
Chris Lattner	386a220f70	Fix PR6921: globals were not getting correctly rounded up to their preferred alignment unless they were common or some other special case. llvm-svn: 102300	2010-04-25 05:30:43 +00:00
Dan Gohman	534ba376f6	Generalize LSR's OptimizeMax to handle the new kinds of max expressions that indvars may use, now that indvars is recognizing le and ge loops. llvm-svn: 102235	2010-04-24 03:13:44 +00:00
Stuart Hastings	c8b2fc0909	Per Chris, fuse four trivial tests using grep (r102199) into one that uses FileCheck. llvm-svn: 102216	2010-04-23 22:12:57 +00:00
Dan Gohman	e1931fa676	Change TargetData's algorithm for computing defualt vector type alignment to match what's used in clang and GCC for __alignof, rather than trying to guess what Legalize is going to be doing. llvm-svn: 102206	2010-04-23 19:41:15 +00:00
Stuart Hastings	24b63f1597	Add some missing x86 patterns for movdq2q. Fixes two (LLVM-)GCC DejaGNU testcases. Radar 6881029. llvm-svn: 102199	2010-04-23 19:03:32 +00:00
Dan Gohman	997bbc54d6	Fix LSR to tolerate cases where ScalarEvolution initially misses an opportunity to fold add operands, but folds them after LSR has separated them out. This fixes rdar://7886751. llvm-svn: 102157	2010-04-23 01:55:05 +00:00
Jim Grosbach	825cb299cd	Update ARM DAGtoDAG for matching UBFX instruction for unsigned bitfield extraction. This fixes PR5998. llvm-svn: 102144	2010-04-22 23:24:18 +00:00
Evan Cheng	02e816b317	Do not try to optimize a copy that has already been marked for deletion. llvm-svn: 102027	2010-04-21 20:57:54 +00:00
Evan Cheng	4158a0ff6b	Implement -disable-non-leaf-fp-elim which disable frame pointer elimination optimization for non-leaf functions. This will be hooked up to gcc's -momit-leaf-frame-pointer option. rdar://7886181 llvm-svn: 101984	2010-04-21 03:18:23 +00:00
Evan Cheng	2034d9f2da	- Clean up some crappy code which deals with coalescing of copies which look at extract_subreg / insert_subreg, etc. - Add support for more aggressive insert_subreg coalescing. llvm-svn: 101971	2010-04-21 00:44:22 +00:00
Dan Gohman	ad33d33719	Add another variant of this test which found a place where CodeGen's ComputeMaskedBits was being over-conservative when computing bits for an ADD. llvm-svn: 101963	2010-04-21 00:19:28 +00:00
Chris Lattner	84776786a7	teach the x86 address matching stuff to handle (shl (or x,c), 3) the same as (shl (add x, c), 3) when x doesn't have any bits from c set. This finishes off PR1135. Before we compiled the block to: to: LBB0_3: ## %bb cmpb $4, %dl sete %dl addb %dl, %cl movb %cl, %dl shlb $2, %dl addb %r8b, %dl shlb $2, %dl movzbl %dl, %edx movl %esi, (%rdi,%rdx,4) leaq 2(%rdx), %r9 movl %esi, (%rdi,%r9,4) leaq 1(%rdx), %r9 movl %esi, (%rdi,%r9,4) addq $3, %rdx movl %esi, (%rdi,%rdx,4) incb %r8b decb %al movb %r8b, %dl jne LBB0_1 Now we produce: LBB0_3: ## %bb cmpb $4, %dl sete %dl addb %dl, %cl movb %cl, %dl shlb $2, %dl addb %r8b, %dl shlb $2, %dl movzbl %dl, %edx movl %esi, (%rdi,%rdx,4) movl %esi, 8(%rdi,%rdx,4) movl %esi, 4(%rdi,%rdx,4) movl %esi, 12(%rdi,%rdx,4) incb %r8b decb %al movb %r8b, %dl jne LBB0_1 llvm-svn: 101958	2010-04-20 23:18:40 +00:00
Bill Wendling	a8ae1783b4	Move CodeGen/X86/2010-04-19-DAGCombineCrash.ll into CodeGen/X86/crash.ll. Also reduce. llvm-svn: 101925	2010-04-20 18:14:47 +00:00
Chris Lattner	5100367ff3	Bill's change in r95336 broke empty aggregates embedded in other types. fix this by only bumping zero-byte globals up to a single byte if the entire global is zero size, fixing PR6340. This also fixes empty arrays etc to be handled correctly, and only does this on subsection-via-symbols targets (aka darwin) which is the only place where this matters. llvm-svn: 101879	2010-04-20 06:20:21 +00:00
Chris Lattner	38c1a1a247	teach cellspu how to return i8 and i16 from calls, patch by Kalle Raiskila! llvm-svn: 101875	2010-04-20 05:36:09 +00:00
Bill Wendling	467e6c2deb	The visitXOR method can return the same SDNode. If so, we don't want to delete it as it's not dead. llvm-svn: 101855	2010-04-20 01:25:01 +00:00
Bob Wilson	92a4685dd2	Fix tests for Neon load/store intrinsics to match the i8* types expected by the intrinsics. The reason for those i8* types is that the intrinsics are overloaded on the vector type and we don't have a way to declare an intrinsic where one argument is an overloaded vector type and another argument is a pointer to the vector element type. The bitcasts added here will match what the frontend will typically generate when these intrinsics are used. llvm-svn: 101840	2010-04-20 00:17:16 +00:00
Nick Lewycky	fbe8d2803d	Fix declarations in a few more tests. llvm-svn: 101676	2010-04-17 21:29:25 +00:00
Chris Lattner	0a8d91a816	fix PR6332, allowing an index of zero into a zero sized array even if the element of the array has no size. llvm-svn: 101662	2010-04-17 19:02:33 +00:00
Dan Gohman	4fee6f3bdd	Start function numbering at 0. llvm-svn: 101638	2010-04-17 16:29:15 +00:00

... 5 6 7 8 9 ...

3757 Commits