llvm-project

Commit Graph

Author	SHA1	Message	Date
Evan Cheng	4fbf459549	- Improved v8i16 shuffle lowering. It now uses pshuflw and pshufhw as much as possible before resorting to pextrw and pinsrw. - Better codegen for v4i32 shuffles masquerading as v8i16 or v16i8 shuffles. - Improves (i16 extract_vector_element 0) codegen by recognizing (i32 extract_vector_element 0) does not require a pextrw. llvm-svn: 44836	2007-12-11 01:46:18 +00:00
Nate Begeman	a55a67ae91	x86 doesn't actually want to custom lower v3i32 llvm-svn: 44835	2007-12-11 01:41:33 +00:00
Anton Korobeynikov	21ade5880b	Hey, English is not my native language :) llvm-svn: 44820	2007-12-10 23:10:20 +00:00
Anton Korobeynikov	77eb5e649d	Clarify the need of CFI() stuff llvm-svn: 44819	2007-12-10 23:08:35 +00:00
Anton Korobeynikov	a6b0f7e244	Provide convenient way to disable CFI stuff for old/broken assemblers. Use it for Darwin. llvm-svn: 44818	2007-12-10 23:04:38 +00:00
Chris Lattner	8a72a7d586	Disable cfi directives for now, darwin does't support them. These should probably be something like: CFI(".cfi_def_cfa_offset 16\n") where CFI is defined to a noop on darwin and other platforms that don't support those directives. llvm-svn: 44803	2007-12-10 19:10:18 +00:00
Anton Korobeynikov	657be86229	And finally annotate X86-64 version of callback. All bad stuff from SSE version is implicitely inherited :) llvm-svn: 44794	2007-12-10 15:27:07 +00:00
Anton Korobeynikov	88e9d082d8	Provide annotation for SSE version of callback. It's even more broken, because doesn't mark xmm regs properly llvm-svn: 44793	2007-12-10 15:13:55 +00:00
Anton Korobeynikov	81e9dc4af7	Annotate JIT callback function with call frame infromation. This will allow us (theoretically) to unwind through JITer. The code wasn't verified, so I'm pretty sure offsets are wrong :) llvm-svn: 44792	2007-12-10 14:54:42 +00:00
Bill Wendling	3f19dfe794	Reverting 44702. It wasn't correct to rename them. llvm-svn: 44727	2007-12-08 23:58:46 +00:00
Chris Lattner	ff87f05e43	aesthetic changes, no functionality change. Evan, it's not clear what 'Available' is, please add a comment near it and rename it if appropriate. llvm-svn: 44703	2007-12-08 07:22:58 +00:00
Bill Wendling	2b07d8c5a0	Renaming: isTriviallyReMaterializable -> hasNoSideEffects isReallyTriviallyReMaterializable -> isTriviallyReMaterializable llvm-svn: 44702	2007-12-08 07:17:56 +00:00
Evan Cheng	b41d838d28	Add comment. llvm-svn: 44686	2007-12-07 21:30:01 +00:00
Evan Cheng	bfd373a53e	Much improved v8i16 shuffles. (Step 1). llvm-svn: 44676	2007-12-07 08:07:39 +00:00
Evan Cheng	c829e5cdf0	Remove a bogus optimization. It's not possible to do a move to low element to a <8 x i16> or <16 x i8> vector. llvm-svn: 44669	2007-12-06 22:14:22 +00:00
Chris Lattner	ad05e17491	add a note llvm-svn: 44637	2007-12-05 22:58:19 +00:00
Evan Cheng	bb26301864	Add a argument to storeRegToStackSlot and storeRegToAddr to specify whether the stored register is killed. llvm-svn: 44600	2007-12-05 03:14:33 +00:00
Evan Cheng	f45a1d623c	Remove redundant foldMemoryOperand variants and other code clean up. llvm-svn: 44517	2007-12-02 08:30:39 +00:00
Evan Cheng	69fda0a716	Allow some reloads to be folded in multi-use cases. Specifically testl r, r -> cmpl [mem], 0. llvm-svn: 44479	2007-12-01 02:07:52 +00:00
Nate Begeman	6f026a654c	Support returning non-power-of-2 vectors to unblock some work llvm-svn: 44371	2007-11-27 19:28:48 +00:00
Duncan Sands	ad0ea2d430	Fix PR1146: parameter attributes are longer part of the function type, instead they belong to functions and function calls. This is an updated and slightly corrected version of Reid Spencer's original patch. The only known problem is that auto-upgrading of bitcode files doesn't seem to work properly (see test/Bitcode/AutoUpgradeIntrinsics.ll). Hopefully a bitcode guru (who might that be? :) ) will fix it. llvm-svn: 44359	2007-11-27 13:23:08 +00:00
Chris Lattner	5728bdd4db	Fix a long standing deficiency in the X86 backend: we would sometimes emit "zero" and "all one" vectors multiple times, for example: _test2: pcmpeqd %mm0, %mm0 movq %mm0, _M1 pcmpeqd %mm0, %mm0 movq %mm0, _M2 ret instead of: _test2: pcmpeqd %mm0, %mm0 movq %mm0, _M1 movq %mm0, _M2 ret This patch fixes this by always arranging for zero/one vectors to be defined as v4i32 or v2i32 (SSE/MMX) instead of letting them be any random type. This ensures they get trivially CSE'd on the dag. This fix is also important for LegalizeDAGTypes, as it gets unhappy when the x86 backend wants BUILD_VECTOR(i64 0) to be legal even when 'i64' isn't legal. This patch makes the following changes: 1) X86TargetLowering::LowerBUILD_VECTOR now lowers 0/1 vectors into their canonical types. 2) The now-dead patterns are removed from the SSE/MMX .td files. 3) All the patterns in the .td file that referred to immAllOnesV or immAllZerosV in the wrong form now use *_bc to match them with a bitcast wrapped around them. 4) X86DAGToDAGISel::SelectScalarSSELoad is generalized to handle bitcast'd zero vectors, which simplifies the code actually. 5) getShuffleVectorZeroOrUndef is updated to generate a shuffle that is legal, instead of generating one that is illegal and expecting a later legalize pass to clean it up. 6) isZeroShuffle is generalized to handle bitcast of zeros. 7) several other minor tweaks. This patch is definite goodness, but has the potential to cause random code quality regressions. Please be on the lookout for these and let me know if they happen. llvm-svn: 44310	2007-11-25 00:24:49 +00:00
Chris Lattner	f72ad16263	remove bogus assertion that broke CodeGen/Generic/cast-fp.ll on x86 among others. llvm-svn: 44302	2007-11-24 18:37:20 +00:00
Chris Lattner	f81d5886c6	Several changes: 1) Change the interface to TargetLowering::ExpandOperationResult to take and return entire NODES that need a result expanded, not just the value. This allows us to handle things like READCYCLECOUNTER, which returns two values. 2) Implement (extremely limited) support in LegalizeDAG::ExpandOp for MERGE_VALUES. 3) Reimplement custom lowering in LegalizeDAGTypes in terms of the new ExpandOperationResult. This makes the result simpler and fully general. 4) Implement (fully general) expand support for MERGE_VALUES in LegalizeDAGTypes. 5) Implement ExpandOperationResult support for ARM f64->i64 bitconvert and ARM i64 shifts, allowing them to work with LegalizeDAGTypes. 6) Implement ExpandOperationResult support for X86 READCYCLECOUNTER and FP_TO_SINT, allowing them to work with LegalizeDAGTypes. LegalizeDAGTypes now passes several more X86 codegen tests when enabled and when type legalization in LegalizeDAG is ifdef'd out. llvm-svn: 44300	2007-11-24 07:07:01 +00:00
Chris Lattner	ab98c41337	add a note llvm-svn: 44299	2007-11-24 06:13:33 +00:00
Dale Johannesen	763e110a9f	Fix .eh table linkage issues on Darwin. Some EH support for Darwin PPC, but it's not fully working yet. llvm-svn: 44258	2007-11-20 23:24:42 +00:00
Nate Begeman	d4d45c268c	Add support for vectors to int <-> float casts. llvm-svn: 44204	2007-11-17 03:58:34 +00:00
Anton Korobeynikov	91460e43f1	Implement codegen for flt_rounds on x86 llvm-svn: 44183	2007-11-16 01:31:51 +00:00
Evan Cheng	0cbe920d7c	Oops. Debugging code shouldn't have been checked in. llvm-svn: 44128	2007-11-14 19:08:32 +00:00
Anton Korobeynikov	2c6387803e	Fix PIC jump table codegen on x86-32/linux. In fact, such thing should be applied to all targets uses GOT-relative offsets for PIC (Alpha?) llvm-svn: 44108	2007-11-14 09:18:41 +00:00
Duncan Sands	e2287ed552	Eliminate the recently introduced CCAssignToStackABISizeAlign in favour of teaching CCAssignToStack that size 0 and/or align 0 means to use the ABI values. This seems a neater solution. It is safe since no legal value type has size 0. llvm-svn: 44107	2007-11-14 08:29:13 +00:00
Evan Cheng	7f02cfa599	Clean up sub-register implementation by moving subReg information back to MachineOperand auxInfo. Previous clunky implementation uses an external map to track sub-register uses. That works because register allocator uses a new virtual register for each spilled use. With interval splitting (coming soon), we may have multiple uses of the same register some of which are of using different sub-registers from others. It's too fragile to constantly update the information. llvm-svn: 44104	2007-11-14 07:59:08 +00:00
Dale Johannesen	7904708369	Revert previous; these files aren't ready to go in yet. llvm-svn: 44057	2007-11-13 19:16:02 +00:00
Dale Johannesen	7a7085f6d3	Add parameter to getDwarfRegNum to permit targets to use different mappings for EH and debug info; no functional change yet. Fix warning in X86CodeEmitter. llvm-svn: 44056	2007-11-13 19:13:01 +00:00
Evan Cheng	c891ae92dc	Fix x86-64 jit: remove reliance on Dwarf numbers. llvm-svn: 44048	2007-11-13 17:54:34 +00:00
Bill Wendling	77b13af9a6	Unifacalize the CALLSEQ{START,END} stuff. llvm-svn: 44045	2007-11-13 09:19:02 +00:00
Bill Wendling	f359fed9f9	Unify CALLSEQ_{START,END}. They take 4 parameters: the chain, two stack adjustment fields, and an optional flag. If there is a "dynamic_stackalloc" in the code, make sure that it's bracketed by CALLSEQ_START and CALLSEQ_END. If not, then there is the potential for the stack to be changed while the stack's being used by another instruction (like a call). This can only result in tears... llvm-svn: 44037	2007-11-13 00:44:25 +00:00
Owen Anderson	933b5b7e62	Add a flag for indirect branch instructions. Target maintainers: please check that the instructions for your target are correctly marked. llvm-svn: 44012	2007-11-12 07:39:39 +00:00
Anton Korobeynikov	4edfea438a	Use TableGen to emit information for dwarf register numbers. This makes DwarfRegNum to accept list of numbers instead. Added three different "flavours", but only slightly tested on x86-32/linux. Please check another subtargets if possible, llvm-svn: 43997	2007-11-11 19:50:10 +00:00
Dale Johannesen	b988e7e8cd	Add CCAssignToStackABISizeAlign for convenience in dealing with types whose size & alignment are different on different subtargets. Use it for x86 f80. llvm-svn: 43988	2007-11-10 22:07:15 +00:00
Arnold Schwaighofer	d2c16ff905	Update tailcall code to include inline attribute operand for memcpy. llvm-svn: 43978	2007-11-10 10:48:01 +00:00
Evan Cheng	fb13fd6f93	Unbreak x86-64 jumptable. llvm-svn: 43955	2007-11-09 19:11:23 +00:00
Dale Johannesen	dfb85c7831	Revert previous rewrite per chris's comments. llvm-svn: 43950	2007-11-09 18:07:11 +00:00
Evan Cheng	797d56ff17	Much improved pic jumptable codegen: Then: call "L1$pb" "L1$pb": popl %eax ... LBB1_1: # entry imull $4, %ecx, %ecx leal LJTI1_0-"L1$pb"(%eax), %edx addl LJTI1_0-"L1$pb"(%ecx,%eax), %edx jmpl %edx .align 2 .set L1_0_set_3,LBB1_3-LJTI1_0 .set L1_0_set_2,LBB1_2-LJTI1_0 .set L1_0_set_5,LBB1_5-LJTI1_0 .set L1_0_set_4,LBB1_4-LJTI1_0 LJTI1_0: .long L1_0_set_3 .long L1_0_set_2 Now: call "L1$pb" "L1$pb": popl %eax ... LBB1_1: # entry addl LJTI1_0-"L1$pb"(%eax,%ecx,4), %eax jmpl %eax .align 2 .set L1_0_set_3,LBB1_3-"L1$pb" .set L1_0_set_2,LBB1_2-"L1$pb" .set L1_0_set_5,LBB1_5-"L1$pb" .set L1_0_set_4,LBB1_4-"L1$pb" LJTI1_0: .long L1_0_set_3 .long L1_0_set_2 llvm-svn: 43924	2007-11-09 01:32:10 +00:00
Dale Johannesen	04fd82088e	Rewrite Dwarf number handling per review comments. llvm-svn: 43918	2007-11-09 00:47:10 +00:00
Dale Johannesen	1b9de4dd6f	Complete conditionalization of Dwarf reg numbers. Would somebody not on Darwin please make sure this doesn't break anything. Exception handling failures would be the most likely symptom. llvm-svn: 43844	2007-11-07 21:48:35 +00:00
Dale Johannesen	fbe69d2cd6	Interchange Dwarf numbers of ESP and EBP on x86 Darwin. Much improvement in exception handling. llvm-svn: 43794	2007-11-07 00:25:05 +00:00
Rafael Espindola	fa0df55bdd	Move the LowerMEMCPY and LowerMEMCPYCall to a common place. Thanks for the suggestions Bill :-) llvm-svn: 43742	2007-11-05 23:12:20 +00:00
Evan Cheng	9337929aae	Use movups to spill / restore SSE registers on targets where stacks alignment is less than 16. This is a temporary solution until dynamic stack alignment is implemented. llvm-svn: 43703	2007-11-05 07:30:01 +00:00
Duncan Sands	283207a71c	Eliminate the remaining uses of getTypeSize. This should only effect x86 when using long double. Now 12/16 bytes are output for long double globals (the exact amount depends on the alignment). This brings globals in line with the rest of LLVM: the space reserved for an object is now always the ABI size. One tricky point is that only 10 bytes should be output for long double if it is a field in a packed struct, which is the reason for the additional argument to EmitGlobalConstant. llvm-svn: 43688	2007-11-05 00:04:43 +00:00

1 2 3 4 5 ...

2860 Commits