llvm-project

Commit Graph

Author	SHA1	Message	Date
Bill Wendling	4aa25b79f9	Temporarily revert r68552. This was causing a failure in the self-hosting LLVM builds. --- Reverse-merging (from foreign repository) r68552 into '.': U test/CodeGen/X86/tls8.ll U test/CodeGen/X86/tls10.ll U test/CodeGen/X86/tls2.ll U test/CodeGen/X86/tls6.ll U lib/Target/X86/X86Instr64bit.td U lib/Target/X86/X86InstrSSE.td U lib/Target/X86/X86InstrInfo.td U lib/Target/X86/X86RegisterInfo.cpp U lib/Target/X86/X86ISelLowering.cpp U lib/Target/X86/X86CodeEmitter.cpp U lib/Target/X86/X86FastISel.cpp U lib/Target/X86/X86InstrInfo.h U lib/Target/X86/X86ISelDAGToDAG.cpp U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.cpp U lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.cpp U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.h U lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.h U lib/Target/X86/X86ISelLowering.h U lib/Target/X86/X86InstrInfo.cpp U lib/Target/X86/X86InstrBuilder.h U lib/Target/X86/X86RegisterInfo.td llvm-svn: 68560	2009-04-07 22:35:25 +00:00
Rafael Espindola	1edda06792	Reduce code duplication on the TLS implementation. This introduces a small regression on the generated code quality in the case we are just computing addresses, not loading values. Will work on it and on X86-64 support. llvm-svn: 68552	2009-04-07 21:37:46 +00:00
Evan Cheng	a84a318873	When optimzing a mul by immediate into two, the resulting mul's should get a x86 specific node to avoid dag combiner from hacking on them further. llvm-svn: 68066	2009-03-30 21:36:47 +00:00
Bill Wendling	189d67181c	Doxygen-ify comments. llvm-svn: 67727	2009-03-26 01:46:56 +00:00
Dan Gohman	4a683478d5	Correct some comments. Operand numbers start at 0. llvm-svn: 67518	2009-03-23 15:40:10 +00:00
Chris Lattner	a492d29c23	improve comment. llvm-svn: 66778	2009-03-12 06:46:02 +00:00
Dan Gohman	ff659b5b86	Arithmetic instructions don't set EFLAGS bits OF and CF bits the same say the "test" instruction does in overflow cases, so eliminating the test is only safe when those bits aren't needed, as is the case for COND_E and COND_NE, or if it can be proven that no overflow will occur. For now, just restrict the optimization to COND_E and COND_NE and don't do any overflow analysis. llvm-svn: 66318	2009-03-07 01:58:32 +00:00
Dan Gohman	55d7b2ac4f	Re-apply 66008, now that the unfoldMemoryOperand bug is fixed. llvm-svn: 66058	2009-03-04 19:44:21 +00:00
Dan Gohman	6728f892be	Revert r66004 for now; it's causing a variety of test failures. llvm-svn: 66008	2009-03-04 03:54:19 +00:00
Dan Gohman	fe8d71f42a	Teach the x86 backend to eliminate "test" instructions by using the EFLAGS result from add, sub, inc, and dec instructions in simple cases. llvm-svn: 66004	2009-03-04 02:33:24 +00:00
Nate Begeman	e684da3e5d	Generate better code for v8i16 shuffles on SSE2 Generate better code for v16i8 shuffles on SSE2 (avoids stack) Generate pshufb for v8i16 and v16i8 shuffles on SSSE3 where it is fewer uops. Document the shuffle matching logic and add some FIXMEs for later further cleanups. New tests that test the above. Examples: New: _shuf2: pextrw $7, %xmm0, %eax punpcklqdq %xmm1, %xmm0 pshuflw $128, %xmm0, %xmm0 pinsrw $2, %eax, %xmm0 Old: _shuf2: pextrw $2, %xmm0, %eax pextrw $7, %xmm0, %ecx pinsrw $2, %ecx, %xmm0 pinsrw $3, %eax, %xmm0 movd %xmm1, %eax pinsrw $4, %eax, %xmm0 ret ========= New: _shuf4: punpcklqdq %xmm1, %xmm0 pshufb LCPI1_0, %xmm0 Old: _shuf4: pextrw $3, %xmm0, %eax movsd %xmm1, %xmm0 pextrw $3, %xmm1, %ecx pinsrw $4, %ecx, %xmm0 pinsrw $5, %eax, %xmm0 ======== New: _shuf1: pushl %ebx pushl %edi pushl %esi pextrw $1, %xmm0, %eax rolw $8, %ax movd %xmm0, %ecx rolw $8, %cx pextrw $5, %xmm0, %edx pextrw $4, %xmm0, %esi pextrw $3, %xmm0, %edi pextrw $2, %xmm0, %ebx movaps %xmm0, %xmm1 pinsrw $0, %ecx, %xmm1 pinsrw $1, %eax, %xmm1 rolw $8, %bx pinsrw $2, %ebx, %xmm1 rolw $8, %di pinsrw $3, %edi, %xmm1 rolw $8, %si pinsrw $4, %esi, %xmm1 rolw $8, %dx pinsrw $5, %edx, %xmm1 pextrw $7, %xmm0, %eax rolw $8, %ax movaps %xmm1, %xmm0 pinsrw $7, %eax, %xmm0 popl %esi popl %edi popl %ebx ret Old: _shuf1: subl $252, %esp movaps %xmm0, (%esp) movaps %xmm0, 16(%esp) movaps %xmm0, 32(%esp) movaps %xmm0, 48(%esp) movaps %xmm0, 64(%esp) movaps %xmm0, 80(%esp) movaps %xmm0, 96(%esp) movaps %xmm0, 224(%esp) movaps %xmm0, 208(%esp) movaps %xmm0, 192(%esp) movaps %xmm0, 176(%esp) movaps %xmm0, 160(%esp) movaps %xmm0, 144(%esp) movaps %xmm0, 128(%esp) movaps %xmm0, 112(%esp) movzbl 14(%esp), %eax movd %eax, %xmm1 movzbl 22(%esp), %eax movd %eax, %xmm2 punpcklbw %xmm1, %xmm2 movzbl 42(%esp), %eax movd %eax, %xmm1 movzbl 50(%esp), %eax movd %eax, %xmm3 punpcklbw %xmm1, %xmm3 punpcklbw %xmm2, %xmm3 movzbl 77(%esp), %eax movd %eax, %xmm1 movzbl 84(%esp), %eax movd %eax, %xmm2 punpcklbw %xmm1, %xmm2 movzbl 104(%esp), %eax movd %eax, %xmm1 punpcklbw %xmm1, %xmm0 punpcklbw %xmm2, %xmm0 movaps %xmm0, %xmm1 punpcklbw %xmm3, %xmm1 movzbl 127(%esp), %eax movd %eax, %xmm0 movzbl 135(%esp), %eax movd %eax, %xmm2 punpcklbw %xmm0, %xmm2 movzbl 155(%esp), %eax movd %eax, %xmm0 movzbl 163(%esp), %eax movd %eax, %xmm3 punpcklbw %xmm0, %xmm3 punpcklbw %xmm2, %xmm3 movzbl 188(%esp), %eax movd %eax, %xmm0 movzbl 197(%esp), %eax movd %eax, %xmm2 punpcklbw %xmm0, %xmm2 movzbl 217(%esp), %eax movd %eax, %xmm4 movzbl 225(%esp), %eax movd %eax, %xmm0 punpcklbw %xmm4, %xmm0 punpcklbw %xmm2, %xmm0 punpcklbw %xmm3, %xmm0 punpcklbw %xmm1, %xmm0 addl $252, %esp ret llvm-svn: 65311	2009-02-23 08:49:38 +00:00
Dan Gohman	747e55bc9a	Constify TargetInstrInfo::EmitInstrWithCustomInserter, allowing ScheduleDAG's TLI member to use const. llvm-svn: 64018	2009-02-07 16:15:20 +00:00
Dale Johannesen	021052a705	Remove non-DebugLoc versions of getLoad and getStore. Adjust the many callers of those versions. llvm-svn: 63767	2009-02-04 20:06:27 +00:00
Dale Johannesen	0404dc11af	Need this file too. llvm-svn: 63674	2009-02-03 22:26:34 +00:00
Dale Johannesen	66e03e6f7b	DebugLoc propagation. 2/3 through file. llvm-svn: 63650	2009-02-03 19:33:06 +00:00
Nate Begeman	b09b0242ca	Fix an indent and a typo. llvm-svn: 62940	2009-01-24 22:12:48 +00:00
Bill Wendling	4d5275905e	Implement a special algorithm for converting uint_to_fp for i32 values on X86. This code: void f() { uint32_t x; float y = (float)x; } used to be: movl %eax, -8(%ebp) movl [2^52 double], -4(%ebp) movsd -8(%ebp), %xmm0 subsd [2^52 double], %xmm0 cvtsd2ss %xmm0, %xmm0 Is now: movsd [2^52 double], %xmm0 movsd %xmm0, %xmm1 movd %ecx, %xmm2 orps %xmm2, %xmm1 subsd %xmm0, %xmm1 cvtsd2ss %xmm1, %xmm0 This is faster on X86. Note that there's an extra load of %xmm0 into %xmm1. That will be fixed in a later coalescer fix. llvm-svn: 62404	2009-01-17 03:56:04 +00:00
Dan Gohman	0ad43ca6e5	Make getWidenVectorType const. llvm-svn: 62265	2009-01-15 17:34:08 +00:00
Devang Patel	5c6e1e3b7d	Use DebugInfo interface to lower dbg_* intrinsics. llvm-svn: 62127	2009-01-13 00:35:13 +00:00
Duncan Sands	8feb694e8f	Fix PR3274: when promoting the condition of a BRCOND node, promote from i1 all the way up to the canonical SetCC type. In order to discover an appropriate type to use, pass MVT::Other to getSetCCResultType. In order to be able to do this, change getSetCCResultType to take a type as an argument, not a value (this is also more logical). llvm-svn: 61542	2009-01-01 15:52:00 +00:00
Dan Gohman	25a767d7f4	Add instruction patterns and encodings for the x86 bt instructions. llvm-svn: 61400	2008-12-23 22:45:23 +00:00
Mon P Wang	998fd29ce1	Fixed x86 code generation of multiple for v2i64. It was incorrect for SSE4.1. llvm-svn: 61211	2008-12-18 21:42:19 +00:00
Bill Wendling	c4499feb1a	- Use patterns instead of creating completely new instruction matching patterns, which are identical to the original patterns. - Change the multiply with overflow so that we distinguish between signed and unsigned multiplication. Currently, unsigned multiplication with overflow isn't working! llvm-svn: 60963	2008-12-12 21:15:41 +00:00
Bill Wendling	1a317678bc	Redo the arithmetic with overflow architecture. I was changing the semantics of ISD::ADD to emit an implicit EFLAGS. This was horribly broken. Instead, replace the intrinsic with an ISD::SADDO node. Then custom lower that into an X86ISD::ADD node with a associated SETCC that checks the correct condition code (overflow or carry). Then that gets lowered into the correct X86::ADDOvf instruction. Similar for SUB and MUL instructions. llvm-svn: 60915	2008-12-12 00:56:36 +00:00
Bill Wendling	db8ec2d75a	Add sub/mul overflow intrinsics. This currently doesn't have a target-independent way of determining overflow on multiplication. It's very tricky. Patch by Zoltan Varga! llvm-svn: 60800	2008-12-09 22:08:41 +00:00
Bill Wendling	30e9dc81c8	Second stab at target-dependent lowering of everyone's favorite nodes: [SU]ADDO - LowerXADDO lowers [SU]ADDO into an ADD with an implicit EFLAGS define. The EFLAGS are fed into a SETCC node which has the conditional COND_O or COND_C, depending on the type of ADDO requested. - LowerBRCOND now recognizes if it's coming from a SETCC node with COND_O or COND_C set. llvm-svn: 60388	2008-12-02 01:06:39 +00:00
Duncan Sands	6ed40141f7	Change the interface to the type legalization method ReplaceNodeResults: rather than returning a node which must have the same number of results as the original node (which means mucking around with MERGE_VALUES, and which is also easy to get wrong since SelectionDAG folding may mean you don't get the node you expect), return the results in a vector. llvm-svn: 60348	2008-12-01 11:39:25 +00:00
Bill Wendling	66835479d7	- Make lowering of "add with overflow" customizable by back-ends. - Mark "add with overflow" as having a custom lowering for X86. Give it a null lowering representation for now. llvm-svn: 59971	2008-11-24 19:21:46 +00:00
Mon P Wang	58c3794c27	Add initial support for vector widening. Logic is set to widen for X86. One will only see an effect if legalizetype is not active. Will move support to LegalizeType soon. llvm-svn: 58426	2008-10-30 08:01:45 +00:00
Dale Johannesen	28929589e7	Add an SSE2 algorithm for uint64->f64 conversion. The same one Apple gcc uses, faster. Also gets the extreme case in gcc.c-torture/execute/ieee/rbug.c correct which we weren't before; this is not sufficient to get the test to pass though, there is another bug. llvm-svn: 57926	2008-10-21 20:50:01 +00:00
Dan Gohman	2fe6bee5b6	Teach DAGCombine to fold constant offsets into GlobalAddress nodes, and add a TargetLowering hook for it to use to determine when this is legal (i.e. not in PIC mode, etc.) This allows instruction selection to emit folded constant offsets in more cases, such as the included testcase, eliminating the need for explicit arithmetic instructions. This eliminates the need for the C++ code in X86ISelDAGToDAG.cpp that attempted to achieve the same effect, but wasn't as effective. Also, fix handling of offsets in GlobalAddressSDNodes in several places, including changing GlobalAddressSDNode's offset from int to int64_t. The Mips, Alpha, Sparc, and CellSPU targets appear to be unaware of GlobalAddress offsets currently, so set the hook to false on those targets. llvm-svn: 57748	2008-10-18 02:06:02 +00:00
Dan Gohman	e7ced74558	FastISel support for exception-handling constructs. - Move the EH landing-pad code and adjust it so that it works with FastISel as well as with SDISel. - Add FastISel support for @llvm.eh.exception and @llvm.eh.selector. llvm-svn: 57539	2008-10-14 23:54:11 +00:00
Dale Johannesen	8c36a1c09c	Make atomic Swap work, 64-bit on x86-32. Make it all work in non-pic mode. llvm-svn: 57034	2008-10-03 22:25:52 +00:00
Dale Johannesen	867d549fce	Handle some 64-bit atomics on x86-32, some of the time. llvm-svn: 56963	2008-10-02 18:53:47 +00:00
Bill Wendling	68f12ee567	Implement the -fno-builtin option in the front-end, not in the back-end. llvm-svn: 56900	2008-10-01 00:59:58 +00:00
Bill Wendling	bd09262e97	Add the new `-no-builtin' flag. This flag is meant to mimic the GCC `-fno-builtin' flag. Currently, it's used to replace "memset" with "_bzero" instead of "__bzero" on Darwin10+. This arguably violates the meaning of this flag, but is currently sufficient. The meaning of this flag should become more specific over time. llvm-svn: 56885	2008-09-30 21:22:07 +00:00
Dale Johannesen	f61a84ec43	Remove misuse of ReplaceNodeResults for atomics with valid types. No functional change. llvm-svn: 56808	2008-09-29 22:25:26 +00:00
Evan Cheng	74c9ed91b0	With sse3 and when the source is a load or has multiple uses, favors movddup over shuffp*, pshufd, etc. Without sse3 or when the source is from a register, make use of movlhps llvm-svn: 56620	2008-09-25 20:50:48 +00:00
Evan Cheng	e0add20c1b	Properly handle 'm' inline asm constraints. If a GV is being selected for the addressing mode, it requires the same logic for PIC relative addressing, etc. llvm-svn: 56526	2008-09-24 00:05:32 +00:00
Dan Gohman	918fe08a56	Arrange for FastISel code to have access to the MachineModuleInfo object. This will be needed to support debug info. llvm-svn: 56508	2008-09-23 21:53:34 +00:00
Dan Gohman	ed1cf1a8f1	Fix these enums' starting values to reflect the way that instruction opcodes are now numbered. No functionality change. llvm-svn: 56497	2008-09-23 18:42:32 +00:00
Bill Wendling	24c79f28b1	Reverting r56249. On further investigation, this functionality isn't needed. Apologies for the thrashing. llvm-svn: 56251	2008-09-16 21:48:12 +00:00
Bill Wendling	8bc392fb1d	- Change "ExternalSymbolSDNode" to "SymbolSDNode". - Add linkage to SymbolSDNode (default to external). - Change ISD::ExternalSymbol to ISD::Symbol. - Change ISD::TargetExternalSymbol to ISD::TargetSymbol These changes pave the way to allowing SymbolSDNodes with non-external linkage. llvm-svn: 56249	2008-09-16 21:12:30 +00:00
Dan Gohman	d3fe174c53	Define CallSDNode, an SDNode subclass for use with ISD::CALL. Currently it just holds the calling convention and flags for isVarArgs and isTailCall. And it has several utility methods, which eliminate magic 5+2*i and similar index computations in several places. CallSDNodes are not CSE'd. Teach UpdateNodeOperands to handle nodes that are not CSE'd gracefully. llvm-svn: 56183	2008-09-13 01:54:27 +00:00
Dan Gohman	39d82f902a	Add X86FastISel support for static allocas, and refences to static allocas. As part of this change, refactor the address mode code for laods and stores. llvm-svn: 56066	2008-09-10 20:11:02 +00:00
Anton Korobeynikov	6acb2219b6	Replace explicit pointer-size constants to TargetData query. No functionality change. llvm-svn: 55996	2008-09-09 18:22:57 +00:00
Dan Gohman	7bda51f5a4	Create HandlePHINodesInSuccessorBlocksFast, a version of HandlePHINodesInSuccessorBlocks that works FastISel-style. This allows PHI nodes to be updated correctly while using FastISel. This also involves some code reorganization; ValueMap and MBBMap are now members of the FastISel class, so they needn't be passed around explicitly anymore. Also, SelectInstructions is changed to SelectInstruction, and only does one instruction at a time. llvm-svn: 55746	2008-09-03 23:12:08 +00:00
Ted Kremenek	2175b55dc7	Fix capitalization in #include of FastISel.h. This unbreaks the build on case-sensitive filesystems. llvm-svn: 55687	2008-09-03 02:54:11 +00:00
Evan Cheng	24422d4928	Let tblgen only generate fastisel routines, not the class definition. This makes it easier for targets to define its own fastisel class. llvm-svn: 55679	2008-09-03 00:03:49 +00:00
Dan Gohman	02c84b8910	Simplify FastISel's constructor argument list, make the FastISel class hold a MachineRegisterInfo member, and make the MachineBasicBlock be passed in to SelectInstructions rather than the FastISel constructor. llvm-svn: 55076	2008-08-20 21:05:57 +00:00
Dan Gohman	4619e93bd3	The X86 target will soon have an implementation of createFastISel. llvm-svn: 55010	2008-08-19 21:32:53 +00:00
Dale Johannesen	5afbf510aa	Add support for 8 and 16 bit forms of __sync builtins on X86. Change "lock" instructions to be on a separate line. This is needed to work around a bug in the Darwin assembler. llvm-svn: 54999	2008-08-19 18:47:28 +00:00
Dan Gohman	2ce6f2ad5e	Rename SDOperand to SDValue. llvm-svn: 54128	2008-07-27 21:46:04 +00:00
Nate Begeman	55b7becb29	SSE codegen for vsetcc nodes llvm-svn: 53719	2008-07-17 16:51:19 +00:00
Duncan Sands	93e180342a	Rather than having a different custom legalization hook for each way in which a result type can be legalized (promotion, expansion, softening etc), just use one: ReplaceNodeResults, which returns a node with exactly the same result types as the node passed to it, but presumably with a bunch of custom code behind the scenes. No change if the new LegalizeTypes infrastructure is not turned on. llvm-svn: 53137	2008-07-04 11:47:58 +00:00
Mon P Wang	6a490371c9	Added MemOperands to Atomic operations since Atomics touches memory. Added abstract class MemSDNode for any Node that have an associated MemOperand Changed atomic.lcs => atomic.cmp.swap, atomic.las => atomic.load.add, and atomic.lss => atomic.load.sub llvm-svn: 52706	2008-06-25 08:15:39 +00:00
Andrew Lenharth	f88d50bfcc	add missing atomic intrinsic from gcc llvm-svn: 52270	2008-06-14 05:48:15 +00:00
Duncan Sands	13237ac3b9	Wrap MVT::ValueType in a struct to get type safety and better control the abstraction. Rename the type to MVT. To update out-of-tree patches, the main thing to do is to rename MVT::ValueType to MVT, and rewrite expressions like MVT::getSizeInBits(VT) in the form VT.getSizeInBits(). Use VT.getSimpleVT() to extract a MVT::SimpleValueType for use in switch statements (you will get an assert failure if VT is an extended value type - these shouldn't exist after type legalization). This results in a small speedup of codegen and no new testsuite failures (x86-64 linux). llvm-svn: 52044	2008-06-06 12:08:01 +00:00
Evan Cheng	5e28227dbd	Implement vector shift up / down and insert zero with ps{rl}lq / ps{rl}ldq. llvm-svn: 51667	2008-05-29 08:22:04 +00:00
Evan Cheng	29e59ad6c9	Fix typos and comments. llvm-svn: 51165	2008-05-15 22:13:02 +00:00
Evan Cheng	ef377adca0	Make use of vector load and store operations to implement memcpy, memmove, and memset. Currently only X86 target is taking advantage of these. llvm-svn: 51140	2008-05-15 08:39:06 +00:00
Dan Gohman	eabd647cd5	Change target-specific classes to use more precise static types. This eliminates the need for several awkward casts, including the last dynamic_cast under lib/Target. llvm-svn: 51091	2008-05-14 01:58:56 +00:00
Nate Begeman	d875c3e2fd	Initial X86 codegen support for VSETCC. llvm-svn: 51000	2008-05-12 20:34:32 +00:00
Evan Cheng	2609d5e779	Refactor isConsecutiveLoad from X86 to TargetLowering so DAG combiner can make use of it. llvm-svn: 50991	2008-05-12 19:56:52 +00:00
Dan Gohman	3c0e11af64	For now, abort when an ISD::VAARG is encountered on x86-64, rather than silently generate invalid code. llvm-gcc does not currently use VAArgInst; it lowers va_arg in the front-end. llvm-svn: 50930	2008-05-10 01:26:14 +00:00
Evan Cheng	961339bbdb	Handle a few more cases of folding load i64 into xmm and zero top bits. Note, some of the code will be moved into target independent part of DAG combiner in a subsequent patch. llvm-svn: 50918	2008-05-09 21:53:03 +00:00
Evan Cheng	78af38c392	Handle vector move / load which zero the destination register top bits (i.e. movd, movq, movss (addr), movsd (addr)) with X86 specific dag combine. llvm-svn: 50838	2008-05-08 00:57:18 +00:00
Mon P Wang	3e58393c3d	Added addition atomic instrinsics and, or, xor, min, and max. llvm-svn: 50663	2008-05-05 19:05:59 +00:00
Arnold Schwaighofer	be0de34ede	Tail call optimization improvements: Move platform independent code (lowering of possibly overwritten arguments, check for tail call optimization eligibility) from target X86ISelectionLowering.cpp to TargetLowering.h and SelectionDAGISel.cpp. Initial PowerPC tail call implementation: Support ppc32 implemented and tested (passes my tests and test-suite llvm-test). Support ppc64 implemented and half tested (passes my tests). On ppc tail call optimization is performed if caller and callee are fastcc call is a tail call (in tail call position, call followed by ret) no variable argument lists or byval arguments option -tailcallopt is enabled Supported: * non pic tail calls on linux/darwin * module-local tail calls on linux(PIC/GOT)/darwin(PIC) * inter-module tail calls on darwin(PIC) If constraints are not met a normal call will be emitted. A test checking the argument lowering behaviour on x86-64 was added. llvm-svn: 50477	2008-04-30 09:16:33 +00:00
Dan Gohman	da44054867	Fix the SVOffset values for loads and stores produced by memcpy/memset expansion. It was a bug for the SVOffset value to be used in the actual address calculations. llvm-svn: 50359	2008-04-28 17:15:20 +00:00
Chris Lattner	724539c001	A few inline asm cleanups: - Make targetlowering.h fit in 80 cols. - Make LowerAsmOperandForConstraint const. - Make lowerXConstraint -> LowerXConstraint - Make LowerXConstraint return a const char* instead of taking a string byref. llvm-svn: 50312	2008-04-26 23:02:14 +00:00
Dan Gohman	3dd8ba6235	Remove X86_64SRet; it isn't used anymore. llvm-svn: 49759	2008-04-16 00:24:30 +00:00
Dan Gohman	2505d86783	Fix const-correctness issues with the SrcValue handling in the memory intrinsic expansion code. llvm-svn: 49666	2008-04-14 17:55:48 +00:00
Arnold Schwaighofer	634fc9a33a	This patch corrects the handling of byval arguments for tailcall optimized x86-64 (and x86) calls so that they work (... at least for my test cases). Should fix the following problems: Problem 1: When i introduced the optimized handling of arguments for tail called functions (using a sequence of copyto/copyfrom virtual registers instead of always lowering to top of the stack) i did not handle byval arguments correctly e.g they did not work at all :). Problem 2: On x86-64 after the arguments of the tail called function are moved to their registers (which include ESI/RSI etc), tail call optimization performs byval lowering which causes xSI,xDI, xCX registers to be overwritten. This is handled in this patch by moving the arguments to virtual registers first and after the byval lowering the arguments are moved from those virtual registers back to RSI/RDI/RCX. llvm-svn: 49584	2008-04-12 18:11:06 +00:00
Dan Gohman	544ab2c50b	Drop ISD::MEMSET, ISD::MEMMOVE, and ISD::MEMCPY, which are not Legal on any current target and aren't optimized in DAGCombiner. Instead of using intermediate nodes, expand the operations, choosing between simple loads/stores, target-specific code, and library calls, immediately. Previously, the code to emit optimized code for these operations was only used at initial SelectionDAG construction time; now it is used at all times. This fixes some cases where rep;movs was being used for small copies where simple loads/stores would be better. This also cleans up code that checks for alignments less than 4; let the targets make that decision instead of doing it in target-independent code. This allows x86 to use rep;movs in low-alignment cases. Also, this fixes a bug that resulted in the use of rep;stos for memsets of 0 with non-constant memory size when the alignment was at least 4. It's better to use the library in this case, which can be significantly faster when the size is large. This also preserves more SourceValue information when memory intrinsics are lowered into simple loads/stores. llvm-svn: 49572	2008-04-12 04:36:06 +00:00
Dan Gohman	33b3300178	Make isVectorClearMaskLegal's operand list const. llvm-svn: 49446	2008-04-09 20:09:42 +00:00
Chris Lattner	68b11e14bc	remove Evan's "ugly hack" that sorta attempted to get x86-64 return conventions correct, but was never enabled. We can now do the "right thing" with multiple return values. llvm-svn: 48635	2008-03-21 06:50:21 +00:00
Arnold Schwaighofer	7da2bceb3b	Don't loose incoming argument registers. Fix documentation style. llvm-svn: 48545	2008-03-19 16:39:45 +00:00
Chris Lattner	4b3a7fa823	Eliminate the FP_GET_ST0/FP_SET_ST0 target-specific dag nodes, just lower to copyfromreg/copytoreg instead. llvm-svn: 48174	2008-03-10 21:08:41 +00:00
Scott Michel	a6729e8666	Give TargetLowering::getSetCCResultType() a parameter so that ISD::SETCC's return ValueType can depend its operands' ValueType. This is a cosmetic change, no functionality impacted. llvm-svn: 48145	2008-03-10 15:42:14 +00:00
Chris Lattner	4c869594bc	rename FP_SETRESULT -> FP_SET_ST0 llvm-svn: 48094	2008-03-09 07:08:44 +00:00
Chris Lattner	d587e580a6	rename FpGETRESULT32 -> FpGET_ST0_32 etc. Add support for isel'ing value preserving FP roundings from one fp stack reg to another into a noop, instead of stack traffic. llvm-svn: 48093	2008-03-09 07:05:32 +00:00
Evan Cheng	0a62cb44ce	Add a target lowering hook to control whether it's worthwhile to compress fp constant. For x86, if sse2 is available, it's not a good idea since cvtss2sd is slower than a movsd load and it prevents load folding. On x87, it's important to shrink fp constant since fldt is very expensive. llvm-svn: 47931	2008-03-05 01:30:59 +00:00
Andrew Lenharth	357061a74d	64bit CAS on 32bit x86. llvm-svn: 47929	2008-03-05 01:15:49 +00:00
Andrew Lenharth	d032c33300	all but CAS working on x86 llvm-svn: 47798	2008-03-01 21:52:34 +00:00
Arnold Schwaighofer	3bfca3e942	Refactor according to Evan's and Anton's suggestions. llvm-svn: 47635	2008-02-26 22:21:54 +00:00
Arnold Schwaighofer	b01b99ec78	Change the lowering of arguments for tail call optimized calls. Before arguments that could overwrite each other were explicitly lowered to a stack slot, not giving the register allocator a chance to optimize. Now a sequence of copyto/copyfrom virtual registers ensures that arguments are loaded in (virtual) registers before they are lowered to the stack slot (and might overwrite each other). Also parameter stack slots are marked mutable for (potentially) tail calling functions. llvm-svn: 47593	2008-02-26 09:19:59 +00:00
Evan Cheng	6200c225e0	- When DAG combiner is folding a bit convert into a BUILD_VECTOR, it should check if it's essentially a SCALAR_TO_VECTOR. Avoid turning (v8i16) <10, u, u, u> to <10, 0, u, u, u, u, u, u>. Instead, simply convert it to a SCALAR_TO_VECTOR of the proper type. - X86 now normalize SCALAR_TO_VECTOR to (BIT_CONVERT (v4i32 SCALAR_TO_VECTOR)). Get rid of X86ISD::S2VEC. llvm-svn: 47290	2008-02-18 23:04:32 +00:00
Dan Gohman	e1d9ee66ed	Simplify some logic in ComputeMaskedBits. And change ComputeMaskedBits to pass the mask APInt by value, not by reference. llvm-svn: 47096	2008-02-13 22:28:48 +00:00
Dan Gohman	f990faf23b	Convert SelectionDAG::ComputeMaskedBits to use APInt instead of uint64_t. Add an overload that supports the uint64_t interface for use by clients that haven't been updated yet. llvm-svn: 47039	2008-02-13 00:35:47 +00:00
Nate Begeman	2d77e8e446	Enable SSE4 codegen and pattern matching. Add some notes to the README. llvm-svn: 46949	2008-02-11 04:19:36 +00:00
Dan Gohman	3a4be0fdef	Rename MRegisterInfo to TargetRegisterInfo. llvm-svn: 46930	2008-02-10 18:45:23 +00:00
Dan Gohman	9ba4d76816	Rename ISD::FLT_ROUNDS to ISD::FLT_ROUNDS_ to avoid conflicting with the real FLT_ROUNDS (defined in <float.h>). llvm-svn: 46587	2008-01-31 00:41:03 +00:00
Evan Cheng	29cfb67e28	Even though InsertAtEndOfBasicBlock is an ugly hack it still deserves a proper name. Rename it to EmitInstrWithCustomInserter since it does not necessarily insert instruction at the end. llvm-svn: 46562	2008-01-30 18:18:23 +00:00
Evan Cheng	084a1cdcdd	Work in progress. This patch fixes x86-64 calls which are modelled as StructRet but really should be return in registers, e.g. _Complex long double, some 128-bit aggregates. This is a short term solution that is necessary only because llvm, for now, cannot model i128 nor call's with multiple results. Status: This only works for direct calls, and only the caller side is done. Disabled for now. llvm-svn: 46527	2008-01-29 19:34:22 +00:00
Dale Johannesen	2b3bc30420	Handle 'X' constraint in asm's better. llvm-svn: 46485	2008-01-29 02:21:21 +00:00
Evan Cheng	35abd840a6	Let each target decide byval alignment. For X86, it's 4-byte unless the aggregare contains SSE vector(s). For x86-64, it's max of 8 or alignment of the type. llvm-svn: 46286	2008-01-23 23:17:41 +00:00
Chris Lattner	7dc00e8021	make a method public llvm-svn: 46159	2008-01-18 06:52:41 +00:00
Chris Lattner	e8bb9f2190	make it more clear that this predicate only applies to scalar FP types. llvm-svn: 46058	2008-01-16 06:24:21 +00:00
Chris Lattner	14e616ef0b	introduce a isTypeInSSEReg predicate, which allows us to simplify some code. No functionality change. llvm-svn: 46055	2008-01-16 06:19:45 +00:00
Chris Lattner	3c3fefde06	no need to expand ISD::TRAP to X86ISD::TRAP, just match ISD::TRAP. llvm-svn: 46015	2008-01-15 21:58:22 +00:00
Anton Korobeynikov	6bbbc4cbfa	For PR1839: add initial support for __builtin_trap. llvm-gcc part is missed as well as PPC codegen llvm-svn: 46001	2008-01-15 07:02:33 +00:00
Gordon Henriksen	9231958391	Refactoring the x86 and x86-64 calling convention implementations, unifying the copied algorithms and saving over 500 LOC. There should be no functionality change, but please test on your favorite x86 target. llvm-svn: 45627	2008-01-05 16:56:59 +00:00
Chris Lattner	f3ebc3f3d2	Remove attribution from file headers, per discussion on llvmdev. llvm-svn: 45418	2007-12-29 20:36:04 +00:00
Evan Cheng	e9fbc3f014	Implement ctlz and cttz with bsr and bsf. llvm-svn: 45024	2007-12-14 02:13:44 +00:00
Chris Lattner	f81d5886c6	Several changes: 1) Change the interface to TargetLowering::ExpandOperationResult to take and return entire NODES that need a result expanded, not just the value. This allows us to handle things like READCYCLECOUNTER, which returns two values. 2) Implement (extremely limited) support in LegalizeDAG::ExpandOp for MERGE_VALUES. 3) Reimplement custom lowering in LegalizeDAGTypes in terms of the new ExpandOperationResult. This makes the result simpler and fully general. 4) Implement (fully general) expand support for MERGE_VALUES in LegalizeDAGTypes. 5) Implement ExpandOperationResult support for ARM f64->i64 bitconvert and ARM i64 shifts, allowing them to work with LegalizeDAGTypes. 6) Implement ExpandOperationResult support for X86 READCYCLECOUNTER and FP_TO_SINT, allowing them to work with LegalizeDAGTypes. LegalizeDAGTypes now passes several more X86 codegen tests when enabled and when type legalization in LegalizeDAG is ifdef'd out. llvm-svn: 44300	2007-11-24 07:07:01 +00:00
Anton Korobeynikov	91460e43f1	Implement codegen for flt_rounds on x86 llvm-svn: 44183	2007-11-16 01:31:51 +00:00
Evan Cheng	797d56ff17	Much improved pic jumptable codegen: Then: call "L1$pb" "L1$pb": popl %eax ... LBB1_1: # entry imull $4, %ecx, %ecx leal LJTI1_0-"L1$pb"(%eax), %edx addl LJTI1_0-"L1$pb"(%ecx,%eax), %edx jmpl %edx .align 2 .set L1_0_set_3,LBB1_3-LJTI1_0 .set L1_0_set_2,LBB1_2-LJTI1_0 .set L1_0_set_5,LBB1_5-LJTI1_0 .set L1_0_set_4,LBB1_4-LJTI1_0 LJTI1_0: .long L1_0_set_3 .long L1_0_set_2 Now: call "L1$pb" "L1$pb": popl %eax ... LBB1_1: # entry addl LJTI1_0-"L1$pb"(%eax,%ecx,4), %eax jmpl %eax .align 2 .set L1_0_set_3,LBB1_3-"L1$pb" .set L1_0_set_2,LBB1_2-"L1$pb" .set L1_0_set_5,LBB1_5-"L1$pb" .set L1_0_set_4,LBB1_4-"L1$pb" LJTI1_0: .long L1_0_set_3 .long L1_0_set_2 llvm-svn: 43924	2007-11-09 01:32:10 +00:00
Rafael Espindola	fa0df55bdd	Move the LowerMEMCPY and LowerMEMCPYCall to a common place. Thanks for the suggestions Bill :-) llvm-svn: 43742	2007-11-05 23:12:20 +00:00
Evan Cheng	e106e2f142	Enable more fold (sext (load x)) -> (sext (truncate (sextload x))) transformation. Previously, it's restricted by ensuring the number of load uses is one. Now the restriction is loosened up by allowing setcc uses to be "extended" (e.g. setcc x, c, eq -> setcc sext(x), sext(c), eq). llvm-svn: 43465	2007-10-29 19:58:20 +00:00
Evan Cheng	7f3d02471d	Loosen up iv reuse to allow reuse of the same stride but a larger type when truncating from the larger type to smaller type is free. e.g. Turns this loop: LBB1_1: # entry.bb_crit_edge xorl %ecx, %ecx xorw %dx, %dx movw %dx, %si LBB1_2: # bb movl L_X$non_lazy_ptr, %edi movw %si, (%edi) movl L_Y$non_lazy_ptr, %edi movw %dx, (%edi) addw $4, %dx incw %si incl %ecx cmpl %eax, %ecx jne LBB1_2 # bb into LBB1_1: # entry.bb_crit_edge xorl %ecx, %ecx xorw %dx, %dx LBB1_2: # bb movl L_X$non_lazy_ptr, %esi movw %cx, (%esi) movl L_Y$non_lazy_ptr, %esi movw %dx, (%esi) addw $4, %dx incl %ecx cmpl %eax, %ecx jne LBB1_2 # bb llvm-svn: 43375	2007-10-26 01:56:11 +00:00
Arnold Schwaighofer	9ccea99165	Added tail call optimization to the x86 back end. It can be enabled by passing -tailcallopt to llc. The optimization is performed if the following conditions are satisfied: * caller/callee are fastcc * elf/pic is disabled OR elf/pic enabled + callee is in module + callee has visibility protected or hidden llvm-svn: 42870	2007-10-11 19:40:01 +00:00
Dan Gohman	e8c8ef5234	LowerIntegerDivOrRem no longer exists. llvm-svn: 42787	2007-10-09 15:45:13 +00:00
Dan Gohman	a160361c85	Migrate X86 and ARM from using X86ISD::{,I}DIV and ARMISD::MULHILO{U,S} to use ISD::{S,U}DIVREM and ISD::{S,U}MUL_HIO. Move the lowering code associated with these operators into target-independent in LegalizeDAG.cpp and TargetLowering.cpp. llvm-svn: 42762	2007-10-08 18:33:35 +00:00
Evan Cheng	5fb5a1f389	Enabling new condition code modeling scheme. llvm-svn: 42459	2007-09-29 00:00:36 +00:00
Rafael Espindola	6c04ac1db0	Refactor the memcpy lowering for the x86 target. The only generated code difference is that now we call memcpy when the size of the array is unknown. This matches GCC behavior and is better since the run time value can be arbitrarily large. llvm-svn: 42433	2007-09-28 12:53:01 +00:00
Dan Gohman	06919e8ef2	Fix a typo in a comment. llvm-svn: 42313	2007-09-25 19:37:26 +00:00
Dan Gohman	31599685c7	When both x/y and x%y are needed (x and y both scalar integer), compute both results with a single div or idiv instruction. This uses new X86ISD nodes for DIV and IDIV which are introduced during the legalize phase so that the SelectionDAG's CSE can automatically eliminate redundant computations. llvm-svn: 42308	2007-09-25 18:23:27 +00:00
Evan Cheng	e95f391ef1	Added support for new condition code modeling scheme (i.e. physical register dependency). These are a bunch of instructions that are duplicated so the x86 backend can support both the old and new schemes at the same time. They will be deleted after all the kinks are worked out. llvm-svn: 42285	2007-09-25 01:57:46 +00:00
Dale Johannesen	e36c400255	Fix PR 1681. When X86 target uses +sse -sse2, keep f32 in SSE registers and f64 in x87. This is effectively a new codegen mode. Change addLegalFPImmediate to permit float and double variants to do different things. Adjust callers. llvm-svn: 42246	2007-09-23 14:52:20 +00:00
Evan Cheng	8070099fef	X86ISD::TEST is dead. llvm-svn: 42037	2007-09-17 17:42:53 +00:00
Rafael Espindola	272f7304f0	Add support for functions with byval arguments on x86 llvm-svn: 41953	2007-09-14 15:48:13 +00:00
Rafael Espindola	e636fc05d6	Initial support for calling functions with byval arguments on x86-64 llvm-svn: 41643	2007-08-31 15:06:30 +00:00
Chris Lattner	d8c9cb9182	rename isOperandValidForConstraint to LowerAsmOperandForConstraint, changing the interface to allow for future changes. llvm-svn: 41384	2007-08-25 00:47:38 +00:00
Anton Korobeynikov	597c8b77e4	Move ReturnAddrIndex variable to X86MachineFunctionInfo structure. This fixed hard to catch bugs with retaddr lowering llvm-svn: 41104	2007-08-15 17:12:32 +00:00
Dan Gohman	5f6a9da530	More explicit keywords. llvm-svn: 40757	2007-08-02 21:21:54 +00:00
Duncan Sands	ce38853cc6	Trampoline codegen support for X86-32. llvm-svn: 40566	2007-07-27 20:02:49 +00:00
Dan Gohman	4788552deb	Re-apply 40504, but with a fix for the segfault it caused in oggenc: Make the alignedload and alignedstore patterns always require 16-byte alignment. This way when they are used in the "Fs" instructions, in which a vector instruction is used for a scalar purpose, they can still require the full vector alignment. And add a regression test for this. llvm-svn: 40555	2007-07-27 17:16:43 +00:00
Evan Cheng	931de40afa	Reverting 40504 for now. It's breaking oggenc. llvm-svn: 40547	2007-07-27 01:37:47 +00:00
Dan Gohman	8455bd3fae	Remove X86ISD::LOAD_PACK and X86ISD::LOAD_UA and associated code from the x86 target, replacing them with the new alignment attributes on memory references. llvm-svn: 40504	2007-07-26 00:31:09 +00:00
Anton Korobeynikov	383a324735	Long live the exception handling! This patch fills the last necessary bits to enable exceptions handling in LLVM. Currently only on x86-32/linux. In fact, this patch adds necessary intrinsics (and their lowering) which represent really weird target-specific gcc builtins used inside unwinder. After corresponding llvm-gcc patch will land (easy) exceptions should be more or less workable. However, exceptions handling support should not be thought as 'finished': I expect many small and not so small glitches everywhere. llvm-svn: 39855	2007-07-14 14:06:15 +00:00
Dan Gohman	57111e7a60	Define non-intrinsic instructions for vector min, max, sqrt, rsqrt, and rcp, in addition to the intrinsic forms. Add spill-folding entries for these new instructions, and for the scalar min and max instrinsic instructions which were missing. And add some preliminary ISelLowering code for using the new non-intrinsic vector sqrt instruction, and fneg and fabs. llvm-svn: 38478	2007-07-10 00:05:58 +00:00
Dan Gohman	309d3d51b3	Move ComputeMaskedBits, MaskedValueIsZero, and ComputeNumSignBits from TargetLowering to SelectionDAG so that they have more convenient access to the current DAG, in preparation for the ValueType routines being changed from standalone functions to members of SelectionDAG for the pre-legalize vector type changes. llvm-svn: 37704	2007-06-22 14:59:07 +00:00
Bill Wendling	591eab8844	Support for the special case of a vector with the canonical form: vector_shuffle v1, v2, <2, 6, 3, 7> I.e. vector_shuffle v, undef, <2, 2, 3, 3> MMX only has a shuffle for v4i16 vectors. It needs to use the unpackh for this type of operation. llvm-svn: 36403	2007-04-24 21:16:55 +00:00
Lauro Ramos Venancio	2518889872	Implement "general dynamic", "initial exec" and "local exec" TLS models for X86 32 bits. llvm-svn: 36283	2007-04-20 21:38:10 +00:00
Anton Korobeynikov	8b7aab009e	Implemented correct stack probing on mingw/cygwin for dynamic alloca's. Also, fixed static case in presence of eax livin. This fixes PR331 PS: Why don't we still have push/pop instructions? :) llvm-svn: 36195	2007-04-17 09:20:00 +00:00
Chris Lattner	808ac93f68	remove some dead hooks llvm-svn: 35845	2007-04-09 23:31:19 +00:00
Chris Lattner	39f65335d5	remove some dead target hooks, subsumed by isLegalAddressingMode llvm-svn: 35840	2007-04-09 22:27:04 +00:00
Chris Lattner	1eb94d973a	implement the new addressing mode description hook. llvm-svn: 35521	2007-03-30 23:15:24 +00:00
Chris Lattner	d685514e2e	switch TargetLowering::getConstraintType to take the entire constraint, not just the first letter. No functionality change. llvm-svn: 35322	2007-03-25 02:14:49 +00:00
Dale Johannesen	0c6bb5eab7	repair x86 performance, dejagnu problems from previous change llvm-svn: 35245	2007-03-21 21:51:52 +00:00
Evan Cheng	3ab7ea7965	More flexible TargetLowering LSR hooks for testing whether an immediate is a legal target address immediate or scale. llvm-svn: 35073	2007-03-12 23:28:50 +00:00
Evan Cheng	deaea25eb9	X86-64 VACOPY needs custom expansion. va_list is a struct { i32, i32, i8, i8 }. llvm-svn: 34857	2007-03-02 23:16:35 +00:00
Chris Lattner	3ed3be3b4a	remove fastcc (not fastcall) support llvm-svn: 34730	2007-02-28 06:05:16 +00:00
Chris Lattner	74f5bcf8eb	add an accessor. llvm-svn: 34625	2007-02-26 04:01:25 +00:00
Chris Lattner	7802f3e2ea	pass the calling convention into Lower*CallTo, instead of using ad-hoc flags. llvm-svn: 34587	2007-02-25 09:06:15 +00:00
Chris Lattner	0cd9960fe7	factor a bunch of code out of LowerCCCCallTo into a new LowerCallResult function. This function now uses GetRetValueLocs to determine where the result values are located and concerns itself with how to pull the values out. llvm-svn: 34586	2007-02-25 08:59:22 +00:00
Chris Lattner	dfda38f7dc	simplify result value lowering by splitting the selection of where to return registers out from the logic of how to return them. This changes X86-64 to mark EAX live out when returning a 32-bit value, where before it marked RAX liveout. llvm-svn: 34582	2007-02-25 08:15:11 +00:00
Nate Begeman	eda5997cc8	Finish off bug 680, allowing targets to custom lower frame and return address nodes. llvm-svn: 33636	2007-01-29 22:58:52 +00:00
Anton Korobeynikov	037c867b54	Propagate changes from my local tree. This patch includes: 1. New parameter attribute called 'inreg'. It has meaning "place this parameter in registers, if possible". This is some generalization of gcc's regparm(n) attribute. It's currently used only in X86-32 backend. 2. Completely rewritten CC handling/lowering code inside X86 backend. Merged stdcall + c CCs and fastcall + fast CC. 3. Dropped CSRET CC. We cannot add struct return variant for each target-specific CC (e.g. stdcall + csretcc and so on). 4. Instead of CSRET CC introduced 'sret' parameter attribute. Setting in on first attribute has meaning 'This is hidden pointer to structure return. Handle it gently'. 5. Fixed small bug in llvm-extract + add new feature to FunctionExtraction pass, which relinks all internal-linkaged callees from deleted function to external linkage. This will allow further linking everything together. NOTEs: 1. Documentation will be updated soon. 2. llvm-upgrade should be improved to translate csret => sret. Before this, there will be some unexpected test fails. llvm-svn: 33597	2007-01-28 13:31:35 +00:00
Evan Cheng	82241c86e9	- FCOPYSIGN custom lowering bug. Clear the sign bit of operand 0 first before or'ing in the sign bit of operand 1. - Tweaking: rather than left shift the sign bit, fp_extend operand 1 first before taking its sign bit if its type is smaller than that of operand 0. llvm-svn: 32932	2007-01-05 21:37:56 +00:00
Evan Cheng	4363e884c0	With SSE2, expand FCOPYSIGN to a series of SSE bitwise operations. llvm-svn: 32900	2007-01-05 07:55:56 +00:00
Evan Cheng	ae1cd75af7	- Use a different wrapper node for RIP-relative GV, etc. - Proper support for both small static and PIC modes under X86-64 - Some (non-optimal) support for medium modes. llvm-svn: 32046	2006-11-30 21:55:46 +00:00
Evan Cheng	49683ba236	Don't dag combine floating point select to max and min intrinsics. Those take v4f32 / v2f64 operands and may end up causing larger spills / restores. Added X86 specific nodes X86ISD::FMAX, X86ISD::FMIN instead. This fixes PR996. llvm-svn: 31645	2006-11-10 21:43:37 +00:00
Evan Cheng	922e191116	Fixed a bug which causes x86 be to incorrectly match shuffle v, undef, <2, ?, 3, ?> to movhlps It should match to unpckhps instead. Added proper matching code for shuffle v, undef, <2, 3, 2, 3> llvm-svn: 31519	2006-11-07 22:14:24 +00:00
Chris Lattner	44daa50bed	allow the address of a global to be used with the "i" constraint when in -static mode. This implements PR882. llvm-svn: 31326	2006-10-31 20:13:11 +00:00
Evan Cheng	e056dd5928	Fixed a significant bug where unpcklpd is incorrectly used to extract element 1 from a v2f64 value. llvm-svn: 31228	2006-10-27 21:08:32 +00:00
Chris Lattner	c0fb567e23	Implement branch analysis/xform hooks required by the branch folding pass. llvm-svn: 31065	2006-10-20 17:42:20 +00:00
Chris Lattner	f4aeff00c2	fit in 80 cols llvm-svn: 31039	2006-10-18 18:26:48 +00:00
Chris Lattner	d9e4bf5285	update comments llvm-svn: 30663	2006-09-28 23:33:12 +00:00
Anton Korobeynikov	3c5b3df6a0	Adding codegeneration for StdCall & FastCall calling conventions llvm-svn: 30549	2006-09-20 22:03:51 +00:00
Evan Cheng	4259a0f654	X86ISD::CMP now produces a chain as well as a flag. Make that the chain operand of a conditional branch to allow load folding into CMP / TEST instructions. llvm-svn: 30241	2006-09-11 02:19:56 +00:00
Evan Cheng	11b0a5dbd4	Committing X86-64 support. llvm-svn: 30177	2006-09-08 06:48:29 +00:00
Chris Lattner	524129dd64	Fix PR850 and CodeGen/X86/2006-07-31-SingleRegClass.ll. The CFE refers to all single-register constraints (like "A") by their 16-bit name, even though the 8 or 32-bit version of the register may be needed. The X86 backend should realize what is going on and redecode the name back to its proper form. llvm-svn: 29420	2006-07-31 23:26:50 +00:00
Chris Lattner	298ef37e02	Implement the inline asm 'A' constraint. This implements PR825 and CodeGen/X86/2006-07-10-InlineAsmAConstraint.ll llvm-svn: 29101	2006-07-11 02:54:03 +00:00
Evan Cheng	5987cfb7b1	X86 target specific DAG combine: turn build_vector (load x), (load x+4), (load x+8), (load x+12), <0, 1, 2, 3> to a single 128-bit load (aligned and unaligned). e.g. __m128 test(float a, float b, float c, float d) { return _mm_set_ps(d, c, b, a); } _test: movups 4(%esp), %xmm0 ret llvm-svn: 29042	2006-07-07 08:33:52 +00:00
Evan Cheng	38c5aee959	Simplify X86CompilationCallback: always align to 16-byte boundary; don't save EAX/EDX if unnecessary. llvm-svn: 28910	2006-06-24 08:36:10 +00:00
Evan Cheng	2a33094284	Switch X86 over to a call-selection model where the lowering code creates the copyto/fromregs instead of making the X86ISD::CALL selection code create them. llvm-svn: 28463	2006-05-25 00:59:30 +00:00
Chris Lattner	aa2372562e	Patches to make the LLVM sources more -pedantic clean. Patch provided by Anton Korobeynikov! This is a step towards closing PR786. llvm-svn: 28447	2006-05-24 17:04:05 +00:00
Evan Cheng	17e734f0a6	Remove PreprocessCCCArguments and PreprocessFastCCArguments now that FORMAL_ARGUMENTS nodes include a token operand. llvm-svn: 28439	2006-05-23 21:06:34 +00:00
Chris Lattner	8be5be817c	Implement an annoying part of the Darwin/X86 abi: the callee of a struct return argument pops the hidden struct pointer if present, not the caller. For example, in this testcase: struct X { int D, E, F, G; }; struct X bar() { struct X a; a.D = 0; a.E = 1; a.F = 2; a.G = 3; return a; } void foo(struct X P) { P = bar(); } We used to emit: _foo: subl $28, %esp movl 32(%esp), %eax movl %eax, (%esp) call _bar addl $28, %esp ret _bar: movl 4(%esp), %eax movl $0, (%eax) movl $1, 4(%eax) movl $2, 8(%eax) movl $3, 12(%eax) ret This is correct on Linux/X86 but not Darwin/X86. With this patch, we now emit: _foo: subl $28, %esp movl 32(%esp), %eax movl %eax, (%esp) call _bar * addl $24, %esp ret _bar: movl 4(%esp), %eax movl $0, (%eax) movl $1, 4(%eax) movl $2, 8(%eax) movl $3, 12(%eax) * ret $4 For the record, GCC emits (which is functionally equivalent to our new code): _bar: movl 4(%esp), %eax movl $3, 12(%eax) movl $2, 8(%eax) movl $1, 4(%eax) movl $0, (%eax) ret $4 _foo: pushl %esi subl $40, %esp movl 48(%esp), %esi leal 16(%esp), %eax movl %eax, (%esp) call _bar subl $4, %esp movl 16(%esp), %eax movl %eax, (%esi) movl 20(%esp), %eax movl %eax, 4(%esi) movl 24(%esp), %eax movl %eax, 8(%esi) movl 28(%esp), %eax movl %eax, 12(%esi) addl $40, %esp popl %esi ret This fixes SingleSource/Benchmarks/CoyoteBench/fftbench with LLC and the JIT, and fixes the X86-backend portion of PR729. The CBE still needs to be updated. llvm-svn: 28438	2006-05-23 18:50:38 +00:00
Evan Cheng	8c6b234ce8	Should pass by reference. llvm-svn: 28357	2006-05-17 19:07:40 +00:00
Evan Cheng	48940d16b2	- Clean up formal argument lowering code. Prepare for vector pass by value work. - Fixed vararg support. llvm-svn: 27985	2006-04-27 01:32:22 +00:00
Evan Cheng	e0bcfbe811	Switching over FORMAL_ARGUMENTS mechanism to lower call arguments. llvm-svn: 27975	2006-04-26 01:20:17 +00:00
Evan Cheng	a9467aab0a	Separate LowerOperation() into multiple functions, one per opcode. llvm-svn: 27972	2006-04-25 20:13:52 +00:00
Evan Cheng	e8b5180044	Now generating perfect (I think) code for "vector set" with a single non-zero scalar value. e.g. _mm_set_epi32(0, a, 0, 0); ==> movd 4(%esp), %xmm0 pshufd $69, %xmm0, %xmm0 _mm_set_epi8(0, 0, 0, 0, 0, a, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0); ==> movzbw 4(%esp), %ax movzwl %ax, %eax pxor %xmm0, %xmm0 pinsrw $5, %eax, %xmm0 llvm-svn: 27923	2006-04-21 01:05:10 +00:00
Evan Cheng	60f0b8998e	- Added support to turn "vector clear elements", e.g. pand V, <-1, -1, 0, -1> to a vector shuffle. - VECTOR_SHUFFLE lowering change in preparation for more efficient codegen of vector shuffle with zero (or any splat) vector. llvm-svn: 27875	2006-04-20 08:58:49 +00:00
Evan Cheng	7855e4d032	Commute vector_shuffle to match more movlhps, movlp{s\|d} cases. llvm-svn: 27840	2006-04-19 20:35:22 +00:00
Evan Cheng	5d247f81c1	Last few SSE3 intrinsics. llvm-svn: 27711	2006-04-14 21:59:03 +00:00
Evan Cheng	12ba3e23d0	Added support for _mm_move_ss and _mm_move_sd. llvm-svn: 27575	2006-04-11 00:19:04 +00:00
Evan Cheng	c995b45f67	- movlp{s\|d} and movhp{s\|d} support. - Normalize shuffle nodes so result vector lower half elements come from the first vector, the rest come from the second vector. (Except for the exceptions :-). - Other minor fixes. llvm-svn: 27474	2006-04-06 23:23:56 +00:00
Evan Cheng	780382946e	Support for comi / ucomi intrinsics. llvm-svn: 27444	2006-04-05 23:38:46 +00:00
Evan Cheng	f3b52c84ea	Handle canonical form of e.g. vector_shuffle v1, v1, <0, 4, 1, 5, 2, 6, 3, 7> This is turned into vector_shuffle v1, <undef>, <0, 0, 1, 1, 2, 2, 3, 3> by dag combiner. It would match a {p}unpckl on x86. llvm-svn: 27437	2006-04-05 07:20:06 +00:00
Evan Cheng	5fd7c69473	Use a X86 target specific node X86ISD::PINSRW instead of a mal-formed INSERT_VECTOR_ELT to insert a 16-bit value in a 128-bit vector. llvm-svn: 27314	2006-03-31 21:55:24 +00:00
Evan Cheng	cbffa4656b	Add support to use pextrw and pinsrw to extract and insert a word element from a 128-bit vector. llvm-svn: 27304	2006-03-31 19:22:53 +00:00
Evan Cheng	b7fedffc78	- Added some SSE2 128-bit packed integer ops. - Added SSE2 128-bit integer pack with signed saturation ops. - Added pshufhw and pshuflw ops. llvm-svn: 27252	2006-03-29 23:07:14 +00:00
Evan Cheng	1a194a5264	* Prefer using operation of matching types. e.g unpcklpd rather than movlhps. * Bug fixes. llvm-svn: 27218	2006-03-28 06:50:32 +00:00
Evan Cheng	2bc3280659	- Clean up / consoladate various shuffle masks. - Some misc. bug fixes. - Use MOVHPDrm to load from m64 to upper half of a XMM register. llvm-svn: 27210	2006-03-28 02:43:26 +00:00
Evan Cheng	5df75889db	Model unpack lower and interleave as vector_shuffle so we can lower the intrinsics as such. llvm-svn: 27200	2006-03-28 00:39:58 +00:00
Evan Cheng	ed6184aef2	Remove X86:isZeroVector, use ISD::isBuildVectorAllZeros instead; some fixes / cleanups llvm-svn: 27150	2006-03-26 09:53:12 +00:00
Evan Cheng	2bc0941e2a	Build arbitrary vector with more than 2 distinct scalar elements with a series of unpack and interleave ops. llvm-svn: 27119	2006-03-25 09:37:23 +00:00
Evan Cheng	e7ee6a5e32	Support for scalar to vector with zero extension. llvm-svn: 27091	2006-03-24 23:15:12 +00:00
Evan Cheng	082c8785ef	Handle BUILD_VECTOR with all zero elements. llvm-svn: 27056	2006-03-24 07:29:27 +00:00
Evan Cheng	2595a687da	More efficient v2f64 shuffle using movlhps, movhlps, unpckhpd, and unpcklpd. llvm-svn: 27040	2006-03-24 02:58:06 +00:00
Evan Cheng	d27fb3e85e	Handle more shuffle cases with SHUFP* instructions. llvm-svn: 27024	2006-03-24 01:18:28 +00:00
Evan Cheng	021bb7c956	Added a ValueType operand to isShuffleMaskLegal(). For now, x86 will not do 64-bit vector shuffle. llvm-svn: 26964	2006-03-22 22:07:06 +00:00
Evan Cheng	68ad48bd1a	- Implement X86ISelLowering::isShuffleMaskLegal(). We currently only support splat and PSHUFD cases. - Clean up shuffle / splat matching code. llvm-svn: 26954	2006-03-22 18:59:22 +00:00
Evan Cheng	8fdbdf20cd	- VECTOR_SHUFFLE of v4i32 / v4f32 with undef second vector always matches PSHUFD. We can make permutes entries which point to the undef pointing anything we want. - Change some names to appease Chris. llvm-svn: 26951	2006-03-22 08:01:21 +00:00
Evan Cheng	d097e67544	Some splat and shuffle support. llvm-svn: 26940	2006-03-22 02:53:00 +00:00
Evan Cheng	d5e905d762	- Use movaps to store 128-bit vector integers. - Each scalar to vector v8i16 and v16i8 is a any_extend followed by a movd. llvm-svn: 26932	2006-03-21 23:01:21 +00:00
Evan Cheng	2dd2c652b2	Added getTargetLowering() to TargetMachine. Refactored targets to support this. llvm-svn: 26742	2006-03-13 23:20:37 +00:00
Evan Cheng	e0ed6ec13f	- Clean up the lowering and selection code of ConstantPool, GlobalAddress, and ExternalSymbol. - Use C++ code (rather than tblgen'd selection code) to match the above mentioned leaf nodes. Do not mutate and nodes and do not record the selection in CodeGenMap. These nodes should be safe to duplicate. This is a performance win. llvm-svn: 26335	2006-02-23 20:41:18 +00:00
Evan Cheng	1f342c2884	PIC related bug fixes. 1. Various asm printer bug. 2. Lowering bug. Now TargetGlobalAddress is wrapped in X86ISD::TGAWrapper. llvm-svn: 26324	2006-02-23 02:43:52 +00:00
Chris Lattner	7ad77dfc2a	split register class handling from explicit physreg handling. llvm-svn: 26308	2006-02-22 00:56:39 +00:00
Chris Lattner	7bb4696dc3	Updates to match change of getRegForInlineAsmConstraint prototype llvm-svn: 26305	2006-02-21 23:11:00 +00:00
Evan Cheng	5588de9415	x86 / Darwin PIC support. llvm-svn: 26273	2006-02-18 00:15:05 +00:00
Nate Begeman	5965bd19f8	kill ADD_PARTS & SUB_PARTS and replace them with fancy new ADDC, ADDE, SUBC and SUBE nodes that actually expose what's going on and allow for significant simplifications in the targets. llvm-svn: 26255	2006-02-17 05:43:56 +00:00
Nate Begeman	8a77efe4f7	Rework the SelectionDAG-based implementations of SimplifyDemandedBits and ComputeMaskedBits to match the new improved versions in instcombine. Tested against all of multisource/benchmarks on ppc. llvm-svn: 26238	2006-02-16 21:11:51 +00:00
Evan Cheng	11613a5219	Separate FILD and FILD_FLAG, the later is only used for SSE2. It produces a flag so it can be flagged to a FST. llvm-svn: 25953	2006-02-04 02:20:30 +00:00
Evan Cheng	72d5c256c9	- Allow XMM load (for scalar use) to be folded into ANDP* and XORP. - Use XORP to implement fneg. llvm-svn: 25857	2006-01-31 22:28:30 +00:00
Chris Lattner	c642aa5e1c	* Fix 80-column violations * Rename hasSSE -> hasSSE1 to avoid my continual confusion with 'has any SSE'. * Add inline asm constraint specification. llvm-svn: 25854	2006-01-31 19:43:35 +00:00
Evan Cheng	2dd217b88f	Added custom lowering of fabs llvm-svn: 25831	2006-01-31 03:14:29 +00:00
Evan Cheng	5b97fcf0f5	Always use FP stack instructions to perform i64 to f64 as well as f64 to i64 conversions. SSE does not have instructions to handle these tasks. llvm-svn: 25817	2006-01-30 08:02:57 +00:00
Chris Lattner	f0b24d2dc0	Move MaskedValueIsZero from the DAGCombiner to the TargetLowering interface,making isMaskedValueZeroForTargetNode simpler, and useable from other partsof the compiler. llvm-svn: 25803	2006-01-30 04:09:27 +00:00
Chris Lattner	c6fa0282d2	adjust prototype llvm-svn: 25798	2006-01-30 03:49:07 +00:00
Nate Begeman	8c47c3a3b1	Remove TLI.LowerReturnTo, and just let targets custom lower ISD::RET for the same functionality. This addresses another piece of bug 680. Next, on to fixing Alpha VAARG, which I broke last time. llvm-svn: 25696	2006-01-27 21:09:22 +00:00
Evan Cheng	cde9e30bc6	x86 CPU detection and proper subtarget support llvm-svn: 25679	2006-01-27 08:10:46 +00:00
Nate Begeman	e74795cd70	First part of bug 680: Remove TLI.LowerVA* and replace it with SDNodes that are lowered the same way as everything else. llvm-svn: 25606	2006-01-25 18:21:52 +00:00
Evan Cheng	6305e50ee1	Fix sint_to_fp (fild*) support. llvm-svn: 25257	2006-01-12 22:54:21 +00:00
Evan Cheng	ae986f1f1e	Support for MEMCPY and MEMSET. llvm-svn: 25226	2006-01-11 22:15:48 +00:00
Evan Cheng	339edad775	SSE cmov support. llvm-svn: 25190	2006-01-11 00:33:36 +00:00
Evan Cheng	9c249c37f8	Support for ADD_PARTS, SUB_PARTS, SHL_PARTS, SHR_PARTS, and SRA_PARTS. llvm-svn: 25158	2006-01-09 18:33:28 +00:00
Evan Cheng	172fce7050	* Fast call support. * FP cmp, setcc, etc. llvm-svn: 25117	2006-01-06 00:43:03 +00:00
Evan Cheng	45e19098a6	DAG based isel call support. llvm-svn: 25103	2006-01-05 00:27:02 +00:00
Evan Cheng	5c59d49630	More X86 floating point patterns. llvm-svn: 24990	2005-12-23 07:31:11 +00:00
Evan Cheng	9cdc16c6d3	* Fix a GlobalAddress lowering bug. * Teach DAG combiner about X86ISD::SETCC by adding a TargetLowering hook. llvm-svn: 24921	2005-12-21 23:05:39 +00:00
Evan Cheng	c1583dbd63	* Added support for X86 RET with an additional operand to specify number of bytes to pop off stack. * Added support for X86 SETCC. llvm-svn: 24917	2005-12-21 20:21:51 +00:00
Evan Cheng	a74ce62746	* Added lowering hook for external weak global address. It inserts a load for Darwin. * Added lowering hook for ISD::RET. It inserts CopyToRegs for the return value (or store / fld / copy to ST(0) for floating point value). This eliminate the need to write C++ code to handle RET with variable number of operands. llvm-svn: 24888	2005-12-21 02:39:21 +00:00
Evan Cheng	6af02635a7	Added a hook to print out names of target specific DAG nodes. llvm-svn: 24877	2005-12-20 06:22:03 +00:00
Evan Cheng	6fc31046aa	X86 conditional branch support. llvm-svn: 24870	2005-12-19 23:12:38 +00:00
Evan Cheng	225a4d0d6d	X86 lowers SELECT to a cmp / test followed by a conditional move. llvm-svn: 24754	2005-12-17 01:21:05 +00:00
Andrew Lenharth	0bf68ae434	The second patch of X86 support for read cycle counter. llvm-svn: 24430	2005-11-20 21:41:10 +00:00
Chris Lattner	76ac068568	Separate X86ISelLowering stuff out from the X86ISelPattern.cpp file. Patch contributed by Evan Cheng. llvm-svn: 24358	2005-11-15 00:40:23 +00:00

... 10 11 12 13 14 ...

783 Commits