llvm-project

Commit Graph

Author	SHA1	Message	Date
Chris Lattner	83263b8cfb	Make X86TargetLowering::LowerSINT_TO_FP return without creating a dead stack slot and store if the SINT_TO_FP is actually legal. This allows us to compile: double a(double b) {return (unsigned)b;} to: _a: cvttsd2siq %xmm0, %rax movl %eax, %eax cvtsi2sdq %rax, %xmm0 ret instead of: _a: subq $8, %rsp cvttsd2siq %xmm0, %rax movl %eax, %eax cvtsi2sdq %rax, %xmm0 addq $8, %rsp ret crazy. llvm-svn: 47660	2008-02-27 05:57:41 +00:00
Arnold Schwaighofer	3bfca3e942	Refactor according to Evan's and Anton's suggestions. llvm-svn: 47635	2008-02-26 22:21:54 +00:00
Arnold Schwaighofer	1f17bf6171	Correct function comments. llvm-svn: 47606	2008-02-26 17:50:59 +00:00
Arnold Schwaighofer	69a10f4112	Add support for intermodule tail calls on x86/32bit with GOT-style position independent code. Before only tail calls to protected/hidden functions within the same module were optimized. Now all function calls are tail call optimized. llvm-svn: 47594	2008-02-26 10:21:54 +00:00
Arnold Schwaighofer	b01b99ec78	Change the lowering of arguments for tail call optimized calls. Before arguments that could overwrite each other were explicitly lowered to a stack slot, not giving the register allocator a chance to optimize. Now a sequence of copyto/copyfrom virtual registers ensures that arguments are loaded in (virtual) registers before they are lowered to the stack slot (and might overwrite each other). Also parameter stack slots are marked mutable for (potentially) tail calling functions. llvm-svn: 47593	2008-02-26 09:19:59 +00:00
Dale Johannesen	65b404d61c	Revise previous patch per review. llvm-svn: 47573	2008-02-25 22:29:22 +00:00
Dale Johannesen	32d84b1772	Expand removal of MMX memory copies to allow 1 level of TokenFactor underneath chain (seems to be enough) llvm-svn: 47554	2008-02-25 19:20:14 +00:00
Dale Johannesen	09f410b6d7	Split ParameterAttributes.h, putting the complicated stuff into ParamAttrsList.h. Per feedback from ParamAttrs changes. llvm-svn: 47504	2008-02-22 22:17:59 +00:00
Chris Lattner	ab8bfc28c8	copy mmx values from/to memory with GPRs on x86-32 instead of with mmx registers. This horribleness is apparently done by gcc to avoid having to insert emms in places that really should have it. This is the second half of rdar://5741668. llvm-svn: 47474	2008-02-22 05:18:04 +00:00
Chris Lattner	997b3a65ca	Start using GPR's to copy around mmx value instead of mmx regs. GCC apparently does this, and code depends on not having to do emms when this happens. This is x86-64 only so far, second half should handle x86-32. rdar://5741668 llvm-svn: 47470	2008-02-22 02:09:43 +00:00
Anton Korobeynikov	40d67c59d5	Remove bunch of gcc 4.3-related warnings from Target llvm-svn: 47369	2008-02-20 11:22:39 +00:00
Evan Cheng	6200c225e0	- When DAG combiner is folding a bit convert into a BUILD_VECTOR, it should check if it's essentially a SCALAR_TO_VECTOR. Avoid turning (v8i16) <10, u, u, u> to <10, 0, u, u, u, u, u, u>. Instead, simply convert it to a SCALAR_TO_VECTOR of the proper type. - X86 now normalize SCALAR_TO_VECTOR to (BIT_CONVERT (v4i32 SCALAR_TO_VECTOR)). Get rid of X86ISD::S2VEC. llvm-svn: 47290	2008-02-18 23:04:32 +00:00
Dan Gohman	c589243107	Chris pointed out that it's not necessary to set i64 MUL to Expand on x86-32 since i64 itself is not a Legal type. And, update some comments. llvm-svn: 47282	2008-02-18 19:34:53 +00:00
Dan Gohman	a589ee11bb	Don't mark scalar integer multiplication as Expand on x86, since x86 has plain one-result scalar integer multiplication instructions. This avoids expanding such instructions into MUL_LOHI sequences that must be special-cased at isel time, and avoids the problem with that code that provented memory operands from being folded. This fixes PR1874, addressesing the most common case. The uncommon cases of optimizing multiply-high operations will require work in DAGCombiner. llvm-svn: 47277	2008-02-18 17:55:26 +00:00
Andrew Lenharth	fedcf477b5	I cannot find a libgcc function for this builtin. Therefor expanding it to a noop (which is how it use to be treated). If someone who knows the x86 backend better than me could tell me how to get a lock prefix on an instruction, that would be nice to complete x86 support. llvm-svn: 47213	2008-02-16 14:46:26 +00:00
Duncan Sands	4c95dbd69f	In TargetLowering::LowerCallTo, don't assert that the return value is zero-extended if it isn't sign-extended. It may also be any-extended. Also, if a floating point value was returned in a larger floating point type, pass 1 as the second operand to FP_ROUND, which tells it that all the precision is in the original type. I think this is right but I could be wrong. Finally, when doing libcalls, set isZExt on a parameter if it is "unsigned". Currently isSExt is set when signed, and nothing is set otherwise. This should be right for all calls to standard library routines. llvm-svn: 47122	2008-02-14 17:28:50 +00:00
Nate Begeman	53e1b3f9d5	Change how FP immediates are handled. 1) ConstantFP is now expand by default 2) ConstantFP is not turned into TargetConstantFP during Legalize if it is legal. This allows ConstantFP to be handled like Constant, allowing for targets that can encode FP immediates as MachineOperands. As a bonus, fix up Itanium FP constants, which now correctly match, and match more constants! Hooray. llvm-svn: 47121	2008-02-14 08:57:00 +00:00
Dan Gohman	9ca025f1dc	Assigning an APInt to 0 with plain assignment gives it a one-bit size. Initialize these APInts to properly-sized zero values. llvm-svn: 47099	2008-02-13 23:07:24 +00:00
Dan Gohman	e1d9ee66ed	Simplify some logic in ComputeMaskedBits. And change ComputeMaskedBits to pass the mask APInt by value, not by reference. llvm-svn: 47096	2008-02-13 22:28:48 +00:00
Dan Gohman	f990faf23b	Convert SelectionDAG::ComputeMaskedBits to use APInt instead of uint64_t. Add an overload that supports the uint64_t interface for use by clients that haven't been updated yet. llvm-svn: 47039	2008-02-13 00:35:47 +00:00
Nate Begeman	8ef50214f0	SSE4.1 64b integer insert/extract pattern support Move formats into the formats file llvm-svn: 47035	2008-02-12 22:51:28 +00:00
Evan Cheng	4d8c98b8f9	Unbreak various insert_vector_elt and extract_vector_elt tests in presence of SSE4. llvm-svn: 47001	2008-02-12 07:59:45 +00:00
Nate Begeman	2d77e8e446	Enable SSE4 codegen and pattern matching. Add some notes to the README. llvm-svn: 46949	2008-02-11 04:19:36 +00:00
Dan Gohman	3a4be0fdef	Rename MRegisterInfo to TargetRegisterInfo. llvm-svn: 46930	2008-02-10 18:45:23 +00:00
Dale Johannesen	36c2967d89	64-bit (MMX) vectors do not need restrictive alignment. 128-bit vectors need it only when SSE is on. llvm-svn: 46890	2008-02-08 19:48:20 +00:00
Dan Gohman	7a55a94ba1	Avoid needlessly casting away const qualifiers. llvm-svn: 46877	2008-02-08 03:29:40 +00:00
Dan Gohman	16d4bc3dc0	Follow Chris' suggestion; change the PseudoSourceValue accessors to return pointers instead of references, since this is always what is needed. llvm-svn: 46857	2008-02-07 18:41:25 +00:00
Dan Gohman	63a8452e9c	Add SourceValue information for outgoing argument stores on x86. llvm-svn: 46854	2008-02-07 16:28:05 +00:00
Dan Gohman	2d489b5081	Re-apply the memory operand changes, with a fix for the static initializer problem, a minor tweak to the way the DAGISelEmitter finds load/store nodes, and a renaming of the new PseudoSourceValue objects. llvm-svn: 46827	2008-02-06 22:27:42 +00:00
Dale Johannesen	d88f1d060e	Implement sseregparm. llvm-svn: 46764	2008-02-05 20:46:33 +00:00
Nick Lewycky	f5b9938ef6	Don't use uninitialized values. Fixes vec_align.ll on X86 Linux. llvm-svn: 46666	2008-02-02 08:29:58 +00:00
Evan Cheng	efd142a920	SDIsel processes llvm.dbg.declare by recording the variable debug information descriptor and its corresponding stack frame index in MachineModuleInfo. This only works if the local variable is "homed" in the stack frame. It does not work for byval parameter, etc. Added ISD::DECLARE node type to represent llvm.dbg.declare intrinsic. Now the intrinsic calls are lowered into a SDNode and lives on through out the codegen passes. For now, since all the debugging information recording is done at isel time, when a ISD::DECLARE node is selected, it has the side effect of also recording the variable. This is a short term solution that should be fixed in time. llvm-svn: 46659	2008-02-02 04:07:54 +00:00
Evan Cheng	27b32b87ed	Revert 46556 and 46585. Dan please fix the PseudoSourceValue problem and re-commit. llvm-svn: 46623	2008-01-31 21:00:00 +00:00
Dan Gohman	ed346f2ed5	Avoid unnecessarily casting away const. llvm-svn: 46590	2008-01-31 01:01:48 +00:00
Dan Gohman	9ba4d76816	Rename ISD::FLT_ROUNDS to ISD::FLT_ROUNDS_ to avoid conflicting with the real FLT_ROUNDS (defined in <float.h>). llvm-svn: 46587	2008-01-31 00:41:03 +00:00
Dan Gohman	3646fdda67	Create a new class, MemOperand, for describing memory references in the backend. Introduce a new SDNode type, MemOperandSDNode, for holding a MemOperand in the SelectionDAG IR, and add a MemOperand list to MachineInstr, and code to manage them. Remove the offset field from SrcValueSDNode; uses of SrcValueSDNode that were using it are all all using MemOperandSDNode now. Also, begin updating some getLoad and getStore calls to use the PseudoSourceValue objects. Most of this was written by Florian Brander, some reorganization and updating to TOT by me. llvm-svn: 46585	2008-01-31 00:25:39 +00:00
Evan Cheng	29cfb67e28	Even though InsertAtEndOfBasicBlock is an ugly hack it still deserves a proper name. Rename it to EmitInstrWithCustomInserter since it does not necessarily insert instruction at the end. llvm-svn: 46562	2008-01-30 18:18:23 +00:00
Evan Cheng	084a1cdcdd	Work in progress. This patch fixes x86-64 calls which are modelled as StructRet but really should be return in registers, e.g. _Complex long double, some 128-bit aggregates. This is a short term solution that is necessary only because llvm, for now, cannot model i128 nor call's with multiple results. Status: This only works for direct calls, and only the caller side is done. Disabled for now. llvm-svn: 46527	2008-01-29 19:34:22 +00:00
Dale Johannesen	2b3bc30420	Handle 'X' constraint in asm's better. llvm-svn: 46485	2008-01-29 02:21:21 +00:00
Chris Lattner	d05d2011d0	Use fldz and fld1 for long double constants instead of a constant pool load. llvm-svn: 46411	2008-01-27 06:19:31 +00:00
Chris Lattner	250789f1bd	Remove some code for inferring alignment info from the x86 backend now that the dag combiner does it. llvm-svn: 46404	2008-01-26 20:07:42 +00:00
Chris Lattner	f4523c35cb	optimize fxor like for llvm-svn: 46345	2008-01-25 06:14:17 +00:00
Chris Lattner	84ab724e06	Add target-specific dag combines for FAND(x,0) and FOR(x,0). This allows us to compile: double test(double X) { return copysign(0.0, X); } into: _test: andpd LCPI1_0(%rip), %xmm0 ret instead of: _test: pxor %xmm1, %xmm1 andpd LCPI1_0(%rip), %xmm1 movapd %xmm0, %xmm2 andpd LCPI1_1(%rip), %xmm2 movapd %xmm1, %xmm0 orpd %xmm2, %xmm0 ret llvm-svn: 46344	2008-01-25 05:46:26 +00:00
Chris Lattner	a91f77eaac	Significantly simplify and improve handling of FP function results on x86-32. This case returns the value in ST(0) and then has to convert it to an SSE register. This causes significant codegen ugliness in some cases. For example in the trivial fp-stack-direct-ret.ll testcase we used to generate: _bar: subl $28, %esp call L_foo$stub fstpl 16(%esp) movsd 16(%esp), %xmm0 movsd %xmm0, 8(%esp) fldl 8(%esp) addl $28, %esp ret because we move the result of foo() into an XMM register, then have to move it back for the return of bar. Instead of hacking ever-more special cases into the call result lowering code we take a much simpler approach: on x86-32, fp return is modeled as always returning into an f80 register which is then truncated to f32 or f64 as needed. Similarly for a result, we model it as an extension to f80 + return. This exposes the truncate and extensions to the dag combiner, allowing target independent code to hack on them, eliminating them in this case. This gives us this code for the example above: _bar: subl $12, %esp call L_foo$stub addl $12, %esp ret The nasty aspect of this is that these conversions are not legal, but we want the second pass of dag combiner (post-legalize) to be able to hack on them. To handle this, we lie to legalize and say they are legal, then custom expand them on entry to the isel pass (PreprocessForFPConvert). This is gross, but less gross than the code it is replacing :) This also allows us to generate better code in several other cases. For example on fp-stack-ret-conv.ll, we now generate: _test: subl $12, %esp call L_foo$stub fstps 8(%esp) movl 16(%esp), %eax cvtss2sd 8(%esp), %xmm0 movsd %xmm0, (%eax) addl $12, %esp ret where before we produced (incidentally, the old bad code is identical to what gcc produces): _test: subl $12, %esp call L_foo$stub fstpl (%esp) cvtsd2ss (%esp), %xmm0 cvtss2sd %xmm0, %xmm0 movl 16(%esp), %eax movsd %xmm0, (%eax) addl $12, %esp ret Note that we generate slightly worse code on pr1505b.ll due to a scheduling deficiency that is unrelated to this patch. llvm-svn: 46307	2008-01-24 08:07:48 +00:00
Evan Cheng	35abd840a6	Let each target decide byval alignment. For X86, it's 4-byte unless the aggregare contains SSE vector(s). For x86-64, it's max of 8 or alignment of the type. llvm-svn: 46286	2008-01-23 23:17:41 +00:00
Duncan Sands	95d46ef887	The last pieces needed for loading arbitrary precision integers. This won't actually work (and most of the code is dead) unless the new legalization machinery is turned on. While there, I rationalized the handling of i1, and removed some bogus (and unused) sextload patterns. For i1, this could result in microscopically better code for some architectures (not X86). It might also result in worse code if annotating with AssertZExt nodes turns out to be more harmful than helpful. llvm-svn: 46280	2008-01-23 20:39:46 +00:00
Chris Lattner	1ea55cf816	This commit changes: 1. Legalize now always promotes truncstore of i1 to i8. 2. Remove patterns and gunk related to truncstore i1 from targets. 3. Rename the StoreXAction stuff to TruncStoreAction in TLI. 4. Make the TLI TruncStoreAction table a 2d table to handle from/to conversions. 5. Mark a wide variety of invalid truncstores as such in various targets, e.g. X86 currently doesn't support truncstore of any of its integer types. 6. Add legalize support for truncstores with invalid value input types. 7. Add a dag combine transform to turn store(truncate) into truncstore when safe. The later allows us to compile CodeGen/X86/storetrunc-fp.ll to: _foo: fldt 20(%esp) fldt 4(%esp) faddp %st(1) movl 36(%esp), %eax fstps (%eax) ret instead of: _foo: subl $4, %esp fldt 24(%esp) fldt 8(%esp) faddp %st(1) fstps (%esp) movl 40(%esp), %eax movss (%esp), %xmm0 movss %xmm0, (%eax) addl $4, %esp ret llvm-svn: 46140	2008-01-17 19:59:44 +00:00
Chris Lattner	72733e573b	* Introduce a new SelectionDAG::getIntPtrConstant method and switch various codegen pieces and the X86 backend over to using it. * Add some comments to SelectionDAGNodes.h * Introduce a second argument to FP_ROUND, which indicates whether the FP_ROUND changes the value of its input. If not it is safe to xform things like fp_extend(fp_round(x)) -> x. llvm-svn: 46125	2008-01-17 07:00:52 +00:00
Duncan Sands	32b0ff6814	Trampoline support for x86-64. This looks like it should work, but I have no machine to test it on. Committed because it will at least cause no harm, and maybe someone can test it for me! llvm-svn: 46098	2008-01-16 22:55:25 +00:00
Chris Lattner	e8bb9f2190	make it more clear that this predicate only applies to scalar FP types. llvm-svn: 46058	2008-01-16 06:24:21 +00:00

1 2 3 4 5 ...

588 Commits