llvm-project

Commit Graph

Author	SHA1	Message	Date
Chris Lattner	a91f77eaac	Significantly simplify and improve handling of FP function results on x86-32. This case returns the value in ST(0) and then has to convert it to an SSE register. This causes significant codegen ugliness in some cases. For example in the trivial fp-stack-direct-ret.ll testcase we used to generate: _bar: subl $28, %esp call L_foo$stub fstpl 16(%esp) movsd 16(%esp), %xmm0 movsd %xmm0, 8(%esp) fldl 8(%esp) addl $28, %esp ret because we move the result of foo() into an XMM register, then have to move it back for the return of bar. Instead of hacking ever-more special cases into the call result lowering code we take a much simpler approach: on x86-32, fp return is modeled as always returning into an f80 register which is then truncated to f32 or f64 as needed. Similarly for a result, we model it as an extension to f80 + return. This exposes the truncate and extensions to the dag combiner, allowing target independent code to hack on them, eliminating them in this case. This gives us this code for the example above: _bar: subl $12, %esp call L_foo$stub addl $12, %esp ret The nasty aspect of this is that these conversions are not legal, but we want the second pass of dag combiner (post-legalize) to be able to hack on them. To handle this, we lie to legalize and say they are legal, then custom expand them on entry to the isel pass (PreprocessForFPConvert). This is gross, but less gross than the code it is replacing :) This also allows us to generate better code in several other cases. For example on fp-stack-ret-conv.ll, we now generate: _test: subl $12, %esp call L_foo$stub fstps 8(%esp) movl 16(%esp), %eax cvtss2sd 8(%esp), %xmm0 movsd %xmm0, (%eax) addl $12, %esp ret where before we produced (incidentally, the old bad code is identical to what gcc produces): _test: subl $12, %esp call L_foo$stub fstpl (%esp) cvtsd2ss (%esp), %xmm0 cvtss2sd %xmm0, %xmm0 movl 16(%esp), %eax movsd %xmm0, (%eax) addl $12, %esp ret Note that we generate slightly worse code on pr1505b.ll due to a scheduling deficiency that is unrelated to this patch. llvm-svn: 46307	2008-01-24 08:07:48 +00:00
Evan Cheng	4951da49aa	Fix a x86-64 static codegen bug. This fixes a lot of x86-64 jit failures. llvm-svn: 45733	2008-01-08 02:06:11 +00:00
Evan Cheng	f55b7381af	Combine MovePCtoStack + POP32r into one instruction MOVPC32r so it can be moved if needed. llvm-svn: 45605	2008-01-05 00:41:47 +00:00
Chris Lattner	a10fff51d9	Rename SSARegMap -> MachineRegisterInfo in keeping with the idea that "machine" classes are used to represent the current state of the code being compiled. Given this expanded name, we can start moving other stuff into it. For now, move the UsedPhysRegs and LiveIn/LoveOuts vectors from MachineFunction into it. Update all the clients to match. This also reduces some needless #includes, such as MachineModuleInfo from MachineFunction. llvm-svn: 45467	2007-12-31 04:13:23 +00:00
Chris Lattner	f3ebc3f3d2	Remove attribution from file headers, per discussion on llvmdev. llvm-svn: 45418	2007-12-29 20:36:04 +00:00
Evan Cheng	f4f52dbc8c	Fix JIT code emission of X86::MovePCtoStack. llvm-svn: 45307	2007-12-22 02:26:46 +00:00
Evan Cheng	827d30db19	Fold some and + shift in x86 addressing mode. llvm-svn: 44970	2007-12-13 00:43:27 +00:00
Chris Lattner	ff87f05e43	aesthetic changes, no functionality change. Evan, it's not clear what 'Available' is, please add a comment near it and rename it if appropriate. llvm-svn: 44703	2007-12-08 07:22:58 +00:00
Chris Lattner	5728bdd4db	Fix a long standing deficiency in the X86 backend: we would sometimes emit "zero" and "all one" vectors multiple times, for example: _test2: pcmpeqd %mm0, %mm0 movq %mm0, _M1 pcmpeqd %mm0, %mm0 movq %mm0, _M2 ret instead of: _test2: pcmpeqd %mm0, %mm0 movq %mm0, _M1 movq %mm0, _M2 ret This patch fixes this by always arranging for zero/one vectors to be defined as v4i32 or v2i32 (SSE/MMX) instead of letting them be any random type. This ensures they get trivially CSE'd on the dag. This fix is also important for LegalizeDAGTypes, as it gets unhappy when the x86 backend wants BUILD_VECTOR(i64 0) to be legal even when 'i64' isn't legal. This patch makes the following changes: 1) X86TargetLowering::LowerBUILD_VECTOR now lowers 0/1 vectors into their canonical types. 2) The now-dead patterns are removed from the SSE/MMX .td files. 3) All the patterns in the .td file that referred to immAllOnesV or immAllZerosV in the wrong form now use *_bc to match them with a bitcast wrapped around them. 4) X86DAGToDAGISel::SelectScalarSSELoad is generalized to handle bitcast'd zero vectors, which simplifies the code actually. 5) getShuffleVectorZeroOrUndef is updated to generate a shuffle that is legal, instead of generating one that is illegal and expecting a later legalize pass to clean it up. 6) isZeroShuffle is generalized to handle bitcast of zeros. 7) several other minor tweaks. This patch is definite goodness, but has the potential to cause random code quality regressions. Please be on the lookout for these and let me know if they happen. llvm-svn: 44310	2007-11-25 00:24:49 +00:00
Bill Wendling	b7cabbe295	Silence, accersed warning llvm-svn: 43609	2007-11-01 08:51:44 +00:00
Dan Gohman	bf474959a3	Fix the folding of multiplication into addresses on x86, which was broken by the recent {U,S}MUL_LOHI changes. llvm-svn: 43230	2007-10-22 20:22:24 +00:00
Evan Cheng	f8c23f074b	Flag MOV32to32_ with EXTRACT_SUBREG. They should not be scheduled apart. llvm-svn: 42894	2007-10-12 07:55:53 +00:00
Dan Gohman	51554bf30e	Fix grammar in a comment. llvm-svn: 42786	2007-10-09 15:44:37 +00:00
Dan Gohman	a160361c85	Migrate X86 and ARM from using X86ISD::{,I}DIV and ARMISD::MULHILO{U,S} to use ISD::{S,U}DIVREM and ISD::{S,U}MUL_HIO. Move the lowering code associated with these operators into target-independent in LegalizeDAG.cpp and TargetLowering.cpp. llvm-svn: 42762	2007-10-08 18:33:35 +00:00
Anton Korobeynikov	90910745bb	Partly revert invalid r41774 llvm-svn: 42322	2007-09-25 21:52:30 +00:00
Dan Gohman	31599685c7	When both x/y and x%y are needed (x and y both scalar integer), compute both results with a single div or idiv instruction. This uses new X86ISD nodes for DIV and IDIV which are introduced during the legalize phase so that the SelectionDAG's CSE can automatically eliminate redundant computations. llvm-svn: 42308	2007-09-25 18:23:27 +00:00
Dale Johannesen	0241bb57b2	When mixing SSE and x87 codegen, it's possible to have situations where an SSE instruction turns into multiple blocks, with the live range of an x87 register crossing them. To do this correctly make sure we examine all blocks when inserting FP_REG_KILL. PR 1697. (This was exposed by my fix for PR 1681, but the same thing could happen mixing x87 long double with SSE.) llvm-svn: 42281	2007-09-24 22:52:39 +00:00
Evan Cheng	cef2c0efcc	TableGen no longer emit CopyFromReg nodes for implicit results in physical registers. The scheduler is now responsible for emitting them. llvm-svn: 41781	2007-09-07 23:59:02 +00:00
Dale Johannesen	9e70086c8f	Apply feedback from previous patch. llvm-svn: 41774	2007-09-07 21:07:57 +00:00
Dale Johannesen	3cf889f75e	Enhance APFloat to retain bits of NaNs (fixes oggenc). Use APFloat interfaces for more references, mostly of ConstantFPSDNode. llvm-svn: 41632	2007-08-31 04:03:46 +00:00
Dan Gohman	ccb3611881	When x86 addresses matching exceeds its recursion limit, check to see if the base register is already occupied before assuming it can be used. This fixes bogus code generation in the accompanying testcase. llvm-svn: 41049	2007-08-13 20:03:06 +00:00
Christopher Lamb	44e79f8aba	Use subregs to improve any_extend code generation when feasible. llvm-svn: 41013	2007-08-10 22:22:41 +00:00
Christopher Lamb	b372abab14	Increase efficiency of sign_extend_inreg by using subregisters for truncation. As the README suggests sign_extend_subreg is selected to (sext(trunc)). llvm-svn: 41010	2007-08-10 21:48:46 +00:00
Evan Cheng	e32e923a6a	divb / mulb outputs to ah. Under x86-64 it's not legal to read ah if the instruction requires a rex prefix (i.e. outputs to r8b, etc.). So issue shift right by 8 on AX and then truncate it to 8 bits instead. llvm-svn: 40972	2007-08-09 21:59:35 +00:00
Dale Johannesen	a47f7d7cfd	Long double patch 8 of N: make it partially work in SSE mode (all but conversions <-> other FP types, I think): >>Do not mark all-80-bit operations as "Requires[FPStack]" (which really means "not SSE"). >>Refactor load-and-extend to facilitate this. >>Update comments. >>Handle long double in SSE when computing FP_REG_KILL. llvm-svn: 40906	2007-08-07 20:29:26 +00:00
Dale Johannesen	75169a82d6	Get X86 long double calling convention to work (on Darwin, anyway). Fix some table omissions for LD arithmetic. llvm-svn: 40877	2007-08-06 21:31:06 +00:00
Evan Cheng	473c5111c3	Switch some multiplication instructions over to the new scheme for testing. llvm-svn: 40723	2007-08-02 05:48:35 +00:00
Evan Cheng	763cdfd371	Mac OS X X86-64 low 4G address not available. llvm-svn: 40701	2007-08-01 23:45:51 +00:00
Christopher Lamb	5fecb80efa	Change the x86 backend to use extract_subreg for truncation operations. Passes DejaGnu, SingleSource and MultiSource. llvm-svn: 40578	2007-07-29 01:24:57 +00:00
Evan Cheng	ca6e041903	Minor bug. llvm-svn: 40535	2007-07-26 17:02:45 +00:00
Evan Cheng	ce5185b181	Same goes for constantpool, etc. llvm-svn: 40517	2007-07-26 07:35:15 +00:00
Evan Cheng	630c1f75b8	Mac OS X x86-64 lower 4G address is not available. llvm-svn: 40502	2007-07-25 23:41:36 +00:00
Dan Gohman	f0bb12848f	Add const to CanBeFoldedBy, CheckAndMask, and CheckOrMask. llvm-svn: 40480	2007-07-24 23:00:27 +00:00
Dale Johannesen	a2b3c175db	Fix for PR 1505 (and 1489). Rewrite X87 register model to include f32 variants. Some factoring improvments forthcoming. llvm-svn: 37847	2007-07-03 00:53:03 +00:00
Dan Gohman	309d3d51b3	Move ComputeMaskedBits, MaskedValueIsZero, and ComputeNumSignBits from TargetLowering to SelectionDAG so that they have more convenient access to the current DAG, in preparation for the ValueType routines being changed from standalone functions to members of SelectionDAG for the pre-legalize vector type changes. llvm-svn: 37704	2007-06-22 14:59:07 +00:00
Chris Lattner	a5fcd24746	Fix CodeGen/X86/2007-03-24-InlineAsmPModifier.ll llvm-svn: 35926	2007-04-11 22:29:46 +00:00
Anton Korobeynikov	0ad22563b8	Oops :) llvm-svn: 35438	2007-03-28 18:38:33 +00:00
Anton Korobeynikov	7522c9d8e1	Don't allow MatchAddress recurse too much. This trims exponential behaviour in some cases. llvm-svn: 35437	2007-03-28 18:36:33 +00:00
Chris Lattner	3e1d917e80	Two changes: 1) codegen a shift of a register as a shift, not an LEA. 2) teach the RA to convert a shift to an LEA instruction if it wants something in three-address form. This gives us asm diffs like: - leal (,%eax,4), %eax + shll $2, %eax which is faster on some processors and smaller on all of them. and, more interestingly: - movl 24(%esi), %eax - leal (,%eax,4), %edi + movl 24(%esi), %edi + shll $2, %edi Without #2, #1 was a significant pessimization in some cases. This implements CodeGen/X86/shift-codegen.ll llvm-svn: 35204	2007-03-20 06:08:29 +00:00
Chris Lattner	fe8c530d79	Fix a miscompilation in the addr mode code trying to implement X \| C and X + C to promote LEA formation. We would incorrectly apply it in some cases (test) and miss it in others. This fixes CodeGen/X86/2007-02-04-OrAddrMode.ll llvm-svn: 33884	2007-02-04 20:18:17 +00:00
Evan Cheng	1281dc32ef	Linux GOT indirect reference is only necessary in PIC mode. llvm-svn: 33441	2007-01-22 21:34:25 +00:00
Reid Spencer	015b432b54	Adjust #includes to compensate for lost of DerivedTypes.h in TargetLowering.h llvm-svn: 33154	2007-01-12 23:22:14 +00:00
Anton Korobeynikov	a0554d90e8	* PIC codegen for X86/Linux has been implemented * PIC-aware internal structures in X86 Codegen have been refactored * Visibility (default/weak) has been added * Docs fixes (external weak linkage, visibility, formatting) llvm-svn: 33136	2007-01-12 19:20:47 +00:00
Anton Korobeynikov	4efbbc963f	Really big cleanup. - New target type "mingw" was introduced - Same things for both mingw & cygwin are marked as "cygming" (as in gcc) - .lcomm is supported here, so allow LLVM to use it - Correctly use underscored versions of setjmp & _longjmp for both mingw & cygwin llvm-svn: 32833	2007-01-03 11:43:14 +00:00
Chris Lattner	1ef9cd400d	eliminate static ctors for Statistic objects. llvm-svn: 32703	2006-12-19 22:59:26 +00:00
Evan Cheng	582ac4bed7	Fix for PR1062 by Dan Gohman. llvm-svn: 32688	2006-12-19 21:31:42 +00:00
Bill Wendling	9bfb1e1f29	What should be the last unnecessary <iostream>s in the library. llvm-svn: 32333	2006-12-07 22:21:48 +00:00
Chris Lattner	700b873130	Detemplatize the Statistic class. The only type it is instantiated with is 'unsigned'. llvm-svn: 32279	2006-12-06 17:46:33 +00:00
Evan Cheng	47e181cc4d	Revert an unintended change. llvm-svn: 32239	2006-12-05 22:03:40 +00:00
Evan Cheng	dd60ca029c	- Switch X86-64 JIT to large code size model. - Re-enable some codegen niceties for X86-64 static relocation model codegen. - Clean ups, etc. llvm-svn: 32238	2006-12-05 19:50:18 +00:00

1 2 3 4

184 Commits