llvm-project

Commit Graph

Author	SHA1	Message	Date
Manman Ren	a09820414a	X86: add GATHER intrinsics (AVX2) in LLVM Support the following intrinsics: llvm.x86.avx2.gather.d.pd, llvm.x86.avx2.gather.q.pd llvm.x86.avx2.gather.d.pd.256, llvm.x86.avx2.gather.q.pd.256 llvm.x86.avx2.gather.d.ps, llvm.x86.avx2.gather.q.ps llvm.x86.avx2.gather.d.ps.256, llvm.x86.avx2.gather.q.ps.256 Modified Disassembler to handle VSIB addressing mode. llvm-svn: 159221	2012-06-26 19:47:59 +00:00
Craig Topper	a4fd6d655a	Tidy up spacing. llvm-svn: 157313	2012-05-23 05:44:51 +00:00
Evan Cheng	58a95f0c8a	Avoid creating a cycle when folding load / op with flag / store. PR11451474. rdar://11451474 llvm-svn: 156896	2012-05-16 01:54:27 +00:00
Evan Cheng	3e869f002c	Generalize r153635 to deal with TokenFactor chains; also clean up the logic and fix the tests. rdar://11069732, rdar://11236106 llvm-svn: 154604	2012-04-12 19:14:21 +00:00
Chandler Carruth	3779ac10b4	Cleanup and relax a restriction on the matching of global offsets into x86 addressing modes. This allows PIE-based TLS offsets to fit directly into an addressing mode immediate offset, which is the last remaining code quality issue from PR12380. With this patch, that PR is completely fixed. To understand why this patch is correct to match these offsets into addressing mode immediates, break it down by cases: 1) 32-bit is trivially correct, and unmodified here. 2) 64-bit non-small mode is unchanged and never matches. 3) 64-bit small PIC code which is RIP-relative is handled specially in the match to try to fit RIP into the base register. If it fails, it now early exits. This behavior is unchanged by the patch. 4) 64-bit small non-PIC code which is not RIP-relative continues to work as it did before. The reason these immediates are safe is because the ABI ensures they fit in small mode. This behavior is unchanged. 5) 64-bit small PIC code which is not using RIP-relative addressing. This is the only case changed by the patch, and the primary place you see it is in TLS, either the win64 section offset TLS or Linux local-exec TLS model in a PIC compilation. Here the ABI again ensures that the immediates fit because we are in small mode, and any other operations required due to the PIC relocation model have been handled externally to the Wrapper node (extra loads etc are made around the wrapper node in ISelLowering). I've tested this as much as I can comparing it with GCC's output, and everything appears safe. I discussed this with Anton and it made sense to him at least at face value. That said, if there are issues with PIC code after this patch, yell and we can revert it. llvm-svn: 154304	2012-04-09 02:13:06 +00:00
Rafael Espindola	ba0a6cabb8	Always compute all the bits in ComputeMaskedBits. This allows us to keep passing reduced masks to SimplifyDemandedBits, but know about all the bits if SimplifyDemandedBits fails. This allows instcombine to simplify cases like the one in the included testcase. llvm-svn: 154011	2012-04-04 12:51:34 +00:00
Benjamin Kramer	8619c37b5b	Replace assert(0) with llvm_unreachable to avoid warnings about dropping off the end of a non-void function in Release builds. llvm-svn: 153643	2012-03-29 12:37:26 +00:00
Joel Jones	68d59e8a90	For X86, change load/dec-or-inc/store into dec-or-inc, respectively. This is a code change to add support for changing instruction sequences of the form: load inc/dec of 8/16/32/64 bits store into the appropriate X86 inc/dec through memory instruction: inc[qlwb] / dec[qlwb] The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode. The comments have also been expanded. llvm-svn: 153635	2012-03-29 05:45:48 +00:00
Joel Jones	b474099e63	Reverted to revision 153616 to unblock build llvm-svn: 153623	2012-03-29 01:20:56 +00:00
Joel Jones	b88c81fe0f	For X86, change load/dec-or-inc/store into dec-or-inc, respectively. This is a code change to add support for changing instruction sequences of the form: load inc/dec of 8/16/32/64 bits store into the appropriate X86 inc/dec through memory instruction: inc[qlwb] / dec[qlwb] The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode. The comments have also been expanded. llvm-svn: 153617	2012-03-29 00:37:47 +00:00
Craig Topper	1fcf5bcae1	Prune some includes llvm-svn: 153502	2012-03-27 07:54:11 +00:00
Craig Topper	f6e7e12f75	Remove unnecessary llvm:: qualifications llvm-svn: 153500	2012-03-27 07:21:54 +00:00
Craig Topper	b25fda95f6	Reorder includes in Target backends to following coding standards. Remove some superfluous forward declarations. llvm-svn: 152997	2012-03-17 18:46:09 +00:00
Craig Topper	2dac962864	Use uint16_t to store opcodes in static tables in X86 backend. llvm-svn: 152391	2012-03-09 07:45:21 +00:00
Craig Topper	cc830f8cda	Declare register classes as const. Fix a couple pointers to register classes that weren't already const. llvm-svn: 151138	2012-02-22 07:28:11 +00:00
Jakob Stoklund Olesen	97e3115dc2	Use the same CALL instructions for Windows as for everything else. The different calling conventions and call-preserved registers are represented with regmask operands that are added dynamically. llvm-svn: 150708	2012-02-16 17:56:02 +00:00
Pete Cooper	c21ebf5c41	Stop custom lowering forr x86 DEC64m from happening if the load in the lowered sequence has more than 1 user llvm-svn: 150537	2012-02-15 00:33:37 +00:00
Pete Cooper	71be57bb32	Fixed bug when custom lowering DEC64m on x86. If the DEC node had more than one user, it was doing this lowering but leaving the original DEC node around and so decrementing twice. Fixes PR11964. llvm-svn: 150356	2012-02-13 00:10:03 +00:00
David Blaikie	46a9f016c5	More dead code removal (using -Wunreachable-code) llvm-svn: 148578	2012-01-20 21:51:11 +00:00
Chandler Carruth	eb21da060b	Switch all of the uses of my InsertDAGNode helper to follow the exact same pattern. We already had this pattern is a few places, but others tried to make a rough approximation of an actual DAG structure. As not everywhere went to this trouble, nothing could rely on this being done. In fact, I've checked all references to these node Ids, and the ones that are using the topo-sort properties are actually satisfied with a strict-weak-ordering. The requirement appears to be that Use >= Def. I've added a big blurb of comments to this bit of the transform to clarify why the order is so important for the next reader of the code. I'm starting with this change as it is very small, and trivially reverted if something breaks or the >= above really does need to be >. If that proves the case, we can hide the problem by reverting this patch, but the problem exists elsewhere as well, and so a more comprehensive solution will be needed. llvm-svn: 148001	2012-01-12 01:34:44 +00:00
Chandler Carruth	3212a34269	Revert r147945 which disabled an addressing mode transformation. I had hoped this would revive one of the llvm-gcc selfhost build bots, but it didn't so it doesn't appear that my transform is the culprit. If anyone else is seeing failures, please let me know! llvm-svn: 147957	2012-01-11 18:36:12 +00:00
Chandler Carruth	9bc48e5215	Disable the transformation I added in r147936 to see if it fixes some strange build bot failures that look like a miscompile into an infloop. I'll investigate this tomorrow, but I'd both like to know whether my patch is the culprit, and get the bots back to green. llvm-svn: 147945	2012-01-11 12:17:47 +00:00
Chandler Carruth	3eacfb83fa	Hoist a really redundant code pattern into a helper function, and delete lots of lines of code. No functionality changed. llvm-svn: 147942	2012-01-11 11:04:36 +00:00
Chandler Carruth	b0049f4a43	Simplify the AND-rooted mask+shift checking code to match that of the SRL-rooted code. llvm-svn: 147941	2012-01-11 09:35:04 +00:00
Chandler Carruth	3dbcda8478	Unify the interface of the three mask+shift transform helpers, and factor the differences that were hiding in one of them into its other caller, the SRL handling code. No change in behavior. llvm-svn: 147940	2012-01-11 09:35:02 +00:00
Chandler Carruth	aa01e6661a	Clarify and make explicit some of the requirements for transforming mask+shift pairs at the beginning of the ISD::AND case block, and then hoist the final pattern into a helper function, simplifying and reflowing it appropriately. This should have no observable behavior change, but several simplifications fell out of this such as directly computing the new mask constant, etc. llvm-svn: 147939	2012-01-11 09:35:00 +00:00
Chandler Carruth	51d3076bbf	Hoist the logic to transform shift+mask combinations into sub-register extracts and scaled addressing modes into its own helper function. No functionality changed here, just hoisting and layout fixes falling out of that hoisting. llvm-svn: 147937	2012-01-11 08:48:20 +00:00
Chandler Carruth	55b2cdee26	Teach the X86 instruction selection to do some heroic transforms to detect a pattern which can be implemented with a small 'shl' embedded in the addressing mode scale. This happens in real code as follows: unsigned x = my_accelerator_table[input >> 11]; Here we have some lookup table that we look into using the high bits of 'input'. Each entity in the table is 4-bytes, which means this implicitly gets turned into (once lowered out of a GEP): (unsigned)((char)my_accelerator_table + ((input >> 11) << 2)); The shift right followed by a shift left is canonicalized to a smaller shift right and masking off the low bits. That hides the shift right which x86 has an addressing mode designed to support. We now detect masks of this form, and produce the longer shift right followed by the proper addressing mode. In addition to saving a (rather large) instruction, this also reduces stalls in Intel chips on benchmarks I've measured. In order for all of this to work, one part of the DAG needs to be canonicalized still further* than it currently is. This involves removing pointless 'trunc' nodes between a zextload and a zext. Without that, we end up generating spurious masks and hiding the pattern. llvm-svn: 147936	2012-01-11 08:41:08 +00:00
Chandler Carruth	c16622daff	Don't rely on the fact that shift values are never very large, and thus this substraction will result in small negative numbers at worst which become very large positive numbers on assignment and are thus caught by the <=4 check on the next line. The >0 check clearly intended to catch these as negative numbers. Spotted by inspection, and impossible to trigger given the shift widths that can be used. llvm-svn: 147773	2012-01-09 09:47:25 +00:00
Pete Cooper	48784ed5b7	Added missing comment about new custom lowering of DEC64 llvm-svn: 144811	2011-11-16 19:03:23 +00:00
Pete Cooper	7c7ba1baa1	Added custom lowering for load->dec->store sequence in x86 when the EFLAGS registers is used by later instructions. Only done for DEC64m right now. Fixes <rdar://problem/6172640> llvm-svn: 144705	2011-11-15 21:57:53 +00:00
Dan Gohman	198b7ffc11	Reapply r143206, with fixes. Disallow physical register lifetimes across calls, and only check for nested dependences on the special call-sequence-resource register. llvm-svn: 143660	2011-11-03 21:49:52 +00:00
Dan Gohman	9b9c970148	Revert r143206, as there are still some failing tests. llvm-svn: 143262	2011-10-29 00:41:52 +00:00
Dan Gohman	73057ad24f	Reapply r143177 and r143179 (reverting r143188), with scheduler fixes: Use a separate register, instead of SP, as the calling-convention resource, to avoid spurious conflicts with actual uses of SP. Also, fix unscheduling of calling sequences, which can be triggered by pseudo-two-address dependencies. llvm-svn: 143206	2011-10-28 17:55:38 +00:00
Duncan Sands	225a7037d6	Speculatively disable Dan's commits 143177 and 143179 to see if it fixes the dragonegg self-host (it looks like gcc is miscompiled). Original commit messages: Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. Delete #if 0 code accidentally left in. llvm-svn: 143188	2011-10-28 09:55:57 +00:00
Dan Gohman	4db3f7dd83	Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. llvm-svn: 143177	2011-10-28 01:29:32 +00:00
Jakob Stoklund Olesen	729abd360e	Add TEST8ri_NOREX pseudo to constrain sub_8bit_hi copies. In 64-bit mode, sub_8bit_hi sub-registers can only be used by NOREX instructions. The COPY created from the EXTRACT_SUBREG DAG node cannot target all GR8 registers, only those in GR8_NOREX. TO enforce this, we ensure that all instructions using the EXTRACT_SUBREG are GR8_NOREX constrained. This fixes PR11088. llvm-svn: 141499	2011-10-08 18:28:28 +00:00
Bruno Cardoso Lopes	616fe60548	Teach PreprocessISelDAG to be aware of vector types and to not process them. llvm-svn: 136653	2011-08-01 21:54:05 +00:00
Eli Friedman	344ec79715	Make sure we don't combine a large displacement and a frame index in the same addressing mode on x86-64. It can overflow, leading to a crash/miscompile. <rdar://problem/9763308> llvm-svn: 135084	2011-07-13 21:29:53 +00:00
Eli Friedman	ef67e7d623	Refactor out checking for displacements on x86-64 addressing modes. No functionality change. Refactoring in preparation for an additional safety check in FoldOffsetIntoAddress. Part of <rdar://problem/9763308>. llvm-svn: 135079	2011-07-13 20:44:23 +00:00
Eric Christopher	a8a56f7e5c	TargetConstant immediates won't be placed into registers so tighten up the valid constant check earlier. rdar://9692967 llvm-svn: 134286	2011-07-01 23:04:38 +00:00
Eric Christopher	c932173773	Fix a small thinko for constant i64 lock/orq optimization where we we didn't have an opcode for 64-bit constant or expressions. Fixes rdar://9692967 llvm-svn: 134121	2011-06-30 00:48:30 +00:00
Stuart Hastings	91f1d24736	Re-commit 131641 with fixes; de-pseudoize MOVSX16rr8 and friends. rdar://problem/8614450 llvm-svn: 131746	2011-05-20 19:04:40 +00:00
Eric Christopher	56a42ebf15	Update comment. llvm-svn: 131459	2011-05-17 08:16:14 +00:00
Eric Christopher	a1d9e29552	Support XOR and AND optimization with no return value. Finishes off rdar://8470697 llvm-svn: 131458	2011-05-17 08:10:18 +00:00
Eric Christopher	abfe3131e3	Couple less magic numbers. llvm-svn: 131457	2011-05-17 07:50:41 +00:00
Eric Christopher	eb47a2a1e5	Make this code a little less magic number laden. llvm-svn: 131456	2011-05-17 07:47:55 +00:00
Eric Christopher	2a9dbbbb12	Turn this into a table, this will make more sense shortly. Part of rdar://8470697 llvm-svn: 131200	2011-05-11 21:44:58 +00:00
Eric Christopher	4a34e61e53	Optimize atomic lock or that doesn't use the result value. Next up: xor and and. Part of rdar://8470697 llvm-svn: 131171	2011-05-10 23:57:45 +00:00
Benjamin Kramer	3db054650b	Silence an overzealous uninitialized variable warning from GCC. llvm-svn: 130053	2011-04-23 08:21:06 +00:00
Benjamin Kramer	4c81624735	X86: Try to use a smaller encoding by transforming (X << C1) & C2 into (X & (C2 >> C1)) & C1. (Part of PR5039) This tends to happen a lot with bitfield code generated by clang. A simple example for x86_64 is uint64_t foo(uint64_t x) { return (x&1) << 42; } which used to compile into bloated code: shlq $42, %rdi ## encoding: [0x48,0xc1,0xe7,0x2a] movabsq $4398046511104, %rax ## encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x00,0x04,0x00,0x00] andq %rdi, %rax ## encoding: [0x48,0x21,0xf8] ret ## encoding: [0xc3] with this patch we can fold the immediate into the and: andq $1, %rdi ## encoding: [0x48,0x83,0xe7,0x01] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] shlq $42, %rax ## encoding: [0x48,0xc1,0xe0,0x2a] ret ## encoding: [0xc3] It's possible to save another byte by using 'andl' instead of 'andq' but I currently see no way of doing that without making this code even more complicated. See the TODOs in the code. llvm-svn: 129990	2011-04-22 15:30:40 +00:00
Stuart Hastings	81c4306005	Swap VT and DebugLoc operands of getExtLoad() for consistency with other getNode() methods. Radar 9002173. llvm-svn: 125665	2011-02-16 16:23:55 +00:00
Chris Lattner	46c01a30f4	Enhance ComputeMaskedBits to know that aligned frameindexes have their low bits set to zero. This allows us to optimize out explicit stack alignment code like in stack-align.ll:test4 when it is redundant. Doing this causes the code generator to start turning FI+cst into FI\|cst all over the place, which is general goodness (that is the canonical form) except that various pieces of the code generator don't handle OR aggressively. Fix this by introducing a new SelectionDAG::isBaseWithConstantOffset predicate, and using it in places that are looking for ADD(X,CST). The ARM backend in particular was missing a lot of addressing mode folding opportunities around OR. llvm-svn: 125470	2011-02-13 22:25:43 +00:00
NAKAMURA Takumi	f3e20b9f0f	lib/Target/X86/X86ISelDAGToDAG.cpp: __main should be WINCALL64 on Win64. CALL64 marks %xmm* as dead. llvm-svn: 124354	2011-01-27 03:20:19 +00:00
Chris Lattner	35a2e65bcb	fix PR8514, a bug where the "heroic" transformation of shift/and into and/shift would cause nodes to move around and a dangling pointer to happen. The code tried to avoid this with a HandleSDNode, but got the details wrong. llvm-svn: 123578	2011-01-16 08:48:11 +00:00
Ted Kremenek	b5241b2b59	'HiReg' is written but never read. Nuke its declaration and its assignments. Found by clang static analyzer. llvm-svn: 123486	2011-01-14 22:34:13 +00:00
Bill Wendling	81d40711f3	PR8918 - When used with MinGW64, LLVM generates a "calll __main" at the beginning of the "main" function. The assembler complains about the invalid suffix for the 'call' instruction. The right instruction is "callq __main". Patch by KS Sreeram! llvm-svn: 122933	2011-01-06 00:47:10 +00:00
Chris Lattner	3e5fbd74ed	rename MVT::Flag to MVT::Glue. "Flag" is a terrible name for something that just glues two nodes together, even if it is sometimes used for flags. llvm-svn: 122310	2010-12-21 02:38:05 +00:00
Chris Lattner	364bb0a081	it turns out that when ".with.overflow" intrinsics were added to the X86 backend that they were all implemented except umul. This one fell back to the default implementation that did a hi/lo multiply and compared the top. Fix this to check the overflow flag that the 'mul' instruction sets, so we can avoid an explicit test. Now we compile: void *func(long count) { return new int[count]; } into: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] seto %cl ## encoding: [0x0f,0x90,0xc1] testb %cl, %cl ## encoding: [0x84,0xc9] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL instead of: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] testq %rdx, %rdx ## encoding: [0x48,0x85,0xd2] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL Other than the silly seto+test, this is using the o bit directly, so it's going in the right direction. llvm-svn: 120935	2010-12-05 07:30:36 +00:00
Dale Johannesen	e660f4d072	Use a MemIntrinsicSDNode for ISD::PREFETCH, which touches memory, so a MachineMemOperand is useful (not propagated into the MachineInstr yet). No functional change except for dump output. llvm-svn: 117413	2010-10-26 23:11:10 +00:00
Chris Lattner	1a1c600110	Use #NAME# to have the CMOV multiclass define things with the same names as before (e.g. CMOVBE16rr instead of CMOVBErr16). llvm-svn: 115705	2010-10-05 23:00:14 +00:00
Chris Lattner	0067ee02f9	switch CMOVBE to the multipattern: 21 insertions(+), 53 deletions(-) Moar change coming before I switch the rest. llvm-svn: 115697	2010-10-05 22:23:58 +00:00
Eric Christopher	c1b3e072f4	Temporarily work around new address lowering while I figure out what needs to happen for darwin. llvm-svn: 114577	2010-09-22 20:42:08 +00:00
Chris Lattner	8a236b63d8	reimplement elf TLS support in terms of addressing modes, eliminating SegmentBaseAddress. llvm-svn: 114529	2010-09-22 04:39:11 +00:00
Chris Lattner	a5156c30ed	convert the last 4 X86ISD nodes that should have memoperands to have them. llvm-svn: 114523	2010-09-22 01:28:21 +00:00
Chris Lattner	ed85da5600	give X86ISD::FNSTCW16m a memoperand, since it touches memory. It only can access the stack due to how it is generated though. llvm-svn: 114522	2010-09-22 01:11:26 +00:00
Chris Lattner	78f518b79b	give FP_TO_INT16_IN_MEM and friends a memoperand. They are only used with stack slots, but hey, lets be safe. llvm-svn: 114521	2010-09-22 01:05:16 +00:00
Chris Lattner	54e5329545	give VZEXT_LOAD a memory operand, it now works with segment registers. llvm-svn: 114515	2010-09-22 00:34:38 +00:00
Chris Lattner	07827ba978	revert r114386 now that address modes work correctly, we get a nice call through gs-relative memory now. llvm-svn: 114510	2010-09-22 00:11:31 +00:00
Chris Lattner	e479e9643b	give LCMPXCHG_DAG[8] a memory operand, allowing it to work with addrspace 256/257 llvm-svn: 114508	2010-09-21 23:59:42 +00:00
Chris Lattner	d58d7c1907	reimplement support for GS and FS relative address space matching by having X86DAGToDAGISel::SelectAddr get passed in the parent node of the operand match (the load/store/atomic op) and having it get the address space from that, instead of having special FS/GS addr mode operations that require duplicating the entire instruction set to support. This makes FS and GS relative accesses far more predictable and work much better. It also simplifies the X86 backend a bit, more to come. There is still a pending issue with nodes like ISD::PREFETCH and X86ISD::FLD, which really should be MemSDNode's but aren't. llvm-svn: 114491	2010-09-21 22:07:31 +00:00
Chris Lattner	0e023ea02a	fix a long standing wart: all the ComplexPattern's were being passed the root of the match, even though only a few patterns actually needed this (one in X86, several in ARM [which should be refactored anyway], and some in CellSPU that I don't feel like detangling). Instead of requiring all ComplexPatterns to take the dead root, have targets opt into getting the root by putting SDNPWantRoot on the ComplexPattern. llvm-svn: 114471	2010-09-21 20:31:19 +00:00
Chris Lattner	c6d8839a2b	even though I'm about to rip it out, simplify the address mode stuff llvm-svn: 114468	2010-09-21 19:41:58 +00:00
Chris Lattner	3d178ed4d4	propagate MachinePointerInfo through various uses of the old SelectionDAG::getExtLoad overload, and eliminate it. llvm-svn: 114446	2010-09-21 17:04:51 +00:00
Chris Lattner	bb0a1c44bf	fix rdar://8453210, a crash handling a call through a GS relative load. For now, just disable folding the load into the call. llvm-svn: 114386	2010-09-21 03:37:00 +00:00
Chris Lattner	65b48b5dfc	zap dead code. llvm-svn: 113073	2010-09-04 18:12:00 +00:00
Jakob Stoklund Olesen	08aede2538	Don't call Predicate_* from X86 target. llvm-svn: 112921	2010-09-03 00:35:18 +00:00
Benjamin Kramer	f1f2133ac0	Remove dead recursive function. Yay for clang -Wunused-function. llvm-svn: 112060	2010-08-25 17:27:58 +00:00
Eli Friedman	39d0f57cab	PR7814: Truncates cannot be ignored for signed comparisons. llvm-svn: 110268	2010-08-04 22:40:58 +00:00
Chris Lattner	f469307c77	Change LEA to have 5 operands for its memory operand, just like all other instructions, even though a segment is not allowed. This resolves a bunch of gross hacks in the encoder and makes LEA more consistent with the rest of the instruction set. No functionality change. llvm-svn: 107934	2010-07-08 23:46:44 +00:00
Evan Cheng	1c349f18f8	Move getExtLoad() and (some) getLoad() DebugLoc argument after EVT argument for consistency sake. llvm-svn: 107820	2010-07-07 22:15:37 +00:00
Devang Patel	a3ca21b228	Propagate debug loc. llvm-svn: 107710	2010-07-06 22:08:15 +00:00
Jakob Stoklund Olesen	d7d0d4e882	When creating X86 MUL8 and DIV8 instructions, make sure we don't produce CopyFromReg nodes for aliasing registers (AX and AL). This confuses the fast register allocator. Instead of CopyFromReg(AL), use ExtractSubReg(CopyFromReg(AX), sub_8bit). This fixes PR7312. llvm-svn: 106934	2010-06-26 00:39:23 +00:00
Dan Gohman	92c11acdb8	Change UpdateNodeOperands' operand and return value from SDValue to SDNode *, since it doesn't care about the ResNo value. llvm-svn: 106282	2010-06-18 15:30:29 +00:00
Dan Gohman	99ba4dac59	Don't maintain a set of deleted nodes; instead, use a HandleSDNode to track a node over CSE events. This fixes PR7368. llvm-svn: 106266	2010-06-18 01:24:29 +00:00
Eric Christopher	b0e1a458ce	Add first pass at darwin tls compiler support. llvm-svn: 105381	2010-06-03 04:07:48 +00:00
Jakob Stoklund Olesen	9340ea59e1	Rename X86 subregister indices to something shorter. Use the tablegen-produced enums. llvm-svn: 104493	2010-05-24 14:48:17 +00:00
Dan Gohman	0fd54fbbcf	Don't leave Base.FrameIndex uninitialized, so that it doesn't print randomly in debug output. llvm-svn: 102668	2010-04-29 23:30:41 +00:00
Evan Cheng	050df1b8de	Enable i16 to i32 promotion by default. llvm-svn: 102493	2010-04-28 08:30:49 +00:00
Chris Lattner	84776786a7	teach the x86 address matching stuff to handle (shl (or x,c), 3) the same as (shl (add x, c), 3) when x doesn't have any bits from c set. This finishes off PR1135. Before we compiled the block to: to: LBB0_3: ## %bb cmpb $4, %dl sete %dl addb %dl, %cl movb %cl, %dl shlb $2, %dl addb %r8b, %dl shlb $2, %dl movzbl %dl, %edx movl %esi, (%rdi,%rdx,4) leaq 2(%rdx), %r9 movl %esi, (%rdi,%r9,4) leaq 1(%rdx), %r9 movl %esi, (%rdi,%r9,4) addq $3, %rdx movl %esi, (%rdi,%rdx,4) incb %r8b decb %al movb %r8b, %dl jne LBB0_1 Now we produce: LBB0_3: ## %bb cmpb $4, %dl sete %dl addb %dl, %cl movb %cl, %dl shlb $2, %dl addb %r8b, %dl shlb $2, %dl movzbl %dl, %edx movl %esi, (%rdi,%rdx,4) movl %esi, 8(%rdi,%rdx,4) movl %esi, 4(%rdi,%rdx,4) movl %esi, 12(%rdi,%rdx,4) incb %r8b decb %al movb %r8b, %dl jne LBB0_1 llvm-svn: 101958	2010-04-20 23:18:40 +00:00
Dan Gohman	21cea8ac2e	Use const qualifiers with TargetLowering. This eliminates several const_casts, and it reinforces the design of the Target classes being immutable. SelectionDAGISel::IsLegalToFold is now a static member function, because PIC16 uses it in an unconventional way. There is more room for API cleanup here. And PIC16's AsmPrinter no longer uses TargetLowering. llvm-svn: 101635	2010-04-17 15:26:15 +00:00
Dan Gohman	bcaf681cde	Add const qualifiers to CodeGen's use of LLVM IR constructs. llvm-svn: 101334	2010-04-15 01:51:59 +00:00
Dan Gohman	c87b74d913	Delete unneeeded arguments. llvm-svn: 101276	2010-04-14 20:17:22 +00:00
Chris Lattner	6f306d7d30	use DebugLoc default ctor instead of DebugLoc::getUnknownLoc() llvm-svn: 100214	2010-04-02 20:16:16 +00:00
Evan Cheng	68333f5c6e	X86 address mode matching code MatchAddressRecursively does some aggressive hack which require doing a RAUW. It may end up deleting some SDNode up stream. It should avoid referencing deleted nodes. llvm-svn: 98780	2010-03-17 23:58:35 +00:00
Evan Cheng	d703df67ce	Do not force indirect tailcall through fixed registers: eax, r11. Add support to allow loads to be folded to tail call instructions. llvm-svn: 98465	2010-03-14 03:48:46 +00:00
Chris Lattner	82cc53388e	add a comment. llvm-svn: 97709	2010-03-04 01:43:43 +00:00
Chris Lattner	3fcbbd8673	factor the 'sign extended from 8 bit' patterns better so that they are not destination type specific. This allows tblgen to factor them and the type check is redundant with what the isel does anyway. llvm-svn: 97629	2010-03-03 01:45:01 +00:00
Chris Lattner	8d63704021	merge two loops over all nodes in the graph into one. llvm-svn: 97606	2010-03-02 23:12:51 +00:00
Chris Lattner	1eb6eb059c	eliminate PreprocessForRMW now that isel handles it. We still preprocess calls and fp return stuff. llvm-svn: 97598	2010-03-02 22:33:56 +00:00
Chris Lattner	dd030701bd	Fix some issues in WalkChainUsers dealing with CopyToReg/CopyFromReg/INLINEASM. These are annoying because they have the same opcode before an after isel. Fix this by setting their NodeID to -1 to indicate that they are selected, just like what automatically happens when selecting things that end up being machine nodes. With that done, give IsLegalToFold a new flag that causes it to ignore chains. This lets the HandleMergeInputChains routine be the one place that validates chains after a match is successful, enabling the new hotness in chain processing. This smarter chain processing eliminates the need for "PreprocessRMW" in the X86 and MSP430 backends and enables MSP to start matching it's multiple mem operand instructions more aggressively. I currently #if out the dead code in the X86 backend and MSP backend, I'll remove it for real in a follow-on patch. The testcase changes are: test/CodeGen/X86/sse3.ll: we generate better code test/CodeGen/X86/store_op_load_fold2.ll: PreprocessRMW was miscompiling this before, we now generate correct code Convert it to filecheck while I'm at it. test/CodeGen/MSP430/Inst16mm.ll: Add a testcase for mem/mem folding to make anton happy. :) llvm-svn: 97596	2010-03-02 22:20:06 +00:00
Chris Lattner	f98f124a73	Sink InstructionSelect() out of each target into SDISel, and rename it DoInstructionSelection. Inline "SelectRoot" into it from DAGISelHeader. Sink some other stuff out of DAGISelHeader into SDISel. Eliminate the various 'Indent' stuff from various targets, which dates to when isel was recursive. 17 files changed, 114 insertions(+), 430 deletions(-) llvm-svn: 97555	2010-03-02 06:34:30 +00:00
Chris Lattner	bd6e193f54	remove a little hack I did for the old isel, not needed now that it is gone. llvm-svn: 97516	2010-03-01 22:51:11 +00:00
Chris Lattner	55ef1ebe52	remove a terrible hack that disabled assertions from this file because of build time problems. rdar://7697850. llvm-svn: 97500	2010-03-01 21:20:46 +00:00
Chris Lattner	8d7b4393d2	no need to override IsLegalToFold, the base implementation disables load folding at -O0. llvm-svn: 96973	2010-02-23 19:33:11 +00:00
Chris Lattner	3c29aff9ff	fix and un-xfail X86/vec_ss_load_fold.ll llvm-svn: 96720	2010-02-21 04:53:34 +00:00
Chris Lattner	18a32ce0f3	rename SelectScalarSSELoad -> SelectScalarSSELoadXXX and rewrite it to follow the mode needed by the new isel. Instead of returning the input and output chains, it just returns the (currently only one, which is a silly limitation) node that has input and output chains. Since we want the old thing to still work, add a new SelectScalarSSELoad to emulate the old interface. The XXX suffix and the wrapper will eventually go away. llvm-svn: 96715	2010-02-21 03:17:59 +00:00
Chris Lattner	3f48215480	rename and document some arguments so I don't have to keep reverse engineering what they are. llvm-svn: 96456	2010-02-17 06:07:47 +00:00
Chris Lattner	afac7dad21	fix rdar://7653908, a crash on a case where we would fold a load into a roundss intrinsic, producing a cyclic dag. The root cause of this is badness handling ComplexPattern nodes in the old dagisel that I noticed through inspection. Eliminate a copy of the of the code that handled ComplexPatterns by making EmitChildMatchCode call into EmitMatchCode. llvm-svn: 96408	2010-02-16 22:35:06 +00:00
Evan Cheng	5e73ff2e3a	Split SelectionDAGISel::IsLegalAndProfitableToFold to IsLegalToFold and IsProfitableToFold. The generic version of the later simply checks whether the folding candidate has a single use. This allows the target isel routines more flexibility in deciding whether folding makes sense. The specific case we are interested in is folding constant pool loads with multiple uses. llvm-svn: 96255	2010-02-15 19:41:07 +00:00
David Greene	cbd39c5def	Remove an assumption of default arguments. This is in anticipation of a change to SelectionDAG build APIs. llvm-svn: 96239	2010-02-15 16:57:43 +00:00
Chris Lattner	2b0a7a2592	refactor the conditional jump instructions in the .td file to use a multipattern that generates both the 1-byte and 4-byte versions from the same defm llvm-svn: 95901	2010-02-11 19:25:55 +00:00
Chris Lattner	b06015aa69	move target-independent opcodes out of TargetInstrInfo into TargetOpcodes.h. #include the new TargetOpcodes.h into MachineInstr. Add new inline accessors (like isPHI()) to MachineInstr, and start using them throughout the codebase. llvm-svn: 95687	2010-02-09 19:54:29 +00:00
Dan Gohman	51ad99d2c5	Re-implement the main strength-reduction portion of LoopStrengthReduction. This new version is much more aggressive about doing "full" reduction in cases where it reduces register pressure, and also more aggressive about rewriting induction variables to count down (or up) to zero when doing so reduces register pressure. It currently uses fairly simplistic algorithms for finding reuse opportunities, but it introduces a new framework allows it to combine multiple strategies at once to form hybrid solutions, instead of doing all full-reduction or all base+index. llvm-svn: 94061	2010-01-21 02:09:26 +00:00
David Greene	0985160c54	When XDEBUG is enabled, check for SelectionDAG cycles at some key points. This will help us find future problems like the one described in PR6019. llvm-svn: 94019	2010-01-20 20:13:31 +00:00
David Greene	b0c0e6433f	Fix PR6019. A load has more than one use if it feeds a bitconvert that has more than one use. llvm-svn: 93576	2010-01-15 23:23:41 +00:00
Dan Gohman	c119580307	Reapply the MOV64r0 patch, with a fix: MOV64r0 clobbers EFLAGS. llvm-svn: 93229	2010-01-12 04:42:54 +00:00
Evan Cheng	7bdf339602	Revert 93158. It's breaking quite a few x86_64 tests. llvm-svn: 93185	2010-01-11 21:13:41 +00:00
Dan Gohman	3a55686345	Re-instate MOV64r0 and MOV16r0, with adjustments to work with the new AsmPrinter. This is perhaps less elegant than describing them in terms of MOV32r0 and subreg operations, but it allows the current register to rematerialize them. llvm-svn: 93158	2010-01-11 17:37:57 +00:00
David Greene	dbdb1b28b8	Change errs() to dbgs(). llvm-svn: 92647	2010-01-05 01:29:08 +00:00
Dan Gohman	ea6f91ff64	Change SelectCode's argument from SDValue to SDNode , to make it more clear what information these functions are actually using. This is also a micro-optimization, as passing a SDNode around is simpler than passing a { SDNode *, int } by value or reference. llvm-svn: 92564	2010-01-05 01:24:18 +00:00
Dan Gohman	85d4fdfe37	Flags-producing add, and, or, etc. have the same profibility rules as normal add, and, or, etc. llvm-svn: 92507	2010-01-04 20:51:50 +00:00
Chris Lattner	518b037620	completely eliminate the MOV16r0 'instruction'. The only interesting part of this is the divrem changes, which are already tested by CodeGen/X86/divrem.ll. llvm-svn: 91975	2009-12-23 01:45:04 +00:00
Evan Cheng	3dfd04e2b7	Re-apply 91623 now that I actually know what I was trying to do. llvm-svn: 91655	2009-12-18 01:59:21 +00:00
Jeffrey Yasskin	0de0ce11d8	Revert r91623 to unbreak the buildbots. llvm-svn: 91632	2009-12-17 22:44:34 +00:00
Evan Cheng	e43b403c87	Remove an unused option. llvm-svn: 91623	2009-12-17 21:23:58 +00:00
Dan Gohman	7a6611793f	Target-independent support for TargetFlags on BlockAddress operands, and support for blockaddresses in x86-32 PIC mode. llvm-svn: 89506	2009-11-20 23:18:13 +00:00
Daniel Dunbar	bc299f0092	llvm-gcc/clang don't (won't?) need this hack. llvm-svn: 86769	2009-11-11 00:28:38 +00:00
Daniel Dunbar	b9415c7d9a	Add a monstrous hack to improve X86ISelDAGToDAG compile time. - Force NDEBUG on in any Release build. This drops the compile time to ~100s from ~600s, in Release mode. - This may just be a temporary workaround, I don't know the true nature of the gcc-4.2 compile time performance problem. llvm-svn: 86695	2009-11-10 18:24:37 +00:00
Dan Gohman	006f9353e1	Use SUBREG_TO_REG instead of INSERT_SUBREG to model x86-64's implicit zero-extend. llvm-svn: 86196	2009-11-05 23:53:08 +00:00
Dan Gohman	b15f4a1cbd	Remove uninteresting and confusing debug output. llvm-svn: 86149	2009-11-05 18:47:09 +00:00
Chris Lattner	50ba5c3dc2	improve x86 codegen support for blockaddress. We now compile the testcase into: _test1: ## @test1 ## BB#0: ## %entry leaq L_test1_bb6(%rip), %rax jmpq *%rax L_test1_bb: ## Address Taken LBB1_1: ## %bb movb $1, %al ret L_test1_bb6: ## Address Taken LBB1_2: ## %bb6 movb $2, %al ret Note, it is very very strange that BlockAddressSDNode doesn't carry around TargetFlags. Dan, please fix this. llvm-svn: 85703	2009-11-01 03:25:03 +00:00
Nick Lewycky	974e12b2d3	Remove includes of Support/Compiler.h that are no longer needed after the VISIBILITY_HIDDEN removal. llvm-svn: 85043	2009-10-25 06:57:41 +00:00
Nick Lewycky	02d5f77d26	Remove VISIBILITY_HIDDEN from class/struct found inside anonymous namespaces. Chris claims we should never have visibility_hidden inside any .cpp file but that's still not true even after this commit. llvm-svn: 85042	2009-10-25 06:33:48 +00:00
Dan Gohman	7d9dffb413	Fix the x86 test-shrink optimization so that it doesn't shrink comparisons when one of the bits being tested would end up being the sign bit in the narrower type, and a signed comparison is being performed, since this would change the result of the signed comparison. This fixes PR5132. llvm-svn: 83670	2009-10-09 20:35:19 +00:00
Dan Gohman	48b185d6f7	Improve MachineMemOperand handling. - Allocate MachineMemOperands and MachineMemOperand lists in MachineFunctions. This eliminates MachineInstr's std::list member and allows the data to be created by isel and live for the remainder of codegen, avoiding a lot of copying and unnecessary translation. This also shrinks MemSDNode. - Delete MemOperandSDNode. Introduce MachineSDNode which has dedicated fields for MachineMemOperands. - Change MemSDNode to have a MachineMemOperand member instead of its own fields with the same information. This introduces some redundancy, but it's more consistent with what MachineInstr will eventually want. - Ignore alignment when searching for redundant loads for CSE, but remember the greatest alignment. Target-specific code which previously used MemOperandSDNodes with generic SDNodes now use MemIntrinsicSDNodes, with opcodes in a designated range so that the SelectionDAG framework knows that MachineMemOperand information is available. llvm-svn: 82794	2009-09-25 20:36:54 +00:00
Dan Gohman	32f71d714b	Rename getTargetNode to getMachineNode, for consistency with the naming scheme used in SelectionDAG, where there are multiple kinds of "target" nodes, but "machine" nodes are nodes which represent a MachineInstr. llvm-svn: 82790	2009-09-25 18:54:59 +00:00
Nate Begeman	1ae49ee7ca	Do not try and sink a load whose chain result has more than one use, when trying to create RMW opportunities in the x86 backend. This can cause a cycle to appear in the graph, since the other uses may eventually feed into the TokenFactor we are sinking the load below. llvm-svn: 81996	2009-09-16 03:20:46 +00:00
Dan Gohman	520a6856ba	Don't pull a load through a callseq_start if the load's chain has multiple uses, as one of the other uses may be on a path to a different node above the callseq_start, because that leads to a cyclic graph. This problem is exposed when -combiner-global-alias-analysis is used. This fixes PR4880. llvm-svn: 81821	2009-09-15 01:22:01 +00:00
Dan Gohman	0f6bf2dbb8	Use X86II::MO_NO_FLAG. llvm-svn: 80012	2009-08-25 17:47:44 +00:00
Benjamin Kramer	940fbb0e3c	Remove Streams.h from the targets. llvm-svn: 79853	2009-08-23 11:52:17 +00:00
Devang Patel	0939595711	Record variable debug info at ISel time directly. llvm-svn: 79742	2009-08-22 17:12:53 +00:00
Anton Korobeynikov	7950510b29	Fix a typo llvm-svn: 79634	2009-08-21 15:41:56 +00:00
Dan Gohman	05046085b6	Fix an x86 code size regression: prefer RIP-relative addressing over absolute addressing even in non-PIC mode (unless the address has an index or something else incompatible), because it has a smaller encoding. llvm-svn: 79553	2009-08-20 18:23:44 +00:00
Dan Gohman	de255fc8f6	Remove temporary testing code. llvm-svn: 79443	2009-08-19 18:27:08 +00:00
Dan Gohman	ac33a9061d	Add an x86 peep that narrows TEST instructions to forms that use a smaller encoding. These kinds of patterns are very frequent in sqlite3, for example. llvm-svn: 79439	2009-08-19 18:16:17 +00:00
Owen Anderson	9f94459d24	Split EVT into MVT and EVT, the former representing _just_ a primitive type, while the latter is capable of representing either a primitive or an extended type. llvm-svn: 78713	2009-08-11 20:47:22 +00:00
Owen Anderson	53aa7a960c	Rename MVT to EVT, in preparation for splitting SimpleValueType out into its own struct type. llvm-svn: 78610	2009-08-10 22:56:29 +00:00
Bill Wendling	fe3bdb4b6f	Reformatting of lines. Put multiple DEBUG statements under one DEBUG statement. llvm-svn: 78411	2009-08-07 21:33:25 +00:00
Dan Gohman	130e2c7aed	Fix a bug in x86's PreprocessForRMW logic that was exposed by aggressive chain operand optimization. UpdateNodeOperands does not modify the node in place if it would result in a node identical to an existing node. llvm-svn: 78297	2009-08-06 09:22:57 +00:00
Anton Korobeynikov	741ea0d7fd	Better handle kernel code model. Also, generalize the things and fix one subtle bug with small code model. llvm-svn: 78255	2009-08-05 23:01:26 +00:00
Bill Wendling	6eecd56efc	- s/DOUT/DEBUG(errs()/g - Tidy up some headers. llvm-svn: 77929	2009-08-03 00:11:34 +00:00
Dan Gohman	757eee8a27	Fix indentation. llvm-svn: 77895	2009-08-02 16:10:52 +00:00
Dan Gohman	edfad17d9b	Minor code simplifications. llvm-svn: 77768	2009-08-01 03:42:59 +00:00
Evan Cheng	e62288fdd4	Optimize some common usage patterns of atomic built-ins __sync_add_and_fetch() and __sync_sub_and_fetch. When the return value is not used (i.e. only care about the value in the memory), x86 does not have to use add to implement these. Instead, it can use add, sub, inc, dec instructions with the "lock" prefix. This is currently implemented using a bit of instruction selection trick. The issue is the target independent pattern produces one output and a chain and we want to map it into one that just output a chain. The current trick is to select it into a merge_values with the first definition being an implicit_def. The proper solution is to add new ISD opcodes for the no-output variant. DAG combiner can then transform the node before it gets to target node selection. Problem #2 is we are adding a whole bunch of x86 atomic instructions when in fact these instructions are identical to the non-lock versions. We need a way to add target specific information to target nodes and have this information carried over to machine instructions. Asm printer (or JIT) can use this information to add the "lock" prefix. llvm-svn: 77582	2009-07-30 08:33:02 +00:00
Dan Gohman	824ab40381	x86 isel tweak: use lea (%reg,%reg) instead of lea (,%reg,2). llvm-svn: 76817	2009-07-22 23:26:55 +00:00
Chris Lattner	79c136d473	reapply r75408, which eliminates MOV64r0 in favor of using MOV32r0 + subregs to do the same thing. This should work now that PR4544 is fixed. Thanks Evan! llvm-svn: 75671	2009-07-14 20:19:57 +00:00
Torok Edwin	fbcc663cbf	llvm_unreachable->llvm_unreachable(0), LLVM_UNREACHABLE->llvm_unreachable. This adds location info for all llvm_unreachable calls (which is a macro now) in !NDEBUG builds. In NDEBUG builds location info and the message is off (it only prints "UREACHABLE executed"). llvm-svn: 75640	2009-07-14 16:55:14 +00:00
Bill Wendling	5b76fc03ae	Temporarily revert r75408. It appears to break the Apple-style builds: x86_64-apple-darwin10-gcc -c -g -O2 -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wmissing-format-attribute -mdynamic-no-pic -DHAVE_CONFIG_H -I. -I. -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmgcc42.roots/llvmgcc42~obj/src/gcc -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmgcc42.roots/llvmgcc42~obj/src/gcc/. -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmgcc42.roots/llvmgcc42~obj/src/gcc/../include -I./../intl -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmgcc42.roots/llvmgcc42~obj/src/gcc/../libcpp/include -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmgcc42.roots/llvmgcc42~obj/src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmCore.roots/llvmCore~dst/Developer/usr/local/include -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmCore.roots/llvmCore~obj/src/include -DENABLE_LLVM -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmCore.roots/llvmCore~dst/Developer/usr/local/include -D_DEBUG -D_GNU_SOURCE -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -DLLVM_VERSION_INFO='"9999"' -DBUILD_LLVM_APPLE_STYLE /Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmgcc42.roots/llvmgcc42~obj/src/gcc/tree-ssa-alias.c -o tree-ssa-alias.o /var/tmp//ccJQ2JBT.s:4134:Incorrect register `%rcx' used with `l' suffix make[2]: * [tree-ssa-live.o] Error 1 make[2]: * Waiting for unfinished jobs.... llvm-svn: 75412	2009-07-12 02:49:22 +00:00
Chris Lattner	02c4339bde	eliminate MOV64r0 in favor of a Pat<> pattern. This is only nontrivial because the div lowering code explicitly references it. llvm-svn: 75408	2009-07-12 00:47:55 +00:00
Chris Lattner	48cee9b4c1	fix a bug in my cleanup patch llvm-svn: 75402	2009-07-11 23:07:30 +00:00
Chris Lattner	4d10f1a6c9	comment cleanup, reduce nesting. llvm-svn: 75398	2009-07-11 22:50:33 +00:00
Torok Edwin	56d0659726	assert(0) -> LLVM_UNREACHABLE. Make llvm_unreachable take an optional string, thus moving the cerr<< out of line. LLVM_UNREACHABLE is now a simple wrapper that makes the message go away for NDEBUG builds. llvm-svn: 75379	2009-07-11 20:10:48 +00:00
Torok Edwin	fb8d6d5b58	Implement changes from Chris's feedback. Finish converting lib/Target. llvm-svn: 75043	2009-07-08 20:53:28 +00:00
Chris Lattner	fea81da433	Reimplement rip-relative addressing in the X86-64 backend. The new implementation primarily differs from the former in that the asmprinter doesn't make a zillion decisions about whether or not something will be RIP relative or not. Instead, those decisions are made by isel lowering and propagated through to the asm printer. To achieve this, we: 1. Represent RIP relative addresses by setting the base of the X86 addr mode to X86::RIP. 2. When ISel Lowering decides that it is safe to use RIP, it lowers to X86ISD::WrapperRIP. When it is unsafe to use RIP, it lowers to X86ISD::Wrapper as before. 3. This removes isRIPRel from X86ISelAddressMode, representing it with a basereg of RIP instead. 4. The addressing mode matching logic in isel is greatly simplified. 5. The asmprinter is greatly simplified, notably the "NotRIPRel" predicate passed through various printoperand routines is gone now. 6. The various symbol printing routines in asmprinter now no longer infer when to emit (%rip), they just print the symbol. I think this is a big improvement over the previous situation. It does have two small caveats though: 1. I implemented a horrible "no-rip" modifier for the inline asm "P" constraint modifier. This is a short term hack, there is a much better, but more involved, solution. 2. I had to xfail an -aggressive-remat testcase because it isn't handling the use of RIP in the constant-pool reading instruction. This specific test is easy to fix without -aggressive-remat, which I intend to do next. llvm-svn: 74372	2009-06-27 04:16:01 +00:00
Chris Lattner	899abc4655	make sure to propagate operand flags in SelectTLSADDRAddr properly. llvm-svn: 74326	2009-06-26 21:18:37 +00:00
Chris Lattner	1d3b65a6ae	fix a pasto. llvm-svn: 74275	2009-06-26 05:56:49 +00:00
Chris Lattner	bd7e26db16	propagate target operand flags through addressing mode selection. llvm-svn: 74272	2009-06-26 05:51:45 +00:00
Chris Lattner	7d2b049404	change TLS_ADDR lowering to lower to a real mem operand, instead of matching as a global with that gets printed with the :mem modifier. All operands to lea's should be handled with the lea32mem operand kind, and this allows the TLS stuff to do this. There are several better ways to do this, but I went for the minimal change since I can't really test this (beyond make check). This also makes the use of EBX explicit in the operand list in the 32-bit, instead of implicit in the instruction. llvm-svn: 73834	2009-06-20 20:38:48 +00:00
Dan Gohman	4751bb9edb	Remove the redundant TM member from X86DAGToDAGISel; replace it with an accessor method which simply casts the parent class SelectionDAGISel's TM to the target-specific type. llvm-svn: 72801	2009-06-03 20:20:00 +00:00
Dan Gohman	faf75c8c9a	Convert a subtract into a negate and an add when it helps x86 address folding. llvm-svn: 71446	2009-05-11 18:02:53 +00:00
Anton Korobeynikov	65a58168cc	Factor out cycle-finder code and make it generic. llvm-svn: 71241	2009-05-08 18:51:58 +00:00
Bill Wendling	026e5d7667	Instead of passing in an unsigned value for the optimization level, use an enum, which better identifies what the optimization is doing. And is more flexible for future uses. llvm-svn: 70440	2009-04-29 23:29:43 +00:00
Bill Wendling	084669a1c9	Second attempt: Massive check in. This changes the "-fast" flag to "-O#" in llc. If you want to use the old behavior, the flag is -O0. This change allows for finer-grained control over which optimizations are run at different -O levels. Most of this work was pretty mechanical. The majority of the fixes came from verifying that a "fast" variable wasn't used anymore. The JIT still uses a "Fast" flag. I'll change the JIT with a follow-up patch. llvm-svn: 70343	2009-04-29 00:15:41 +00:00
Bill Wendling	56f2987a87	r70270 isn't ready yet. Back this out. Sorry for the noise. llvm-svn: 70275	2009-04-28 01:04:53 +00:00
Bill Wendling	d0ae15946c	Massive check in. This changes the "-fast" flag to "-O#" in llc. If you want to use the old behavior, the flag is -O0. This change allows for finer-grained control over which optimizations are run at different -O levels. Most of this work was pretty mechanical. The majority of the fixes came from verifying that a "fast" variable wasn't used anymore. The JIT still uses a "Fast" flag. I'm not 100% sure if it's necessary to change it there... llvm-svn: 70270	2009-04-28 00:21:31 +00:00
Rafael Espindola	5e42177a0f	fix PR3995. A scale must be 1, 2, 4 or 8. llvm-svn: 69284	2009-04-16 12:34:53 +00:00
Dan Gohman	62f4498646	For the h-register addressing-mode trick, use the correct value for any non-address uses of the address value. This fixes 186.crafty. llvm-svn: 69094	2009-04-14 22:45:05 +00:00
Dan Gohman	57d6bd36b2	Implement x86 h-register extract support. - Add patterns for h-register extract, which avoids a shift and mask, and in some cases a temporary register. - Add address-mode matching for turning (X>>(8-n))&(255<<n), where n is a valid address-mode scale value, into an h-register extract and a scaled-offset address. - Replace X86's MOV32to32_ and related instructions with the new target-independent COPY_TO_SUBREG instruction. On x86-64 there are complicated constraints on h registers, and CodeGen doesn't currently provide a high-level way to express all of them, so they are handled with a bunch of special code. This code currently only supports extracts where the result is used by a zero-extend or a store, though these are fairly common. These transformations are not always beneficial; since there are only 4 h registers, they sometimes require extra move instructions, and this sometimes increases register pressure because it can force out values that would otherwise be in one of those registers. However, this appears to be relatively uncommon. llvm-svn: 68962	2009-04-13 16:09:41 +00:00
Dan Gohman	f20462c217	Remove x86's special-case handling for ISD::TRUNCATE and ISD::SIGN_EXTEND_INREG. Tablegen-generated code can handle these cases, and the scheduling issues observed earlier appear to be resolved now. llvm-svn: 68959	2009-04-13 15:29:31 +00:00
Dan Gohman	092b8b6fdb	Use X86::SUBREG_8BIT instead of hard-coding the equivalent constant. llvm-svn: 68951	2009-04-13 15:14:03 +00:00
Rafael Espindola	6d6c6043ea	X86-64 TLS support for local exec and initial exec. llvm-svn: 68947	2009-04-13 13:02:49 +00:00
Rafael Espindola	7186f20a1b	In X86DAGToDAGISel::MatchWrapper, if base or index are set, avoid matching only if symbolic addresses are RIP relatives. llvm-svn: 68924	2009-04-12 23:00:38 +00:00
Rafael Espindola	6688b0a5da	refactor some code into X86DAGToDAGISel::MatchWrapper llvm-svn: 68915	2009-04-12 21:55:03 +00:00
Rafael Espindola	bb834f0929	Don't fold a load if the other operand is a TLS address. With this we generate movl %gs:0, %eax leal i@NTPOFF(%eax), %eax instead of movl $i@NTPOFF, %eax addl %gs:0, %eax llvm-svn: 68778	2009-04-10 10:09:34 +00:00
Rafael Espindola	3b2df10c9e	Re-apply 68552. Tested by bootstrapping llvm-gcc and using that to build llvm. llvm-svn: 68645	2009-04-08 21:14:34 +00:00
Bill Wendling	4aa25b79f9	Temporarily revert r68552. This was causing a failure in the self-hosting LLVM builds. --- Reverse-merging (from foreign repository) r68552 into '.': U test/CodeGen/X86/tls8.ll U test/CodeGen/X86/tls10.ll U test/CodeGen/X86/tls2.ll U test/CodeGen/X86/tls6.ll U lib/Target/X86/X86Instr64bit.td U lib/Target/X86/X86InstrSSE.td U lib/Target/X86/X86InstrInfo.td U lib/Target/X86/X86RegisterInfo.cpp U lib/Target/X86/X86ISelLowering.cpp U lib/Target/X86/X86CodeEmitter.cpp U lib/Target/X86/X86FastISel.cpp U lib/Target/X86/X86InstrInfo.h U lib/Target/X86/X86ISelDAGToDAG.cpp U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.cpp U lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.cpp U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.h U lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.h U lib/Target/X86/X86ISelLowering.h U lib/Target/X86/X86InstrInfo.cpp U lib/Target/X86/X86InstrBuilder.h U lib/Target/X86/X86RegisterInfo.td llvm-svn: 68560	2009-04-07 22:35:25 +00:00
Rafael Espindola	1edda06792	Reduce code duplication on the TLS implementation. This introduces a small regression on the generated code quality in the case we are just computing addresses, not loading values. Will work on it and on X86-64 support. llvm-svn: 68552	2009-04-07 21:37:46 +00:00
Rafael Espindola	9277379fc0	remove unused arguments. llvm-svn: 68109	2009-03-31 16:16:57 +00:00
Evan Cheng	885bc6de52	X86 address mode isel tweak. If the base of the address is also used by a CopyToReg (i.e. it's likely live-out), do not fold the sub-expressions into the addressing mode to avoid computing the address twice. The CopyToReg use will be isel'ed to a LEA, re-use it for address instead. This is not yet enabled. llvm-svn: 68082	2009-03-31 01:13:53 +00:00
Evan Cheng	a84a318873	When optimzing a mul by immediate into two, the resulting mul's should get a x86 specific node to avoid dag combiner from hacking on them further. llvm-svn: 68066	2009-03-30 21:36:47 +00:00
Rafael Espindola	1f11c3c36f	Use array_lengthof llvm-svn: 67950	2009-03-28 19:02:18 +00:00
Rafael Espindola	227815437a	Use less hard coded constants to make the code less brittle. llvm-svn: 67846	2009-03-27 15:45:05 +00:00
Dan Gohman	2293eb6037	Don't forego folding of loads into 64-bit adds when the other operand is a signed 32-bit immediate. Unlike with the 8-bit signed immediate case, it isn't actually smaller to fold a 32-bit signed immediate instead of a load. In fact, it's larger in the case of 32-bit unsigned immediates, because they can be materialized with movl instead of movq. llvm-svn: 67001	2009-03-14 02:07:16 +00:00
Dan Gohman	a1d92423cf	Enhance address-mode folding of ISD::ADD to handle cases where the operands can't both be fully folded at the same time. For example, in the included testcase, a global variable is being added with an add of two values. The global variable wants RIP-relative addressing, so it can't share the address with another base register, but it's still possible to fold the initial add. llvm-svn: 66865	2009-03-13 02:25:09 +00:00
Dale Johannesen	9bba902c83	Remove non-DebugLoc versions of BuildMI from X86. There were some that might even matter in X86FastISel. llvm-svn: 64437	2009-02-13 02:33:27 +00:00
Chris Lattner	aed3a4215b	fix the X86 backend to just drop llvm.declare nodes for VLAs instead of leaving them in the DAG and then getting selection errors. This is a fix for PR3538. llvm-svn: 64382	2009-02-12 17:33:11 +00:00
Dale Johannesen	9c310711bb	Use getDebugLoc forwarder instead of getNode()->getDebugLoc. No functional change. llvm-svn: 64026	2009-02-07 19:59:05 +00:00
Dan Gohman	4e3e3deed3	Refactor some repeated logic into a separate function. llvm-svn: 63989	2009-02-07 00:43:41 +00:00
Dale Johannesen	9f3f72f144	Get rid of one more non-DebugLoc getNode and its corresponding getTargetNode. Lots of caller changes. llvm-svn: 63904	2009-02-06 01:31:28 +00:00
Dale Johannesen	bbf13f54e0	Patch up omissions in DebugLoc propagation. llvm-svn: 63693	2009-02-04 00:33:20 +00:00
Dale Johannesen	14f2d9dcbd	DebugLoc propgation llvm-svn: 63664	2009-02-03 21:48:12 +00:00
Dan Gohman	f77f0ce21a	Simplify findNonImmUse; return the result using the return value instead of via a by-reference argument. No functionality change. llvm-svn: 63118	2009-01-27 19:04:30 +00:00
Dan Gohman	7740523a89	Eliminate unnecessary operands-list traversals. llvm-svn: 63088	2009-01-27 02:37:43 +00:00
Evan Cheng	6c7e85142b	Enhance logic in X86DAGToDAGISel::PreprocessForRMW which move load inside callseq_start to allow it to be folded into a call. It was not considering the cases where a token factor is between the load and the callseq_start. llvm-svn: 63022	2009-01-26 18:43:34 +00:00
Dan Gohman	b43c8996f2	Fix a recent regression. ClrOpcode is not set for i8; for i8, if we want to clear %ah to zero before a division, just use a zero-extending mov to %al. This fixes PR3366. llvm-svn: 62691	2009-01-21 14:50:16 +00:00
Evan Cheng	44cc554311	DIVREM isel deficiency: If sign bit is known zero, zero out DX/EDX/RDX instead of sign extending the low part (in AX/EAX/RAX) into it. llvm-svn: 62519	2009-01-19 19:06:11 +00:00
Evan Cheng	bf38a5e540	Fix MatchAddress bug that's preventing negative displacement from being folded in 64-bit mode. llvm-svn: 62413	2009-01-17 07:09:27 +00:00
Dan Gohman	619ef48a52	Move a few containers out of ScheduleDAGInstrs::BuildSchedGraph and into the ScheduleDAGInstrs class, so that they don't get destructed and re-constructed for each block. This fixes a compile-time hot spot in the post-pass scheduler. To help facilitate this, tidy and do some minor reorganization in the scheduler constructor functions. llvm-svn: 62275	2009-01-15 19:20:50 +00:00
Evan Cheng	5a272e79e5	80 col violation. llvm-svn: 62024	2009-01-10 03:33:22 +00:00
Evan Cheng	01fa50ca4f	Some code clean up. llvm-svn: 60850	2008-12-10 21:49:05 +00:00
Evan Cheng	83bdb38965	On x86 favors folding short immediate into some arithmetic operations (e.g. add, and, xor, etc.) because materializing an immediate in a register is expensive in turns of code size. e.g. movl 4(%esp), %eax addl $4, %eax is 2 bytes shorter than movl $4, %eax addl 4(%esp), %eax llvm-svn: 60139	2008-11-27 00:49:46 +00:00
Dan Gohman	88ba5f0b96	Move the code that inserts X87 FP_REG_KILL instructions from a special-purpose hook to a new pass. Also, add check to see if any x87 virtual registers are used, to avoid doing any work in the common case that no x87 code is needed. llvm-svn: 59190	2008-11-12 22:55:05 +00:00
Dan Gohman	059c4fa8d8	The 32-bit displacement field in an x86 address is signed. Arrange for it to be sign-extended when it is promoted to 64 bits for intermediate offset calculations. The offset calculations are done as uint64_t so that overflow conditions are well defined. This fixes a problem which is currently hidden by the x86 AsmPrinter but which was exposed by r58917 (which is temporarily reverted). See PR3027 for details. llvm-svn: 59044	2008-11-11 15:52:29 +00:00
Dan Gohman	f14b77ebf1	Eliminate the ISel priority queue, which used the topological order for a priority function. Instead, just iterate over the AllNodes list, which is already in topological order. This eliminates a fair amount of bookkeeping, and speeds up the isel phase by about 15% on many testcases. The impact on most targets is that AddToISelQueue calls can be simply removed. In the x86 target, there are two additional notable changes. The rule-bending AND+SHIFT optimization in MatchAddress that creates new pre-isel nodes during isel is now a little more verbose, but more robust. Instead of either creating an invalid DAG or creating an invalid topological sort, as it has historically done, it can now just insert the new nodes into the node list at a position where they will be consistent with the topological ordering. Also, the address-matching code has logic that checked to see if a node was "already selected". However, when a node is selected, it has all its uses taken away via ReplaceAllUsesWith or equivalent, so it won't recieve any further visits from MatchAddress. This code is now removed. llvm-svn: 58748	2008-11-05 04:14:16 +00:00
Dan Gohman	b9110e7fbb	The ANDMask node folds to a constant, and isn't the node that needs to have its node id set. The new and and shift nodes are the nodes that need the IDs. This fixes PR2982. llvm-svn: 58655	2008-11-03 23:43:55 +00:00
David Greene	ce2a938186	Have TableGen emit setSubgraphColor calls under control of a -gen-debug flag. Then in a debugger developers can set breakpoints at these calls to see waht is about to be selected and what the resulting subgraph looks like. This really helps when debugging instruction selection. llvm-svn: 58278	2008-10-27 21:56:29 +00:00
Dan Gohman	2fe6bee5b6	Teach DAGCombine to fold constant offsets into GlobalAddress nodes, and add a TargetLowering hook for it to use to determine when this is legal (i.e. not in PIC mode, etc.) This allows instruction selection to emit folded constant offsets in more cases, such as the included testcase, eliminating the need for explicit arithmetic instructions. This eliminates the need for the C++ code in X86ISelDAGToDAG.cpp that attempted to achieve the same effect, but wasn't as effective. Also, fix handling of offsets in GlobalAddressSDNodes in several places, including changing GlobalAddressSDNode's offset from int to int64_t. The Mips, Alpha, Sparc, and CellSPU targets appear to be unaware of GlobalAddress offsets currently, so set the hook to false on those targets. llvm-svn: 57748	2008-10-18 02:06:02 +00:00
Dan Gohman	e33afda4fa	Trim #includes. llvm-svn: 57649	2008-10-16 20:18:31 +00:00
Evan Cheng	c36231b95e	Fix indentation. llvm-svn: 57508	2008-10-14 17:15:39 +00:00
Dan Gohman	56b6885104	When doing the very-late shift-and address-mode optimization, create a new DAG node to represent the new shift to keep the DAG consistent, even though it'll almost always be folded into the address. If a user of the resulting address has multiple uses, the nodes may get revisited by a later MatchAddress call, in which case DAG inconsistencies do matter. This fixes PR2849. llvm-svn: 57465	2008-10-13 20:52:04 +00:00
Devang Patel	c0f3b52e65	It is possible that all functions in one module are not being optimized for size. Set OptForSize for each function separately. llvm-svn: 57182	2008-10-06 18:03:39 +00:00
Dale Johannesen	8c36a1c09c	Make atomic Swap work, 64-bit on x86-32. Make it all work in non-pic mode. llvm-svn: 57034	2008-10-03 22:25:52 +00:00
Dale Johannesen	5d60c1ebb1	Pass MemOperand through for 64-bit atomics on 32-bit, incidentally making the case where the memop is a pointer deref work. Fix cmp-and-swap regression. llvm-svn: 57027	2008-10-03 19:41:08 +00:00
Dan Gohman	2c836cf187	Avoid creating two TargetLowering objects for each target. Instead, just create one, and make sure everything that needs it can access it. Previously most of the SelectionDAGISel subclasses all had their own TargetLowering object, which was redundant with the TargetLowering object in the TargetMachine subclasses, except on Sparc, where SparcTargetMachine didn't have a TargetLowering object. Change Sparc to work more like the other targets here. llvm-svn: 57016	2008-10-03 16:55:19 +00:00
Dan Gohman	eae96ce3ec	Remove an unused field. llvm-svn: 57014	2008-10-03 16:17:33 +00:00
Dan Gohman	0d1e9a8e04	Switch the MachineOperand accessors back to the short names like isReg, etc., from isRegister, etc. llvm-svn: 57006	2008-10-03 15:45:36 +00:00
Dale Johannesen	867d549fce	Handle some 64-bit atomics on x86-32, some of the time. llvm-svn: 56963	2008-10-02 18:53:47 +00:00
Devang Patel	1b76f2c40b	Remove OptimizeForSize global. Use function attribute optsize. llvm-svn: 56937	2008-10-01 23:18:38 +00:00
Dan Gohman	86aa16a69a	Optimize SelectionDAG's AssignTopologicalOrder even further. Completely eliminate the TopOrder std::vector. Instead, sort the AllNodes list in place. This also eliminates the need to call AllNodes.size(), a linear-time operation, before performing the sort. Also, eliminate the Sources temporary std::vector, since it essentially duplicates the sorted result as it is being built. This also changes the direction of the topological sort from bottom-up to top-down. The AllNodes list starts out in roughly top-down order, so this reduces the amount of reordering needed. Top-down is also more convenient for Legalize, and ISel needed only minor adjustments. llvm-svn: 56867	2008-09-30 18:30:35 +00:00
Dan Gohman	6ebe734ca6	Move the GlobalBaseReg field out of X86ISelDAGToDAG.cpp and X86FastISel.cpp into X86MachineFunction.h, so that it can be shared, instead of having each selector keep track of its own. llvm-svn: 56825	2008-09-30 00:58:23 +00:00
Daniel Dunbar	1d5e766016	Unbreak build. llvm-svn: 56727	2008-09-27 00:22:09 +00:00
Evan Cheng	7d6fa97567	Implement "punpckldq %xmm0, $xmm0" as "pshufd $0x50, %xmm0, %xmm" unless optimizing for code size. llvm-svn: 56711	2008-09-26 23:41:32 +00:00
Dan Gohman	6e0548336a	Rename ConstantSDNode's getSignExtended to getSExtValue, for consistancy with ConstantInt, and re-implement it in terms of ConstantInt's getSExtValue. llvm-svn: 56700	2008-09-26 21:54:37 +00:00
Dan Gohman	007a6bb9b9	Factor out the code for determining when symblic addresses require RIP-relative addressing and use it to fix a bug in X86FastISel in x86-64 PIC mode, where it was trying to use base/index registers with RIP-relative addresses. This fixes a bunch of x86-64 testsuite failures. llvm-svn: 56676	2008-09-26 19:15:30 +00:00
Evan Cheng	e0add20c1b	Properly handle 'm' inline asm constraints. If a GV is being selected for the addressing mode, it requires the same logic for PIC relative addressing, etc. llvm-svn: 56526	2008-09-24 00:05:32 +00:00
Dan Gohman	e64c9944f6	Delete an unused function. llvm-svn: 56495	2008-09-23 18:26:47 +00:00
Dan Gohman	2430073657	Move the code for initializing the global base reg out of X86ISelDAGToDAG.cpp and into X86InstrInfo.cpp. This will allow it to be reused by FastISel. llvm-svn: 56494	2008-09-23 18:22:58 +00:00
Dan Gohman	173aa8602d	Simplify and generalize X86DAGToDAGISel::CanBeFoldedBy, and draw up some new ascii art to illustrate what it does. This change currently has no effect on generated code. llvm-svn: 56270	2008-09-17 01:39:10 +00:00
Bill Wendling	24c79f28b1	Reverting r56249. On further investigation, this functionality isn't needed. Apologies for the thrashing. llvm-svn: 56251	2008-09-16 21:48:12 +00:00
Bill Wendling	8bc392fb1d	- Change "ExternalSymbolSDNode" to "SymbolSDNode". - Add linkage to SymbolSDNode (default to external). - Change ISD::ExternalSymbol to ISD::Symbol. - Change ISD::TargetExternalSymbol to ISD::TargetSymbol These changes pave the way to allowing SymbolSDNodes with non-external linkage. llvm-svn: 56249	2008-09-16 21:12:30 +00:00
Dan Gohman	effb894453	Rename ConstantSDNode::getValue to getZExtValue, for consistency with ConstantInt. This led to fixing a bug in TargetLowering.cpp using getValue instead of getAPIntValue. llvm-svn: 56159	2008-09-12 16:56:44 +00:00
Gabor Greif	81d6a38434	fix a bunch of 80-col violations llvm-svn: 55588	2008-08-31 15:37:04 +00:00
Gabor Greif	f304a7aa4d	erect abstraction boundaries for accessing SDValue members, rename Val -> Node to reflect semantics llvm-svn: 55504	2008-08-28 21:40:38 +00:00
Gabor Greif	abfdf928d8	disallow direct access to SDValue::ResNo, provide a getter instead llvm-svn: 55394	2008-08-26 22:36:50 +00:00
Evan Cheng	f00f1e50b5	Try approach to moving call address load inside of callseq_start. Now it's done during the preprocess of x86 isel. callseq_start's chain is changed to load's chain node; while load's chain is the last of callseq_start or the loads or copytoreg nodes inserted to move arguments to the right spot. llvm-svn: 55338	2008-08-25 21:27:18 +00:00
Dan Gohman	eb0cee91f6	Move the point at which FastISel taps into the SelectionDAGISel process up to a higher level. This allows FastISel to leverage more of SelectionDAGISel's infastructure, such as updating Machine PHI nodes. Also, implement transitioning from SDISel back to FastISel in the middle of a block, so it's now possible to go back and forth. This allows FastISel to hand individual CallInsts and other complicated things off to SDISel to handle, while handling the rest of the block itself. To help support this, reorganize the SelectionDAG class so that it is allocated once and reused throughout a function, instead of being completely reallocated for each block. llvm-svn: 55219	2008-08-23 02:25:05 +00:00
Dan Gohman	d3582c9bda	Simplify SelectRoot's interface, and factor out some common code from all targets. llvm-svn: 55124	2008-08-21 16:36:34 +00:00
Dan Gohman	814f291664	Move the handling of ANY_EXTEND, SIGN_EXTEND_INREG, and TRUNCATE out of X86ISelDAGToDAG.cpp C++ code and into tablegen code. Among other things, using tablegen for these things makes them friendlier to FastISel. Tablegen can handle the case of i8 subregs on x86-32, but currently the C++ code for that case uses MVT::Flag in a tricky way, and it happens to schedule better in some cases. So for now, leave the C++ code in place to handle the i8 case on x86-32. llvm-svn: 55078	2008-08-20 21:27:32 +00:00
Evan Cheng	ab35bfdf18	Fix a (u)comiss intrinsic lowering bug. It was using anyext which can return junk in higher bits. Patch by Nate Begeman. llvm-svn: 54903	2008-08-17 19:22:34 +00:00
Dan Gohman	e81ac0b66f	Oops, check in these files too, for the FastISel -> Fast rename. llvm-svn: 54750	2008-08-13 19:55:00 +00:00
Dale Johannesen	dafdbf77b3	Some fixes for x86-64 JIT. Make it use small code model, except for external calls; this makes addressing modes PC-relative. Incomplete. The assertion at the top of Emitter::runOnMachineFunction was obviously bogus (always true) so I removed it. If someone knows what the correct test should be to cover all the various targets, please fix. llvm-svn: 54656	2008-08-11 23:46:25 +00:00
Dan Gohman	2ce6f2ad5e	Rename SDOperand to SDValue. llvm-svn: 54128	2008-07-27 21:46:04 +00:00
Dan Gohman	91e5dcb680	Tidy SDNode::use_iterator, and complete the transition to have it parallel its analogue, Value::value_use_iterator. The operator* method now returns the user, rather than the use. llvm-svn: 54127	2008-07-27 20:43:25 +00:00
Dan Gohman	581cc87f57	Add titles to the various SelectionDAG viewGraph calls that include useful information like the name of the block being viewed and the current phase of compilation. llvm-svn: 53872	2008-07-21 20:00:07 +00:00
Dan Gohman	1705968102	Add a new function, ReplaceAllUsesOfValuesWith, which handles bulk replacement of multiple values. This is slightly more efficient than doing multiple ReplaceAllUsesOfValueWith calls, and theoretically could be optimized even further. However, an important property of this new function is that it handles the case where the source value set and destination value set overlap. This makes it feasible for isel to use SelectNodeTo in many very common cases, which is advantageous because SelectNodeTo avoids a temporary node and it doesn't require CSEMap updates for users of values that don't change position. Revamp MorphNodeTo, which is what does all the work of SelectNodeTo, to handle operand lists more efficiently, and to correctly handle a number of corner cases to which its new wider use exposes it. This commit also includes a change to the encoding of post-isel opcodes in SDNodes; now instead of being sandwiched between the target-independent pre-isel opcodes and the target-dependent pre-isel opcodes, post-isel opcodes are now represented as negative values. This makes it possible to test if an opcode is pre-isel or post-isel without having to know the size of the current target's post-isel instruction set. These changes speed up llc overall by 3% and reduce memory usage by 10% on the InstructionCombining.cpp testcase with -fast and -regalloc=local. llvm-svn: 53728	2008-07-17 19:10:17 +00:00
Dan Gohman	f169f81036	Fix the result type of X86's truncate to i8. llvm-svn: 53688	2008-07-16 16:20:48 +00:00
Evan Cheng	2c9773155a	Do not use computationally expensive scheduling heuristics with -fast. llvm-svn: 52971	2008-07-01 18:05:03 +00:00
Evan Cheng	0711d68fa7	Split scheduling from instruction selection. llvm-svn: 52923	2008-06-30 20:45:06 +00:00
Evan Cheng	f6a1466829	Unbreak DECLARE isel in pic mode. llvm-svn: 52439	2008-06-18 02:48:27 +00:00
Evan Cheng	e47ca0940f	Rather than avoiding to wrap ISD::DECLARE GV operand in X86ISD::Wrapper, simply handle it at dagisel time with x86 specific isel code. llvm-svn: 52377	2008-06-17 02:01:22 +00:00
Duncan Sands	13237ac3b9	Wrap MVT::ValueType in a struct to get type safety and better control the abstraction. Rename the type to MVT. To update out-of-tree patches, the main thing to do is to rename MVT::ValueType to MVT, and rewrite expressions like MVT::getSizeInBits(VT) in the form VT.getSizeInBits(). Use VT.getSimpleVT() to extract a MVT::SimpleValueType for use in switch statements (you will get an assert failure if VT is an extended value type - these shouldn't exist after type legalization). This results in a small speedup of codegen and no new testsuite failures (x86-64 linux). llvm-svn: 52044	2008-06-06 12:08:01 +00:00
Dan Gohman	6e582c449f	Fix a tblgen problem handling variable_ops in tblgen instruction definitions. This adds a new construct, "discard", for indicating that a named node in the input matching pattern is to be discarded, instead of corresponding to a node in the output pattern. This allows tblgen to know where the arguments for the varaible_ops are supposed to begin. This fixes "rdar://5791600", whatever that is ;-). llvm-svn: 51699	2008-05-29 19:57:41 +00:00
Evan Cheng	04d24edcbb	Use movlps / movhps to modify low / high half of 16-byet memory location. llvm-svn: 51501	2008-05-23 21:23:16 +00:00
Evan Cheng	961339bbdb	Handle a few more cases of folding load i64 into xmm and zero top bits. Note, some of the code will be moved into target independent part of DAG combiner in a subsequent patch. llvm-svn: 50918	2008-05-09 21:53:03 +00:00
Evan Cheng	78af38c392	Handle vector move / load which zero the destination register top bits (i.e. movd, movq, movss (addr), movsd (addr)) with X86 specific dag combine. llvm-svn: 50838	2008-05-08 00:57:18 +00:00
Evan Cheng	59834d1c7a	Not checking for intrinsics which do not have a chain operand. llvm-svn: 50260	2008-04-25 08:55:28 +00:00
Evan Cheng	051da5deaa	- Switch from std::set to SmallPtrSet. - Add comments. llvm-svn: 50259	2008-04-25 08:22:20 +00:00
Chris Lattner	741c7a3b49	Loosen up an assertion to allow intrinsics. I really have no idea what this code (findNonImmUse) does, so I'm only guessing that this is the right thing. It would be really really nice if this had comments and perhaps switched to SmallPtrSet (hint hint) :) This fixes rdar://5886601, a crash on gcc.target/i386/sse4_1-pblendw.c llvm-svn: 50252	2008-04-25 05:13:01 +00:00
Roman Levenstein	51f532f92d	Re-commit of the r48822, where the infinite looping problem discovered by Dan Gohman is fixed. llvm-svn: 49330	2008-04-07 10:06:32 +00:00
Evan Cheng	d9129d1de3	Cosmetic llvm-svn: 49156	2008-04-03 07:45:18 +00:00
Evan Cheng	025cea1126	Backing out 48222 temporarily. llvm-svn: 49124	2008-04-03 03:13:16 +00:00
Roman Levenstein	358e04a185	Use a linked data structure for the uses lists of an SDNode, just like LLVM Value/Use does and MachineRegisterInfo/MachineOperand does. This allows constant time for all uses list maintenance operations. The idea was suggested by Chris. Reviewed by Evan and Dan. Patch is tested and approved by Dan. On normal use-cases compilation speed is not affected. On very big basic blocks there are compilation speedups in the range of 15-20% or even better. llvm-svn: 48822	2008-03-26 12:39:26 +00:00
Chris Lattner	68b11e14bc	remove Evan's "ugly hack" that sorta attempted to get x86-64 return conventions correct, but was never enabled. We can now do the "right thing" with multiple return values. llvm-svn: 48635	2008-03-21 06:50:21 +00:00
Christopher Lamb	d3d0ad3f58	Make insert_subreg a two-address instruction, vastly simplifying LowerSubregs pass. Add a new TII, subreg_to_reg, which is like insert_subreg except that it takes an immediate implicit value to insert into rather than a register. llvm-svn: 48412	2008-03-16 03:12:01 +00:00
Christopher Lamb	dd55d3f1b2	Get rid of a pseudo instruction and replace it with subreg based operation on real instructions, ridding the asm printers of the hack used to do this previously. In the process, update LowerSubregs to be careful about eliminating copies that have side affects. Note: the coalescer will have to be careful about this too, when it starts coalescing insert_subreg nodes. llvm-svn: 48329	2008-03-13 05:47:01 +00:00
Christopher Lamb	aa7c2105de	Recommitting parts of r48130. These do not appear to cause the observed failures. llvm-svn: 48223	2008-03-11 10:09:17 +00:00
Chris Lattner	1bd44363f2	Change the model for FP Stack return to use fp operands on the RET instruction instead of using FpSET_ST0_32. This also generalizes the code to handling returning of multiple FP results. llvm-svn: 48209	2008-03-11 03:23:40 +00:00
Chris Lattner	7362d38391	Don't emit FP_REG_KILL into a block that just returns. Nothing can be live out of the block anyway, so it isn't needed. llvm-svn: 48192	2008-03-10 23:34:12 +00:00
Evan Cheng	d4e1d9eeb2	Revert 48125, 48126, and 48130 for now to unbreak some x86-64 tests. llvm-svn: 48167	2008-03-10 19:31:26 +00:00
Christopher Lamb	4ba3f0430b	Allow insert_subreg into implicit, target-specific values. Change insert/extract subreg instructions to be able to be used in TableGen patterns. Use the above features to reimplement an x86-64 pseudo instruction as a pattern. llvm-svn: 48130	2008-03-10 06:12:08 +00:00
Chris Lattner	d587e580a6	rename FpGETRESULT32 -> FpGET_ST0_32 etc. Add support for isel'ing value preserving FP roundings from one fp stack reg to another into a noop, instead of stack traffic. llvm-svn: 48093	2008-03-09 07:05:32 +00:00
Evan Cheng	33ff36321e	Remove -always-fold-and-in-test. llvm-svn: 47871	2008-03-04 00:40:35 +00:00
Evan Cheng	507713de08	Set to default: x86 no longer fold and into test if it has more than one use. llvm-svn: 47711	2008-02-28 07:46:38 +00:00
Dan Gohman	a790af3a88	Revert the assert for MUL_LOHI with an unused high result; Chris pointed out that this isn't correct at -O0. llvm-svn: 47575	2008-02-25 22:43:48 +00:00
Dan Gohman	0be2f3b941	Add an assert to verify that we don't see an {S,U}MUL_LOHI with an unused high value. llvm-svn: 47569	2008-02-25 22:15:55 +00:00
Dan Gohman	2ff975e749	Remove the hack that turned an {S,U}MUL_LOHI with an unused high result into a MUL late in the X86 codegen process. ISD::MUL is once again Legal on X86, so this is no longer needed. And, the hack was suboptimal; see PR1874 for details. llvm-svn: 47567	2008-02-25 21:57:04 +00:00
Dan Gohman	1f372edd97	Convert MaskedValueIsZero and all its users to use APInt. Also add a SignBitIsZero function to simplify a common use case. llvm-svn: 47561	2008-02-25 21:11:39 +00:00
Evan Cheng	b6b69208ba	Poorly named option. llvm-svn: 47400	2008-02-20 20:57:32 +00:00
Evan Cheng	7626ab33d8	Disable for now. This is pessimizing code. llvm-svn: 47354	2008-02-20 02:29:17 +00:00
Evan Cheng	5ce8dd93ef	Add hidden option -x86-fold-and-in-test to test the effect the test / and folding change. llvm-svn: 47351	2008-02-19 23:36:51 +00:00
Evan Cheng	8a25d6ac53	Only using x86-64 rip relative addressing in non-staic mode? llvm-svn: 47019	2008-02-12 19:20:46 +00:00
Dan Gohman	3a4be0fdef	Rename MRegisterInfo to TargetRegisterInfo. llvm-svn: 46930	2008-02-10 18:45:23 +00:00
Evan Cheng	a20a773654	Fix a x86-64 codegen deficiency. Allow gv + offset when using rip addressing mode. Before: _main: subq $8, %rsp leaq _X(%rip), %rax movsd 8(%rax), %xmm1 movss _X(%rip), %xmm0 call _t xorl %ecx, %ecx movl %ecx, %eax addq $8, %rsp ret Now: _main: subq $8, %rsp movsd _X+8(%rip), %xmm1 movss _X(%rip), %xmm0 call _t xorl %ecx, %ecx movl %ecx, %eax addq $8, %rsp ret Notice there is another idiotic codegen issue that needs to be fixed asap: xorl %ecx, %ecx movl %ecx, %eax llvm-svn: 46850	2008-02-07 08:53:49 +00:00
Evan Cheng	2cb9068c78	Dwarf requires variable entries to be in the source order. Right now, since we are recording variable information at isel time this means parameters would appear in the reverse order. The short term fix is to issue recordVariable() at asm printing time instead. llvm-svn: 46724	2008-02-04 23:06:48 +00:00
Evan Cheng	efd142a920	SDIsel processes llvm.dbg.declare by recording the variable debug information descriptor and its corresponding stack frame index in MachineModuleInfo. This only works if the local variable is "homed" in the stack frame. It does not work for byval parameter, etc. Added ISD::DECLARE node type to represent llvm.dbg.declare intrinsic. Now the intrinsic calls are lowered into a SDNode and lives on through out the codegen passes. For now, since all the debugging information recording is done at isel time, when a ISD::DECLARE node is selected, it has the side effect of also recording the variable. This is a short term solution that should be fixed in time. llvm-svn: 46659	2008-02-02 04:07:54 +00:00
Evan Cheng	084a1cdcdd	Work in progress. This patch fixes x86-64 calls which are modelled as StructRet but really should be return in registers, e.g. _Complex long double, some 128-bit aggregates. This is a short term solution that is necessary only because llvm, for now, cannot model i128 nor call's with multiple results. Status: This only works for direct calls, and only the caller side is done. Disabled for now. llvm-svn: 46527	2008-01-29 19:34:22 +00:00
Chris Lattner	a91f77eaac	Significantly simplify and improve handling of FP function results on x86-32. This case returns the value in ST(0) and then has to convert it to an SSE register. This causes significant codegen ugliness in some cases. For example in the trivial fp-stack-direct-ret.ll testcase we used to generate: _bar: subl $28, %esp call L_foo$stub fstpl 16(%esp) movsd 16(%esp), %xmm0 movsd %xmm0, 8(%esp) fldl 8(%esp) addl $28, %esp ret because we move the result of foo() into an XMM register, then have to move it back for the return of bar. Instead of hacking ever-more special cases into the call result lowering code we take a much simpler approach: on x86-32, fp return is modeled as always returning into an f80 register which is then truncated to f32 or f64 as needed. Similarly for a result, we model it as an extension to f80 + return. This exposes the truncate and extensions to the dag combiner, allowing target independent code to hack on them, eliminating them in this case. This gives us this code for the example above: _bar: subl $12, %esp call L_foo$stub addl $12, %esp ret The nasty aspect of this is that these conversions are not legal, but we want the second pass of dag combiner (post-legalize) to be able to hack on them. To handle this, we lie to legalize and say they are legal, then custom expand them on entry to the isel pass (PreprocessForFPConvert). This is gross, but less gross than the code it is replacing :) This also allows us to generate better code in several other cases. For example on fp-stack-ret-conv.ll, we now generate: _test: subl $12, %esp call L_foo$stub fstps 8(%esp) movl 16(%esp), %eax cvtss2sd 8(%esp), %xmm0 movsd %xmm0, (%eax) addl $12, %esp ret where before we produced (incidentally, the old bad code is identical to what gcc produces): _test: subl $12, %esp call L_foo$stub fstpl (%esp) cvtsd2ss (%esp), %xmm0 cvtss2sd %xmm0, %xmm0 movl 16(%esp), %eax movsd %xmm0, (%eax) addl $12, %esp ret Note that we generate slightly worse code on pr1505b.ll due to a scheduling deficiency that is unrelated to this patch. llvm-svn: 46307	2008-01-24 08:07:48 +00:00
Evan Cheng	4951da49aa	Fix a x86-64 static codegen bug. This fixes a lot of x86-64 jit failures. llvm-svn: 45733	2008-01-08 02:06:11 +00:00
Evan Cheng	f55b7381af	Combine MovePCtoStack + POP32r into one instruction MOVPC32r so it can be moved if needed. llvm-svn: 45605	2008-01-05 00:41:47 +00:00

... 4 5 6 7 8 ...

731 Commits