llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	ab47fe4e16	Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305. llvm-svn: 161318	2012-08-06 06:22:36 +00:00
Chad Rosier	24c19d20c0	Whitespace. llvm-svn: 161122	2012-08-01 18:39:17 +00:00
David Chisnall	5b8c1680de	ELF does not imply GNU/Linux. Do not assume GNU conventions just because we are targeting an ELF platform. Only fold gs-relative (and fs-relative) loads if it is actually sensible to do so for the target platform. This fixes PR13438. llvm-svn: 160687	2012-07-24 20:04:16 +00:00
Craig Topper	f7755df776	Update GATHER instructions to support 2 read-write operands. Patch from myself and Manman Ren. llvm-svn: 160110	2012-07-12 06:52:41 +00:00
Craig Topper	3af251dbf1	Reduce code size by using a second switch statement to avoid extra calls to SelectAtomic64. Also catch cases where SelectAtomic64 fails. llvm-svn: 159503	2012-07-01 02:55:34 +00:00
Craig Topper	e15e5f7c5c	Add a break to the end of case statement missed in r159501. llvm-svn: 159502	2012-07-01 02:18:18 +00:00
Craig Topper	fbb954f727	Fix a crash on release builds if gather intrinsics are passed a non-constant value for the last argument. llvm-svn: 159501	2012-07-01 02:17:08 +00:00
Craig Topper	def044b974	Use a second switch statement to reduce number of calls to SelectGather in code. Reduces code size a bit. llvm-svn: 159500	2012-07-01 02:05:52 +00:00
Manman Ren	98a5bf24a9	X86: add more GATHER intrinsics in LLVM Corrected type for index of llvm.x86.avx2.gather.d.pd.256 from 256-bit to 128-bit. Corrected types for src\|dst\|mask of llvm.x86.avx2.gather.q.ps.256 from 256-bit to 128-bit. Support the following intrinsics: llvm.x86.avx2.gather.d.q, llvm.x86.avx2.gather.q.q llvm.x86.avx2.gather.d.q.256, llvm.x86.avx2.gather.q.q.256 llvm.x86.avx2.gather.d.d, llvm.x86.avx2.gather.q.d llvm.x86.avx2.gather.d.d.256, llvm.x86.avx2.gather.q.d.256 llvm-svn: 159402	2012-06-29 00:54:20 +00:00
Manman Ren	a09820414a	X86: add GATHER intrinsics (AVX2) in LLVM Support the following intrinsics: llvm.x86.avx2.gather.d.pd, llvm.x86.avx2.gather.q.pd llvm.x86.avx2.gather.d.pd.256, llvm.x86.avx2.gather.q.pd.256 llvm.x86.avx2.gather.d.ps, llvm.x86.avx2.gather.q.ps llvm.x86.avx2.gather.d.ps.256, llvm.x86.avx2.gather.q.ps.256 Modified Disassembler to handle VSIB addressing mode. llvm-svn: 159221	2012-06-26 19:47:59 +00:00
Craig Topper	a4fd6d655a	Tidy up spacing. llvm-svn: 157313	2012-05-23 05:44:51 +00:00
Evan Cheng	58a95f0c8a	Avoid creating a cycle when folding load / op with flag / store. PR11451474. rdar://11451474 llvm-svn: 156896	2012-05-16 01:54:27 +00:00
Evan Cheng	3e869f002c	Generalize r153635 to deal with TokenFactor chains; also clean up the logic and fix the tests. rdar://11069732, rdar://11236106 llvm-svn: 154604	2012-04-12 19:14:21 +00:00
Chandler Carruth	3779ac10b4	Cleanup and relax a restriction on the matching of global offsets into x86 addressing modes. This allows PIE-based TLS offsets to fit directly into an addressing mode immediate offset, which is the last remaining code quality issue from PR12380. With this patch, that PR is completely fixed. To understand why this patch is correct to match these offsets into addressing mode immediates, break it down by cases: 1) 32-bit is trivially correct, and unmodified here. 2) 64-bit non-small mode is unchanged and never matches. 3) 64-bit small PIC code which is RIP-relative is handled specially in the match to try to fit RIP into the base register. If it fails, it now early exits. This behavior is unchanged by the patch. 4) 64-bit small non-PIC code which is not RIP-relative continues to work as it did before. The reason these immediates are safe is because the ABI ensures they fit in small mode. This behavior is unchanged. 5) 64-bit small PIC code which is not using RIP-relative addressing. This is the only case changed by the patch, and the primary place you see it is in TLS, either the win64 section offset TLS or Linux local-exec TLS model in a PIC compilation. Here the ABI again ensures that the immediates fit because we are in small mode, and any other operations required due to the PIC relocation model have been handled externally to the Wrapper node (extra loads etc are made around the wrapper node in ISelLowering). I've tested this as much as I can comparing it with GCC's output, and everything appears safe. I discussed this with Anton and it made sense to him at least at face value. That said, if there are issues with PIC code after this patch, yell and we can revert it. llvm-svn: 154304	2012-04-09 02:13:06 +00:00
Rafael Espindola	ba0a6cabb8	Always compute all the bits in ComputeMaskedBits. This allows us to keep passing reduced masks to SimplifyDemandedBits, but know about all the bits if SimplifyDemandedBits fails. This allows instcombine to simplify cases like the one in the included testcase. llvm-svn: 154011	2012-04-04 12:51:34 +00:00
Benjamin Kramer	8619c37b5b	Replace assert(0) with llvm_unreachable to avoid warnings about dropping off the end of a non-void function in Release builds. llvm-svn: 153643	2012-03-29 12:37:26 +00:00
Joel Jones	68d59e8a90	For X86, change load/dec-or-inc/store into dec-or-inc, respectively. This is a code change to add support for changing instruction sequences of the form: load inc/dec of 8/16/32/64 bits store into the appropriate X86 inc/dec through memory instruction: inc[qlwb] / dec[qlwb] The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode. The comments have also been expanded. llvm-svn: 153635	2012-03-29 05:45:48 +00:00
Joel Jones	b474099e63	Reverted to revision 153616 to unblock build llvm-svn: 153623	2012-03-29 01:20:56 +00:00
Joel Jones	b88c81fe0f	For X86, change load/dec-or-inc/store into dec-or-inc, respectively. This is a code change to add support for changing instruction sequences of the form: load inc/dec of 8/16/32/64 bits store into the appropriate X86 inc/dec through memory instruction: inc[qlwb] / dec[qlwb] The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode. The comments have also been expanded. llvm-svn: 153617	2012-03-29 00:37:47 +00:00
Craig Topper	1fcf5bcae1	Prune some includes llvm-svn: 153502	2012-03-27 07:54:11 +00:00
Craig Topper	f6e7e12f75	Remove unnecessary llvm:: qualifications llvm-svn: 153500	2012-03-27 07:21:54 +00:00
Craig Topper	b25fda95f6	Reorder includes in Target backends to following coding standards. Remove some superfluous forward declarations. llvm-svn: 152997	2012-03-17 18:46:09 +00:00
Craig Topper	2dac962864	Use uint16_t to store opcodes in static tables in X86 backend. llvm-svn: 152391	2012-03-09 07:45:21 +00:00
Craig Topper	cc830f8cda	Declare register classes as const. Fix a couple pointers to register classes that weren't already const. llvm-svn: 151138	2012-02-22 07:28:11 +00:00
Jakob Stoklund Olesen	97e3115dc2	Use the same CALL instructions for Windows as for everything else. The different calling conventions and call-preserved registers are represented with regmask operands that are added dynamically. llvm-svn: 150708	2012-02-16 17:56:02 +00:00
Pete Cooper	c21ebf5c41	Stop custom lowering forr x86 DEC64m from happening if the load in the lowered sequence has more than 1 user llvm-svn: 150537	2012-02-15 00:33:37 +00:00
Pete Cooper	71be57bb32	Fixed bug when custom lowering DEC64m on x86. If the DEC node had more than one user, it was doing this lowering but leaving the original DEC node around and so decrementing twice. Fixes PR11964. llvm-svn: 150356	2012-02-13 00:10:03 +00:00
David Blaikie	46a9f016c5	More dead code removal (using -Wunreachable-code) llvm-svn: 148578	2012-01-20 21:51:11 +00:00
Chandler Carruth	eb21da060b	Switch all of the uses of my InsertDAGNode helper to follow the exact same pattern. We already had this pattern is a few places, but others tried to make a rough approximation of an actual DAG structure. As not everywhere went to this trouble, nothing could rely on this being done. In fact, I've checked all references to these node Ids, and the ones that are using the topo-sort properties are actually satisfied with a strict-weak-ordering. The requirement appears to be that Use >= Def. I've added a big blurb of comments to this bit of the transform to clarify why the order is so important for the next reader of the code. I'm starting with this change as it is very small, and trivially reverted if something breaks or the >= above really does need to be >. If that proves the case, we can hide the problem by reverting this patch, but the problem exists elsewhere as well, and so a more comprehensive solution will be needed. llvm-svn: 148001	2012-01-12 01:34:44 +00:00
Chandler Carruth	3212a34269	Revert r147945 which disabled an addressing mode transformation. I had hoped this would revive one of the llvm-gcc selfhost build bots, but it didn't so it doesn't appear that my transform is the culprit. If anyone else is seeing failures, please let me know! llvm-svn: 147957	2012-01-11 18:36:12 +00:00
Chandler Carruth	9bc48e5215	Disable the transformation I added in r147936 to see if it fixes some strange build bot failures that look like a miscompile into an infloop. I'll investigate this tomorrow, but I'd both like to know whether my patch is the culprit, and get the bots back to green. llvm-svn: 147945	2012-01-11 12:17:47 +00:00
Chandler Carruth	3eacfb83fa	Hoist a really redundant code pattern into a helper function, and delete lots of lines of code. No functionality changed. llvm-svn: 147942	2012-01-11 11:04:36 +00:00
Chandler Carruth	b0049f4a43	Simplify the AND-rooted mask+shift checking code to match that of the SRL-rooted code. llvm-svn: 147941	2012-01-11 09:35:04 +00:00
Chandler Carruth	3dbcda8478	Unify the interface of the three mask+shift transform helpers, and factor the differences that were hiding in one of them into its other caller, the SRL handling code. No change in behavior. llvm-svn: 147940	2012-01-11 09:35:02 +00:00
Chandler Carruth	aa01e6661a	Clarify and make explicit some of the requirements for transforming mask+shift pairs at the beginning of the ISD::AND case block, and then hoist the final pattern into a helper function, simplifying and reflowing it appropriately. This should have no observable behavior change, but several simplifications fell out of this such as directly computing the new mask constant, etc. llvm-svn: 147939	2012-01-11 09:35:00 +00:00
Chandler Carruth	51d3076bbf	Hoist the logic to transform shift+mask combinations into sub-register extracts and scaled addressing modes into its own helper function. No functionality changed here, just hoisting and layout fixes falling out of that hoisting. llvm-svn: 147937	2012-01-11 08:48:20 +00:00
Chandler Carruth	55b2cdee26	Teach the X86 instruction selection to do some heroic transforms to detect a pattern which can be implemented with a small 'shl' embedded in the addressing mode scale. This happens in real code as follows: unsigned x = my_accelerator_table[input >> 11]; Here we have some lookup table that we look into using the high bits of 'input'. Each entity in the table is 4-bytes, which means this implicitly gets turned into (once lowered out of a GEP): (unsigned)((char)my_accelerator_table + ((input >> 11) << 2)); The shift right followed by a shift left is canonicalized to a smaller shift right and masking off the low bits. That hides the shift right which x86 has an addressing mode designed to support. We now detect masks of this form, and produce the longer shift right followed by the proper addressing mode. In addition to saving a (rather large) instruction, this also reduces stalls in Intel chips on benchmarks I've measured. In order for all of this to work, one part of the DAG needs to be canonicalized still further* than it currently is. This involves removing pointless 'trunc' nodes between a zextload and a zext. Without that, we end up generating spurious masks and hiding the pattern. llvm-svn: 147936	2012-01-11 08:41:08 +00:00
Chandler Carruth	c16622daff	Don't rely on the fact that shift values are never very large, and thus this substraction will result in small negative numbers at worst which become very large positive numbers on assignment and are thus caught by the <=4 check on the next line. The >0 check clearly intended to catch these as negative numbers. Spotted by inspection, and impossible to trigger given the shift widths that can be used. llvm-svn: 147773	2012-01-09 09:47:25 +00:00
Pete Cooper	48784ed5b7	Added missing comment about new custom lowering of DEC64 llvm-svn: 144811	2011-11-16 19:03:23 +00:00
Pete Cooper	7c7ba1baa1	Added custom lowering for load->dec->store sequence in x86 when the EFLAGS registers is used by later instructions. Only done for DEC64m right now. Fixes <rdar://problem/6172640> llvm-svn: 144705	2011-11-15 21:57:53 +00:00
Dan Gohman	198b7ffc11	Reapply r143206, with fixes. Disallow physical register lifetimes across calls, and only check for nested dependences on the special call-sequence-resource register. llvm-svn: 143660	2011-11-03 21:49:52 +00:00
Dan Gohman	9b9c970148	Revert r143206, as there are still some failing tests. llvm-svn: 143262	2011-10-29 00:41:52 +00:00
Dan Gohman	73057ad24f	Reapply r143177 and r143179 (reverting r143188), with scheduler fixes: Use a separate register, instead of SP, as the calling-convention resource, to avoid spurious conflicts with actual uses of SP. Also, fix unscheduling of calling sequences, which can be triggered by pseudo-two-address dependencies. llvm-svn: 143206	2011-10-28 17:55:38 +00:00
Duncan Sands	225a7037d6	Speculatively disable Dan's commits 143177 and 143179 to see if it fixes the dragonegg self-host (it looks like gcc is miscompiled). Original commit messages: Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. Delete #if 0 code accidentally left in. llvm-svn: 143188	2011-10-28 09:55:57 +00:00
Dan Gohman	4db3f7dd83	Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. llvm-svn: 143177	2011-10-28 01:29:32 +00:00
Jakob Stoklund Olesen	729abd360e	Add TEST8ri_NOREX pseudo to constrain sub_8bit_hi copies. In 64-bit mode, sub_8bit_hi sub-registers can only be used by NOREX instructions. The COPY created from the EXTRACT_SUBREG DAG node cannot target all GR8 registers, only those in GR8_NOREX. TO enforce this, we ensure that all instructions using the EXTRACT_SUBREG are GR8_NOREX constrained. This fixes PR11088. llvm-svn: 141499	2011-10-08 18:28:28 +00:00
Bruno Cardoso Lopes	616fe60548	Teach PreprocessISelDAG to be aware of vector types and to not process them. llvm-svn: 136653	2011-08-01 21:54:05 +00:00
Eli Friedman	344ec79715	Make sure we don't combine a large displacement and a frame index in the same addressing mode on x86-64. It can overflow, leading to a crash/miscompile. <rdar://problem/9763308> llvm-svn: 135084	2011-07-13 21:29:53 +00:00
Eli Friedman	ef67e7d623	Refactor out checking for displacements on x86-64 addressing modes. No functionality change. Refactoring in preparation for an additional safety check in FoldOffsetIntoAddress. Part of <rdar://problem/9763308>. llvm-svn: 135079	2011-07-13 20:44:23 +00:00
Eric Christopher	a8a56f7e5c	TargetConstant immediates won't be placed into registers so tighten up the valid constant check earlier. rdar://9692967 llvm-svn: 134286	2011-07-01 23:04:38 +00:00
Eric Christopher	c932173773	Fix a small thinko for constant i64 lock/orq optimization where we we didn't have an opcode for 64-bit constant or expressions. Fixes rdar://9692967 llvm-svn: 134121	2011-06-30 00:48:30 +00:00
Stuart Hastings	91f1d24736	Re-commit 131641 with fixes; de-pseudoize MOVSX16rr8 and friends. rdar://problem/8614450 llvm-svn: 131746	2011-05-20 19:04:40 +00:00
Eric Christopher	56a42ebf15	Update comment. llvm-svn: 131459	2011-05-17 08:16:14 +00:00
Eric Christopher	a1d9e29552	Support XOR and AND optimization with no return value. Finishes off rdar://8470697 llvm-svn: 131458	2011-05-17 08:10:18 +00:00
Eric Christopher	abfe3131e3	Couple less magic numbers. llvm-svn: 131457	2011-05-17 07:50:41 +00:00
Eric Christopher	eb47a2a1e5	Make this code a little less magic number laden. llvm-svn: 131456	2011-05-17 07:47:55 +00:00
Eric Christopher	2a9dbbbb12	Turn this into a table, this will make more sense shortly. Part of rdar://8470697 llvm-svn: 131200	2011-05-11 21:44:58 +00:00
Eric Christopher	4a34e61e53	Optimize atomic lock or that doesn't use the result value. Next up: xor and and. Part of rdar://8470697 llvm-svn: 131171	2011-05-10 23:57:45 +00:00
Benjamin Kramer	3db054650b	Silence an overzealous uninitialized variable warning from GCC. llvm-svn: 130053	2011-04-23 08:21:06 +00:00
Benjamin Kramer	4c81624735	X86: Try to use a smaller encoding by transforming (X << C1) & C2 into (X & (C2 >> C1)) & C1. (Part of PR5039) This tends to happen a lot with bitfield code generated by clang. A simple example for x86_64 is uint64_t foo(uint64_t x) { return (x&1) << 42; } which used to compile into bloated code: shlq $42, %rdi ## encoding: [0x48,0xc1,0xe7,0x2a] movabsq $4398046511104, %rax ## encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x00,0x04,0x00,0x00] andq %rdi, %rax ## encoding: [0x48,0x21,0xf8] ret ## encoding: [0xc3] with this patch we can fold the immediate into the and: andq $1, %rdi ## encoding: [0x48,0x83,0xe7,0x01] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] shlq $42, %rax ## encoding: [0x48,0xc1,0xe0,0x2a] ret ## encoding: [0xc3] It's possible to save another byte by using 'andl' instead of 'andq' but I currently see no way of doing that without making this code even more complicated. See the TODOs in the code. llvm-svn: 129990	2011-04-22 15:30:40 +00:00
Stuart Hastings	81c4306005	Swap VT and DebugLoc operands of getExtLoad() for consistency with other getNode() methods. Radar 9002173. llvm-svn: 125665	2011-02-16 16:23:55 +00:00
Chris Lattner	46c01a30f4	Enhance ComputeMaskedBits to know that aligned frameindexes have their low bits set to zero. This allows us to optimize out explicit stack alignment code like in stack-align.ll:test4 when it is redundant. Doing this causes the code generator to start turning FI+cst into FI\|cst all over the place, which is general goodness (that is the canonical form) except that various pieces of the code generator don't handle OR aggressively. Fix this by introducing a new SelectionDAG::isBaseWithConstantOffset predicate, and using it in places that are looking for ADD(X,CST). The ARM backend in particular was missing a lot of addressing mode folding opportunities around OR. llvm-svn: 125470	2011-02-13 22:25:43 +00:00
NAKAMURA Takumi	f3e20b9f0f	lib/Target/X86/X86ISelDAGToDAG.cpp: __main should be WINCALL64 on Win64. CALL64 marks %xmm* as dead. llvm-svn: 124354	2011-01-27 03:20:19 +00:00
Chris Lattner	35a2e65bcb	fix PR8514, a bug where the "heroic" transformation of shift/and into and/shift would cause nodes to move around and a dangling pointer to happen. The code tried to avoid this with a HandleSDNode, but got the details wrong. llvm-svn: 123578	2011-01-16 08:48:11 +00:00
Ted Kremenek	b5241b2b59	'HiReg' is written but never read. Nuke its declaration and its assignments. Found by clang static analyzer. llvm-svn: 123486	2011-01-14 22:34:13 +00:00
Bill Wendling	81d40711f3	PR8918 - When used with MinGW64, LLVM generates a "calll __main" at the beginning of the "main" function. The assembler complains about the invalid suffix for the 'call' instruction. The right instruction is "callq __main". Patch by KS Sreeram! llvm-svn: 122933	2011-01-06 00:47:10 +00:00
Chris Lattner	3e5fbd74ed	rename MVT::Flag to MVT::Glue. "Flag" is a terrible name for something that just glues two nodes together, even if it is sometimes used for flags. llvm-svn: 122310	2010-12-21 02:38:05 +00:00
Chris Lattner	364bb0a081	it turns out that when ".with.overflow" intrinsics were added to the X86 backend that they were all implemented except umul. This one fell back to the default implementation that did a hi/lo multiply and compared the top. Fix this to check the overflow flag that the 'mul' instruction sets, so we can avoid an explicit test. Now we compile: void *func(long count) { return new int[count]; } into: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] seto %cl ## encoding: [0x0f,0x90,0xc1] testb %cl, %cl ## encoding: [0x84,0xc9] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL instead of: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] testq %rdx, %rdx ## encoding: [0x48,0x85,0xd2] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL Other than the silly seto+test, this is using the o bit directly, so it's going in the right direction. llvm-svn: 120935	2010-12-05 07:30:36 +00:00
Dale Johannesen	e660f4d072	Use a MemIntrinsicSDNode for ISD::PREFETCH, which touches memory, so a MachineMemOperand is useful (not propagated into the MachineInstr yet). No functional change except for dump output. llvm-svn: 117413	2010-10-26 23:11:10 +00:00
Chris Lattner	1a1c600110	Use #NAME# to have the CMOV multiclass define things with the same names as before (e.g. CMOVBE16rr instead of CMOVBErr16). llvm-svn: 115705	2010-10-05 23:00:14 +00:00
Chris Lattner	0067ee02f9	switch CMOVBE to the multipattern: 21 insertions(+), 53 deletions(-) Moar change coming before I switch the rest. llvm-svn: 115697	2010-10-05 22:23:58 +00:00
Eric Christopher	c1b3e072f4	Temporarily work around new address lowering while I figure out what needs to happen for darwin. llvm-svn: 114577	2010-09-22 20:42:08 +00:00
Chris Lattner	8a236b63d8	reimplement elf TLS support in terms of addressing modes, eliminating SegmentBaseAddress. llvm-svn: 114529	2010-09-22 04:39:11 +00:00
Chris Lattner	a5156c30ed	convert the last 4 X86ISD nodes that should have memoperands to have them. llvm-svn: 114523	2010-09-22 01:28:21 +00:00
Chris Lattner	ed85da5600	give X86ISD::FNSTCW16m a memoperand, since it touches memory. It only can access the stack due to how it is generated though. llvm-svn: 114522	2010-09-22 01:11:26 +00:00
Chris Lattner	78f518b79b	give FP_TO_INT16_IN_MEM and friends a memoperand. They are only used with stack slots, but hey, lets be safe. llvm-svn: 114521	2010-09-22 01:05:16 +00:00
Chris Lattner	54e5329545	give VZEXT_LOAD a memory operand, it now works with segment registers. llvm-svn: 114515	2010-09-22 00:34:38 +00:00
Chris Lattner	07827ba978	revert r114386 now that address modes work correctly, we get a nice call through gs-relative memory now. llvm-svn: 114510	2010-09-22 00:11:31 +00:00
Chris Lattner	e479e9643b	give LCMPXCHG_DAG[8] a memory operand, allowing it to work with addrspace 256/257 llvm-svn: 114508	2010-09-21 23:59:42 +00:00
Chris Lattner	d58d7c1907	reimplement support for GS and FS relative address space matching by having X86DAGToDAGISel::SelectAddr get passed in the parent node of the operand match (the load/store/atomic op) and having it get the address space from that, instead of having special FS/GS addr mode operations that require duplicating the entire instruction set to support. This makes FS and GS relative accesses far more predictable and work much better. It also simplifies the X86 backend a bit, more to come. There is still a pending issue with nodes like ISD::PREFETCH and X86ISD::FLD, which really should be MemSDNode's but aren't. llvm-svn: 114491	2010-09-21 22:07:31 +00:00
Chris Lattner	0e023ea02a	fix a long standing wart: all the ComplexPattern's were being passed the root of the match, even though only a few patterns actually needed this (one in X86, several in ARM [which should be refactored anyway], and some in CellSPU that I don't feel like detangling). Instead of requiring all ComplexPatterns to take the dead root, have targets opt into getting the root by putting SDNPWantRoot on the ComplexPattern. llvm-svn: 114471	2010-09-21 20:31:19 +00:00
Chris Lattner	c6d8839a2b	even though I'm about to rip it out, simplify the address mode stuff llvm-svn: 114468	2010-09-21 19:41:58 +00:00
Chris Lattner	3d178ed4d4	propagate MachinePointerInfo through various uses of the old SelectionDAG::getExtLoad overload, and eliminate it. llvm-svn: 114446	2010-09-21 17:04:51 +00:00
Chris Lattner	bb0a1c44bf	fix rdar://8453210, a crash handling a call through a GS relative load. For now, just disable folding the load into the call. llvm-svn: 114386	2010-09-21 03:37:00 +00:00
Chris Lattner	65b48b5dfc	zap dead code. llvm-svn: 113073	2010-09-04 18:12:00 +00:00
Jakob Stoklund Olesen	08aede2538	Don't call Predicate_* from X86 target. llvm-svn: 112921	2010-09-03 00:35:18 +00:00
Benjamin Kramer	f1f2133ac0	Remove dead recursive function. Yay for clang -Wunused-function. llvm-svn: 112060	2010-08-25 17:27:58 +00:00
Eli Friedman	39d0f57cab	PR7814: Truncates cannot be ignored for signed comparisons. llvm-svn: 110268	2010-08-04 22:40:58 +00:00
Chris Lattner	f469307c77	Change LEA to have 5 operands for its memory operand, just like all other instructions, even though a segment is not allowed. This resolves a bunch of gross hacks in the encoder and makes LEA more consistent with the rest of the instruction set. No functionality change. llvm-svn: 107934	2010-07-08 23:46:44 +00:00
Evan Cheng	1c349f18f8	Move getExtLoad() and (some) getLoad() DebugLoc argument after EVT argument for consistency sake. llvm-svn: 107820	2010-07-07 22:15:37 +00:00
Devang Patel	a3ca21b228	Propagate debug loc. llvm-svn: 107710	2010-07-06 22:08:15 +00:00
Jakob Stoklund Olesen	d7d0d4e882	When creating X86 MUL8 and DIV8 instructions, make sure we don't produce CopyFromReg nodes for aliasing registers (AX and AL). This confuses the fast register allocator. Instead of CopyFromReg(AL), use ExtractSubReg(CopyFromReg(AX), sub_8bit). This fixes PR7312. llvm-svn: 106934	2010-06-26 00:39:23 +00:00
Dan Gohman	92c11acdb8	Change UpdateNodeOperands' operand and return value from SDValue to SDNode *, since it doesn't care about the ResNo value. llvm-svn: 106282	2010-06-18 15:30:29 +00:00
Dan Gohman	99ba4dac59	Don't maintain a set of deleted nodes; instead, use a HandleSDNode to track a node over CSE events. This fixes PR7368. llvm-svn: 106266	2010-06-18 01:24:29 +00:00
Eric Christopher	b0e1a458ce	Add first pass at darwin tls compiler support. llvm-svn: 105381	2010-06-03 04:07:48 +00:00
Jakob Stoklund Olesen	9340ea59e1	Rename X86 subregister indices to something shorter. Use the tablegen-produced enums. llvm-svn: 104493	2010-05-24 14:48:17 +00:00
Dan Gohman	0fd54fbbcf	Don't leave Base.FrameIndex uninitialized, so that it doesn't print randomly in debug output. llvm-svn: 102668	2010-04-29 23:30:41 +00:00
Evan Cheng	050df1b8de	Enable i16 to i32 promotion by default. llvm-svn: 102493	2010-04-28 08:30:49 +00:00
Chris Lattner	84776786a7	teach the x86 address matching stuff to handle (shl (or x,c), 3) the same as (shl (add x, c), 3) when x doesn't have any bits from c set. This finishes off PR1135. Before we compiled the block to: to: LBB0_3: ## %bb cmpb $4, %dl sete %dl addb %dl, %cl movb %cl, %dl shlb $2, %dl addb %r8b, %dl shlb $2, %dl movzbl %dl, %edx movl %esi, (%rdi,%rdx,4) leaq 2(%rdx), %r9 movl %esi, (%rdi,%r9,4) leaq 1(%rdx), %r9 movl %esi, (%rdi,%r9,4) addq $3, %rdx movl %esi, (%rdi,%rdx,4) incb %r8b decb %al movb %r8b, %dl jne LBB0_1 Now we produce: LBB0_3: ## %bb cmpb $4, %dl sete %dl addb %dl, %cl movb %cl, %dl shlb $2, %dl addb %r8b, %dl shlb $2, %dl movzbl %dl, %edx movl %esi, (%rdi,%rdx,4) movl %esi, 8(%rdi,%rdx,4) movl %esi, 4(%rdi,%rdx,4) movl %esi, 12(%rdi,%rdx,4) incb %r8b decb %al movb %r8b, %dl jne LBB0_1 llvm-svn: 101958	2010-04-20 23:18:40 +00:00
Dan Gohman	21cea8ac2e	Use const qualifiers with TargetLowering. This eliminates several const_casts, and it reinforces the design of the Target classes being immutable. SelectionDAGISel::IsLegalToFold is now a static member function, because PIC16 uses it in an unconventional way. There is more room for API cleanup here. And PIC16's AsmPrinter no longer uses TargetLowering. llvm-svn: 101635	2010-04-17 15:26:15 +00:00
Dan Gohman	bcaf681cde	Add const qualifiers to CodeGen's use of LLVM IR constructs. llvm-svn: 101334	2010-04-15 01:51:59 +00:00
Dan Gohman	c87b74d913	Delete unneeeded arguments. llvm-svn: 101276	2010-04-14 20:17:22 +00:00
Chris Lattner	6f306d7d30	use DebugLoc default ctor instead of DebugLoc::getUnknownLoc() llvm-svn: 100214	2010-04-02 20:16:16 +00:00
Evan Cheng	68333f5c6e	X86 address mode matching code MatchAddressRecursively does some aggressive hack which require doing a RAUW. It may end up deleting some SDNode up stream. It should avoid referencing deleted nodes. llvm-svn: 98780	2010-03-17 23:58:35 +00:00
Evan Cheng	d703df67ce	Do not force indirect tailcall through fixed registers: eax, r11. Add support to allow loads to be folded to tail call instructions. llvm-svn: 98465	2010-03-14 03:48:46 +00:00
Chris Lattner	82cc53388e	add a comment. llvm-svn: 97709	2010-03-04 01:43:43 +00:00
Chris Lattner	3fcbbd8673	factor the 'sign extended from 8 bit' patterns better so that they are not destination type specific. This allows tblgen to factor them and the type check is redundant with what the isel does anyway. llvm-svn: 97629	2010-03-03 01:45:01 +00:00
Chris Lattner	8d63704021	merge two loops over all nodes in the graph into one. llvm-svn: 97606	2010-03-02 23:12:51 +00:00
Chris Lattner	1eb6eb059c	eliminate PreprocessForRMW now that isel handles it. We still preprocess calls and fp return stuff. llvm-svn: 97598	2010-03-02 22:33:56 +00:00
Chris Lattner	dd030701bd	Fix some issues in WalkChainUsers dealing with CopyToReg/CopyFromReg/INLINEASM. These are annoying because they have the same opcode before an after isel. Fix this by setting their NodeID to -1 to indicate that they are selected, just like what automatically happens when selecting things that end up being machine nodes. With that done, give IsLegalToFold a new flag that causes it to ignore chains. This lets the HandleMergeInputChains routine be the one place that validates chains after a match is successful, enabling the new hotness in chain processing. This smarter chain processing eliminates the need for "PreprocessRMW" in the X86 and MSP430 backends and enables MSP to start matching it's multiple mem operand instructions more aggressively. I currently #if out the dead code in the X86 backend and MSP backend, I'll remove it for real in a follow-on patch. The testcase changes are: test/CodeGen/X86/sse3.ll: we generate better code test/CodeGen/X86/store_op_load_fold2.ll: PreprocessRMW was miscompiling this before, we now generate correct code Convert it to filecheck while I'm at it. test/CodeGen/MSP430/Inst16mm.ll: Add a testcase for mem/mem folding to make anton happy. :) llvm-svn: 97596	2010-03-02 22:20:06 +00:00
Chris Lattner	f98f124a73	Sink InstructionSelect() out of each target into SDISel, and rename it DoInstructionSelection. Inline "SelectRoot" into it from DAGISelHeader. Sink some other stuff out of DAGISelHeader into SDISel. Eliminate the various 'Indent' stuff from various targets, which dates to when isel was recursive. 17 files changed, 114 insertions(+), 430 deletions(-) llvm-svn: 97555	2010-03-02 06:34:30 +00:00
Chris Lattner	bd6e193f54	remove a little hack I did for the old isel, not needed now that it is gone. llvm-svn: 97516	2010-03-01 22:51:11 +00:00
Chris Lattner	55ef1ebe52	remove a terrible hack that disabled assertions from this file because of build time problems. rdar://7697850. llvm-svn: 97500	2010-03-01 21:20:46 +00:00
Chris Lattner	8d7b4393d2	no need to override IsLegalToFold, the base implementation disables load folding at -O0. llvm-svn: 96973	2010-02-23 19:33:11 +00:00
Chris Lattner	3c29aff9ff	fix and un-xfail X86/vec_ss_load_fold.ll llvm-svn: 96720	2010-02-21 04:53:34 +00:00
Chris Lattner	18a32ce0f3	rename SelectScalarSSELoad -> SelectScalarSSELoadXXX and rewrite it to follow the mode needed by the new isel. Instead of returning the input and output chains, it just returns the (currently only one, which is a silly limitation) node that has input and output chains. Since we want the old thing to still work, add a new SelectScalarSSELoad to emulate the old interface. The XXX suffix and the wrapper will eventually go away. llvm-svn: 96715	2010-02-21 03:17:59 +00:00
Chris Lattner	3f48215480	rename and document some arguments so I don't have to keep reverse engineering what they are. llvm-svn: 96456	2010-02-17 06:07:47 +00:00
Chris Lattner	afac7dad21	fix rdar://7653908, a crash on a case where we would fold a load into a roundss intrinsic, producing a cyclic dag. The root cause of this is badness handling ComplexPattern nodes in the old dagisel that I noticed through inspection. Eliminate a copy of the of the code that handled ComplexPatterns by making EmitChildMatchCode call into EmitMatchCode. llvm-svn: 96408	2010-02-16 22:35:06 +00:00
Evan Cheng	5e73ff2e3a	Split SelectionDAGISel::IsLegalAndProfitableToFold to IsLegalToFold and IsProfitableToFold. The generic version of the later simply checks whether the folding candidate has a single use. This allows the target isel routines more flexibility in deciding whether folding makes sense. The specific case we are interested in is folding constant pool loads with multiple uses. llvm-svn: 96255	2010-02-15 19:41:07 +00:00
David Greene	cbd39c5def	Remove an assumption of default arguments. This is in anticipation of a change to SelectionDAG build APIs. llvm-svn: 96239	2010-02-15 16:57:43 +00:00
Chris Lattner	2b0a7a2592	refactor the conditional jump instructions in the .td file to use a multipattern that generates both the 1-byte and 4-byte versions from the same defm llvm-svn: 95901	2010-02-11 19:25:55 +00:00
Chris Lattner	b06015aa69	move target-independent opcodes out of TargetInstrInfo into TargetOpcodes.h. #include the new TargetOpcodes.h into MachineInstr. Add new inline accessors (like isPHI()) to MachineInstr, and start using them throughout the codebase. llvm-svn: 95687	2010-02-09 19:54:29 +00:00
Dan Gohman	51ad99d2c5	Re-implement the main strength-reduction portion of LoopStrengthReduction. This new version is much more aggressive about doing "full" reduction in cases where it reduces register pressure, and also more aggressive about rewriting induction variables to count down (or up) to zero when doing so reduces register pressure. It currently uses fairly simplistic algorithms for finding reuse opportunities, but it introduces a new framework allows it to combine multiple strategies at once to form hybrid solutions, instead of doing all full-reduction or all base+index. llvm-svn: 94061	2010-01-21 02:09:26 +00:00
David Greene	0985160c54	When XDEBUG is enabled, check for SelectionDAG cycles at some key points. This will help us find future problems like the one described in PR6019. llvm-svn: 94019	2010-01-20 20:13:31 +00:00
David Greene	b0c0e6433f	Fix PR6019. A load has more than one use if it feeds a bitconvert that has more than one use. llvm-svn: 93576	2010-01-15 23:23:41 +00:00
Dan Gohman	c119580307	Reapply the MOV64r0 patch, with a fix: MOV64r0 clobbers EFLAGS. llvm-svn: 93229	2010-01-12 04:42:54 +00:00
Evan Cheng	7bdf339602	Revert 93158. It's breaking quite a few x86_64 tests. llvm-svn: 93185	2010-01-11 21:13:41 +00:00
Dan Gohman	3a55686345	Re-instate MOV64r0 and MOV16r0, with adjustments to work with the new AsmPrinter. This is perhaps less elegant than describing them in terms of MOV32r0 and subreg operations, but it allows the current register to rematerialize them. llvm-svn: 93158	2010-01-11 17:37:57 +00:00
David Greene	dbdb1b28b8	Change errs() to dbgs(). llvm-svn: 92647	2010-01-05 01:29:08 +00:00
Dan Gohman	ea6f91ff64	Change SelectCode's argument from SDValue to SDNode , to make it more clear what information these functions are actually using. This is also a micro-optimization, as passing a SDNode around is simpler than passing a { SDNode *, int } by value or reference. llvm-svn: 92564	2010-01-05 01:24:18 +00:00
Dan Gohman	85d4fdfe37	Flags-producing add, and, or, etc. have the same profibility rules as normal add, and, or, etc. llvm-svn: 92507	2010-01-04 20:51:50 +00:00
Chris Lattner	518b037620	completely eliminate the MOV16r0 'instruction'. The only interesting part of this is the divrem changes, which are already tested by CodeGen/X86/divrem.ll. llvm-svn: 91975	2009-12-23 01:45:04 +00:00
Evan Cheng	3dfd04e2b7	Re-apply 91623 now that I actually know what I was trying to do. llvm-svn: 91655	2009-12-18 01:59:21 +00:00
Jeffrey Yasskin	0de0ce11d8	Revert r91623 to unbreak the buildbots. llvm-svn: 91632	2009-12-17 22:44:34 +00:00
Evan Cheng	e43b403c87	Remove an unused option. llvm-svn: 91623	2009-12-17 21:23:58 +00:00
Dan Gohman	7a6611793f	Target-independent support for TargetFlags on BlockAddress operands, and support for blockaddresses in x86-32 PIC mode. llvm-svn: 89506	2009-11-20 23:18:13 +00:00
Daniel Dunbar	bc299f0092	llvm-gcc/clang don't (won't?) need this hack. llvm-svn: 86769	2009-11-11 00:28:38 +00:00
Daniel Dunbar	b9415c7d9a	Add a monstrous hack to improve X86ISelDAGToDAG compile time. - Force NDEBUG on in any Release build. This drops the compile time to ~100s from ~600s, in Release mode. - This may just be a temporary workaround, I don't know the true nature of the gcc-4.2 compile time performance problem. llvm-svn: 86695	2009-11-10 18:24:37 +00:00
Dan Gohman	006f9353e1	Use SUBREG_TO_REG instead of INSERT_SUBREG to model x86-64's implicit zero-extend. llvm-svn: 86196	2009-11-05 23:53:08 +00:00
Dan Gohman	b15f4a1cbd	Remove uninteresting and confusing debug output. llvm-svn: 86149	2009-11-05 18:47:09 +00:00
Chris Lattner	50ba5c3dc2	improve x86 codegen support for blockaddress. We now compile the testcase into: _test1: ## @test1 ## BB#0: ## %entry leaq L_test1_bb6(%rip), %rax jmpq *%rax L_test1_bb: ## Address Taken LBB1_1: ## %bb movb $1, %al ret L_test1_bb6: ## Address Taken LBB1_2: ## %bb6 movb $2, %al ret Note, it is very very strange that BlockAddressSDNode doesn't carry around TargetFlags. Dan, please fix this. llvm-svn: 85703	2009-11-01 03:25:03 +00:00
Nick Lewycky	974e12b2d3	Remove includes of Support/Compiler.h that are no longer needed after the VISIBILITY_HIDDEN removal. llvm-svn: 85043	2009-10-25 06:57:41 +00:00
Nick Lewycky	02d5f77d26	Remove VISIBILITY_HIDDEN from class/struct found inside anonymous namespaces. Chris claims we should never have visibility_hidden inside any .cpp file but that's still not true even after this commit. llvm-svn: 85042	2009-10-25 06:33:48 +00:00
Dan Gohman	7d9dffb413	Fix the x86 test-shrink optimization so that it doesn't shrink comparisons when one of the bits being tested would end up being the sign bit in the narrower type, and a signed comparison is being performed, since this would change the result of the signed comparison. This fixes PR5132. llvm-svn: 83670	2009-10-09 20:35:19 +00:00
Dan Gohman	48b185d6f7	Improve MachineMemOperand handling. - Allocate MachineMemOperands and MachineMemOperand lists in MachineFunctions. This eliminates MachineInstr's std::list member and allows the data to be created by isel and live for the remainder of codegen, avoiding a lot of copying and unnecessary translation. This also shrinks MemSDNode. - Delete MemOperandSDNode. Introduce MachineSDNode which has dedicated fields for MachineMemOperands. - Change MemSDNode to have a MachineMemOperand member instead of its own fields with the same information. This introduces some redundancy, but it's more consistent with what MachineInstr will eventually want. - Ignore alignment when searching for redundant loads for CSE, but remember the greatest alignment. Target-specific code which previously used MemOperandSDNodes with generic SDNodes now use MemIntrinsicSDNodes, with opcodes in a designated range so that the SelectionDAG framework knows that MachineMemOperand information is available. llvm-svn: 82794	2009-09-25 20:36:54 +00:00
Dan Gohman	32f71d714b	Rename getTargetNode to getMachineNode, for consistency with the naming scheme used in SelectionDAG, where there are multiple kinds of "target" nodes, but "machine" nodes are nodes which represent a MachineInstr. llvm-svn: 82790	2009-09-25 18:54:59 +00:00
Nate Begeman	1ae49ee7ca	Do not try and sink a load whose chain result has more than one use, when trying to create RMW opportunities in the x86 backend. This can cause a cycle to appear in the graph, since the other uses may eventually feed into the TokenFactor we are sinking the load below. llvm-svn: 81996	2009-09-16 03:20:46 +00:00
Dan Gohman	520a6856ba	Don't pull a load through a callseq_start if the load's chain has multiple uses, as one of the other uses may be on a path to a different node above the callseq_start, because that leads to a cyclic graph. This problem is exposed when -combiner-global-alias-analysis is used. This fixes PR4880. llvm-svn: 81821	2009-09-15 01:22:01 +00:00
Dan Gohman	0f6bf2dbb8	Use X86II::MO_NO_FLAG. llvm-svn: 80012	2009-08-25 17:47:44 +00:00
Benjamin Kramer	940fbb0e3c	Remove Streams.h from the targets. llvm-svn: 79853	2009-08-23 11:52:17 +00:00

1 2 3 4 5 ...

590 Commits