llvm-project

Commit Graph

Author	SHA1	Message	Date
Michael Liao	de51caf2a0	Add missing i64 max/min/umax/umin on 32-bit target - Turn on atomic6432.ll and add specific test case as well llvm-svn: 164616	2012-09-25 18:08:13 +00:00
Michael Liao	8372539543	Unify the logic in SelectAtomicLoadAdd and SelectAtomicLoadArith - Merge the processing of LOAD_ADD with other atomic load-arith operations - Separate the logic getting target constant for atomic-load-op and add an optimization for atomic-load-add on i16 with negative value - Optimize a minor case for atomic-fetch-add i16 with negative operand. Test case is revised. llvm-svn: 164243	2012-09-19 19:36:58 +00:00
Jakob Stoklund Olesen	78b9f8fc67	Revert r163761 "Don't fold indexed loads into TCRETURNmi64." The patch caused "Wrong topological sorting" assertions. llvm-svn: 163810	2012-09-13 16:52:17 +00:00
Jakob Stoklund Olesen	bfacef45eb	Don't fold indexed loads into TCRETURNmi64. We don't have enough GR64_TC registers when calling a varargs function with 6 arguments. Since %al holds the number of vector registers used, only %r11 is available as a scratch register. This means that addressing modes using both base and index registers can't be folded into TCRETURNmi64. <rdar://problem/12282281> llvm-svn: 163761	2012-09-13 00:25:00 +00:00
Michael Liao	abb87d4857	Fix PR11985 - BlockAddress has no support of BA + offset form and there is no way to propagate that offset into machine operand; - Add BA + offset support and a new interface 'getTargetBlockAddress' to simplify target block address forming; - All targets are modified to use new interface and X86 backend is enhanced to support BA + offset addressing. llvm-svn: 163743	2012-09-12 21:43:09 +00:00
Manman Ren	19f49ac624	Release build: guard dump functions with "#if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)" No functional change. Update r163339. llvm-svn: 163653	2012-09-11 22:23:19 +00:00
Manman Ren	742534c4dc	Release build: guard dump functions with "ifndef NDEBUG" No functional change. llvm-svn: 163339	2012-09-06 19:06:06 +00:00
Richard Smith	228e6d4cf3	Fix integer undefined behavior due to signed left shift overflow in LLVM. Reviewed offline by chandlerc. llvm-svn: 162623	2012-08-24 23:29:28 +00:00
Craig Topper	22cb0c572b	Add a couple default: llvm_unreachable() to some switch statements. Fix a bad message in an existing llvm_unreachable. llvm-svn: 161725	2012-08-11 17:44:14 +00:00
Manman Ren	1be131ba27	X86: enable CSE between CMP and SUB We perform the following: 1> Use SUB instead of CMP for i8,i16,i32 and i64 in ISel lowering. 2> Modify MachineCSE to correctly handle implicit defs. 3> Convert SUB back to CMP if possible at peephole. Removed pattern matching of (a>b) ? (a-b):0 and like, since they are handled by peephole now. rdar://11873276 llvm-svn: 161462	2012-08-08 00:51:41 +00:00
Craig Topper	ab47fe4e16	Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305. llvm-svn: 161318	2012-08-06 06:22:36 +00:00
Chad Rosier	24c19d20c0	Whitespace. llvm-svn: 161122	2012-08-01 18:39:17 +00:00
David Chisnall	5b8c1680de	ELF does not imply GNU/Linux. Do not assume GNU conventions just because we are targeting an ELF platform. Only fold gs-relative (and fs-relative) loads if it is actually sensible to do so for the target platform. This fixes PR13438. llvm-svn: 160687	2012-07-24 20:04:16 +00:00
Craig Topper	f7755df776	Update GATHER instructions to support 2 read-write operands. Patch from myself and Manman Ren. llvm-svn: 160110	2012-07-12 06:52:41 +00:00
Craig Topper	3af251dbf1	Reduce code size by using a second switch statement to avoid extra calls to SelectAtomic64. Also catch cases where SelectAtomic64 fails. llvm-svn: 159503	2012-07-01 02:55:34 +00:00
Craig Topper	e15e5f7c5c	Add a break to the end of case statement missed in r159501. llvm-svn: 159502	2012-07-01 02:18:18 +00:00
Craig Topper	fbb954f727	Fix a crash on release builds if gather intrinsics are passed a non-constant value for the last argument. llvm-svn: 159501	2012-07-01 02:17:08 +00:00
Craig Topper	def044b974	Use a second switch statement to reduce number of calls to SelectGather in code. Reduces code size a bit. llvm-svn: 159500	2012-07-01 02:05:52 +00:00
Manman Ren	98a5bf24a9	X86: add more GATHER intrinsics in LLVM Corrected type for index of llvm.x86.avx2.gather.d.pd.256 from 256-bit to 128-bit. Corrected types for src\|dst\|mask of llvm.x86.avx2.gather.q.ps.256 from 256-bit to 128-bit. Support the following intrinsics: llvm.x86.avx2.gather.d.q, llvm.x86.avx2.gather.q.q llvm.x86.avx2.gather.d.q.256, llvm.x86.avx2.gather.q.q.256 llvm.x86.avx2.gather.d.d, llvm.x86.avx2.gather.q.d llvm.x86.avx2.gather.d.d.256, llvm.x86.avx2.gather.q.d.256 llvm-svn: 159402	2012-06-29 00:54:20 +00:00
Manman Ren	a09820414a	X86: add GATHER intrinsics (AVX2) in LLVM Support the following intrinsics: llvm.x86.avx2.gather.d.pd, llvm.x86.avx2.gather.q.pd llvm.x86.avx2.gather.d.pd.256, llvm.x86.avx2.gather.q.pd.256 llvm.x86.avx2.gather.d.ps, llvm.x86.avx2.gather.q.ps llvm.x86.avx2.gather.d.ps.256, llvm.x86.avx2.gather.q.ps.256 Modified Disassembler to handle VSIB addressing mode. llvm-svn: 159221	2012-06-26 19:47:59 +00:00
Craig Topper	a4fd6d655a	Tidy up spacing. llvm-svn: 157313	2012-05-23 05:44:51 +00:00
Evan Cheng	58a95f0c8a	Avoid creating a cycle when folding load / op with flag / store. PR11451474. rdar://11451474 llvm-svn: 156896	2012-05-16 01:54:27 +00:00
Evan Cheng	3e869f002c	Generalize r153635 to deal with TokenFactor chains; also clean up the logic and fix the tests. rdar://11069732, rdar://11236106 llvm-svn: 154604	2012-04-12 19:14:21 +00:00
Chandler Carruth	3779ac10b4	Cleanup and relax a restriction on the matching of global offsets into x86 addressing modes. This allows PIE-based TLS offsets to fit directly into an addressing mode immediate offset, which is the last remaining code quality issue from PR12380. With this patch, that PR is completely fixed. To understand why this patch is correct to match these offsets into addressing mode immediates, break it down by cases: 1) 32-bit is trivially correct, and unmodified here. 2) 64-bit non-small mode is unchanged and never matches. 3) 64-bit small PIC code which is RIP-relative is handled specially in the match to try to fit RIP into the base register. If it fails, it now early exits. This behavior is unchanged by the patch. 4) 64-bit small non-PIC code which is not RIP-relative continues to work as it did before. The reason these immediates are safe is because the ABI ensures they fit in small mode. This behavior is unchanged. 5) 64-bit small PIC code which is not using RIP-relative addressing. This is the only case changed by the patch, and the primary place you see it is in TLS, either the win64 section offset TLS or Linux local-exec TLS model in a PIC compilation. Here the ABI again ensures that the immediates fit because we are in small mode, and any other operations required due to the PIC relocation model have been handled externally to the Wrapper node (extra loads etc are made around the wrapper node in ISelLowering). I've tested this as much as I can comparing it with GCC's output, and everything appears safe. I discussed this with Anton and it made sense to him at least at face value. That said, if there are issues with PIC code after this patch, yell and we can revert it. llvm-svn: 154304	2012-04-09 02:13:06 +00:00
Rafael Espindola	ba0a6cabb8	Always compute all the bits in ComputeMaskedBits. This allows us to keep passing reduced masks to SimplifyDemandedBits, but know about all the bits if SimplifyDemandedBits fails. This allows instcombine to simplify cases like the one in the included testcase. llvm-svn: 154011	2012-04-04 12:51:34 +00:00
Benjamin Kramer	8619c37b5b	Replace assert(0) with llvm_unreachable to avoid warnings about dropping off the end of a non-void function in Release builds. llvm-svn: 153643	2012-03-29 12:37:26 +00:00
Joel Jones	68d59e8a90	For X86, change load/dec-or-inc/store into dec-or-inc, respectively. This is a code change to add support for changing instruction sequences of the form: load inc/dec of 8/16/32/64 bits store into the appropriate X86 inc/dec through memory instruction: inc[qlwb] / dec[qlwb] The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode. The comments have also been expanded. llvm-svn: 153635	2012-03-29 05:45:48 +00:00
Joel Jones	b474099e63	Reverted to revision 153616 to unblock build llvm-svn: 153623	2012-03-29 01:20:56 +00:00
Joel Jones	b88c81fe0f	For X86, change load/dec-or-inc/store into dec-or-inc, respectively. This is a code change to add support for changing instruction sequences of the form: load inc/dec of 8/16/32/64 bits store into the appropriate X86 inc/dec through memory instruction: inc[qlwb] / dec[qlwb] The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode. The comments have also been expanded. llvm-svn: 153617	2012-03-29 00:37:47 +00:00
Craig Topper	1fcf5bcae1	Prune some includes llvm-svn: 153502	2012-03-27 07:54:11 +00:00
Craig Topper	f6e7e12f75	Remove unnecessary llvm:: qualifications llvm-svn: 153500	2012-03-27 07:21:54 +00:00
Craig Topper	b25fda95f6	Reorder includes in Target backends to following coding standards. Remove some superfluous forward declarations. llvm-svn: 152997	2012-03-17 18:46:09 +00:00
Craig Topper	2dac962864	Use uint16_t to store opcodes in static tables in X86 backend. llvm-svn: 152391	2012-03-09 07:45:21 +00:00
Craig Topper	cc830f8cda	Declare register classes as const. Fix a couple pointers to register classes that weren't already const. llvm-svn: 151138	2012-02-22 07:28:11 +00:00
Jakob Stoklund Olesen	97e3115dc2	Use the same CALL instructions for Windows as for everything else. The different calling conventions and call-preserved registers are represented with regmask operands that are added dynamically. llvm-svn: 150708	2012-02-16 17:56:02 +00:00
Pete Cooper	c21ebf5c41	Stop custom lowering forr x86 DEC64m from happening if the load in the lowered sequence has more than 1 user llvm-svn: 150537	2012-02-15 00:33:37 +00:00
Pete Cooper	71be57bb32	Fixed bug when custom lowering DEC64m on x86. If the DEC node had more than one user, it was doing this lowering but leaving the original DEC node around and so decrementing twice. Fixes PR11964. llvm-svn: 150356	2012-02-13 00:10:03 +00:00
David Blaikie	46a9f016c5	More dead code removal (using -Wunreachable-code) llvm-svn: 148578	2012-01-20 21:51:11 +00:00
Chandler Carruth	eb21da060b	Switch all of the uses of my InsertDAGNode helper to follow the exact same pattern. We already had this pattern is a few places, but others tried to make a rough approximation of an actual DAG structure. As not everywhere went to this trouble, nothing could rely on this being done. In fact, I've checked all references to these node Ids, and the ones that are using the topo-sort properties are actually satisfied with a strict-weak-ordering. The requirement appears to be that Use >= Def. I've added a big blurb of comments to this bit of the transform to clarify why the order is so important for the next reader of the code. I'm starting with this change as it is very small, and trivially reverted if something breaks or the >= above really does need to be >. If that proves the case, we can hide the problem by reverting this patch, but the problem exists elsewhere as well, and so a more comprehensive solution will be needed. llvm-svn: 148001	2012-01-12 01:34:44 +00:00
Chandler Carruth	3212a34269	Revert r147945 which disabled an addressing mode transformation. I had hoped this would revive one of the llvm-gcc selfhost build bots, but it didn't so it doesn't appear that my transform is the culprit. If anyone else is seeing failures, please let me know! llvm-svn: 147957	2012-01-11 18:36:12 +00:00
Chandler Carruth	9bc48e5215	Disable the transformation I added in r147936 to see if it fixes some strange build bot failures that look like a miscompile into an infloop. I'll investigate this tomorrow, but I'd both like to know whether my patch is the culprit, and get the bots back to green. llvm-svn: 147945	2012-01-11 12:17:47 +00:00
Chandler Carruth	3eacfb83fa	Hoist a really redundant code pattern into a helper function, and delete lots of lines of code. No functionality changed. llvm-svn: 147942	2012-01-11 11:04:36 +00:00
Chandler Carruth	b0049f4a43	Simplify the AND-rooted mask+shift checking code to match that of the SRL-rooted code. llvm-svn: 147941	2012-01-11 09:35:04 +00:00
Chandler Carruth	3dbcda8478	Unify the interface of the three mask+shift transform helpers, and factor the differences that were hiding in one of them into its other caller, the SRL handling code. No change in behavior. llvm-svn: 147940	2012-01-11 09:35:02 +00:00
Chandler Carruth	aa01e6661a	Clarify and make explicit some of the requirements for transforming mask+shift pairs at the beginning of the ISD::AND case block, and then hoist the final pattern into a helper function, simplifying and reflowing it appropriately. This should have no observable behavior change, but several simplifications fell out of this such as directly computing the new mask constant, etc. llvm-svn: 147939	2012-01-11 09:35:00 +00:00
Chandler Carruth	51d3076bbf	Hoist the logic to transform shift+mask combinations into sub-register extracts and scaled addressing modes into its own helper function. No functionality changed here, just hoisting and layout fixes falling out of that hoisting. llvm-svn: 147937	2012-01-11 08:48:20 +00:00
Chandler Carruth	55b2cdee26	Teach the X86 instruction selection to do some heroic transforms to detect a pattern which can be implemented with a small 'shl' embedded in the addressing mode scale. This happens in real code as follows: unsigned x = my_accelerator_table[input >> 11]; Here we have some lookup table that we look into using the high bits of 'input'. Each entity in the table is 4-bytes, which means this implicitly gets turned into (once lowered out of a GEP): (unsigned)((char)my_accelerator_table + ((input >> 11) << 2)); The shift right followed by a shift left is canonicalized to a smaller shift right and masking off the low bits. That hides the shift right which x86 has an addressing mode designed to support. We now detect masks of this form, and produce the longer shift right followed by the proper addressing mode. In addition to saving a (rather large) instruction, this also reduces stalls in Intel chips on benchmarks I've measured. In order for all of this to work, one part of the DAG needs to be canonicalized still further* than it currently is. This involves removing pointless 'trunc' nodes between a zextload and a zext. Without that, we end up generating spurious masks and hiding the pattern. llvm-svn: 147936	2012-01-11 08:41:08 +00:00
Chandler Carruth	c16622daff	Don't rely on the fact that shift values are never very large, and thus this substraction will result in small negative numbers at worst which become very large positive numbers on assignment and are thus caught by the <=4 check on the next line. The >0 check clearly intended to catch these as negative numbers. Spotted by inspection, and impossible to trigger given the shift widths that can be used. llvm-svn: 147773	2012-01-09 09:47:25 +00:00
Pete Cooper	48784ed5b7	Added missing comment about new custom lowering of DEC64 llvm-svn: 144811	2011-11-16 19:03:23 +00:00
Pete Cooper	7c7ba1baa1	Added custom lowering for load->dec->store sequence in x86 when the EFLAGS registers is used by later instructions. Only done for DEC64m right now. Fixes <rdar://problem/6172640> llvm-svn: 144705	2011-11-15 21:57:53 +00:00
Dan Gohman	198b7ffc11	Reapply r143206, with fixes. Disallow physical register lifetimes across calls, and only check for nested dependences on the special call-sequence-resource register. llvm-svn: 143660	2011-11-03 21:49:52 +00:00
Dan Gohman	9b9c970148	Revert r143206, as there are still some failing tests. llvm-svn: 143262	2011-10-29 00:41:52 +00:00
Dan Gohman	73057ad24f	Reapply r143177 and r143179 (reverting r143188), with scheduler fixes: Use a separate register, instead of SP, as the calling-convention resource, to avoid spurious conflicts with actual uses of SP. Also, fix unscheduling of calling sequences, which can be triggered by pseudo-two-address dependencies. llvm-svn: 143206	2011-10-28 17:55:38 +00:00
Duncan Sands	225a7037d6	Speculatively disable Dan's commits 143177 and 143179 to see if it fixes the dragonegg self-host (it looks like gcc is miscompiled). Original commit messages: Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. Delete #if 0 code accidentally left in. llvm-svn: 143188	2011-10-28 09:55:57 +00:00
Dan Gohman	4db3f7dd83	Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. llvm-svn: 143177	2011-10-28 01:29:32 +00:00
Jakob Stoklund Olesen	729abd360e	Add TEST8ri_NOREX pseudo to constrain sub_8bit_hi copies. In 64-bit mode, sub_8bit_hi sub-registers can only be used by NOREX instructions. The COPY created from the EXTRACT_SUBREG DAG node cannot target all GR8 registers, only those in GR8_NOREX. TO enforce this, we ensure that all instructions using the EXTRACT_SUBREG are GR8_NOREX constrained. This fixes PR11088. llvm-svn: 141499	2011-10-08 18:28:28 +00:00
Bruno Cardoso Lopes	616fe60548	Teach PreprocessISelDAG to be aware of vector types and to not process them. llvm-svn: 136653	2011-08-01 21:54:05 +00:00
Eli Friedman	344ec79715	Make sure we don't combine a large displacement and a frame index in the same addressing mode on x86-64. It can overflow, leading to a crash/miscompile. <rdar://problem/9763308> llvm-svn: 135084	2011-07-13 21:29:53 +00:00
Eli Friedman	ef67e7d623	Refactor out checking for displacements on x86-64 addressing modes. No functionality change. Refactoring in preparation for an additional safety check in FoldOffsetIntoAddress. Part of <rdar://problem/9763308>. llvm-svn: 135079	2011-07-13 20:44:23 +00:00
Eric Christopher	a8a56f7e5c	TargetConstant immediates won't be placed into registers so tighten up the valid constant check earlier. rdar://9692967 llvm-svn: 134286	2011-07-01 23:04:38 +00:00
Eric Christopher	c932173773	Fix a small thinko for constant i64 lock/orq optimization where we we didn't have an opcode for 64-bit constant or expressions. Fixes rdar://9692967 llvm-svn: 134121	2011-06-30 00:48:30 +00:00
Stuart Hastings	91f1d24736	Re-commit 131641 with fixes; de-pseudoize MOVSX16rr8 and friends. rdar://problem/8614450 llvm-svn: 131746	2011-05-20 19:04:40 +00:00
Eric Christopher	56a42ebf15	Update comment. llvm-svn: 131459	2011-05-17 08:16:14 +00:00
Eric Christopher	a1d9e29552	Support XOR and AND optimization with no return value. Finishes off rdar://8470697 llvm-svn: 131458	2011-05-17 08:10:18 +00:00
Eric Christopher	abfe3131e3	Couple less magic numbers. llvm-svn: 131457	2011-05-17 07:50:41 +00:00
Eric Christopher	eb47a2a1e5	Make this code a little less magic number laden. llvm-svn: 131456	2011-05-17 07:47:55 +00:00
Eric Christopher	2a9dbbbb12	Turn this into a table, this will make more sense shortly. Part of rdar://8470697 llvm-svn: 131200	2011-05-11 21:44:58 +00:00
Eric Christopher	4a34e61e53	Optimize atomic lock or that doesn't use the result value. Next up: xor and and. Part of rdar://8470697 llvm-svn: 131171	2011-05-10 23:57:45 +00:00
Benjamin Kramer	3db054650b	Silence an overzealous uninitialized variable warning from GCC. llvm-svn: 130053	2011-04-23 08:21:06 +00:00
Benjamin Kramer	4c81624735	X86: Try to use a smaller encoding by transforming (X << C1) & C2 into (X & (C2 >> C1)) & C1. (Part of PR5039) This tends to happen a lot with bitfield code generated by clang. A simple example for x86_64 is uint64_t foo(uint64_t x) { return (x&1) << 42; } which used to compile into bloated code: shlq $42, %rdi ## encoding: [0x48,0xc1,0xe7,0x2a] movabsq $4398046511104, %rax ## encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x00,0x04,0x00,0x00] andq %rdi, %rax ## encoding: [0x48,0x21,0xf8] ret ## encoding: [0xc3] with this patch we can fold the immediate into the and: andq $1, %rdi ## encoding: [0x48,0x83,0xe7,0x01] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] shlq $42, %rax ## encoding: [0x48,0xc1,0xe0,0x2a] ret ## encoding: [0xc3] It's possible to save another byte by using 'andl' instead of 'andq' but I currently see no way of doing that without making this code even more complicated. See the TODOs in the code. llvm-svn: 129990	2011-04-22 15:30:40 +00:00
Stuart Hastings	81c4306005	Swap VT and DebugLoc operands of getExtLoad() for consistency with other getNode() methods. Radar 9002173. llvm-svn: 125665	2011-02-16 16:23:55 +00:00
Chris Lattner	46c01a30f4	Enhance ComputeMaskedBits to know that aligned frameindexes have their low bits set to zero. This allows us to optimize out explicit stack alignment code like in stack-align.ll:test4 when it is redundant. Doing this causes the code generator to start turning FI+cst into FI\|cst all over the place, which is general goodness (that is the canonical form) except that various pieces of the code generator don't handle OR aggressively. Fix this by introducing a new SelectionDAG::isBaseWithConstantOffset predicate, and using it in places that are looking for ADD(X,CST). The ARM backend in particular was missing a lot of addressing mode folding opportunities around OR. llvm-svn: 125470	2011-02-13 22:25:43 +00:00
NAKAMURA Takumi	f3e20b9f0f	lib/Target/X86/X86ISelDAGToDAG.cpp: __main should be WINCALL64 on Win64. CALL64 marks %xmm* as dead. llvm-svn: 124354	2011-01-27 03:20:19 +00:00
Chris Lattner	35a2e65bcb	fix PR8514, a bug where the "heroic" transformation of shift/and into and/shift would cause nodes to move around and a dangling pointer to happen. The code tried to avoid this with a HandleSDNode, but got the details wrong. llvm-svn: 123578	2011-01-16 08:48:11 +00:00
Ted Kremenek	b5241b2b59	'HiReg' is written but never read. Nuke its declaration and its assignments. Found by clang static analyzer. llvm-svn: 123486	2011-01-14 22:34:13 +00:00
Bill Wendling	81d40711f3	PR8918 - When used with MinGW64, LLVM generates a "calll __main" at the beginning of the "main" function. The assembler complains about the invalid suffix for the 'call' instruction. The right instruction is "callq __main". Patch by KS Sreeram! llvm-svn: 122933	2011-01-06 00:47:10 +00:00
Chris Lattner	3e5fbd74ed	rename MVT::Flag to MVT::Glue. "Flag" is a terrible name for something that just glues two nodes together, even if it is sometimes used for flags. llvm-svn: 122310	2010-12-21 02:38:05 +00:00
Chris Lattner	364bb0a081	it turns out that when ".with.overflow" intrinsics were added to the X86 backend that they were all implemented except umul. This one fell back to the default implementation that did a hi/lo multiply and compared the top. Fix this to check the overflow flag that the 'mul' instruction sets, so we can avoid an explicit test. Now we compile: void *func(long count) { return new int[count]; } into: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] seto %cl ## encoding: [0x0f,0x90,0xc1] testb %cl, %cl ## encoding: [0x84,0xc9] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL instead of: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] testq %rdx, %rdx ## encoding: [0x48,0x85,0xd2] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL Other than the silly seto+test, this is using the o bit directly, so it's going in the right direction. llvm-svn: 120935	2010-12-05 07:30:36 +00:00
Dale Johannesen	e660f4d072	Use a MemIntrinsicSDNode for ISD::PREFETCH, which touches memory, so a MachineMemOperand is useful (not propagated into the MachineInstr yet). No functional change except for dump output. llvm-svn: 117413	2010-10-26 23:11:10 +00:00
Chris Lattner	1a1c600110	Use #NAME# to have the CMOV multiclass define things with the same names as before (e.g. CMOVBE16rr instead of CMOVBErr16). llvm-svn: 115705	2010-10-05 23:00:14 +00:00
Chris Lattner	0067ee02f9	switch CMOVBE to the multipattern: 21 insertions(+), 53 deletions(-) Moar change coming before I switch the rest. llvm-svn: 115697	2010-10-05 22:23:58 +00:00
Eric Christopher	c1b3e072f4	Temporarily work around new address lowering while I figure out what needs to happen for darwin. llvm-svn: 114577	2010-09-22 20:42:08 +00:00
Chris Lattner	8a236b63d8	reimplement elf TLS support in terms of addressing modes, eliminating SegmentBaseAddress. llvm-svn: 114529	2010-09-22 04:39:11 +00:00
Chris Lattner	a5156c30ed	convert the last 4 X86ISD nodes that should have memoperands to have them. llvm-svn: 114523	2010-09-22 01:28:21 +00:00
Chris Lattner	ed85da5600	give X86ISD::FNSTCW16m a memoperand, since it touches memory. It only can access the stack due to how it is generated though. llvm-svn: 114522	2010-09-22 01:11:26 +00:00
Chris Lattner	78f518b79b	give FP_TO_INT16_IN_MEM and friends a memoperand. They are only used with stack slots, but hey, lets be safe. llvm-svn: 114521	2010-09-22 01:05:16 +00:00
Chris Lattner	54e5329545	give VZEXT_LOAD a memory operand, it now works with segment registers. llvm-svn: 114515	2010-09-22 00:34:38 +00:00
Chris Lattner	07827ba978	revert r114386 now that address modes work correctly, we get a nice call through gs-relative memory now. llvm-svn: 114510	2010-09-22 00:11:31 +00:00
Chris Lattner	e479e9643b	give LCMPXCHG_DAG[8] a memory operand, allowing it to work with addrspace 256/257 llvm-svn: 114508	2010-09-21 23:59:42 +00:00
Chris Lattner	d58d7c1907	reimplement support for GS and FS relative address space matching by having X86DAGToDAGISel::SelectAddr get passed in the parent node of the operand match (the load/store/atomic op) and having it get the address space from that, instead of having special FS/GS addr mode operations that require duplicating the entire instruction set to support. This makes FS and GS relative accesses far more predictable and work much better. It also simplifies the X86 backend a bit, more to come. There is still a pending issue with nodes like ISD::PREFETCH and X86ISD::FLD, which really should be MemSDNode's but aren't. llvm-svn: 114491	2010-09-21 22:07:31 +00:00
Chris Lattner	0e023ea02a	fix a long standing wart: all the ComplexPattern's were being passed the root of the match, even though only a few patterns actually needed this (one in X86, several in ARM [which should be refactored anyway], and some in CellSPU that I don't feel like detangling). Instead of requiring all ComplexPatterns to take the dead root, have targets opt into getting the root by putting SDNPWantRoot on the ComplexPattern. llvm-svn: 114471	2010-09-21 20:31:19 +00:00
Chris Lattner	c6d8839a2b	even though I'm about to rip it out, simplify the address mode stuff llvm-svn: 114468	2010-09-21 19:41:58 +00:00
Chris Lattner	3d178ed4d4	propagate MachinePointerInfo through various uses of the old SelectionDAG::getExtLoad overload, and eliminate it. llvm-svn: 114446	2010-09-21 17:04:51 +00:00
Chris Lattner	bb0a1c44bf	fix rdar://8453210, a crash handling a call through a GS relative load. For now, just disable folding the load into the call. llvm-svn: 114386	2010-09-21 03:37:00 +00:00
Chris Lattner	65b48b5dfc	zap dead code. llvm-svn: 113073	2010-09-04 18:12:00 +00:00
Jakob Stoklund Olesen	08aede2538	Don't call Predicate_* from X86 target. llvm-svn: 112921	2010-09-03 00:35:18 +00:00
Benjamin Kramer	f1f2133ac0	Remove dead recursive function. Yay for clang -Wunused-function. llvm-svn: 112060	2010-08-25 17:27:58 +00:00
Eli Friedman	39d0f57cab	PR7814: Truncates cannot be ignored for signed comparisons. llvm-svn: 110268	2010-08-04 22:40:58 +00:00
Chris Lattner	f469307c77	Change LEA to have 5 operands for its memory operand, just like all other instructions, even though a segment is not allowed. This resolves a bunch of gross hacks in the encoder and makes LEA more consistent with the rest of the instruction set. No functionality change. llvm-svn: 107934	2010-07-08 23:46:44 +00:00
Evan Cheng	1c349f18f8	Move getExtLoad() and (some) getLoad() DebugLoc argument after EVT argument for consistency sake. llvm-svn: 107820	2010-07-07 22:15:37 +00:00

1 2 3 4 5 ...

550 Commits