llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	4ca8ca916e	Use constants for all return values in switch. Allows clang to optimize it into a lookup table. llvm-svn: 164926	2012-10-01 07:33:27 +00:00
Craig Topper	4f1c8caf2f	Change getX86SubSuperRegister to take an MVT::SimpleValueType rather than an EVT and add llvm_unreachable to the switches. Helps it compile to dramatically better code. llvm-svn: 164919	2012-09-30 19:49:56 +00:00
Manman Ren	511c6d0369	X86: when replacing SUB with TEST in ISelDAGToDAG, only replace uses of the second output of SUB with first output of TEST. PR13966 llvm-svn: 164835	2012-09-28 18:53:24 +00:00
Sylvestre Ledru	91ce36c986	Revert 'Fix a typo 'iff' => 'if''. iff is an abreviation of if and only if. See: http://en.wikipedia.org/wiki/If_and_only_if Commit 164767 llvm-svn: 164768	2012-09-27 10:14:43 +00:00
Sylvestre Ledru	721cffd53a	Fix a typo 'iff' => 'if' llvm-svn: 164767	2012-09-27 09:59:43 +00:00
Bill Wendling	863bab689a	Remove the `hasFnAttr' method from Function. The hasFnAttr method has been replaced by querying the Attributes explicitly. No intended functionality change. llvm-svn: 164725	2012-09-26 21:48:26 +00:00
Jim Grosbach	c03a0c241e	X86_32: Large Symbol+Offset relocations. If the offset is more than 24-bits, it won't fit in a scattered relocation offset field, so we fall back to using a non-scattered relocation. rdar://12358909 llvm-svn: 164724	2012-09-26 21:27:45 +00:00
Michael Liao	2b425e1e24	Add SARX/SHRX/SHLX code generation support llvm-svn: 164675	2012-09-26 08:26:25 +00:00
Michael Liao	2de86af22d	Add RORX code generation support llvm-svn: 164674	2012-09-26 08:24:51 +00:00
Michael Liao	f9f7b5518a	Add MULX code generation support llvm-svn: 164673	2012-09-26 08:22:37 +00:00
Craig Topper	0a928fa32e	Remove hasNoAVX method. Can just invert hasAVX instead. llvm-svn: 164664	2012-09-26 06:29:37 +00:00
Michael Liao	425c0dbc81	Add 'lock' prefix output support in assembly printer - Instead of embedding 'lock' into each mnemonic of atomic instructions except 'xchg', we teach X86 assembly printer to output 'lock' prefix similar to or consistent with code emitter. llvm-svn: 164659	2012-09-26 05:13:44 +00:00
Michael Liao	de51caf2a0	Add missing i64 max/min/umax/umin on 32-bit target - Turn on atomic6432.ll and add specific test case as well llvm-svn: 164616	2012-09-25 18:08:13 +00:00
Bob Wilson	165f0a24c6	Consistently specify the assembly variant to MatchInstructionImpl. llvm-svn: 164611	2012-09-25 17:19:29 +00:00
Evan Cheng	446ff28df1	Fix an illegal tailcall opt where the callee returns a double via xmm while caller returns x86_fp80 via st0. rdar://12229511 llvm-svn: 164588	2012-09-25 05:32:34 +00:00
Jim Grosbach	361ca34270	Mark jump tables in code sections with DataRegion directives. Even out-of-line jump tables can be in the code section, so mark them as data-regions for those targets which support the directives. rdar://12362871&12362974 llvm-svn: 164571	2012-09-24 23:06:27 +00:00
Chad Rosier	c4734c8950	Rather then have a wrapper function, have tblgen instantiate the implementation. Also remove an unused argument. llvm-svn: 164567	2012-09-24 22:57:55 +00:00
Chad Rosier	3cb355d11f	Rather then have a wrapper function, have tblgen instantiate the implementation. llvm-svn: 164548	2012-09-24 19:32:29 +00:00
Michael Liao	2718b20030	Fix 16-bit atomic inst encoding and keep pseudo-inst starting with '#' llvm-svn: 164453	2012-09-22 05:41:15 +00:00
Michael Liao	2456b3ae8c	Fix typo in r164357 llvm-svn: 164452	2012-09-22 03:39:42 +00:00
Chad Rosier	17ede627f0	[ms-inline asm] Expose the mnemonicIsValid() function in the AsmParser. llvm-svn: 164420	2012-09-21 22:21:26 +00:00
Chad Rosier	3d325cf3f1	Add comment. llvm-svn: 164415	2012-09-21 21:08:46 +00:00
Michael Liao	7325a9d08e	Fix a typo in r164357 llvm-svn: 164372	2012-09-21 16:03:03 +00:00
Michael Liao	a880186030	Add missing i8 max/min/umax/umin support - Fix PR5145 and turn on test 8-bit atomic ops llvm-svn: 164358	2012-09-21 03:18:52 +00:00
Michael Liao	c33bebff52	Revise td of X86 atomic instructions - Rewirte most atomic instructions in templates for both better maintenance and future extensions, such as HLE in TSX. llvm-svn: 164357	2012-09-21 03:00:17 +00:00
Michael Liao	3237662b65	Re-work X86 code generation of atomic ops with spin-loop - Rewrite/merge pseudo-atomic instruction emitters to address the following issue: * Reduce one unnecessary load in spin-loop previously the spin-loop looks like thisMBB: newMBB: ld t1 = [bitinstr.addr] op t2 = t1, [bitinstr.val] not t3 = t2 (if Invert) mov EAX = t1 lcs dest = [bitinstr.addr], t3 [EAX is implicit] bz newMBB fallthrough -->nextMBB the 'ld' at the beginning of newMBB should be lift out of the loop as lcs (or CMPXCHG on x86) will load the current memory value into EAX. This loop is refined as: thisMBB: EAX = LOAD [MI.addr] mainMBB: t1 = OP [MI.val], EAX LCMPXCHG [MI.addr], t1, [EAX is implicitly used & defined] JNE mainMBB sinkMBB: * Remove immopc as, so far, all pseudo-atomic instructions has all-register form only, there is no immedidate operand. * Remove unnecessary attributes/modifiers in pseudo-atomic instruction td * Fix issues in PR13458 - Add comprehensive tests on atomic ops on various data types. NOTE: Some of them are turned off due to missing functionality. - Revise tests due to the new spin-loop generated. llvm-svn: 164281	2012-09-20 03:06:15 +00:00
Michael Liao	8372539543	Unify the logic in SelectAtomicLoadAdd and SelectAtomicLoadArith - Merge the processing of LOAD_ADD with other atomic load-arith operations - Separate the logic getting target constant for atomic-load-op and add an optimization for atomic-load-add on i16 with negative value - Optimize a minor case for atomic-fetch-add i16 with negative operand. Test case is revised. llvm-svn: 164243	2012-09-19 19:36:58 +00:00
Craig Topper	3f23c1a8b9	Remove code for setting the VEX L-bit as a function of operand size from the code emitters and the disassembler table builder. Fix a couple instructions that were still missing VEX_L. llvm-svn: 164204	2012-09-19 06:37:45 +00:00
Craig Topper	a73be890a1	Add explicit VEX_L tags to all 256-bit instructions. This will allow us to remove code from the code emitters that examined operands to set the L-bit. llvm-svn: 164202	2012-09-19 06:06:34 +00:00
Roman Divacky	5dd4ccb402	When creating MCAsmBackend pass the CPU string as well. In X86AsmBackend store this and use it to not emit long nops when the CPU is geode which doesnt support them. Fixes PR11212. llvm-svn: 164132	2012-09-18 16:08:49 +00:00
Jan Wen Voung	4ce1d7b4f1	Add some cases to x86 OptimizeCompare to handle DEC and INC, too. While we are setting the earlier def to true, also make it live. llvm-svn: 164056	2012-09-17 22:04:23 +00:00
Benjamin Kramer	0d874f775a	LLVM_ATTRIBUTE_USED forces emission of a function. To silence unused function warnings use LLVM_ATTRIBUTE_UNUSED. llvm-svn: 164036	2012-09-17 16:46:22 +00:00
Nadav Rotem	37521aa89c	The PMOVZXWD family of functions had patterns extends narrow vector types to wide vector types. It had patterns for zext-loading and extending. This commit adds patterns for loading a wide type, performing a bitcast, and extending. This is an odd pattern, but it is commonly used when writing code with intrinsics. rdar://11897677 llvm-svn: 163995	2012-09-16 07:39:07 +00:00
Craig Topper	a60c0f1163	Use LLVM_DELETED_FUNCTION in place of 'DO NOT IMPLEMENT' comments. llvm-svn: 163974	2012-09-15 17:09:36 +00:00
Benjamin Kramer	ece434252c	X86: Emitting x87 fsin/fcos for sinf/cosf is not safe without unsafe fp math. This was only an issue if sse is disabled. llvm-svn: 163967	2012-09-15 12:44:27 +00:00
Dmitri Gribenko	5485acd440	Fix Doxygen issues: * wrap code blocks in \code ... \endcode; * refer to parameter names in paragraphs correctly (\arg is not what most people want -- it starts a new paragraph); * use \param instead of \arg to document parameters in order to be consistent with the rest of the codebase. llvm-svn: 163902	2012-09-14 14:57:36 +00:00
Michael Liao	8b48bf27b0	Fix comment llvm-svn: 163835	2012-09-13 20:30:16 +00:00
Michael Liao	137f8aedea	Add wider vector/integer support for PR12312 - Enhance the fix to PR12312 to support wider integer, such as 256-bit integer. If more than 1 fully evaluated vectors are found, POR them first followed by the final PTEST. llvm-svn: 163832	2012-09-13 20:24:54 +00:00
Jakob Stoklund Olesen	3cf3ffce24	Fix the TCRETURNmi64 bug differently. Add a PatFrag to match X86tcret using 6 fixed registers or less. This avoids folding loads into TCRETURNmi64 using 7 or more volatile registers. <rdar://problem/12282281> llvm-svn: 163819	2012-09-13 18:31:27 +00:00
Jakob Stoklund Olesen	78b9f8fc67	Revert r163761 "Don't fold indexed loads into TCRETURNmi64." The patch caused "Wrong topological sorting" assertions. llvm-svn: 163810	2012-09-13 16:52:17 +00:00
Craig Topper	963305b450	Add a new compression type to ModRM table that detects when the memory modRM byte represent 8 instructions and the reg modRM byte represents up to 64 instructions. Reduces modRM table from 43k entreis to 25k entries. Based on a patch from Manman Ren. llvm-svn: 163774	2012-09-13 05:45:42 +00:00
Jakob Stoklund Olesen	bfacef45eb	Don't fold indexed loads into TCRETURNmi64. We don't have enough GR64_TC registers when calling a varargs function with 6 arguments. Since %al holds the number of vector registers used, only %r11 is available as a scratch register. This means that addressing modes using both base and index registers can't be folded into TCRETURNmi64. <rdar://problem/12282281> llvm-svn: 163761	2012-09-13 00:25:00 +00:00
Michael Liao	abb87d4857	Fix PR11985 - BlockAddress has no support of BA + offset form and there is no way to propagate that offset into machine operand; - Add BA + offset support and a new interface 'getTargetBlockAddress' to simplify target block address forming; - All targets are modified to use new interface and X86 backend is enhanced to support BA + offset addressing. llvm-svn: 163743	2012-09-12 21:43:09 +00:00
Chad Rosier	ab53b4f6d0	[ms-inline asm] Make the operand size directives case insensitive. llvm-svn: 163729	2012-09-12 18:24:26 +00:00
Roman Divacky	fd69009419	Add support for AMD Geode. llvm-svn: 163710	2012-09-12 14:36:02 +00:00
Craig Topper	ad495964f1	Indentation fixes. No functional change. llvm-svn: 163682	2012-09-12 06:20:41 +00:00
Manman Ren	19f49ac624	Release build: guard dump functions with "#if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)" No functional change. Update r163339. llvm-svn: 163653	2012-09-11 22:23:19 +00:00
Chad Rosier	b6b8e966d6	StringSwitchify. llvm-svn: 163649	2012-09-11 21:10:25 +00:00
Chad Rosier	30888b176a	Simplify logic. No functional change intended. llvm-svn: 163648	2012-09-11 20:57:04 +00:00
Craig Topper	a29ed865d0	Make a bunch of lowering helper functions static instead of member functions. No functional change. llvm-svn: 163596	2012-09-11 06:15:32 +00:00
Craig Topper	8702c5b7c0	Change unsigned to a uint16_t in static disassembler tables to reduce the table size. llvm-svn: 163594	2012-09-11 04:19:21 +00:00
Chad Rosier	38e05a9eb2	Update function names to conform to guidelines. No functional change intended. llvm-svn: 163561	2012-09-10 22:50:57 +00:00
Chad Rosier	41ff85d754	Revert r163556. Missed updates to tablegen files. llvm-svn: 163557	2012-09-10 22:30:35 +00:00
Chad Rosier	2089c49db7	Update function names to conform to guidelines. No functional change intended. llvm-svn: 163556	2012-09-10 22:23:45 +00:00
Dmitri Gribenko	ca1e27be0d	Remove redundant semicolons which are null statements. llvm-svn: 163547	2012-09-10 21:26:47 +00:00
Chad Rosier	db20a41d99	[ms-inline asm] Pass the correct AsmVariant to the PrintAsmOperand() function and update the printOperand() function accordingly. llvm-svn: 163544	2012-09-10 21:10:49 +00:00
Chad Rosier	6f8d8b2406	[ms-inline asm] Add support for .att_syntax directive. llvm-svn: 163542	2012-09-10 20:54:39 +00:00
Michael Liao	400f7ef871	Enhance PR11334 fix to support extload from v2f32/v4f32 - Fix an remaining issue of PR11674 as well llvm-svn: 163528	2012-09-10 18:33:51 +00:00
Michael Liao	c3d5b21c39	Add boolean simplification support from CMOV - If a boolean value is generated from CMOV and tested as boolean value, simplify the use of test result by referencing the original condition. RDRAND intrinisc is one of such cases. llvm-svn: 163516	2012-09-10 16:36:16 +00:00
Elena Demikhovsky	264fb0217e	The VPSHUFB 256-bit instruction may be generated when one of input vector is undefined or zeroinitializer. I've added the "zeroinitializer" case in this patch. llvm-svn: 163506	2012-09-10 12:13:11 +00:00
Nick Lewycky	74bf42c9a1	Add missing space before {. No functionality change. llvm-svn: 163484	2012-09-09 23:40:55 +00:00
Craig Topper	4ed79bd7d7	Add instruction selection for ffloor of vectors when SSE4.1 or AVX is enabled. llvm-svn: 163473	2012-09-08 17:42:27 +00:00
Craig Topper	0955a9f4e1	Use 256-bit alignment for constant pool value for 256-bit vector FNEG lowering. llvm-svn: 163463	2012-09-08 07:46:05 +00:00
Craig Topper	98f2e861a0	Add support for lowering FABS of vector types. llvm-svn: 163461	2012-09-08 07:31:51 +00:00
Craig Topper	3e41a5bb31	Set operation action for FFLOOR to Expand for all vector types for X86. Set FFLOOR of v4f32 to Expand for ARM. v2f64 was already correct. llvm-svn: 163458	2012-09-08 04:58:43 +00:00
Benjamin Kramer	e3d658bb6c	PR13754: llvm-mc/x86 crashes on .cfi directives without the % prefix for registers. gas accepts this and it seems to be common enough to be worth supporting. This doesn't affect the parsing of reg operands outside of .cfi directives. llvm-svn: 163390	2012-09-07 14:51:35 +00:00
Manman Ren	742534c4dc	Release build: guard dump functions with "ifndef NDEBUG" No functional change. llvm-svn: 163339	2012-09-06 19:06:06 +00:00
Elena Demikhovsky	42777877c2	AVX2 optimization. Added generation of VPSHUB instruction for <32 x i8> vector shuffle when possible. llvm-svn: 163312	2012-09-06 12:42:01 +00:00
Michael Liao	2d95a2b5c4	Remove duplicated helper function llvm-svn: 163295	2012-09-06 07:11:22 +00:00
Craig Topper	f3e4aa8cdd	Use iPTR instead of i32 for extract_subvector/insert_subvector index in lowering and patterns. This makes it consistent with the incoming DAG nodes from the DAG builder. llvm-svn: 163293	2012-09-06 06:09:01 +00:00
Craig Topper	daa5ed1e0a	Add patterns for converting stores of subvector_extracts of lower 128-bits of a 256-bit vector to VMOVAPSmr/VMOVUPSmr. llvm-svn: 163292	2012-09-06 05:15:01 +00:00
Roman Divacky	ad06cee239	Stop casting away const qualifier needlessly. llvm-svn: 163258	2012-09-05 22:26:57 +00:00
Roman Divacky	6792380e7b	Use const properly so that we dont remove const qualifier from region and MII by casting. Found with gcc48. llvm-svn: 163247	2012-09-05 21:17:34 +00:00
Craig Topper	81f06df699	Remove some of the patterns added in r163196. Increasing the complexity on insert_subvector into undef accomplishes the same thing. llvm-svn: 163198	2012-09-05 07:26:35 +00:00
Craig Topper	f7c87d6eea	Add patterns for integer forms of VINSERTF128/VINSERTI128 folded with loads. Also add patterns to turn subvector inserts with loads to index 0 of an undef into VMOVAPS. llvm-svn: 163196	2012-09-05 06:58:39 +00:00
Craig Topper	2db2353b21	Convert vextracti128/vextractf128 intrinsics to extract_subvector at DAG build time. Similar was previously done for vinserti128/vinsertf128. Add patterns for folding these extract_subvectors with stores. llvm-svn: 163192	2012-09-05 05:48:09 +00:00
Chad Rosier	a05ea0f3e3	Fix function name per coding standard. llvm-svn: 163187	2012-09-05 01:15:43 +00:00
Preston Gurd	cdf540d5d6	Generic Bypass Slow Div - CodeGenPrepare pass for identifying div/rem ops - Backend specifies the type mapping using addBypassSlowDivType - Enabled only for Intel Atom with O2 32-bit -> 8-bit - Replace IDIV with instructions which test its value and use DIVB if the value is positive and less than 256. - In the case when the quotient and remainder of a divide are used a DIV and a REM instruction will be present in the IR. In the non-Atom case they are both lowered to IDIVs and CSE removes the redundant IDIV instruction, using the quotient and remainder from the first IDIV. However, due to this optimization CSE is not able to eliminate redundant IDIV instructions because they are located in different basic blocks. This is overcome by calculating both the quotient (DIV) and remainder (REM) in each basic block that is inserted by the optimization and reusing the result values when a subsequent DIV or REM instruction uses the same operands. - Test cases check for the presents of the optimization when calculating either the quotient, remainder, or both. Patch by Tyler Nowicki! llvm-svn: 163150	2012-09-04 18:22:17 +00:00
Elena Demikhovsky	cbe99bbb36	This patch optimizes shuffle instruction - generates 2 instructions instead of 4. Since this specific shuffle is widely used in many workloads we have ~10% performance on them. shufflevector <8 x float> %A, <8 x float> %B, <8 x i32> <i32 0, i32 8, i32 2, i32 10, i32 4, i32 12, i32 6, i32 14> vmovaps (%rdx), %ymm0 vshufps $8, %ymm0, %ymm0, %ymm0 vmovaps (%rcx), %ymm1 vshufps $8, %ymm0, %ymm1, %ymm1 vunpcklps %ymm0, %ymm1, %ymm0 vmovaps (%rcx), %ymm0 vmovsldup (%rdx), %ymm1 vblendps $85, %ymm0, %ymm1, %ymm0 llvm-svn: 163134	2012-09-04 12:49:02 +00:00
Chad Rosier	9e2aff8b6d	[ms-inline asm] Asm operands can map to one or more MCOperands. Therefore, add the NumMCOperands argument to the GetMCInstOperandNum() function that is set to the number of MCOperands this asm operand mapped to. llvm-svn: 163124	2012-09-03 20:31:23 +00:00
Chad Rosier	391d299737	[ms-inline asm] Add an interface to the GetMCInstOperandNum() function in the MCTargetAsmParser class. llvm-svn: 163122	2012-09-03 18:47:45 +00:00
Chad Rosier	a353dba17d	Removed unused argument. llvm-svn: 163104	2012-09-03 03:16:09 +00:00
Chris Lattner	ba3ba8fa1f	some peepholes that should match horizontal add/sub operations. llvm-svn: 163103	2012-09-03 02:58:21 +00:00
Chad Rosier	e38bb6a34e	[ms-inline asm] Expose the Kind and Opcode variables from the MatchInstructionImpl() function. These values are used by the ConvertToMCInst() function to index into the ConversionTable. The values are also needed to call the GetMCInstOperandNum() function. llvm-svn: 163101	2012-09-03 02:06:46 +00:00
Craig Topper	d6cc4062be	Typos llvm-svn: 163053	2012-09-01 06:33:50 +00:00
Manman Ren	26c5d0f607	SelectionDAG: when constructing VZEXT_LOAD from other loads, make sure its output chain is correctly setup. As an example, if the original load must happen before later stores, we need to make sure the constructed VZEXT_LOAD is constrained to be before the stores. rdar://11457792 llvm-svn: 163036	2012-08-31 23:16:57 +00:00
Craig Topper	908e685102	Mark FMA4 instructions as commutable and add them to the folding tables. llvm-svn: 163035	2012-08-31 23:10:34 +00:00
Craig Topper	7573c8f081	Add selection of RegOp2MemOpTable3 to canFoldMemoryOperand llvm-svn: 163029	2012-08-31 22:12:16 +00:00
Michael Liao	3224543bf9	Fix PR12359 - In addition to undefined, if V2 is zero vector, skip 2nd PSHUFB and POR as well as PSHUFB will zero elements with negative indices. Patch by Sriram Murali <sriram.murali@intel.com> llvm-svn: 163018	2012-08-31 20:12:31 +00:00
Chad Rosier	a8f3c4fe35	The ConvertToMCInst() function can't fail, so remove the now dead Match_ConversionFail enum. llvm-svn: 163002	2012-08-31 16:41:07 +00:00
Craig Topper	c0387f6b23	Mark FMA3 instructions as commutable so that the operands to the multiply part can be commuted. llvm-svn: 163001	2012-08-31 16:31:13 +00:00
Craig Topper	c30fdbc46c	Add support for converting llvm.fma to fma4 instructions. llvm-svn: 162999	2012-08-31 15:40:30 +00:00
Michael Liao	969f3913dd	Clean up AddedComplexity further after adding UseSSEx llvm-svn: 162973	2012-08-31 03:01:35 +00:00
Jim Grosbach	e423e865fe	X86: Fix encoding of 'movd %xmm0, %rax' The assembly string for the VMOVPQIto64rr instruction incorrectly lacked the 'v' prefix, resulting in mis-assembly of the vanilla movd instruction. llvm-svn: 162963	2012-08-31 00:30:30 +00:00
Michael Liao	bbd10792c2	Introduce 'UseSSEx' to force SSE legacy encoding - Add 'UseSSEx' to force SSE legacy insn not being selected when AVX is enabled. As the penalty of inter-mixing SSE and AVX instructions, we need prevent SSE legacy insn from being generated except explicitly specified through some intrinsics. For patterns supported by both SSE and AVX, so far, we force AVX insn will be tried first relying on AddedComplexity or position in td file. It's error-prone and introduces bugs accidentally. 'UseSSEx' is disabled when AVX is turned on. For SSE insns inherited by AVX, we need this predicate to force VEX encoding or SSE legacy encoding only. For insns not inherited by AVX, we still use the previous predicates, i.e. 'HasSSEx'. So far, these insns fall into the following categories: * SSE insns with MMX operands * SSE insns with GPR/MEM operands only (xFENCE, PREFETCH, CLFLUSH, CRC, and etc.) * SSE4A insns. * MMX insns. * x87 insns added by SSE. 2 test cases are modified: - test/CodeGen/X86/fast-isel-x86-64.ll AVX code generation is different from SSE one. 'vcvtsi2sdq' cannot be selected by fast-isel due to complicated pattern and fast-isel fallback to materialize it from constant pool. - test/CodeGen/X86/widen_load-1.ll AVX code generation is different from SSE one after fixing SSE/AVX inter-mixing. Exec-domain fixing prefers 'vmovapd' instead of 'vmovaps'. llvm-svn: 162919	2012-08-30 16:54:46 +00:00
Craig Topper	e39ad7b549	Only perform DAG combine on FMAs of legal types. llvm-svn: 162892	2012-08-30 06:56:15 +00:00
Michael Liao	3c8980646b	Fix PR13727 - The root cause is that target constant materialization in X86 fast-isel creates a PC-rel addressing which may overflow 32-bit range in non-Small code model if .rodata section is allocated too far away from code segment in MCJIT, which uses Large code model so far. - Follow the similar logic to fix non-Small code model in fast-isel by skipping non-Small code model. llvm-svn: 162881	2012-08-30 00:30:16 +00:00
Benjamin Kramer	8f5c5ded4e	Make helper function static. llvm-svn: 162843	2012-08-29 16:17:01 +00:00
Craig Topper	a999c66292	Convert FMA4 patterns to use target specific nodes instead of intrinsics to align with FMA3. llvm-svn: 162829	2012-08-29 07:18:25 +00:00
Chad Rosier	3b1336ceb9	Typo. llvm-svn: 162807	2012-08-28 23:57:47 +00:00
Michael Liao	407d659fa5	Add comments on the literal value used. llvm-svn: 162805	2012-08-28 23:42:17 +00:00
Michael Liao	710e1a594b	Explicitly update the number of nodes to be traversed llvm-svn: 162780	2012-08-28 19:20:29 +00:00
Bill Wendling	cc56718038	The commutative flag is already correctly set within the multiclass. If we set it here, then a 'register-memory' version would wrongly get the commutative flag. <rdar://problem/12180135> llvm-svn: 162741	2012-08-28 07:36:46 +00:00
Craig Topper	72f51c3986	Convert V_SETALLONES/AVX_SETALLONES/AVX2_SETALLONES to Post-RA pseudos. llvm-svn: 162740	2012-08-28 07:30:47 +00:00
Craig Topper	bd509eea4a	Merge AVX_SET0PSY/AVX_SET0PDY/AVX2_SET0 into a single post-RA pseudo. llvm-svn: 162738	2012-08-28 07:05:28 +00:00
Michael Liao	b7d85b6328	Fix PR12312 - Add a target-specific DAG optimization to recognize a pattern PTEST-able. Such a pattern is a OR'd tree with X86ISD::OR as the root node. When X86ISD::OR node has only its flag result being used as a boolean value and all its leaves are extracted from the same vector, it could be folded into an X86ISD::PTEST node. llvm-svn: 162735	2012-08-28 03:34:40 +00:00
Jakob Stoklund Olesen	89d6b29d16	More missing mayLoad flags on AVX multiclasses. llvm-svn: 162714	2012-08-28 00:02:01 +00:00
Craig Topper	a737ef8964	Remove MMX shift intrinsic handling code that also exists in SelectionDAGBuilder. llvm-svn: 162661	2012-08-27 08:08:30 +00:00
Craig Topper	5af2fed5f2	Don't allow vextractf128 to be folded with unaligned stores. We don't fold unaligned loads so shouldn't fold unaligned stores as it can cause an alignment fault to occur. llvm-svn: 162658	2012-08-27 07:19:59 +00:00
Craig Topper	6d44554cd4	Fold some patterns into instruction definitons so tablegen can infer flags removing the need for an explicit 'neverHasSideEffects = 1' llvm-svn: 162656	2012-08-27 07:04:50 +00:00
Craig Topper	f7828f91ee	Add HasAVX1Only predicate and use it for patterns that have an AVX1 instruction and an AVX2 instruction rather than relying on AddedComplexity. llvm-svn: 162654	2012-08-27 06:08:57 +00:00
Richard Smith	228e6d4cf3	Fix integer undefined behavior due to signed left shift overflow in LLVM. Reviewed offline by chandlerc. llvm-svn: 162623	2012-08-24 23:29:28 +00:00
Jakob Stoklund Olesen	3d91b43ad2	Add missing mayLoad flags to a large class of AVX *_Int instructions. llvm-svn: 162622	2012-08-24 23:29:07 +00:00
Jakob Stoklund Olesen	b50cf8b30f	Mark X86::RET and RETI instructions as variadic. There is special magic happening when returning floating point values on the x87 stack. The RET instructions get extra f80 operands. llvm-svn: 162592	2012-08-24 20:52:44 +00:00
Jakob Stoklund Olesen	8ff666fcb6	Remove more mayLoad workarounds. llvm-svn: 162556	2012-08-24 14:43:22 +00:00
Craig Topper	663d160adb	Custom lower FMA intrinsics to target specific nodes and remove the patterns. llvm-svn: 162534	2012-08-24 04:03:22 +00:00
Jakob Stoklund Olesen	d3511235d1	Remove some spurious mayLoad = 0 flags. They were inserted to silence TableGen's warning about redundant properties. That warning is now gone. llvm-svn: 162517	2012-08-24 00:31:20 +00:00
Jakob Stoklund Olesen	df1faa0503	X86MemBarrier has unmodeled side effects. llvm-svn: 162514	2012-08-24 00:31:10 +00:00
Jakob Stoklund Olesen	7030427623	Preserve operand flags in convertToThreeAddress() by copying operands. No test case, this is a generalization of r160260. llvm-svn: 162485	2012-08-23 22:36:31 +00:00
Craig Topper	4a4634d6de	Favor FMA3 over FMA4 if both are enabled. llvm-svn: 162454	2012-08-23 18:14:30 +00:00
Craig Topper	f911597494	Use a switch statement instead of a bunch of if-else checks and pull out the common function call. llvm-svn: 162428	2012-08-23 04:57:36 +00:00
Chad Rosier	cf172e5e28	[ms-inline asm] Avoid a false positive assertion Assertion failed: (Start.isValid() == End.isValid() && "Start and end should either both be valid or both be invalid!") when parsing inline asm. SMLoc assumes that the first char * in the source is invalid. However, when parsing an inline asm the mnemonic is at this location. I don't want to change SMLoc, so use a trivial workaround. llvm-svn: 162381	2012-08-22 19:14:29 +00:00
Craig Topper	a538d831e6	Add a getName function to MachineFunction. Use it in places that previously did getFunction()->getName(). Remove includes of Function.h that are no longer needed. llvm-svn: 162347	2012-08-22 06:07:19 +00:00
Craig Topper	056dfcccb7	Don't cache the MBB in the class. Its only used by one function. Change a for loop over operands to use unsigned instead of int. llvm-svn: 162344	2012-08-22 05:59:59 +00:00
Craig Topper	455bcafa3b	Mark a function as static since it doesn't use anything in the class. llvm-svn: 162342	2012-08-22 05:36:44 +00:00
Richard Smith	13473857a7	Fix unaligned memory accesses when performing relocations in X86 JIT. There's no cost to using memcpy here: the fixed code is optimized by LLVM to perfect machine code. llvm-svn: 162311	2012-08-21 20:48:36 +00:00
Chad Rosier	3d4bc62a5c	[ms-inline asm] Do not report a Parser error when matching inline assembly. llvm-svn: 162306	2012-08-21 19:36:59 +00:00
Chad Rosier	79e766c38e	[ms-inline asm] Expose the ErrorInfo from the MatchInstructionImpl. In general, this is the index of the operand that failed to match. Note: This may cause a buildbot failure due to an API mismatch in clang. Should recover with my next commit to clang. llvm-svn: 162295	2012-08-21 18:14:59 +00:00
Craig Topper	bab0c76674	Fix up indentation and remove a couple else's after returns. llvm-svn: 162270	2012-08-21 08:29:51 +00:00
Craig Topper	bfcfdeb563	Use uint16_t for tables of opcodes. llvm-svn: 162267	2012-08-21 08:23:21 +00:00
Craig Topper	a0cabf19f8	Fix up indentation. No functional change. llvm-svn: 162264	2012-08-21 08:17:07 +00:00
Craig Topper	4bc3e5a1bf	Add a couple llvm_unreachables. Add a message to several others. llvm-svn: 162263	2012-08-21 08:16:16 +00:00
Craig Topper	653e759046	Replace a break with llvm_unreachable in the default case of a nested switch. Condense code a bit. No functional change. llvm-svn: 162261	2012-08-21 07:32:16 +00:00
Craig Topper	384fae2f0d	Cleanup the scalar FMA3 definitions. Add patterns to fold loads with scalar forms. llvm-svn: 162260	2012-08-21 07:11:11 +00:00
Craig Topper	4f3879dfa7	Merge FMA3 instructions with and without patterns into single classes using null_frag. llvm-svn: 162257	2012-08-21 05:56:45 +00:00
Michael Liao	10ff96ce8c	fix a case where all operands of BUILD_VECTOR are undefined llvm-svn: 162214	2012-08-20 17:59:18 +00:00
Craig Topper	b58eec4eaf	Remove FMA3 intrinsic instructions in favor of patterns. llvm-svn: 162194	2012-08-20 06:21:25 +00:00
Craig Topper	37eca54912	Use correct intrinsic for 256-bit VFMSUBADDPS. llvm-svn: 162193	2012-08-20 06:03:04 +00:00
Craig Topper	5122e9f194	Remove trailing white space and tab characters. No functional change. llvm-svn: 162192	2012-08-19 23:37:46 +00:00
Nadav Rotem	178250ad87	When unsafe math is used, we can use commutative FMAX and FMIN. In some cases this allows for better code generation. Added a new DAGCombine transformation to convert FMAX and FMIN to FMANC and FMINC, which are commutative. For example: movaps %xmm0, %xmm1 movsd LC(%rip), %xmm0 minsd %xmm1, %xmm0 becomes: minsd LC(%rip), %xmm0 llvm-svn: 162187	2012-08-19 13:06:16 +00:00
Nadav Rotem	a136939fa9	Reapply r162160 with a fix: Optimize Arith->Trunc->SETCC sequence to allow better compare/branch code. llvm-svn: 162172	2012-08-18 17:53:03 +00:00
Craig Topper	0128f9bad7	Refactor code a bit to reduce number of calls in the final compiled code. No functional change intended. llvm-svn: 162166	2012-08-18 06:39:34 +00:00
Nadav Rotem	c324af609e	Revert r162160 because it made a few buildbots fail. llvm-svn: 162164	2012-08-18 05:02:36 +00:00
Nadav Rotem	2cb14a5c4b	The X86 backend has a number of optimizations for SETCC nodes which use arithmetic instructions. However, when small data types are used, a truncate node appears between the SETCC node and the arithmetic operation. This patch adds support for this pattern. Before: xorl %esi, %edi testb %dil, %dil setne %al ret After: xorb %dil, %sil setne %al ret rdar://12081007 llvm-svn: 162160	2012-08-18 02:43:28 +00:00
Craig Topper	31625574db	Use nested switch to select arguments to reduce calls to EmitPCMP. llvm-svn: 162089	2012-08-17 07:15:56 +00:00
Craig Topper	602e1abe0d	Make ReplaceATOMIC_BINARY_64 a static function. Use a nested switch to reduce to only a single call to it thus allowing it to be inlined by the compiler. llvm-svn: 162088	2012-08-17 06:55:11 +00:00
Anitha Boyapati	af3e98347f	Patch to enable FMA on bdver2 target. Make XOP feature enable FMA4 as well. llvm-svn: 162012	2012-08-16 04:04:02 +00:00
Anitha Boyapati	426feb61b9	(no commit message) llvm-svn: 162010	2012-08-16 03:50:04 +00:00
Michael Liao	06f6fe875a	minor fix of X86ISD::VSEXT_MOVL dump llvm-svn: 161902	2012-08-14 22:53:17 +00:00
Michael Liao	34107b9177	fix PR11334 - FP_EXTEND only support extending from vectors with matching elements. This results in the scalarization of extending to v2f64 from v2f32, which will be legalized to v4f32 not matching with v2f64. - add X86-specific VFPEXT supproting extending from v4f32 to v2f64. - add BUILD_VECTOR lowering helper to recover back the original extending from v4f32 to v2f64. - test case is enhanced to include different vector width. llvm-svn: 161894	2012-08-14 21:24:47 +00:00
Craig Topper	925a281b00	Factor duplicate calls to getUNDEF in several functions. llvm-svn: 161860	2012-08-14 08:18:43 +00:00
Craig Topper	d0d4b11f66	Re-factor intrinsic lowering to combine common parts of similar intrinsics. Reduces compiled code size a little bit. llvm-svn: 161859	2012-08-14 07:43:25 +00:00
Manman Ren	959acb106b	X86: move Int_CVTSD2SSrr, Int_CVTSI2SSrr, Int_CVTSI2SDrr, Int_CVTSS2SDrr from OpTbl1 to OpTbl2 since they have 3 operands and the last operand can be changed to a memory operand. PR13576 llvm-svn: 161769	2012-08-13 18:29:41 +00:00
Manman Ren	e90e94f117	X86: when auto-detecting the subtarget features, make sure use IsIntel to detect Nehalem, Westmere and Sandy Bridge. AMD also has processor family 6. llvm-svn: 161763	2012-08-13 17:26:46 +00:00
Craig Topper	4e5eb72735	Tidy up VSETCC lowering code a bit more by adding an llvm_unreachable and putting an a couple if conditions in a better order. llvm-svn: 161746	2012-08-13 03:42:38 +00:00
Craig Topper	5145a0d967	Refactor code a bit to share commonalities. No functional change intended. llvm-svn: 161745	2012-08-13 02:34:03 +00:00
Craig Topper	ff6e4d1928	Fix an unused variable warning from r161742. llvm-svn: 161743	2012-08-13 01:26:45 +00:00
Craig Topper	a7aaa62d54	Remove the LowerMMXCONCAT_VECTORS function. It could never execute because there are no legal 64-bit vector types that could be used as inputs to a 128-bit concat_vectors. Remove a target specific SDNode and its patterns that become unused as a result. llvm-svn: 161742	2012-08-13 01:23:55 +00:00
Craig Topper	3d2b271362	Remove call to setOperationAction for SETCC of v4f32. SETCC returns an integer type not an FP type. llvm-svn: 161738	2012-08-12 05:31:32 +00:00
Craig Topper	498228d089	Remove unnecessary call to setOperationAction for SETCC of v2i64 under SSE42. It was already called for the same under SSE2. llvm-svn: 161737	2012-08-12 05:15:16 +00:00
Craig Topper	10a8bf3b8c	Make replace many calls to getSizeInBits() with is128BitVector/is256BitVector llvm-svn: 161734	2012-08-12 02:23:29 +00:00
Craig Topper	03d2787275	Use MVT.isXBitVector instead of EVT.isXBitVector when setting up operation actions. Compiles to smaller code. llvm-svn: 161733	2012-08-12 00:34:56 +00:00
Michael Liao	e7e828fd64	fix PR13577, an issue introduced by r161687 - FCMOV only supports a subset of X86 conditions. Skip boolean simplification if X86 condition is not valid for FCMOV. - add a minimal test case for PR13577. llvm-svn: 161732	2012-08-11 23:47:06 +00:00
Craig Topper	b5bcf58ba1	Move setOperationAction for CONCAT_VECTORS for 256-bit vectors into loop since all 256-bit types are supported. llvm-svn: 161730	2012-08-11 22:34:26 +00:00
Craig Topper	490c45c06c	Tidy up indentation. No functional change. llvm-svn: 161727	2012-08-11 17:53:00 +00:00
Craig Topper	55406d9f78	Fix a cast that was casting away 'const' unnecessarily llvm-svn: 161726	2012-08-11 17:46:16 +00:00
Craig Topper	22cb0c572b	Add a couple default: llvm_unreachable() to some switch statements. Fix a bad message in an existing llvm_unreachable. llvm-svn: 161725	2012-08-11 17:44:14 +00:00
Manman Ren	1acb6707cd	X86: when we are auto-detecting the subtarget features, make sure we turn on FeatureFastUAMem for Nehalem, Westmere and Sandy Bridge. FeatureFastUAMem is already on if we pass in nehalem or westmere as a command argument. rdar: 7252306 llvm-svn: 161717	2012-08-10 23:43:32 +00:00
Michael Liao	5248e9913f	add X86-specific DAG optimization to simplify boolean test - if a boolean test (X86ISD::CMP or X86ISD:SUB) checks a boolean value generated from X86ISD::SETCC, try to simplify the boolean value generation and checking by reusing the original EFLAGS with proper condition code - add hooks to X86 specific SETCC/BRCOND/CMOV, the major 3 places consuming EFLAGS part of patches fixing PR12312 llvm-svn: 161687	2012-08-10 19:58:13 +00:00
Michael Liao	ea7d906b0f	remove tailing whitespaces and test commit llvm-svn: 161664	2012-08-10 14:39:24 +00:00
Joerg Sonnenberger	aa2f801ca3	Add some missing includes for the build against stdcxx. llvm-svn: 161657	2012-08-10 10:53:56 +00:00
Chad Rosier	9cb988f3aa	[ms-inline asm] Extend the MC AsmParser API to match MCInsts (but not emit). This new API will be used by clang to parse ms-style inline asms. One goal of this project is to use this style of inline asm for targets other then x86. Therefore, this API needs to be implemented for non-x86 targets at some point in the future. llvm-svn: 161624	2012-08-09 22:04:55 +00:00
Manman Ren	1be131ba27	X86: enable CSE between CMP and SUB We perform the following: 1> Use SUB instead of CMP for i8,i16,i32 and i64 in ISel lowering. 2> Modify MachineCSE to correctly handle implicit defs. 3> Convert SUB back to CMP if possible at peephole. Removed pattern matching of (a>b) ? (a-b):0 and like, since they are handled by peephole now. rdar://11873276 llvm-svn: 161462	2012-08-08 00:51:41 +00:00
Jakob Stoklund Olesen	3b9a442841	Don't scan physreg use-def chains looking for a PIC base. We can't rematerialize a PIC base after register allocation anyway, and scanning physreg use-def chains is very expensive in a function with many calls. <rdar://problem/12047515> llvm-svn: 161461	2012-08-08 00:40:47 +00:00
Evan Cheng	fbdd25c135	X86 cmp lowering is looking past truncate on the condition node. It should only do so when the high bits are known zero. This caused a subtle miscompilation. rdar://12027825 llvm-svn: 161451	2012-08-07 22:21:00 +00:00
Andrew Trick	e0c83b1f3b	Allow x86 subtargets to use the GenericModel defined in X86Schedule.td. This allows codegen passes to query properties like InstrItins->SchedModel->IssueWidth. It also ensure's that computeOperandLatency returns the X86 defaults for loads and "high latency ops". This should have no significant impact on existing schedulers because X86 defaults happen to be the same as global defaults. llvm-svn: 161370	2012-08-07 00:25:30 +00:00
Eric Christopher	22738d00a3	Add support for the OpenBSD for Bitrig. Patch by David Hill. llvm-svn: 161344	2012-08-06 20:52:18 +00:00
Craig Topper	ab47fe4e16	Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305. llvm-svn: 161318	2012-08-06 06:22:36 +00:00
Craig Topper	6d0408d3a5	Remove custom inserter for MWAIT. It doesn't do anything that couldn't be represented in a pattern. llvm-svn: 161306	2012-08-05 00:36:57 +00:00
Craig Topper	43ee9fae92	Use a COPY node instead of an explicit MOVA opcode in the custom insterter for pcmpestrm/pcmpistrm. Allows the register allocator to handle it better and prevent wasted identity moves. llvm-svn: 161305	2012-08-05 00:17:48 +00:00
Bob Wilson	3e6fa462f3	Fall back to selection DAG isel for calls to builtin functions. Fast isel doesn't currently have support for translating builtin function calls to target instructions. For embedded environments where the library functions are not available, this is a matter of correctness and not just optimization. Most of this patch is just arranging to make the TargetLibraryInfo available in fast isel. <rdar://problem/12008746> llvm-svn: 161232	2012-08-03 04:06:28 +00:00
Manman Ren	ba8122cc25	X86 Peephole: fold loads to the source register operand if possible. Add more comments and use early returns to reduce nesting in isLoadFoldable. Also disable folding for V_SET0 to avoid introducing a const pool entry and a const pool load. rdar://10554090 and rdar://11873276 llvm-svn: 161207	2012-08-02 19:37:32 +00:00
Manman Ren	5759d01230	X86 Peephole: fold loads to the source register operand if possible. Machine CSE and other optimizations can remove instructions so folding is possible at peephole while not possible at ISel. This patch is a rework of r160919 and was tested on clang self-host on my local machine. rdar://10554090 and rdar://11873276 llvm-svn: 161152	2012-08-02 00:56:42 +00:00
Manman Ren	4059145396	X86: mark GATHER instructios as mayLoad llvm-svn: 161143	2012-08-01 23:28:59 +00:00
Chad Rosier	24c19d20c0	Whitespace. llvm-svn: 161122	2012-08-01 18:39:17 +00:00
Elena Demikhovsky	3cb3b0045c	Added FMA functionality to X86 target. llvm-svn: 161110	2012-08-01 12:06:00 +00:00
Craig Topper	b8aec08819	Add more indirection to the disassembler tables to reduce amount of space used to store the operand types and encodings. Store only the unique combinations in a separate table and store indices in the instruction table. Saves about 32K of static data. llvm-svn: 161101	2012-08-01 07:39:18 +00:00
Chad Rosier	710be7df71	[x86 frame lowering] In 32-bit mode, use ESI as the base pointer. Previously, we were using EBX, but PIC requires the GOT to be in EBX before function calls via PLT GOT pointer. llvm-svn: 161066	2012-07-31 18:29:21 +00:00
Craig Topper	c2efce404e	Make INSTRUCTION_SPECIFIER_FIELDS match X86DisassemblerCommon.h. Also remove trailing whitespace. llvm-svn: 161029	2012-07-31 05:18:26 +00:00
Craig Topper	fb39f97d4c	Tidy up trailing whitespace llvm-svn: 161027	2012-07-31 04:58:05 +00:00
Craig Topper	5f33d90214	Tidy up trailing whitespace llvm-svn: 161026	2012-07-31 04:38:27 +00:00
Craig Topper	efd97044a3	Mark MOVZX16/MOVSX16 as neverHasSideEffects/mayLoad llvm-svn: 160953	2012-07-30 07:14:07 +00:00
Craig Topper	c6b7ef61f4	Mark MOVZX32_NOREX as isCodeGenOnly and neverHasSideEffects. The isCodeGenOnly change allows special detection of _NOREX instructions to be removed from tablegen disassembler code. llvm-svn: 160951	2012-07-30 06:48:11 +00:00
Craig Topper	14eac5dda8	Give VCVTTPD2DQ priority over CVTTPD2DQ. llvm-svn: 160942	2012-07-30 02:20:32 +00:00
Craig Topper	f881d385da	Fix patterns for CVTTPS2DQ to specify SSE2 instead of SSE1. llvm-svn: 160941	2012-07-30 02:14:02 +00:00
Craig Topper	415b3586d0	Fix up patterns for VCVTSS2SD. Specifically give it priority over SSE form. Add an OptForSpeed to explicitly pair up with an OptForSize that was already on another pattern. llvm-svn: 160939	2012-07-30 01:38:57 +00:00
Craig Topper	28402efcb6	Fix load types on intrinsic forms of SS2SD and SD2SS AVX/SSE convert instruction patterns. llvm-svn: 160938	2012-07-29 23:26:34 +00:00
Craig Topper	b6767f3acd	Move more SSE/AVX convert instruction patterns into their definitions. llvm-svn: 160937	2012-07-29 22:30:06 +00:00
Manman Ren	f87dd7c01b	Revert r160920 and r160919 due to dragonegg and clang selfhost failure llvm-svn: 160927	2012-07-29 02:44:09 +00:00
Craig Topper	fc93281c07	Fold patterns for some of the SSE/AVX convert instructions into their instruction definitions. llvm-svn: 160922	2012-07-28 18:59:19 +00:00
Craig Topper	024797b9a2	Mark some of the SSE/AVX convert instructions as mayLoad/neverHasSideEffects. llvm-svn: 160921	2012-07-28 18:36:39 +00:00
Manman Ren	0fa3ab88ba	X86 Peephole: fold loads to the source register operand if possible. Machine CSE and other optimizations can remove instructions so folding is possible at peephole while not possible at ISel. rdar://10554090 and rdar://11873276 llvm-svn: 160919	2012-07-28 16:48:01 +00:00
Craig Topper	44f9b5343d	Make CVTSS2SI instruction definition consistent with CVTSD2SI. llvm-svn: 160914	2012-07-28 08:28:23 +00:00
Craig Topper	1c1aef07b8	Fix up memory load types for SSE scalar convert intrinsic patterns. llvm-svn: 160913	2012-07-28 07:59:59 +00:00
Manman Ren	32367c063b	X86 Peephole: fix PR13475 in optimizeCompare. It is possible that an instruction can use and update EFLAGS. When checking the safety, we should check the usage of EFLAGS first before declaring it is safe to optimize due to the update. llvm-svn: 160912	2012-07-28 03:15:46 +00:00
Jakob Stoklund Olesen	7cd08536c2	Remove the X86 sub_ss and sub_sd sub-register indexes completely. llvm-svn: 160833	2012-07-26 23:07:20 +00:00
Jakob Stoklund Olesen	77cd55b4ee	Remove the last mentions of sub_ss and sub_sd from patterns. I'll remove these two sub-register indexes shortly. llvm-svn: 160831	2012-07-26 23:03:08 +00:00
Jakob Stoklund Olesen	b96d0b4e08	Eliminate sub_ss, sub_sd from broadcast patterns. The (COPY_TO_REGCLASS GR32:$src, VR128) pattern looks odd, but copyPhysReg does the right thing with it. (The old pattern would eventually produce the same cross-class copy). llvm-svn: 160830	2012-07-26 22:59:06 +00:00
Jakob Stoklund Olesen	206b825f5c	Eliminate more sub_ss / sub_sd patterns. This gets rid of some more INSERT_SUBREG - IMPLICIT_DEF patterns, simplifying the emitted code a bit. llvm-svn: 160820	2012-07-26 22:30:18 +00:00
Jakob Stoklund Olesen	75d17b0577	Eliminate some SUBREG_TO_REG patterns with sub_ss and sub_sd. The SUBREG_TO_REG instruction has magic semantics asserting that the source value was defined by an instruction that cleared the high half of the register. Those semantics are never actually exploited for xmm registers. llvm-svn: 160818	2012-07-26 22:03:21 +00:00
Jakob Stoklund Olesen	ceee4a9d0c	Eliminate a batch of uses of sub_ss and sub_sd in the X86 target. These idempotent sub-register indices don't do anything --- They simply map XMM registers to themselves. They no longer affect register classes either since the SubRegClasses field has been removed from Target.td. This patch replaces XMM->XMM EXTRACT_SUBREG and INSERT_SUBREG patterns with COPY_TO_REGCLASS patterns which simply become COPY instructions. The number of IMPLICIT_DEF instructions before register allocation is reduced, and that is the cause of the test case changes. llvm-svn: 160816	2012-07-26 21:40:42 +00:00
Craig Topper	c7690ac7ac	Make l/q suffixes on AVX forms of scalar convert instructions consistent with their non-AVX forms. llvm-svn: 160775	2012-07-26 07:48:28 +00:00
Rafael Espindola	73173c55c2	Fix typos. Thanks to Matt Beaumont-Gay for noticing it. llvm-svn: 160731	2012-07-25 15:42:45 +00:00
Rafael Espindola	11c38b9657	When a return struct pointer is passed in registers, the called has nothing to pop. llvm-svn: 160725	2012-07-25 13:41:10 +00:00
Rafael Espindola	2caee7f4d2	Factor a long list of conditions into a predicate function. No functionality change. llvm-svn: 160724	2012-07-25 13:35:45 +00:00
Kevin Enderby	216ac31971	Fix a bug in the x86 disassembler's symbolic disassembly support for Jcc-Jump if Condition Is Met instuctions that was not correctly determining the target instruction. So for a jne rel32 instruction: % cat x.s .byte 0x0f, 0x85, 0x09, 0x00, 0x00, 0x00 % as x.s it was incorrectly deterining the target: % otool -q -tv a.out a.out: (__TEXT,__text) section 0000000000000000 jne 0xd and with the fix it gets this correct as: % otool -q -tv a.out a.out: (__TEXT,__text) section 0000000000000000 jne 0xf rdar://11505997 llvm-svn: 160694	2012-07-24 21:40:01 +00:00
David Chisnall	5b8c1680de	ELF does not imply GNU/Linux. Do not assume GNU conventions just because we are targeting an ELF platform. Only fold gs-relative (and fs-relative) loads if it is actually sensible to do so for the target platform. This fixes PR13438. llvm-svn: 160687	2012-07-24 20:04:16 +00:00
Sylvestre Ledru	35521e2310	Fix a typo (the the => the) llvm-svn: 160621	2012-07-23 08:51:15 +00:00
Craig Topper	0b94e46ce3	Don't use implicit register operands to calculate L-bit for AVX instructions. Needed because super reg defs and kills are added as implicit operands on 128-bit instructions. Fixes PR13349. Patch by Jose Fonseca. llvm-svn: 160543	2012-07-20 07:03:46 +00:00
Preston Gurd	8e082688a1	Adds the family codes for the Midview Atom processors so that the Atom buildbot will auto-detect Atom. llvm-svn: 160521	2012-07-19 19:05:37 +00:00
Bill Wendling	318f03f56f	Remove tabs. llvm-svn: 160479	2012-07-19 00:15:11 +00:00
Bill Wendling	ea6397f67b	Remove tabs. llvm-svn: 160477	2012-07-19 00:11:40 +00:00
Manman Ren	d0a4ee8427	X86: remove redundant cmp against zero. Updated OptimizeCompare in peephole to remove redundant cmp against zero. We only remove Compare if CF and OF are not used. rdar://11855129 llvm-svn: 160454	2012-07-18 21:40:01 +00:00
Preston Gurd	f0a48ec8f1	This patch fixes 8 out of 20 unexpected failures in "make check" when run on an Intel Atom processor. The failures have arisen due to changes elsewhere in the trunk over the past 8 weeks or so. These failures were not detected by the Atom buildbot because the CPU on the Atom buildbot was not being detected as an Atom CPU. The fix for this problem is in Host.cpp and X86Subtarget.cpp, but shall remain commented out until the current set of Atom test failures are fixed. Patch by Andy Zhang and Tyler Nowicki! llvm-svn: 160451	2012-07-18 20:49:17 +00:00
Nadav Rotem	4c12245b3a	The vbroadcast family of instructions has 'fallback patterns' in case where the load source operand is used by multiple nodes. The v2i64 broadcast was emulated by shuffling the two lower i32 elements to the upper two. We had a bug in the immediate used for the broadcast. Replacing 0 to 0x44. 0x44 means [01\|00\|01\|00] which corresponds to the correct lane. Patch by Michael Kuperstein. llvm-svn: 160430	2012-07-18 08:14:48 +00:00
Craig Topper	6bf3ed454a	Remove tab characters. llvm-svn: 160425	2012-07-18 04:59:16 +00:00
Craig Topper	8532423268	Fix typo in error message and remove some tab characters. llvm-svn: 160423	2012-07-18 04:36:35 +00:00
Craig Topper	01deb5f2df	Make x86 asm parser to check for xmm vs ymm for index register in gather instructions. Also fix Intel syntax for gather instructions to use 'DWORD PTR' or 'QWORD PTR' to match gas. llvm-svn: 160420	2012-07-18 04:11:12 +00:00
Evan Cheng	e6a3b03ee0	Back out r160101 and instead implement a dag combine to recover from instcombine transformation. llvm-svn: 160387	2012-07-17 18:54:11 +00:00
Evan Cheng	780f9b5f92	Implement r160312 as target indepedenet dag combine. llvm-svn: 160354	2012-07-17 08:31:11 +00:00
Evan Cheng	f579beca6d	This is another case where instcombine demanded bits optimization created large immediates. Add dag combine logic to recover in case the large immediates doesn't fit in cmp immediate operand field. int foo(unsigned long l) { return (l>> 47) == 1; } we produce %shr.mask = and i64 %l, -140737488355328 %cmp = icmp eq i64 %shr.mask, 140737488355328 %conv = zext i1 %cmp to i32 ret i32 %conv which codegens to movq $0xffff800000000000,%rax andq %rdi,%rax movq $0x0000800000000000,%rcx cmpq %rcx,%rax sete %al movzbl %al,%eax ret TargetLowering::SimplifySetCC would transform (X & -256) == 256 -> (X >> 8) == 1 if the immediate fails the isLegalICmpImmediate() test. For x86, that's immediates which are not a signed 32-bit immediate. Based on a patch by Eli Friedman. PR10328 rdar://9758774 llvm-svn: 160346	2012-07-17 06:53:39 +00:00
Evan Cheng	75315b877c	For something like uint32_t hi(uint64_t res) { uint_32t hi = res >> 32; return !hi; } llvm IR looks like this: define i32 @hi(i64 %res) nounwind uwtable ssp { entry: %lnot = icmp ult i64 %res, 4294967296 %lnot.ext = zext i1 %lnot to i32 ret i32 %lnot.ext } The optimizer has optimize away the right shift and truncate but the resulting constant is too large to fit in the 32-bit immediate field. The resulting x86 code is worse as a result: movabsq $4294967296, %rax ## imm = 0x100000000 cmpq %rax, %rdi sbbl %eax, %eax andl $1, %eax This patch teaches the x86 lowering code to handle ult against a large immediate with trailing zeros. It will issue a right shift and a truncate followed by a comparison against a shifted immediate. shrq $32, %rdi testl %edi, %edi sete %al movzbl %al, %eax It also handles a ugt comparison against a large immediate with trailing bits set. i.e. X > 0x0ffffffff -> (X >> 32) >= 1 rdar://11866926 llvm-svn: 160312	2012-07-16 19:35:43 +00:00
Chad Rosier	10e8207c9e	With r160248 in place this code is no longer needed. llvm-svn: 160293	2012-07-16 17:42:13 +00:00
Nadav Rotem	4968e45b9f	Fix a bug in the 3-address conversion of LEA when one of the operands is an undef virtual register. The problem is that ProcessImplicitDefs removes the definition of the register and marks all uses as undef. If we lose the undef marker then we get a register which has no def, is not marked as undef. The live interval analysis does not collect information for these virtual registers and we crash in later passes. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160260	2012-07-16 10:52:25 +00:00
Alexey Samsonov	dcc1291d17	This CL changes the function prologue and epilogue emitted on X86 when stack needs realignment. It is intended to fix PR11468. Old prologue and epilogue looked like this: push %rbp mov %rsp, %rbp and $alignment, %rsp push %r14 push %r15 ... pop %r15 pop %r14 mov %rbp, %rsp pop %rbp The problem was to reference the locations of callee-saved registers in exception handling: locations of callee-saved had to be re-calculated regarding the stack alignment operation. It would take some effort to implement this in LLVM, as currently MachineLocation can only have the form "Register + Offset". Funciton prologue and epilogue are now changed to: push %rbp mov %rsp, %rbp push %14 push %15 and $alignment, %rsp ... lea -$size_of_saved_registers(%rbp), %rsp pop %r15 pop %r14 pop %rbp Reviewed by Chad Rosier. llvm-svn: 160248	2012-07-16 06:54:09 +00:00
Nadav Rotem	eec74c7279	Teach getTargetVShiftNode about TargetConstant nodes. llvm-svn: 160234	2012-07-15 20:27:43 +00:00
Nadav Rotem	ee3552f88d	Rename VBROADCASTSDrm into VBROADCASTSDYrm to match the naming convention. Allow the folding of vbroadcastRR to vbroadcastRM, where the memory operand is a spill slot. PR12782. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160230	2012-07-15 12:26:30 +00:00
Nadav Rotem	9466e81df6	AVX: Fix a bug in getTargetVShiftNode. The shift amount has to be a 128bit vector with the same element type as the input vector. This is needed because of the patterns we have for the VP[SLL/SRA/SRL][W/D/Q] instructions. llvm-svn: 160222	2012-07-14 22:26:05 +00:00
Benjamin Kramer	abbfe69356	Make helper functions static. llvm-svn: 160173	2012-07-13 13:25:15 +00:00
Craig Topper	b3bac4908e	Mark VINSERTI128rm as MayLoad=1. Fixes PR13348. llvm-svn: 160162	2012-07-13 05:46:28 +00:00
Benjamin Kramer	4d0916788d	Give the rdrand instructions a SideEffect flag and a chain so MachineCSE and MachineLICM don't touch it. I already had the necessary things in place for IR-level passes but missed the machine passes. llvm-svn: 160137	2012-07-12 18:14:57 +00:00
Benjamin Kramer	0ab2794eda	Add intrinsics for Ivy Bridge's rdrand instruction. The rdrand/cmov sequence is the same that is emitted by both GCC and ICC. Fixes PR13284. llvm-svn: 160117	2012-07-12 09:31:43 +00:00
Craig Topper	f7755df776	Update GATHER instructions to support 2 read-write operands. Patch from myself and Manman Ren. llvm-svn: 160110	2012-07-12 06:52:41 +00:00
Chad Rosier	8446ede023	[x86 fast-isel] Per discussion with Eric, add all cases to switch with verbose comments. llvm-svn: 160069	2012-07-11 19:58:38 +00:00
Manman Ren	1553ce0e81	X86: Update to peephole optimization to move Movr0 before (Sub, Cmp) pair. When Movr0 is between sub and cmp, we move Movr0 before sub if it enables removal of Cmp. llvm-svn: 160066	2012-07-11 19:35:12 +00:00
Chad Rosier	43218c59c3	[x86 fast-isel] Rather then call llvm_unreachable() have fast-isel fall back to Selection DAG isel. Patch by Andrew Kaylor <andrew.kaylor@intel.com>. llvm-svn: 160055	2012-07-11 17:23:17 +00:00
Nadav Rotem	d2bdcebb14	When ext-loading and trunc-storing vectors to memory, on x86 32bit systems, allow loads/stores of 64bit values from xmm registers. llvm-svn: 160044	2012-07-11 13:27:05 +00:00
Chad Rosier	97c2214277	Move [get\|set]BasePtrStackAdjustment() from MachineFrameInfo to X86MachineFunctionInfo as this is currently only used by X86. If this ever becomes an issue on another arch (e.g., ARM) then we can hoist it back out. llvm-svn: 160009	2012-07-10 18:27:15 +00:00
Chad Rosier	bdb08ac50a	Add support for dynamic stack realignment in the presence of dynamic allocas on X86. Basically, this is a reapplication of r158087 with a few fixes. Specifically, (1) the stack pointer is restored from the base pointer before popping callee-saved registers and (2) in obscure cases (see comments in patch) we must cache the value of the original stack adjustment in the prologue and apply it in the epilogue. rdar://11496434 llvm-svn: 160002	2012-07-10 17:45:53 +00:00
Nadav Rotem	d908ddc186	Improve the loading of load-anyext vectors by allowing the codegen to load multiple scalars and insert them into a vector. Next, we shuffle the elements into the correct places, as before. Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the migration of bitcasts happened too late in the SelectionDAG process. llvm-svn: 159991	2012-07-10 13:25:08 +00:00

... 3 4 5 6 7 ...

8834 Commits