llvm-project

Commit Graph

Author	SHA1	Message	Date
Michael Liao	126680ffa1	Rewrite X86 codegen regression test with FileCheck llvm-svn: 180910	2013-05-02 06:20:42 +00:00
David Majnemer	a18dfe6b96	Add a test for the foldSelectICmpAndOr fix committed in r180779. This tests a case where C1 and C2 were the same but X and Y were different widths. llvm-svn: 180907	2013-05-02 02:44:23 +00:00
Michael Liao	d2d42f1b2d	Avoid generating tempfile(s) never used As DejaGNU is deprecated, it seems pipe-jam issue doesn't exist any more. llvm-svn: 180892	2013-05-01 22:46:50 +00:00
Bill Wendling	8f2e6feb8e	Revert r180737. The companion patch was reverted, and this is not relevant right now. llvm-svn: 180889	2013-05-01 22:32:08 +00:00
Nadav Rotem	1e211913b5	SROA: Generate selects instead of shuffles when blending values because this is the cannonical form. Shuffles are more difficult to lower and we usually don't touch them, while we do optimize selects more often. llvm-svn: 180875	2013-05-01 19:53:30 +00:00
Nadav Rotem	e5a2dda372	Optimize away nop CONCAT_VECTOR nodes. Optimize CONCAT_VECTOR nodes that merge EXTRACT_SUBVECTOR values that extract from the same vector. rdar://13402653 PR15866 llvm-svn: 180871	2013-05-01 19:18:51 +00:00
Rafael Espindola	817c1d92b4	Put VMOVPQIto64rr in the VRPDI class. Patch by Joshua Magee. llvm-svn: 180842	2013-05-01 13:00:16 +00:00
Michael Liao	f7f33ed31e	Forget remove the tempfile argument llvm-svn: 180838	2013-05-01 05:45:57 +00:00
Michael Liao	bc793a775e	More rewrites of x86 codegen regression tests with FileCheck llvm-svn: 180837	2013-05-01 05:34:30 +00:00
Jim Grosbach	d11584a7f7	Revert "InstCombine: Fold more shuffles of shuffles." This reverts commit r180802 There's ongoing discussion about whether this is the right place to make this transformation. Reverting for now while we figure it out. llvm-svn: 180834	2013-05-01 00:25:27 +00:00
Akira Hatanaka	4254319ef9	[mips] Fix handling of instructions which copy to/from accumulator registers. Expand copy instructions between two accumulator registers before callee-saved scan is done. Handle copies between integer GPR and hi/lo registers in MipsSEInstrInfo::copyPhysReg. Delete pseudo-copy instructions that are not needed. llvm-svn: 180827	2013-04-30 23:22:09 +00:00
Stephen Lin	699808ceb2	Only pass 'returned' to target-specific lowering code when the value of entire register is guaranteed to be preserved. llvm-svn: 180825	2013-04-30 22:49:28 +00:00
Akira Hatanaka	68741cc38d	[mips] Instruction selection patterns for DSP-ASE vector select and compare instructions. llvm-svn: 180820	2013-04-30 22:37:26 +00:00
Adrian Prantl	a2888e71eb	Temporarily revert "Change the informal convention of DBG_VALUE so that we can express a" because it breaks some buildbots. This reverts commit 180816. llvm-svn: 180819	2013-04-30 22:35:14 +00:00
Adrian Prantl	9a576644e4	Change the informal convention of DBG_VALUE so that we can express a register-indirect address with an offset of 0. It used to be that a DBG_VALUE is a register-indirect value if the offset (operand 1) is nonzero. The new convention is that a DBG_VALUE is register-indirect if the first operand is a register and the second operand is an immediate. For plain registers use the combination reg, reg. rdar://problem/13658587 llvm-svn: 180816	2013-04-30 22:16:46 +00:00
Akira Hatanaka	433de170ee	[mips] Test for r179873. Patch by Zoran Jovanovic. llvm-svn: 180804	2013-04-30 20:48:49 +00:00
Jim Grosbach	0b914fe839	InstCombine: Fold more shuffles of shuffles. Always fold a shuffle-of-shuffle into a single shuffle when there's only one input vector in the first place. Continue to be more conservative when there's multiple inputs. rdar://13402653 PR15866 llvm-svn: 180802	2013-04-30 20:43:52 +00:00
Hal Finkel	7153251ab5	LocalStackSlotAllocation improvements First, taking advantage of the fact that the virtual base registers are allocated in order of the local frame offsets, remove the quadratic register-searching behavior. Because of the ordering, we only need to check the last virtual base register created. Second, store the frame index in the FrameRef structure, and get the frame index and the local offset from this structure at the top of the loop iteration. This allows us to de-nest the loops in insertFrameReferenceRegisters (and I think makes the code cleaner). I also moved the needsFrameBaseReg check into the first loop over instructions so that we don't bother pushing FrameRefs for instructions that don't want a virtual base register anyway. Lastly, and this is the only functionality change, avoid the creation of single-use virtual base registers. These are currently not useful because, in general, they end up replacing what would be one r+r instruction with an add and a r+i instruction. Committing this removes the XFAIL in CodeGen/PowerPC/2007-09-07-LoadStoreIdxForms.ll Jim has okayed this off-list. llvm-svn: 180799	2013-04-30 20:04:37 +00:00
Manman Ren	1a5ff287fd	TBAA: remove !tbaa from testing cases if not used. This will make it easier to turn on struct-path aware TBAA since the metadata format will change. llvm-svn: 180796	2013-04-30 17:52:57 +00:00
Adrian Prantl	0941638a1b	Set debug locations for branch instructions created during inlining, even the inlined function has multiple returns. rdar://problem/12415623 llvm-svn: 180793	2013-04-30 17:08:16 +00:00
Rafael Espindola	52501033d0	Fix Addend computation for non external relocations on Macho. llvm-svn: 180790	2013-04-30 15:40:54 +00:00
Vincent Lejeune	e69e26025e	R600: fix loop-address.ll test Texture cache is now used when shader type is not specified llvm-svn: 180785	2013-04-30 12:47:56 +00:00
Mihai Popa	af22d91af0	s tightens up the encoding description for ARM post-indexed ldr instructions. All instructions in this class have bit 4 cleared. It turns out that there is a test case for this, but it was marked XFAIL. llvm-svn: 180778	2013-04-30 09:00:12 +00:00
David Majnemer	8d048d0482	Fix "Combine bit test + conditional or into simple math" This fixes the optimization introduced in r179748 and reverted in r179750. While the optimization was sound, it did not properly respect differences in bit-width. llvm-svn: 180777	2013-04-30 08:57:58 +00:00
Michael Liao	c9ae780b5b	Rewrite X86 codegen regression test with FileCheck llvm-svn: 180776	2013-04-30 07:51:08 +00:00
Rafael Espindola	d00c2765aa	Collect the Addend for external relocs. This fixes 2013-04-04-RelocAddend.ll. We don't have a testcase for non external relocs with an Addend. I will try to write one. llvm-svn: 180767	2013-04-30 01:29:57 +00:00
Vincent Lejeune	3abdbf1cad	R600: use native for alu llvm-svn: 180761	2013-04-30 00:14:38 +00:00
Vincent Lejeune	c299164284	R600: Add FetchInst bit to instruction defs to denote vertex/tex instructions v2[Vincent Lejeune]: Split FetchInst into usesTextureCache/usesVertexCache llvm-svn: 180755	2013-04-30 00:13:39 +00:00
Michael Liao	db6c6ea21c	Rewrite test in FileCheck instead of grep in X86 codegen llvm-svn: 180754	2013-04-30 00:13:38 +00:00
Manman Ren	f0499ba991	TBAA: remove !tbaa from testing cases if not used. This will make it easier to turn on struct-path aware TBAA since the metadata format will change. llvm-svn: 180745	2013-04-29 22:58:55 +00:00
Bill Wendling	39033855c3	Duplicate a testcase. llvm-svn: 180744	2013-04-29 22:42:47 +00:00
Manman Ren	662ece49de	TBAA: remove !tbaa from testing cases if not used. This will make it easier to turn on struct-path aware TBAA since the metadata format will change. llvm-svn: 180743	2013-04-29 22:42:01 +00:00
Michael Liao	c83b3e79fc	Rewrite some tests with FileCHeck in X86 codegen - Revise previous patches of the same purpose by fixing ) grep <PA> \| not grep <PB> semantically is not the same as CHECK: <PA>{{^<PB>.$}} as the former will check all occurrences of <PA> while the later only check the first match. As the result, CHECK needs putting in all place where <PA> occurs. *) grep <PA> \| count <N> needs a final CHECK-NOT of the same pattern. (As 'CHECK-<N>' is proposed for discussion, converting 'grep \| count <N>' where N > 1 is postponed.) llvm-svn: 180742	2013-04-29 22:41:29 +00:00
Adrian Prantl	3e1758c045	Improve documentation. llvm-svn: 180738	2013-04-29 22:25:52 +00:00
Rafael Espindola	e4dd2e0132	Add getSymbolAlignment to the ObjectFile interface. For regular object files this is only meaningful for common symbols. An object file format with direct support for atoms should be able to provide alignment information for all symbols. This replaces getCommonSymbolAlignment and fixes test-common-symbols-alignment.ll on darwin. This also includes a fix to MachOObjectFile::getSymbolFlags. It was marking undefined symbols as common (already tested by existing mcjit tests now that it is used). llvm-svn: 180736	2013-04-29 22:24:22 +00:00
Tom Stellard	119ad03c67	R600: Use correct CF_END instruction on Northern Island GPUs llvm-svn: 180735	2013-04-29 22:23:58 +00:00
Tom Stellard	8367067e02	R600: Fix encoding of CF_END_{EG, R600} instructions The EOP bit was not being encoded. llvm-svn: 180734	2013-04-29 22:23:54 +00:00
Arnold Schwaighofer	474df6d3ed	SimplifyCFG: If convert single conditional stores This resurrects r179957, but adds code that makes sure we don't touch atomic/volatile stores: This transformation will transform a conditional store with a preceeding uncondtional store to the same location: a[i] = may-alias with a[i] load if (cond) a[i] = Y into an unconditional store. a[i] = X may-alias with a[i] load tmp = cond ? Y : X; a[i] = tmp We assume that on average the cost of a mispredicted branch is going to be higher than the cost of a second store to the same location, and that the secondary benefits of creating a bigger basic block for other optimizations to work on outway the potential case where the branch would be correctly predicted and the cost of the executing the second store would be noticably reflected in performance. hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With this change we are on par with gcc's performance (gcc also performs this transformation). There was a 1.2 % performance improvement on a ARM swift chip. Other tests in the test-suite+external seem to be mostly uninfluenced in my experiments: This optimization was triggered on 41 tests such that the executable was different before/after the patch. Only 1 out of the 40 tests (dealII) was reproducable below 100% (by about .4%). Given that hmmer benefits so much I believe this to be a fair trade off. llvm-svn: 180731	2013-04-29 21:28:24 +00:00
Rafael Espindola	29cb481ba0	Disable the MCJIT tests on 32 bit darwin. I recently enabled them on 32 and 64 bit darwin, but it looks like 32 bit is still fairly broken. llvm-svn: 180730	2013-04-29 21:09:32 +00:00
Rafael Espindola	f1f1c626e7	Propagate relocation info to resolveRelocation. This gets most of the MCJITs tests passing with MachO. llvm-svn: 180716	2013-04-29 17:24:34 +00:00
Michael Gottesman	214ca90f8e	[objc-arc] Apply the RV optimization to retains next to calls in ObjCARCContract instead of ObjCARCOpts. Turning retains into retainRV calls disrupts the data flow analysis in ObjCARCOpts. Thus we move it as late as we can by moving it into ObjCARCContract. We leave in the conversion from retainRV -> retain in ObjCARCOpt since it enables the dataflow analysis. rdar://10813093 llvm-svn: 180698	2013-04-29 06:53:53 +00:00
Shuxin Yang	04a4fd43aa	Fix a XOR reassociation bug. When Reassociator optimize "(x \| C1)" ^ "(X & C2)", it may swap the two subexpressions, however, it forgot to swap cached constants (of C1 and C2) accordingly. rdar://13739160 llvm-svn: 180676	2013-04-27 18:02:12 +00:00
Tim Northover	72e122607f	AArch64: convert MC-layer test to .s file The CodeGen aspects of this test are already covered by cfi-frame.ll; making it an assembly file reduces the risk of incidental changes affecting the test. llvm-svn: 180671	2013-04-27 11:56:14 +00:00
Michael Gottesman	b33b6cb84a	[objc-arc] Test cleanups. Mainly adding paranoid checks for the closing brace of a function to help with FileCheck error readability. Also some other minor changes. No actual CHECK changes. llvm-svn: 180668	2013-04-27 05:25:54 +00:00
Eric Christopher	203e12bf9e	Use the target triple from the target machine rather than the module to determine whether or not we're on a darwin platform for debug code emitting. Solves the problem of a module with no triple on the command line and no triple in the module using non-gdb ok features on darwin. Fix up the member-pointers test to check the correct things for cross platform (DW_FORM_flag is a good prefix). Unfortunately no testcase because I have no ideas how to test something without a triple and without a triple in the module yet check precisely on two platforms. Ideas welcome. llvm-svn: 180660	2013-04-27 01:07:52 +00:00
Eric Christopher	b2a602d730	Move the XFAIL out of the middle of a comment. llvm-svn: 180659	2013-04-27 01:07:22 +00:00
Rafael Espindola	1357ab74e5	Make all darwin ppc stubs local. This fixes pr15763. Patch by David Fang. llvm-svn: 180657	2013-04-27 00:43:16 +00:00
Manman Ren	5c37106d65	Struct-path aware TBAA: change the format of TBAAStructType node. We switch the order of offset and field type to make TBAAStructType node (name, parent node, offset) similar to scalar TBAA node (name, parent node). TypeIsImmutable is added to TBAAStructTag node. llvm-svn: 180654	2013-04-27 00:26:11 +00:00
Benjamin Kramer	5259bbde82	Make CHECK lines a bit less strict so they also match code generated for win64. Hopefully brings the windows buildbots back to life. llvm-svn: 180630	2013-04-26 21:04:21 +00:00
Nadav Rotem	be0e89d9e8	Teach the interpreter to handle vector compares and additional vector arithmetic operations. Patch by Yuri Veselov. llvm-svn: 180626	2013-04-26 20:19:41 +00:00
Tom Stellard	456adc6c4e	R600: Initialize AMDGPUMachineFunction::ShaderType to ShaderType::COMPUTE We need to intialize this to something and since clang does not set the shader type attribute and clang is used only for compute shaders, initializing it to COMPUTE seems like the best choice. Reviewed-by: Christian König <christian.koenig@amd.com> llvm-svn: 180620	2013-04-26 18:32:24 +00:00
Adrian Prantl	a8aa97d310	cleanup testcase some more rdar://problem/13056109 llvm-svn: 180619	2013-04-26 18:10:54 +00:00
Quentin Colombet	a83d5e9f91	ARM: Fix encoding of hint instruction for Thumb. "hint" space for Thumb actually overlaps the encoding space of the CPS instruction. In actuality, hints can be defined as CPS instructions where imod and M bits are all nil. Handle decoding of permitted nop-compatible hints (i.e. nop, yield, wfi, wfe, sev) in DecodeT2CPSInstruction. This commit adds a proper diagnostic message for Imm0_4 and updates all tests. Patch by Mihail Popa <Mihail.Popa@arm.com>. llvm-svn: 180617	2013-04-26 17:54:54 +00:00
Rafael Espindola	37212578ee	Add missing ':'. llvm-svn: 180616	2013-04-26 17:54:46 +00:00
Adrian Prantl	29b9de7bf1	Bugfix for the debug intrinsic handling in InstCombiner: Since we can't guarantee that the original dbg.declare instrinsic is removed by LowerDbgDeclare(), we need to make sure that we are not inserting the same dbg.value intrinsic over and over. This removes tons of redundant DIEs when compiling optimized code. rdar://problem/13056109 llvm-svn: 180615	2013-04-26 17:48:33 +00:00
Benjamin Kramer	ae81474a38	ARM/NEON: Pattern match vector integer abs to vabs. llvm-svn: 180604	2013-04-26 15:00:57 +00:00
Benjamin Kramer	aec90531f9	X86: Now that we have a canonical form for vector integer abs, match it into pabs. llvm-svn: 180600	2013-04-26 12:05:21 +00:00
Benjamin Kramer	d56ffc709d	DAGCombiner: Canonicalize vector integer abs in the same way we do it for scalars. This already helps SSE2 x86 a lot because it lacks an efficient way to represent a vector select. The long term goal is to enable the backend to match a canonicalized pattern into a single instruction (e.g. vabs or pabs). llvm-svn: 180597	2013-04-26 09:19:19 +00:00
Nadav Rotem	13306816fc	LoopVectorizer: Calculate the number of pointers to disambiguate at runtime based on the numbers of reads and writes. llvm-svn: 180593	2013-04-26 05:08:59 +00:00
Jack Carter	c15c1d245b	Mips assembler: .set reorder support Mips have delayslots for certain instructions like jumps and branches. These are instructions that follow the branch or jump and are executed before the jump or branch is completed. Early Mips compilers could not cope with delayslots and left them up to the assembler. The assembler would fill the delayslots with the appropriate instruction, usually just a nop to allow correct runtime behavior. The default behavior for this is set with .set reorder. To tell the assembler that you don't want it to mess with the delayslot one used .set noreorder. For backwards compatibility we need to support .set reorder and have it be the default behavior in the assembler. Our support for it is to insert a NOP directly after an instruction with a delayslot when in .set reorder mode. Contributer: Vladimir Medic llvm-svn: 180584	2013-04-25 23:31:35 +00:00
Michael Liao	0b707eb85e	Remove SMLoc paired with CHECK-NOT patterns. Not functionality change. Pattern has source location by itself. After adding a trivial method to retrieve it, it's unnecessary to pair a source location for CHECK-NOT patterns. One thing revised after this is the diagnostic info is more accurate by pointing to the start of the CHECK-NOT pattern instead of the end of the CHECK-NOT pattern. E.g. diagnostic message previously looks like <stdin>:1:1: error: CHECK-NOT: string occurred! test ^ test.txt:1:16: note: CHECK-NOT: pattern specified here CHECK-NOT: test ^ is changed to <stdin>:1:1: error: CHECK-NOT: string occurred! test ^ test.txt:1:12: note: CHECK-NOT: pattern specified here CHECK-NOT: test ^ llvm-svn: 180578	2013-04-25 21:31:34 +00:00
Arnold Schwaighofer	9881dcf2f2	ARM cost model: Integer div and rem is lowered to a function call Reflect this in the cost model. I observed this in MiBench/consumer-lame. radar://13354716 llvm-svn: 180576	2013-04-25 21:16:18 +00:00
Preston Gurd	8b7ab4ba2b	This patch adds the X86FixupLEAs pass, which will reduce instruction latency for certain models of the Intel Atom family, by converting instructions into their equivalent LEA instructions, when it is both useful and possible to do so. llvm-svn: 180573	2013-04-25 20:29:37 +00:00
Nadav Rotem	f43cbeee15	LoopVectorizer: No need to generate pointer disambiguation checks between readonly pointers. llvm-svn: 180570	2013-04-25 19:55:03 +00:00
Reid Kleckner	d973ca3c51	[mc-coff] Forward Linker Option flags into the .drectve section Summary: This is modelled on the Mach-O linker options implementation and should support a Clang implementation of #pragma comment(lib/linker). Reviewers: rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D724 llvm-svn: 180569	2013-04-25 19:34:41 +00:00
Rafael Espindola	b770f897ee	Fix section relocation for SECTIONREL32 with immediate offset. Patch by Kai Nacke. This matches the gnu as output. llvm-svn: 180568	2013-04-25 19:27:05 +00:00
Chad Rosier	8180db1f03	[inline asm] Add a test case for r180226. The specific issue is that the inline assembly is requesting a 64-bit register, which is invalid for i386. rdar://13731657 llvm-svn: 180445	2013-04-25 17:10:21 +00:00
Rafael Espindola	1e48387962	Clarify getRelocationAddress x getRelocationOffset a bit. getRelocationAddress is for dynamic libraries and executables, getRelocationOffset for relocatable objects. Mark the getRelocationAddress of COFF and MachO as not implemented yet. Add a test of ELF's. llvm-readobj -r now prints the same values as readelf -r. llvm-svn: 180259	2013-04-25 12:28:45 +00:00
Silviu Baranga	4ad2bc5963	Fix constant folding for one lane vector types. Constant folding one lane vector types not returns a vector instead of a scalar. llvm-svn: 180254	2013-04-25 09:32:33 +00:00
Akira Hatanaka	8aba50fd39	Test case for r180241. llvm-svn: 180246	2013-04-25 02:22:07 +00:00
Akira Hatanaka	714a8f62db	Test case for r180238. llvm-svn: 180245	2013-04-25 02:21:09 +00:00
Tom Stellard	34e4068d05	R600: Use SHT_PROGBITS for the .AMDGPU.config section The libelf implementation that is distributed here: http://www.mr511.de/software/english.html will not parse sections that are marked SHT_NULL. llvm-svn: 180230	2013-04-24 23:56:14 +00:00
Jack Carter	a2015328e8	Mips assembler: Add 64 bit testing for JAL Contributer: Vladimir Medic llvm-svn: 180220	2013-04-24 21:52:42 +00:00
Rafael Espindola	75c3036d4b	Use pointers to iterate over symbols. While here, don't report a dummy symbol for relocations that don't have symbols. We used to says such relocations were for the first defined symbol, but now we return end_symbols(). The llvm-readobj output change agrees with otool. llvm-svn: 180214	2013-04-24 19:47:55 +00:00
Arnold Schwaighofer	a6578f7056	LoopVectorize: Scalarize padded types This patch disables memory-instruction vectorization for types that need padding bytes, e.g., x86_fp80 has 10 bytes store size with 6 bytes padding in darwin on x86_64. Because the load/store vectorization is performed by the bit casting to a packed vector, which has incompatible memory layout due to the lack of padding bytes, the present vectorizer produces inconsistent result for memory instructions of those types. This patch checks an equality of the AllocSize of a scalar type and allocated size for each vector element, to ensure that there is no padding bytes and the array can be read/written using vector operations. Patch by Daisuke Takahashi! Fixes PR15758. llvm-svn: 180196	2013-04-24 16:16:01 +00:00
Arnold Schwaighofer	23a0589bce	LoopVectorizer: Bail out if we don't have datalayout we need it llvm-svn: 180195	2013-04-24 16:15:58 +00:00
Andrew Trick	85a1d4cbc0	MI Sched: eliminate local vreg copies. For now, we just reschedule instructions that use the copied vregs and let regalloc elliminate it. I would really like to eliminate the copies on-the-fly during scheduling, but we need a complete implementation of repairIntervalsInRange() first. The general strategy is for the register coalescer to eliminate as many global copies as possible and shrink live ranges to be extended-basic-block local. The coalescer should not have to worry about resolving local copies (e.g. it shouldn't attemp to reorder instructions). The scheduler is a much better place to deal with local interference. The coalescer side of this equation needs work. llvm-svn: 180193	2013-04-24 15:54:43 +00:00
Adrian Prantl	1d5f8f93f8	Cleanup testcase and ensure we actually exercise the inliner. rdar://problem/12415623 llvm-svn: 180168	2013-04-24 01:44:15 +00:00
Jyotsna Verma	af2359b98c	Hexagon: Use multiclass for combine and STri[bhwd]_shl_V4 instructions. llvm-svn: 180145	2013-04-23 21:17:40 +00:00
Adrian Prantl	15db52bf6d	Make sure the instruction right after an inlined function has a debug location. This solves a problem where range of an inlined subroutine is emitted wrongly. Patch by Manman Ren. Fixes rdar://problem/12415623 llvm-svn: 180140	2013-04-23 19:56:03 +00:00
Stephen Lin	8118e0b588	Add more tests for r179925 to verify correct handling of signext/zeroext; strengthen condition check to require actual MVT::i32 virtual register types, just in case (no actual functionality change) llvm-svn: 180138	2013-04-23 19:42:25 +00:00
Rafael Espindola	7f08d1b9a8	Fix typo. llvm-svn: 180137	2013-04-23 19:39:34 +00:00
Jyotsna Verma	89c84821ea	Hexagon: Remove assembler mapped instruction definitions. llvm-svn: 180133	2013-04-23 19:15:55 +00:00
Vincent Lejeune	117f075f6e	R600: Use .AMDGPU.config section to emit stacksize llvm-svn: 180124	2013-04-23 17:34:12 +00:00
Vincent Lejeune	b6bfe85a07	R600: Add CF_END llvm-svn: 180123	2013-04-23 17:34:00 +00:00
Nadav Rotem	71c9d6d333	LoopVectorizer: Fix 15830. When scalarizing and unrolling stores make sure that the order in which the elements are scalarized is the same as the original order. This fixes a miscompilation in FreeBSD's regex library. llvm-svn: 180121	2013-04-23 17:12:42 +00:00
Jyotsna Verma	a696239bec	Hexagon: Remove duplicate instructions to handle global/immediate values for absolute/absolute-set addressing modes. llvm-svn: 180120	2013-04-23 17:11:46 +00:00
Pekka Jaaskelainen	d3c90e132a	Call the potentially costly isAnnotatedParallel() only once. Made the uniform write test's checks a bit stricter. llvm-svn: 180119	2013-04-23 16:44:43 +00:00
Rafael Espindola	b716e622ae	Write relocations in yaml2obj. llvm-svn: 180115	2013-04-23 15:53:02 +00:00
Rafael Espindola	70e94800e5	Move test from grep to FileCheck. llvm-svn: 180092	2013-04-23 12:03:27 +00:00
Alexey Samsonov	068fc8ae6e	Use zlib to uncompress debug sections in DWARF parser. This makes llvm-dwarfdump and llvm-symbolizer understand debug info sections compressed by ld.gold linker. llvm-svn: 180088	2013-04-23 10:17:34 +00:00
Pekka Jaaskelainen	6f2f66b63f	Refuse to (even try to) vectorize loops which have uniform writes, even if erroneously annotated with the parallel loop metadata. Fixes Bug 15794: "Loop Vectorizer: Crashes with the use of llvm.loop.parallel metadata" llvm-svn: 180081	2013-04-23 08:08:51 +00:00
Chad Rosier	53e5768351	Add test case for PR15779, which has previously been fixed. llvm-svn: 180058	2013-04-22 22:30:01 +00:00
Anat Shemer	10260a75e3	Changed back (relative to commit 179786) the operations executed when extract(cast) is transformed to cast(extract). It uses the Builder class as before. In addition the result node is added to the Worklist, so all the previous extract users will become the new scalar cast users. llvm-svn: 180045	2013-04-22 20:51:10 +00:00
Akira Hatanaka	0d6964cf4a	[mips] In performDSPShiftCombine, check that all elements in the vector are shifted by the same amount and the shift amount is smaller than the element size. llvm-svn: 180039	2013-04-22 19:58:23 +00:00
Peter Collingbourne	8988687d6b	COFF: Fix weak external aliases. Differential Revision: http://llvm-reviews.chandlerc.com/D700 llvm-svn: 180034	2013-04-22 18:48:56 +00:00
Stephen Lin	2ec1b100a4	Extra paranoid test for r179925 (verify that tail calls are not generated to 'this'-returning constructors of objects with different 'this' pointers than the caller) llvm-svn: 180032	2013-04-22 17:23:49 +00:00
Rafael Espindola	8bd2c228f8	Also verify llvm.compiler_used. llvm-svn: 180020	2013-04-22 15:16:51 +00:00
Rafael Espindola	74f2e46eef	Clarify that llvm.used can contain aliases. Also add a check for llvm.used in the verifier and simplify clients now that they can assume they have a ConstantArray. llvm-svn: 180019	2013-04-22 14:58:02 +00:00
Stepan Dyatkovskiy	f80f9513ce	Fix for 5.5 Parameter Passing --> Stage C: -- C.4 and C.5 statements, when NSAA is not equal to SP. -- C.1.cp statement for VA functions. Note: There are no VFP CPRCs in a variadic procedure. Before this patch "NSAA != 0" means "don't use GPRs anymore ". But there are some exceptions in AAPCS. 1. For non VA function: allocate all VFP regs for CPRC. When all VFPs are allocated CPRCs would be sent to stack, while non CPRCs may be still allocated in GRPs. 2. Check that for VA functions all params uses GPRs and then stack. No exceptions, no CPRCs here. llvm-svn: 180011	2013-04-22 13:06:52 +00:00
Eric Christopher	f565498668	Add .ll as a valid test suffix for Object, this allows .ll -> object and then dumping as tests. llvm-svn: 180010	2013-04-22 10:45:06 +00:00
Arnaud A. de Grandmaison	e206e6e80a	Cleanup: test source files do not need to be executable llvm-svn: 180003	2013-04-22 08:02:43 +00:00
David Blaikie	f55abeaf4c	Revert "Revert "PR14606: debug info imported_module support"" This reverts commit r179840 with a fix to test/DebugInfo/two-cus-from-same-file.ll I'm not sure why that test only failed on ARM & MIPS and not X86 Linux, even though the debug info was clearly invalid on all of them, but this ought to fix it. llvm-svn: 179996	2013-04-22 06:12:31 +00:00
Jim Grosbach	563983c8a3	Legalize vector truncates by parts rather than just splitting. Rather than just splitting the input type and hoping for the best, apply a bit more cleverness. Just splitting the types until the source is legal often leads to an illegal result time, which is then widened and a scalarization step is introduced which leads to truly horrible code generation. With the loop vectorizer, these sorts of operations are much more common, and so it's worth extra effort to do them well. Add a legalization hook for the operands of a TRUNCATE node, which will be encountered after the result type has been legalized, but if the operand type is still illegal. If simple splitting of both types ends up with the result type of each half still being legal, just do that (v16i16 -> v16i8 on ARM, for example). If, however, that would result in an illegal result type (v8i32 -> v8i8 on ARM, for example), we can get more clever with power-two vectors. Specifically, split the input type, but also widen the result element size, then concatenate the halves and truncate again. For example on ARM, To perform a "%res = v8i8 trunc v8i32 %in" we transform to: %inlo = v4i32 extract_subvector %in, 0 %inhi = v4i32 extract_subvector %in, 4 %lo16 = v4i16 trunc v4i32 %inlo %hi16 = v4i16 trunc v4i32 %inhi %in16 = v8i16 concat_vectors v4i16 %lo16, v4i16 %hi16 %res = v8i8 trunc v8i16 %in16 This allows instruction selection to generate three VMOVN instructions instead of a sequences of moves, stores and loads. Update the ARMTargetTransformInfo to take this improved legalization into account. Consider the simplified IR: define <16 x i8> @test1(<16 x i32>* %ap) { %a = load <16 x i32>* %ap %tmp = trunc <16 x i32> %a to <16 x i8> ret <16 x i8> %tmp } define <8 x i8> @test2(<8 x i32>* %ap) { %a = load <8 x i32>* %ap %tmp = trunc <8 x i32> %a to <8 x i8> ret <8 x i8> %tmp } Previously, we would generate the truly hideous: .syntax unified .section __TEXT,__text,regular,pure_instructions .globl _test1 .align 2 _test1: @ @test1 @ BB#0: push {r7} mov r7, sp sub sp, sp, #20 bic sp, sp, #7 add r1, r0, #48 add r2, r0, #32 vld1.64 {d24, d25}, [r0:128] vld1.64 {d16, d17}, [r1:128] vld1.64 {d18, d19}, [r2:128] add r1, r0, #16 vmovn.i32 d22, q8 vld1.64 {d16, d17}, [r1:128] vmovn.i32 d20, q9 vmovn.i32 d18, q12 vmov.u16 r0, d22[3] strb r0, [sp, #15] vmov.u16 r0, d22[2] strb r0, [sp, #14] vmov.u16 r0, d22[1] strb r0, [sp, #13] vmov.u16 r0, d22[0] vmovn.i32 d16, q8 strb r0, [sp, #12] vmov.u16 r0, d20[3] strb r0, [sp, #11] vmov.u16 r0, d20[2] strb r0, [sp, #10] vmov.u16 r0, d20[1] strb r0, [sp, #9] vmov.u16 r0, d20[0] strb r0, [sp, #8] vmov.u16 r0, d18[3] strb r0, [sp, #3] vmov.u16 r0, d18[2] strb r0, [sp, #2] vmov.u16 r0, d18[1] strb r0, [sp, #1] vmov.u16 r0, d18[0] strb r0, [sp] vmov.u16 r0, d16[3] strb r0, [sp, #7] vmov.u16 r0, d16[2] strb r0, [sp, #6] vmov.u16 r0, d16[1] strb r0, [sp, #5] vmov.u16 r0, d16[0] strb r0, [sp, #4] vldmia sp, {d16, d17} vmov r0, r1, d16 vmov r2, r3, d17 mov sp, r7 pop {r7} bx lr .globl _test2 .align 2 _test2: @ @test2 @ BB#0: push {r7} mov r7, sp sub sp, sp, #12 bic sp, sp, #7 vld1.64 {d16, d17}, [r0:128] add r0, r0, #16 vld1.64 {d20, d21}, [r0:128] vmovn.i32 d18, q8 vmov.u16 r0, d18[3] vmovn.i32 d16, q10 strb r0, [sp, #3] vmov.u16 r0, d18[2] strb r0, [sp, #2] vmov.u16 r0, d18[1] strb r0, [sp, #1] vmov.u16 r0, d18[0] strb r0, [sp] vmov.u16 r0, d16[3] strb r0, [sp, #7] vmov.u16 r0, d16[2] strb r0, [sp, #6] vmov.u16 r0, d16[1] strb r0, [sp, #5] vmov.u16 r0, d16[0] strb r0, [sp, #4] ldm sp, {r0, r1} mov sp, r7 pop {r7} bx lr Now, however, we generate the much more straightforward: .syntax unified .section __TEXT,__text,regular,pure_instructions .globl _test1 .align 2 _test1: @ @test1 @ BB#0: add r1, r0, #48 add r2, r0, #32 vld1.64 {d20, d21}, [r0:128] vld1.64 {d16, d17}, [r1:128] add r1, r0, #16 vld1.64 {d18, d19}, [r2:128] vld1.64 {d22, d23}, [r1:128] vmovn.i32 d17, q8 vmovn.i32 d16, q9 vmovn.i32 d18, q10 vmovn.i32 d19, q11 vmovn.i16 d17, q8 vmovn.i16 d16, q9 vmov r0, r1, d16 vmov r2, r3, d17 bx lr .globl _test2 .align 2 _test2: @ @test2 @ BB#0: vld1.64 {d16, d17}, [r0:128] add r0, r0, #16 vld1.64 {d18, d19}, [r0:128] vmovn.i32 d16, q8 vmovn.i32 d17, q9 vmovn.i16 d16, q8 vmov r0, r1, d16 bx lr llvm-svn: 179989	2013-04-21 23:47:41 +00:00
Jim Grosbach	fb08e55cc1	ARM: Split out cost model vcvt testcases. They had a separate RUN line already, so may as well be in a separate file. llvm-svn: 179988	2013-04-21 23:47:37 +00:00
Jakob Stoklund Olesen	84ebe25db7	Passing arguments to varags functions under the SPARC v9 ABI. Arguments after the fixed arguments never use the floating point registers. llvm-svn: 179987	2013-04-21 21:36:49 +00:00
Jakob Stoklund Olesen	65d3287282	Fix the SETHIimm pattern for 64-bit code. Don't ignore the high 32 bits of the immediate. llvm-svn: 179985	2013-04-21 21:18:03 +00:00
Benjamin Kramer	0212dc27ed	SROA: Don't crash on a select with two identical operands. This is an edge case that can happen if we modify a chain of multiple selects. Update all operands in that case and remove the assert. PR15805. llvm-svn: 179982	2013-04-21 17:48:39 +00:00
Arnold Schwaighofer	6eb32b31bd	Revert "SimplifyCFG: If convert single conditional stores" There is the temptation to make this tranform dependent on target information as it is not going to be beneficial on all (sub)targets. Therefore, we should probably do this in MI Early-Ifconversion. This reverts commit r179957. Original commit message: "SimplifyCFG: If convert single conditional stores This transformation will transform a conditional store with a preceeding uncondtional store to the same location: a[i] = may-alias with a[i] load if (cond) a[i] = Y into an unconditional store. a[i] = X may-alias with a[i] load tmp = cond ? Y : X; a[i] = tmp We assume that on average the cost of a mispredicted branch is going to be higher than the cost of a second store to the same location, and that the secondary benefits of creating a bigger basic block for other optimizations to work on outway the potential case were the branch would be correctly predicted and the cost of the executing the second store would be noticably reflected in performance. hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With this change we are on par with gcc's performance (gcc also performs this transformation). There was a 1.2 % performance improvement on a ARM swift chip. Other tests in the test-suite+external seem to be mostly uninfluenced in my experiments: This optimization was triggered on 41 tests such that the executable was different before/after the patch. Only 1 out of the 40 tests (dealII) was reproducable below 100% (by about .4%). Given that hmmer benefits so much I believe this to be a fair trade off. I am going to watch performance numbers across the builtbots and will revert this if anything unexpected comes up." llvm-svn: 179980	2013-04-21 13:09:04 +00:00
Tim Northover	4a58db65a5	ARM: fix part of test which actually needed an asserts build This should fix a buildbot failure that occurred after r179977. llvm-svn: 179978	2013-04-21 12:20:19 +00:00
Tim Northover	798697d662	ARM: Use ldrd/strd to spill 64-bit pairs when available. This allows common sp-offsets to be part of the instruction and is probably faster on modern CPUs too. llvm-svn: 179977	2013-04-21 11:57:07 +00:00
Nadav Rotem	c57af326a4	SLPVectorize: Add support for vectorization of casts. llvm-svn: 179975	2013-04-21 08:05:59 +00:00
Michael Gottesman	d5b701faf1	[objc-arc] Cleaned up tail-call-invariant-enforcement.ll. Specifically: 1. Added checks that unwind is being properly added to various instructions. 2. Fixed the declaration/calling of objc_release to have a return type of void. 3. Moved all checks to precede the functions and added checks to ensure that the checks would only match inside the specific function that we are attempting to check. llvm-svn: 179973	2013-04-21 02:59:44 +00:00
Michael Gottesman	77aa946321	[objc-arc] Check that objc-arc-expand properly handles all strictly forwarding calls and does not touch calls which are not strictly forwarding (i.e. objc_retainBlock). llvm-svn: 179972	2013-04-21 01:57:46 +00:00
Michael Gottesman	524052fec1	[objc-arc] Renamed the test file clang-arc-used-intrinsic-removed-if-isolated.ll -> intrinsic-use-isolated.ll to match the other test file intrinsic-use.ll. llvm-svn: 179971	2013-04-21 01:42:24 +00:00
Bill Wendling	eaff0ce3a4	Remove tbaa metadata. llvm-svn: 179970	2013-04-21 01:38:25 +00:00
Jakob Stoklund Olesen	a41f91ea8e	Compile varargs functions for SPARCv9. With a little help from the frontend, it looks like the standard va_* intrinsics can do the job. Also clean up an old bitcast hack in LowerVAARG that dealt with unaligned double loads. Load SDNodes can specify an alignment now. Still missing: Calling varargs functions with float arguments. llvm-svn: 179961	2013-04-20 22:49:16 +00:00
Nadav Rotem	8aca44a623	Fix PR15800. Do not try to vectorize vectors and structs. llvm-svn: 179960	2013-04-20 22:29:43 +00:00
Arnold Schwaighofer	3546ccf465	SimplifyCFG: If convert single conditional stores This transformation will transform a conditional store with a preceeding uncondtional store to the same location: a[i] = may-alias with a[i] load if (cond) a[i] = Y into an unconditional store. a[i] = X may-alias with a[i] load tmp = cond ? Y : X; a[i] = tmp We assume that on average the cost of a mispredicted branch is going to be higher than the cost of a second store to the same location, and that the secondary benefits of creating a bigger basic block for other optimizations to work on outway the potential case were the branch would be correctly predicted and the cost of the executing the second store would be noticably reflected in performance. hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With this change we are on par with gcc's performance (gcc also performs this transformation). There was a 1.2 % performance improvement on a ARM swift chip. Other tests in the test-suite+external seem to be mostly uninfluenced in my experiments: This optimization was triggered on 41 tests such that the executable was different before/after the patch. Only 1 out of the 40 tests (dealII) was reproducable below 100% (by about .4%). Given that hmmer benefits so much I believe this to be a fair trade off. I am going to watch performance numbers across the builtbots and will revert this if anything unexpected comes up. llvm-svn: 179957	2013-04-20 21:42:09 +00:00
Tim Northover	d9d4211fe2	ARM: don't add FrameIndex offset for LDMIA (has no immediate) Previously, when spilling 64-bit paired registers, an LDMIA with both a FrameIndex and an offset was produced. This kind of instruction shouldn't exist, and the extra operand was being confused with the predicate, causing aborts later on. This removes the invalid 0-offset from the instruction being produced. llvm-svn: 179956	2013-04-20 19:31:00 +00:00
Nuno Lopes	36e827602a	recommit tests llvm-svn: 179955	2013-04-20 17:39:52 +00:00
Stephen Lin	8fccb8a772	Minor renaming of tests (for consistency with an in-development patch) llvm-svn: 179954	2013-04-20 16:21:26 +00:00
Benjamin Kramer	5bd25f3786	Don't litter .s files in test directory. llvm-svn: 179937	2013-04-20 10:43:40 +00:00
Nadav Rotem	83c7c41bc2	SLPVectorizer: Improve the cost model for loop invariant broadcast values. llvm-svn: 179930	2013-04-20 06:13:47 +00:00
Stephen Lin	b8bd232a3d	Add CodeGen support for functions that always return arguments via a new parameter attribute 'returned', which is taken advantage of in target-independent tail call opportunity detection and in ARM call lowering (when placed on an integral first parameter). llvm-svn: 179925	2013-04-20 05:14:40 +00:00
Stephen Lin	ffc445492c	Allow tail call opportunity detection through nested and/or multiple iterations of extractelement/insertelement indirection llvm-svn: 179924	2013-04-20 04:27:51 +00:00
Akira Hatanaka	1ebb2a1c56	[mips] Instruction selection patterns for DSP-ASE vector shifts. llvm-svn: 179906	2013-04-19 23:21:32 +00:00
Benjamin Kramer	630e6e1422	MergeFunc: Make pointer and integer types generate the same hash. The logic that actually compares the types considers pointers and integers the same if they are of the same size. This created a strange mismatch between hash and reality and made the test case for this fail on some platforms (yay, test cases). llvm-svn: 179905	2013-04-19 23:06:44 +00:00
Bill Wendling	24e8a0d5f0	Make variable match any name. llvm-svn: 179903	2013-04-19 22:30:43 +00:00
Hal Finkel	e632239d7b	Fix PPC optimizeCompareInstr swapped-sub argument handling When matching a compare with a subtract where the arguments of the compare are swapped w.r.t. the arguments of the subtract, we need to negate the predicates (or CR bit indices) of the users. This, however, is not the same as inverting the predicate (negating LT -> GT, but inverting LT -> GE, for example). The ARM backend seems to do this correctly, but when I adapted the code for the PPC backend, I introduced an error in this logic. Comparison optimization is now enabled again by default. llvm-svn: 179899	2013-04-19 22:08:38 +00:00
Bill Wendling	81c8cf5ef9	Try explicitly setting the target triple to see if this gets it to pass on ARM. llvm-svn: 179890	2013-04-19 21:24:51 +00:00
Anton Korobeynikov	9c0df1695d	Do not mangle in MS-way the globals with magic \001 in the name. Based on the patch by David Nadlinger! llvm-svn: 179889	2013-04-19 21:20:56 +00:00
Bill Wendling	b1f0f71735	Make test slightly more readable. llvm-svn: 179888	2013-04-19 21:14:59 +00:00
Bill Wendling	ae230c11cc	Add a testcase to make sure we generate the proper compact unwind section for a function that cannot produce a compact unwind encoding. llvm-svn: 179887	2013-04-19 21:07:11 +00:00
Chad Rosier	11ebe05643	Attempt to pacify this test for the buildbots. llvm-svn: 179874	2013-04-19 19:27:33 +00:00
Akira Hatanaka	c68fd9f4f1	[mips] Fix InstAlias of XOR and OR macros. Set EmitAlias flag and change operand type to uimm16. Patch by Vladimir Medic. llvm-svn: 179872	2013-04-19 18:47:40 +00:00
Bill Wendling	b670649067	Add test to make sure that a int-to-ptr can be merged correctly. llvm-svn: 179869	2013-04-19 18:16:06 +00:00
Benjamin Kramer	ec1bb4fdaf	ConstantFolding: ComputeMaskedBits wants the scalar size for vectors. Fixes PR15791. llvm-svn: 179859	2013-04-19 16:56:24 +00:00
Tim Northover	27ff504653	ARM: Permit "sp" in ARM variant of STREXD instructions Patch from Mihail Popa llvm-svn: 179854	2013-04-19 15:44:32 +00:00
Rafael Espindola	c6555c1956	Only run the tests in test/Object/ARM if we have ARM support. llvm-svn: 179850	2013-04-19 12:47:53 +00:00
Benjamin Kramer	0baf8f4279	Attributes: Don't print trailing whitespace on the function attribute comment. llvm-svn: 179849	2013-04-19 11:43:21 +00:00
Rafael Espindola	feef8c2469	Don't read one command past the end. Thanks to Evgeniy Stepanov for reporting this. It might be a good idea to add a command iterator abstraction to MachO.h, but this fixes the bug for now. llvm-svn: 179848	2013-04-19 11:36:47 +00:00
Tim Northover	a155ab2dd2	ARM: permit "sp" in ARM variants of MOVW/MOVT instructions llvm-svn: 179847	2013-04-19 09:58:09 +00:00
Jakub Staszak	9b59d14fc4	Revert 179826. Tests were worthless. llvm-svn: 179845	2013-04-19 09:32:30 +00:00
Eric Christopher	0e89ade8ff	Revert "PR14606: debug info imported_module support" This reverts commit r179836 as it seems to have caused test failures. llvm-svn: 179840	2013-04-19 07:47:16 +00:00
David Blaikie	88564f3cf7	PR14606: debug info imported_module support Adding another CU-wide list, in this case of imported_modules (since they should be relatively rare, it seemed better to add a list where each element had a "context" value, rather than add a (usually empty) list to every scope). This takes care of DW_TAG_imported_module, but to fully address PR14606 we'll need to expand this to cover DW_TAG_imported_declaration too. llvm-svn: 179836	2013-04-19 06:57:04 +00:00
Tom Stellard	9d10c4ce86	R600: Add pattern for the BFI_INT instruction llvm-svn: 179830	2013-04-19 02:11:06 +00:00
Tom Stellard	5a6b0d828b	R600: Reorganize lit tests and document how they should be organized llvm-svn: 179828	2013-04-19 02:10:53 +00:00
Jakub Staszak	2c1daf75b9	Don't run expensive -O2 and -O3 in tests. llvm-svn: 179825	2013-04-19 01:10:45 +00:00
Chad Rosier	f8fb2bc2f3	[ms-inline asm] Apply the condition code mnemonic aliases to both the Intel and AT&T dialect. Test case for r179804 as well. rdar://13674398 and PR13340. llvm-svn: 179813	2013-04-18 23:16:12 +00:00
Hal Finkel	b12da6be75	Disable PPC comparison optimization by default This seems to cause a stage-2 LLVM compile failure (by crashing TableGen); do I'm disabling this for now. llvm-svn: 179807	2013-04-18 22:54:25 +00:00
Hal Finkel	82656cb200	Implement optimizeCompareInstr for PPC Many PPC instructions have a so-called 'record form' which stores to a specific condition register the result of comparing the result of the instruction with zero (always as a signed comparison). For integer operations on PPC64, this is always a 64-bit comparison. This implementation is derived from the implementation in the ARM backend; there are some differences because PPC condition registers are allocatable virtual registers (although the record forms always use a specific one), and we look for a matching subtraction instruction after the compare (but before the first use) in addition to before it. llvm-svn: 179802	2013-04-18 22:15:08 +00:00
Benjamin Kramer	c557828805	X86: Add an SSE2 lowering for 64 bit compares when pcmpgtq (SSE4.2) isn't available. This pattern started popping up in vectorized min/max reductions. llvm-svn: 179797	2013-04-18 21:37:45 +00:00
Anat Shemer	5570318f43	In the function InstCombiner::visitExtractElementInst() removed the limitation that extract is promoted over a cast only if the cast has only one use. llvm-svn: 179786	2013-04-18 19:56:44 +00:00
Anat Shemer	0c95efad7e	Added a function scalarizePHI() that sclarizes a vector phi instruction if it has only 2 uses: one to promote the vector phi in a loop and the other use is an extract operation of one element at a constant location. llvm-svn: 179783	2013-04-18 19:35:39 +00:00
Rafael Espindola	56f976f6bd	At Jim Grosbach's request detemplate Object/MachO.h. We are still able to handle mixed endian objects by swapping one struct at a time. llvm-svn: 179778	2013-04-18 18:08:55 +00:00
Derek Schuff	a403d243d1	Allow misaligned stores in x86 fast-isel. In X86FastISel::X86SelectStore(), improperly aligned stores are rejected and handled by the DAG-based ISel. However, X86FastISel::X86SelectLoad() makes no such requirement. There doesn't appear to be an x86 architectural correctness issue with allowing potentially unaligned store instructions. This patch removes this restriction. Patch by Jim Stichnot. llvm-svn: 179774	2013-04-18 17:41:08 +00:00
Arnold Schwaighofer	4cd6aa110c	LoopVectorizer: Recognize min/max reductions A min/max operation is represented by a select(cmp(lt/le/gt/ge, X, Y), X, Y) sequence in LLVM. If we see such a sequence we can treat it just as any other commutative binary instruction and reduce it. This appears to help bzip2 by about 1.5% on an imac12,2. radar://12960601 llvm-svn: 179773	2013-04-18 17:22:34 +00:00
Benjamin Kramer	8df2cfb858	LoopVectorize: Use a set to avoid longer cycles in the reduction chain too. Fixes PR15748. llvm-svn: 179757	2013-04-18 14:29:13 +00:00
Hao Liu	a2ff69863e	Fix for PR14824, An ARM Load/Store Optimization bug llvm-svn: 179751	2013-04-18 09:11:08 +00:00
David Majnemer	81af06e003	Revert "Combine bit test + conditional or into simple math" It is causing stage2 builds to fail, let's get them running again. llvm-svn: 179750	2013-04-18 08:42:33 +00:00
David Majnemer	bdf0caf6b1	Combine bit test + conditional or into simple math Simplify: (select (icmp eq (and X, C1), 0), Y, (or Y, C2)) Into: (or (shl (and X, C1), C3), y) Where: C3 = Log(C2) - Log(C1) If: C1 and C2 are both powers of two llvm-svn: 179748	2013-04-18 07:30:07 +00:00
Michael Gottesman	323964ca9e	[objc-arc] Do not mismatch up retains inside a for loop with releases outside said for loop in the presense of differing provenance caused by escaping blocks. This occurs due to an alloca representing a separate ownership from the original pointer. Thus consider the following pseudo-IR: objc_retain(%a) for (...) { objc_retain(%a) %block <- %a F(%block) objc_release(%block) } objc_release(%a) From the perspective of the optimizer, the %block is a separate provenance from the original %a. Thus the optimizer pairs up the inner retain for %a and the outer release from %a, resulting in segfaults. This is fixed by noting that the signature of a mismatch of retain/releases inside the for loop is a Use/CanRelease top down with an None bottom up (since bottom up the Retain-CanRelease-Use-Release sequence is completed by the inner objc_retain, but top down due to the differing provenance from the objc_release said sequence is not completed). In said case in CheckForCFGHazards, we now clear the state of %a implying that no pairing will occur. Additionally a test case is included. rdar://12969722 llvm-svn: 179747	2013-04-18 05:39:45 +00:00
Michael Gottesman	a15ab25238	Streamline arc-annotation test (removing some cases which do not add any extra coverage) and set it up to use FileCheck variables to make the test more robust. llvm-svn: 179745	2013-04-18 04:34:06 +00:00
Akira Hatanaka	59bfaf774b	[mips] DSP-ASE move from HI/LO register instructions. llvm-svn: 179739	2013-04-18 00:52:44 +00:00
Peter Collingbourne	2f495b93ee	Add support for subsections to the ELF assembler. Fixes PR8717. Differential Revision: http://llvm-reviews.chandlerc.com/D598 llvm-svn: 179725	2013-04-17 21:18:16 +00:00
Chad Rosier	3124627aa8	[ms-inline asm] Add support for the minus unary operator. Previously, we were unable to handle cases such as __asm mov eax, 8*-8. This patch also attempts to simplify the state machine. Further, the error reporting has been improved. Test cases included, but more will be added to the clang side shortly. rdar://13668445 llvm-svn: 179719	2013-04-17 21:01:45 +00:00
Eli Bendersky	24a36eb331	This patch teaches x86 fast-isel to generate the native div/idiv instructions for the sdiv/srem/udiv/urem bitcode instructions. This is done for the i8, i16, and i32 types, as well as i64 for the x86_64 target. Patch by Jim Stichnoth llvm-svn: 179715	2013-04-17 20:10:13 +00:00
Arnold Schwaighofer	c0c7ff4ac0	X86 cost model: Exit before calling getSimpleVT on non-simple VTs getSimpleVT can only handle simple value types. radar://13676022 llvm-svn: 179714	2013-04-17 20:04:53 +00:00
Quentin Colombet	6f03f624df	Fix treatment of ARM unallocated hint instructions. The reference manual defines only 5 permitted values for the immediate field of the "hint" instruction: 1. nop (imm == 0) 2. yield (imm == 1) 3. wfe (imm == 2) 4. wfi (imm == 3) 5. sev (imm == 4) Therefore, restrict the permitted values for the "hint" instruction to 0 through 4. Patch by Mihail Popa <Mihail.Popa@arm.com> llvm-svn: 179707	2013-04-17 18:46:12 +00:00
Vincent Lejeune	2d5c341cee	R600: Make Export Instruction not duplicable llvm-svn: 179686	2013-04-17 15:17:39 +00:00
Eric Christopher	6d1fcb46af	This appears to be no longer necessary for the testsuite. llvm-svn: 179667	2013-04-17 06:37:30 +00:00
David Blaikie	a205ea3151	PR15149/r174304 improvement - print hex for unknown dwarf language codes & add a test case CR feedback from Rafael Espindola and Paul Robinson. llvm-svn: 179664	2013-04-17 03:41:36 +00:00
Peter Collingbourne	37ae72b508	Do not optimise fprintf() calls if its return value is used. Differential Revision: http://llvm-reviews.chandlerc.com/D620 llvm-svn: 179661	2013-04-17 02:01:10 +00:00
Jack Carter	b5cf5909ac	Mips assembler: Enable handling of nested expressions This patch allows the Mips assembler to parse and emit nested expressions as instruction operands. It also extends the expansion of memory instructions when an offset is given as an expression. Contributer: Vladimir Medic llvm-svn: 179657	2013-04-17 00:18:04 +00:00
Richard Osborne	ba79dfc390	[XCore] Extend test to check positve offsets are folded into addresses. llvm-svn: 179621	2013-04-16 20:05:52 +00:00
Richard Osborne	f29919d3f8	[XCore] Give test more generic name. I intend to extend the test with more offset folding checks llvm-svn: 179620	2013-04-16 19:56:55 +00:00
Richard Osborne	9a61da5ed4	[XCore] Convert a couple of tests to FileCheck. llvm-svn: 179619	2013-04-16 19:41:19 +00:00
Logan Chien	d8bb4b7e06	Implement ARM unwind opcode assembler. llvm-svn: 179591	2013-04-16 12:02:21 +00:00
Alexey Samsonov	209095cd9f	llvm-objdump: Don't print contents of BSS sections: it makes no sense and crashes llvm-objdump on relocated objects with large bss llvm-svn: 179589	2013-04-16 10:53:11 +00:00
Hans Wennborg	c9e1d99279	simplifycfg: Fix integer overflow converting switch into icmp. If a switch instruction has a case for every possible value of its type, with the same successor, SimplifyCFG would replace it with an icmp ult, but the computation of the bound overflows in that case, which inverts the test. Patch by Jed Davis! llvm-svn: 179587	2013-04-16 08:35:36 +00:00
Jakob Stoklund Olesen	73d1739bc4	Add 64-bit multiply and divide instructions for SPARC v9. llvm-svn: 179582	2013-04-16 02:57:02 +00:00
Jim Grosbach	9b81a4f0f1	ARM: Add VACLT and VACLE assembly aliases. These are aliases for VACGT and VACGE, respectively, with the source operands reversed. rdar://13638090 llvm-svn: 179575	2013-04-15 22:42:50 +00:00
Bill Wendling	3789171972	We are not able to bitcast a pointer to an integral value. Two return types are not equivalent if one is a pointer and the other is an integral. This is because we cannot bitcast a pointer to an integral value. PR15185 llvm-svn: 179569	2013-04-15 22:33:50 +00:00
Jack Carter	2ad73da02b	Mips assembler: Explicit floating point condition register recognition. This patch allows the assembler to recognize $fcc0 as a valid register for conditional move instructions. Corresponding test cases have been added. Contributer: Vladimir Medic llvm-svn: 179567	2013-04-15 22:21:55 +00:00
Nadav Rotem	b9116e6966	SLPVectorizer: Make it a function pass and add code for hoisting the vector-gather sequence out of loops. llvm-svn: 179562	2013-04-15 22:00:26 +00:00
Tom Stellard	cb97e3acfa	R600/SI: Emit config values in register value pairs. Instead of emitting config values in a predefined order, the code emitter will now emit a 32-bit register index followed by the 32-bit config value. llvm-svn: 179546	2013-04-15 17:51:35 +00:00
Tom Stellard	3a7beafb32	R600/SI: Emit configuration value in the .AMDGPU.config ELF section llvm-svn: 179545	2013-04-15 17:51:30 +00:00
Tom Stellard	9991659fab	R600: Emit ELF formatted code rather than raw ISA. llvm-svn: 179544	2013-04-15 17:51:21 +00:00
Tim Northover	943e9293b3	Avoid outputting temporary test file into source tree. llvm-svn: 179532	2013-04-15 15:49:13 +00:00
Eric Christopher	13637e900e	Revert "Recommit r179497 after fixing uninitialized variable." until I can fix the testcases here: http://lab.llvm.org:8011/builders/clang-native-arm-cortex-a9/builds/6952 This reverts commit r179512 due to testcases specifying triples that they didn't actually mean and causing failures on other platforms. llvm-svn: 179513	2013-04-15 07:31:37 +00:00
Eric Christopher	fc2beaa136	Recommit r179497 after fixing uninitialized variable. llvm-svn: 179512	2013-04-15 07:07:21 +00:00
Nadav Rotem	5d393c416f	SLPVectorizer: Add support for vectorizing trees that start at compare instructions. llvm-svn: 179504	2013-04-15 04:25:27 +00:00
Hal Finkel	6736988ae2	Fix PPC64 CR spill location for callee-saved registers This fixes an ABI bug for non-Darwin PPC64. For the callee-saved condition registers, the spill location is specified relative to the stack pointer (SP + 8). However, this is not relative to the SP after the new stack frame is established, but instead relative to the caller's stack pointer (it is stored into the linkage area of the parent's stack frame). So, like with the link register, we don't directly spill the CRs with other callee-saved registers, but just mark them to be spilled during prologue generation. In practice, this reverts r179457 for PPC64 (but leaves it in place for PPC32). llvm-svn: 179500	2013-04-15 02:07:05 +00:00
Eric Christopher	1f140317e3	Revert "Remove some unused triple and data layout." This reverts commit r179497 and the accompanying commit as it broke random platforms that aren't osx. llvm-svn: 179499	2013-04-14 23:35:36 +00:00
Eric Christopher	4eebd14ad0	Remove some unused triple and data layout. llvm-svn: 179498	2013-04-14 23:32:44 +00:00
Nico Rieck	334c7bc7eb	Use object file specific section type for initial text section llvm-svn: 179494	2013-04-14 21:18:36 +00:00
David Majnemer	1fae195557	Reorders two transforms that collide with each other One performs: (X == 13 \| X == 14) -> X-13 <u 2 The other: (A == C1 \|\| A == C2) -> (A & ~(C1 ^ C2)) == C1 The problem is that there are certain values of C1 and C2 that trigger both transforms but the first one blocks out the second, this generates suboptimal code. Reordering the transforms should be better in every case and allows us to do interesting stuff like turn: %shr = lshr i32 %X, 4 %and = and i32 %shr, 15 %add = add i32 %and, -14 %tobool = icmp ne i32 %add, 0 into: %and = and i32 %X, 240 %tobool = icmp ne i32 %and, 224 llvm-svn: 179493	2013-04-14 21:15:43 +00:00
Nadav Rotem	6ebddae118	Make the command line triple match the module triple. llvm-svn: 179492	2013-04-14 20:13:05 +00:00
Jakob Stoklund Olesen	eed1072ff8	Use i32 for all SPARC shift amounts, even in 64-bit mode. Test case by llvm-stress. llvm-svn: 179477	2013-04-14 05:48:50 +00:00
Nadav Rotem	029208ceeb	Remove unused function attributes. llvm-svn: 179476	2013-04-14 05:47:04 +00:00
Nadav Rotem	54b413d157	SLPVectorizer: Add support for trees that don't start at binary operators, and add the cost of extracting values from the roots of the tree. llvm-svn: 179475	2013-04-14 05:15:53 +00:00
Jakob Stoklund Olesen	c3c28f8599	Add support for the abs64 SPARC v9 code model. For when 16 TB just isn't enough. llvm-svn: 179474	2013-04-14 05:10:36 +00:00
Jakob Stoklund Olesen	c8fc76b078	Add support for the SPARC v9 abs44 code model. This is the default model for non-PIC 64-bit code. It supports text+data+bss linked anywhere in the low 16 TB of the address space. llvm-svn: 179473	2013-04-14 04:57:51 +00:00
Jakob Stoklund Olesen	e0fc832b77	Also put target flags on SPARC constant pool references. Constant pool entries are accessed exactly the same way as global variables. llvm-svn: 179471	2013-04-14 04:35:16 +00:00
Nadav Rotem	0b9cf8567b	SLPVectorizer: add initial support for reduction variable vectorization. llvm-svn: 179470	2013-04-14 03:22:20 +00:00
Jakob Stoklund Olesen	dc1ed57858	Fix patterns for 64-bit pointers. This fixes the pic32 code model for SPARC v9. llvm-svn: 179469	2013-04-14 01:53:23 +00:00
Jakob Stoklund Olesen	15b3e90081	Define SPARC code models. Currently, only abs32 and pic32 are implemented. Add a test case for abs32 with 64-bit code. 64-bit PIC code is currently broken. llvm-svn: 179463	2013-04-13 19:02:23 +00:00
Benjamin Kramer	adc1727c39	GlobalDCE: Fix an oversight in my last commit that could lead to crashes. There is a Constant with non-constant operands: blockaddress. llvm-svn: 179460	2013-04-13 16:11:14 +00:00
Benjamin Kramer	89ca4bc6d4	Fix a scalability issue with complex ConstantExprs. This is basically the same fix in three different places. We use a set to avoid walking the whole tree of a big ConstantExprs multiple times. For example: (select cmp, (add big_expr 1), (add big_expr 2)) We don't want to visit big_expr twice here, it may consist of thousands of nodes. The testcase exercises this by creating an insanely large ConstantExprs out of a loop. It's questionable if the optimizer should ever create those, but this can be triggered with real C code. Fixes PR15714. llvm-svn: 179458	2013-04-13 12:53:18 +00:00
Hal Finkel	d85a04b3df	Spill and restore PPC CR registers using the FP when we have one For functions that need to spill CRs, and have dynamic stack allocations, the value of the SP during the restore is not what it was during the save, and so we need to use the FP in these cases (as for all of the other spills and restores, but the CR restore has a special code path because its reserved slot, like the link register, is specified directly relative to the adjusted SP). llvm-svn: 179457	2013-04-13 08:09:20 +00:00
Andrew Trick	3d957c0ead	Further generalize this scheduler test. The order of copies depends on queue order, which is not very stable. llvm-svn: 179456	2013-04-13 07:37:27 +00:00
Andrew Trick	e6f9fc0cdb	Fix a dislexic regex. llvm-svn: 179455	2013-04-13 07:29:21 +00:00
Andrew Trick	88a1285b4f	Add a missing REQUIRES: asserts llvm-svn: 179453	2013-04-13 06:12:46 +00:00
Andrew Trick	e833e1cd6e	MI-Sched: schedule physreg copies. The register allocator expects minimal physreg live ranges. Schedule physreg copies accordingly. This is slightly tricky when they occur in the middle of the scheduling region. For now, this is handled by rescheduling the copy when its associated instruction is scheduled. Eventually we may instead bundle them, but only if we can preserve the bundles as parallel copies during regalloc. llvm-svn: 179449	2013-04-13 06:07:40 +00:00
Rafael Espindola	9b709259e1	Finish templating MachObjectFile over endianness. We are now able to handle big endian macho files in llvm-readobject. Thanks to David Fang for providing the object files. llvm-svn: 179440	2013-04-13 01:45:40 +00:00
Akira Hatanaka	2f08822f9d	[mips] Reapply r179420 and r179421. llvm-svn: 179434	2013-04-13 00:55:41 +00:00
Akira Hatanaka	8ed2892c1c	Revert r179420 and r179421. llvm-svn: 179422	2013-04-12 22:40:07 +00:00
Akira Hatanaka	931ad87f6a	[mips] Instruction selection patterns for carry-setting and using add instructions. llvm-svn: 179421	2013-04-12 22:24:52 +00:00
Akira Hatanaka	8f41dd923e	[mips] v4i8 and v2i16 add, sub and mul instruction selection patterns. llvm-svn: 179420	2013-04-12 22:14:24 +00:00
Nadav Rotem	4e4d45e507	Revert r179409 because it caused some warnings and some of the build bots fail. llvm-svn: 179418	2013-04-12 22:02:26 +00:00
Benjamin Kramer	e89c705030	InstCombine: Check the operand types before merging fcmp ord & fcmp ord. Fixes PR15737. llvm-svn: 179417	2013-04-12 21:56:23 +00:00
Nadav Rotem	8543ba3e52	SLPVectorizer: add support for vectorization of diamond shaped trees. We now perform a preliminary traversal of the graph to collect values with multiple users and check where the users came from. llvm-svn: 179414	2013-04-12 21:16:54 +00:00
Nadav Rotem	87a0af6e1b	CostModel: increase the default cost of supported floating point operations from 1 to two. Fixed a few tests that changes because now the cost of one insert + a vector operation on two doubles is lower than two scalar operations on doubles. llvm-svn: 179413	2013-04-12 21:15:03 +00:00
Nadav Rotem	e4b8aa001c	Add support for additional vector instructions in the interpreter. patch by Veselov, Yuri <Yuri.Veselov@intel.com>. llvm-svn: 179409	2013-04-12 20:45:20 +00:00
Quentin Colombet	c313220b18	ARM: Correct printing of pre-indexed operands. According to the ARM reference manual, constant offsets are mandatory for pre-indexed addressing modes. The MC disassembler was not obeying this when the offset is 0. It was producing instructions like: str r0, [r1]!. Correct syntax is: str r0, [r1, #0]!. This change modifies the dumping of operands so that the offset is always printed, regardless of its value, when pre-indexed addressing mode is used. Patch by Mihail Popa <Mihail.Popa@arm.com> llvm-svn: 179398	2013-04-12 18:47:25 +00:00
David Majnemer	1a08accbb7	Simplify (A & ~B) in icmp if A is a power of 2 The transform will execute like so: (A & ~B) == 0 --> (A & B) != 0 (A & ~B) != 0 --> (A & B) == 0 llvm-svn: 179386	2013-04-12 17:25:07 +00:00
Arnold Schwaighofer	f9cea17f75	LoopVectorizer: integer division is not a reduction operation Don't classify idiv/udiv as a reduction operation. Integer division is lossy. For example : (1 / 2) * 4 != 4/2. Example: int a[] = { 2, 5, 2, 2} int x = 80; for() x /= a[i]; Scalar: x /= 2 // = 40 x /= 5 // = 8 x /= 2 // = 4 x /= 2 // = 2 Vectorized: <80, 1> / <2,5> //= <40,0> <40, 0> / <2,2> //= <20,0> 20*0 = 0 radar://13640654 llvm-svn: 179381	2013-04-12 15:15:19 +00:00
Tim Northover	caf3e95e97	AArch64: use full triple for ELF tests These tests rely specifically on the names of ELF relocations, let alone any other detail. There's no way they'd work if LLVM was emitting something else by default. llvm-svn: 179376	2013-04-12 12:54:58 +00:00
Tim Northover	c13d2675e0	AArch64: remove over-zealous use of CHECK-NEXT It turns out some platforms (e.g. Windows) lay out their llvm-mc slightly differently with extra newlines; there was no real reason for the test lines to be consecutive, so this relaxes the FileCheck. llvm-svn: 179375	2013-04-12 12:54:49 +00:00
Nico Rieck	d6df0547fe	Teach llvm-readobj to print ELF program headers llvm-svn: 179363	2013-04-12 04:07:39 +00:00
Nico Rieck	e85c663f19	Remove obsolete object file dumpers llvm-svn: 179362	2013-04-12 04:07:13 +00:00
Nico Rieck	ba848e3bca	Replace coff-/elf-dump with llvm-readobj llvm-svn: 179361	2013-04-12 04:06:46 +00:00
Nico Rieck	e351732942	Add extensive relocation tests for llvm-readobj This test ensures that relocation type names returned by libObject match the raw relocation type value. llvm-svn: 179360	2013-04-12 04:02:23 +00:00
Nadav Rotem	25a23bc0ef	Fix the test on linux by setting the triple and the align format llvm-svn: 179354	2013-04-12 01:07:16 +00:00
Nadav Rotem	c3b0f50ac2	Add a flag to align all basic blocks in the function. When debugging performance regressions we often ask ourselves if the regression that we see is due to poor isel/sched/ra or due to some micro-architetural problem. When comparing two code sequences one good way to rule out front-end bottlenecks (and other the issues) is to force code alignment. This pass adds a flag that forces the alignment of all of the basic blocks in the program. llvm-svn: 179353	2013-04-12 00:48:32 +00:00
Rafael Espindola	ecf1320579	Add 179294 back, but don't use bit fields so that it works on big endian hosts. Original message: Print more information about relocations. With this patch llvm-readobj now prints if a relocation is pcrel, its length, if it is extern and if it is scattered. It also refactors the code a bit to use bit fields instead of shifts and masks all over the place. llvm-svn: 179345	2013-04-12 00:17:33 +00:00
Manman Ren	06a9d50a35	Aliasing rules for struct-path aware TBAA. Added PathAliases to check if two struct-path tags can alias. Added command line option -struct-path-tbaa. llvm-svn: 179337	2013-04-11 23:24:18 +00:00
Preston Gurd	6bda0db299	Use FileCheck instead of grep. llvm-svn: 179322	2013-04-11 21:39:01 +00:00
David Majnemer	b81cd63c4b	Optimize icmp involving addition better Allows LLVM to optimize sequences like the following: %add = add nsw i32 %x, 1 %cmp = icmp sgt i32 %add, %y into: %cmp = icmp sge i32 %x, %y as well as: %add1 = add nsw i32 %x, 20 %add2 = add nsw i32 %y, 57 %cmp = icmp sge i32 %add1, %add2 into: %add = add nsw i32 %y, 37 %cmp = icmp sle i32 %cmp, %x llvm-svn: 179316	2013-04-11 20:05:46 +00:00
Jack Carter	a16fa808d3	Mips specific inline asm memory operand modifier test case These changes are based on commit responses for r179135. llvm-svn: 179315	2013-04-11 19:39:19 +00:00
Rafael Espindola	e2742a038c	Revert my last two commits while I debug what is wrong in a big endian host. llvm-svn: 179303	2013-04-11 17:46:10 +00:00
Rafael Espindola	708a44d464	Print more information about relocations. With this patch llvm-readobj now prints if a relocation is pcrel, its length, if it is extern and if it is scattered. It also refactors the code a bit to use bit fields instead of shifts and masks all over the place. llvm-svn: 179294	2013-04-11 16:31:37 +00:00
Benjamin Kramer	a95f87494a	Fix for wrong instcombine on vector insert/extract When trying to collapse sequences of insertelement/extractelement instructions into single shuffle instructions, there is one specific case where the Instruction Combiner wrongly updates the resulting Mask of shuffle indexes. The problem is in function CollectShuffleElments. If we have a sequence of insert/extract element instructions like the one below: %tmp1 = extractelement <4 x float> %LHS, i32 0 %tmp2 = insertelement <4 x float> %RHS, float %tmp1, i32 1 %tmp3 = extractelement <4 x float> %RHS, i32 2 %tmp4 = insertelement <4 x float> %tmp2, float %tmp3, i32 3 Where: . %RHS will have a mask of [4,5,6,7] . %LHS will have a mask of [0,1,2,3] The Mask of shuffle indexes is wrongly computed to [4,1,6,7] instead of [4,0,6,7]. When analyzing %tmp2 in order to compute the Mask for the resulting shuffle instruction, the algorithm forgets to update the mask index at position 1 with the index associated to the element extracted from %LHS by instruction %tmp1. Patch by Andrea DiBiagio! llvm-svn: 179291	2013-04-11 15:10:09 +00:00
Eli Bendersky	0840082c02	Add a CHECK-NOT for a more faithful translation of the original grep \| count 2. Thanks to Reid Kleckner for catching this. llvm-svn: 179289	2013-04-11 14:43:19 +00:00
Benjamin Kramer	b50682e156	Add missing colons to check lines. llvm-svn: 179277	2013-04-11 12:41:41 +00:00
Benjamin Kramer	3960c1cd56	FileCheckize a bunch of tests. llvm-svn: 179276	2013-04-11 12:32:23 +00:00
Michael Liao	55658d4222	Optimize vector select from all 0s or all 1s As packed comparisons in AVX/SSE produce all 0s or all 1s in each SIMD lane, vector select could be simplified to AND/OR or removed if one or both values being selected is all 0s or all 1s. llvm-svn: 179267	2013-04-11 05:15:54 +00:00
Michael Liao	95d9440348	Add CLAC/STAC instruction encoding/decoding support As these two instructions in AVX extension are privileged instructions for special purpose, it's only expected to be used in inlined assembly. llvm-svn: 179266	2013-04-11 04:52:28 +00:00
Michael Liao	f7bf87051a	Enhance bool simplifcation in X86 to handle more cases This patch is revised based on patch from Victor Umansky <victor.umansky@intel.com>. More cases are handled in X86's bool simplification, i.e. - SETCC_CARRY - value is truncated to i1 with AND As a by-product, PR5443 is also fixed. llvm-svn: 179265	2013-04-11 04:43:09 +00:00

... 3 4 5 6 7 ...

19345 Commits