llvm-project

Commit Graph

Author	SHA1	Message	Date
Preston Gurd	52dacca977	This patch addresses a problem with the Post RA scheduler generating an incorrect instruction sequence due to it not being aware that an inline assembly instruction may reference memory. This patch fixes the problem by causing the scheduler to always assume that any inline assembly code instruction could access memory. This is necessary because the internal representation of the inline instruction does not include any information about memory accesses. This should fix PR13504. llvm-svn: 166929	2012-10-29 15:01:23 +00:00
Jakob Stoklund Olesen	57143f7e78	Never attempt to join an early-clobber def with a regular kill. This fixes PR14194. llvm-svn: 166880	2012-10-27 17:41:27 +00:00
Michael Liao	8fe3a6bda4	Add test for ATOM ISA SSSE3 - Remove SSE4.1 feature in other ATOM-based test cases llvm-svn: 166699	2012-10-25 17:50:05 +00:00
Elena Demikhovsky	e0e69a33a0	The test avx-intel-ocl.ll failed. I can't reproduce on any of my machines. I added -mcpu flag, may be it will fix the problem llvm-svn: 166669	2012-10-25 08:38:42 +00:00
Chad Rosier	dd5eada241	[ms-inline asm] Add back-end test case for r166632. Make sure we emit the correct .s output as well as get the correct encoding by the integrated assembler. llvm-svn: 166638	2012-10-24 23:10:28 +00:00
Elena Demikhovsky	d6afb03bc9	Special calling conventions for Intel OpenCL built-in library. llvm-svn: 166566	2012-10-24 14:46:16 +00:00
Michael Liao	5922979e49	Teach DAG combine to fold (buildvec (Xint2fp x)) to (Xint2fp (buildvec x)) - If more than 1 elemennts are defined and target supports the vectorized conversion, use the vectorized one instead to reduce the strength on conversion operation. llvm-svn: 166546	2012-10-24 04:14:18 +00:00
Michael Liao	c5af149e70	Add custom conversion from v2u32 to v2f32 in 32-bit mode - As there's no 64-bit GPRs in 32-bit mode, a custom conversion from v2u32 to v2f32 is added to improve the efficiency of the code generated. llvm-svn: 166545	2012-10-24 04:09:32 +00:00
Rafael Espindola	4e6e537314	Change x86_fastcallcc to require inreg markers. This allows it to known the difference from "int x" (which should go in registers and "struct y {int x;}" (which should not). Clang will be updated in the next patches. llvm-svn: 166536	2012-10-24 01:58:48 +00:00
Michael Liao	2843625bb5	Fix PR14161 - Check index being extracted to be constant 0 before simplfiying. Otherwise, retain the original sequence. llvm-svn: 166504	2012-10-23 21:40:15 +00:00
Michael Liao	1be96bb5ce	Enable lowering ZERO_EXTEND/ANY_EXTEND to PMOVZX from SSE4.1 llvm-svn: 166486	2012-10-23 17:34:00 +00:00
Michael Liao	4b7ccfcaad	Lower BUILD_VECTOR to SHUFFLE + INSERT_VECTOR_ELT for X86 - If INSERT_VECTOR_ELT is supported (above SSE2, either by custom sequence of legal insn), transform BUILD_VECTOR into SHUFFLE + INSERT_VECTOR_ELT if most of elements could be built from SHUFFLE with few (so far 1) elements being inserted. llvm-svn: 166288	2012-10-19 17:15:18 +00:00
Sebastian Pop	127777d686	Clear unknown mem ops when merging stack slots (pr14090) When merging stack slots, if StackColoring::remapInstructions gets a value back from GetUnderlyingObject that it does not know about or is not itself a stack slot, clear the memory operand in case it aliases the merged slot. This prevents the introduction of incorrect aliasing information. Author: Matthew Curtis <mcurtis@codeaurora.org> llvm-svn: 166216	2012-10-18 19:53:48 +00:00
Nadav Rotem	d5f8859672	In SimplifySelectOps we pulled two loads through a select node despite the fact that one was dependent on the other. rdar://12513091 llvm-svn: 166196	2012-10-18 18:06:48 +00:00
Michael Liao	3ac8201ea4	Revert part of r166049 back and enable test case in r166125. - Folding (trunc (concat ... X )) to (concat ... (trunc X) ...) is valid when '...' are all 'undef's. - r166125 relies on this transformation. llvm-svn: 166155	2012-10-17 23:45:54 +00:00
Michael Liao	f630f4cadc	Disable extract-concat test case temporarily llvm-svn: 166141	2012-10-17 23:08:19 +00:00
Michael Liao	c87d98dbc8	Revert r166049 - In general, it's unsafe for this transformation. llvm-svn: 166135	2012-10-17 22:41:15 +00:00
Michael Liao	7a442c8031	Teach DAG combine to fold (extract_subvec (concat v1, ..) i) to v_i - If the extracted vector has the same type of all vectored being concatenated together, it should be simplified directly into v_i, where i is the index of the element being extracted. llvm-svn: 166125	2012-10-17 20:48:33 +00:00
Michael Liao	6f7206132f	Fix setjmp on models with non-Small code model nor non-Static relocation model - MBB address is only valid as an immediate value in Small & Static code/relocation models. On other models, LEA is needed to load IP address of the restore MBB. - A minor fix of MBB in MC lowering is added as well to enable target relocation flag being propagated into MC. llvm-svn: 166084	2012-10-17 02:22:27 +00:00
Jakob Stoklund Olesen	4df59a9ff8	Avoid rematerializing a redef immediately after the old def. PR14098 contains an example where we would rematerialize a MOV8ri immediately after the original instruction: %vreg7:sub_8bit<def> = MOV8ri 9; GR32_ABCD:%vreg7 %vreg22:sub_8bit<def> = MOV8ri 9; GR32_ABCD:%vreg7 Besides being pointless, it is also wrong since the original instruction only redefines part of the register, and the value read by the new instruction is wrong. The problem was the LiveRangeEdit::allUsesAvailableAt() didn't special-case OrigIdx == UseIdx and found the wrong SSA value. llvm-svn: 166068	2012-10-16 22:51:58 +00:00
Jakob Stoklund Olesen	2043329e67	Revert r166046 "Switch back to the old coalescer for now to fix the 32 bit bit" A fix for PR14098, including the test case is in the next commit. llvm-svn: 166067	2012-10-16 22:51:55 +00:00
Michael Liao	19006206a1	Teach DAG combine to fold (trunc (fptoXi x)) to (fptoXi x) llvm-svn: 166049	2012-10-16 19:38:35 +00:00
Rafael Espindola	b58be2c593	Switch back to the old coalescer for now to fix the 32 bit bit llvm+clang+compiler-rt bootstrap. llvm-svn: 166046	2012-10-16 19:34:06 +00:00
NAKAMURA Takumi	1705a999fa	Reapply r165661, Patch by Shuxin Yang <shuxin.llvm@gmail.com>. Original message: The attached is the fix to radar://11663049. The optimization can be outlined by following rules: (select (x != c), e, c) -> select (x != c), e, x), (select (x == c), c, e) -> select (x == c), x, e) where the <c> is an integer constant. The reason for this change is that : on x86, conditional-move-from-constant needs two instructions; however, conditional-move-from-register need only one instruction. While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place. The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase. The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource". Original message since r165661: My previous change has a bug: I negated the condition code of a CMOV, and go ahead creating a new CMOV using the ORIGINAL condition code. llvm-svn: 166017	2012-10-16 06:28:34 +00:00
Rafael Espindola	7f4f79a5bc	Fix the cpu name and add -verify-machineinstrs. llvm-svn: 166003	2012-10-16 01:13:06 +00:00
Andrew Trick	d9d4be0d57	misched: Added handleMove support for updating all kill flags, not just for allocatable regs. This is a medium term workaround until we have a more robust solution in the form of a register liveness utility for postRA passes. llvm-svn: 166001	2012-10-16 00:22:51 +00:00
Michael Liao	97bf363a9e	Add __builtin_setjmp/_longjmp supprt in X86 backend - Besides used in SjLj exception handling, __builtin_setjmp/__longjmp is also used as a light-weight replacement of setjmp/longjmp which are used to implementation continuation, user-level threading, and etc. The support added in this patch ONLY addresses this usage and is NOT intended to support SjLj exception handling as zero-cost DWARF exception handling is used by default in X86. llvm-svn: 165989	2012-10-15 22:39:43 +00:00
Andrew Trick	6e5f49d7b7	Check output of the misched unit tests llvm-svn: 165959	2012-10-15 20:33:14 +00:00
Rafael Espindola	aee00b5e72	Add a cpu to try to fix the atom builder. llvm-svn: 165956	2012-10-15 19:25:43 +00:00
Rafael Espindola	b41459a3c0	Add testcase for pr14088. llvm-svn: 165954	2012-10-15 19:00:10 +00:00
Andrew Trick	5a89e0ef07	misched tests: add a triple to speculatively fix windows builders. llvm-svn: 165952	2012-10-15 18:21:08 +00:00
Andrew Trick	90f711da9a	misched: ILP scheduler for experimental heuristics. llvm-svn: 165950	2012-10-15 18:02:27 +00:00
Benjamin Kramer	ecd15d7f6c	X86: Fix accidentally swapped operands. llvm-svn: 165871	2012-10-13 12:50:19 +00:00
Benjamin Kramer	d6b9362fc2	X86: Promote i8 cmov when both operands are coming from truncates of the same width. X86 doesn't have i8 cmovs so isel would emit a branch. Emitting branches at this level is often not a good idea because it's too late for many optimizations to kick in. This solution doesn't add any extensions (truncs are free) and tries to avoid introducing partial register stalls by filtering direct copyfromregs. I'm seeing a ~10% speedup on reading a random .png file with libpng15 via graphicsmagick on x86_64/westmere, but YMMV depending on the microarchitecture. llvm-svn: 165868	2012-10-13 10:39:49 +00:00
Jakob Stoklund Olesen	5ac781c576	Fix buildbots: -misched=shuffle is only available in +Asserts builds. llvm-svn: 165846	2012-10-12 23:01:33 +00:00
Jakob Stoklund Olesen	1a87a29d08	Use a transposed algorithm for handleMove(). Completely update one interval at a time instead of collecting live range fragments to be updated. This avoids building data structures, except for a single SmallPtrSet of updated intervals. Also share code between handleMove() and handleMoveIntoBundle(). Add support for moving dead defs across other live values in the interval. The MI scheduler can do that. llvm-svn: 165824	2012-10-12 21:31:57 +00:00
Jakob Stoklund Olesen	1a3eb878f6	Fix coalescing with IMPLICIT_DEF values. PHIElimination inserts IMPLICIT_DEF instructions to guarantee that all PHI predecessors have a live-out value. These IMPLICIT_DEF values are not considered to be real interference when coalescing virtual registers: %vreg1 = IMPLICIT_DEF %vreg2 = MOV32r0 When joining %vreg1 and %vreg2, the IMPLICIT_DEF instruction and its value number should simply be erased since the %vreg2 value number now provides a live-out value for the PHI predecesor block. llvm-svn: 165813	2012-10-12 18:03:04 +00:00
Jakob Stoklund Olesen	d0d7860f40	Pass an explicit operand number to addLiveIns. Not all instructions define a virtual register in their first operand. Specifically, INLINEASM has a different format. <rdar://problem/12472811> llvm-svn: 165721	2012-10-11 16:46:07 +00:00
NAKAMURA Takumi	da0730c2d7	Revert r165661, "Patch by Shuxin Yang <shuxin.llvm@gmail.com>." It broke stage2 clang and test-suite/MultiSource/Benchmarks/mediabench/g721/g721encode. llvm-svn: 165692	2012-10-11 02:02:05 +00:00
Nadav Rotem	17418964f8	Patch by Shuxin Yang <shuxin.llvm@gmail.com>. Original message: The attached is the fix to radar://11663049. The optimization can be outlined by following rules: (select (x != c), e, c) -> select (x != c), e, x), (select (x == c), c, e) -> select (x == c), x, e) where the <c> is an integer constant. The reason for this change is that : on x86, conditional-move-from-constant needs two instructions; however, conditional-move-from-register need only one instruction. While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place. The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase. The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource". llvm-svn: 165661	2012-10-10 21:31:55 +00:00
Michael Liao	e26b0313de	Specify CPU model to avoid breaking ATOM builds llvm-svn: 165638	2012-10-10 18:04:52 +00:00
Michael Liao	e999b865dd	Add support for FP_ROUND from v2f64 to v2f32 - Due to the current matching vector elements constraints in ISD::FP_ROUND, rounding from v2f64 to v4f32 (after legalization from v2f32) is scalarized. Add a customized v2f32 widening to convert it into a target-specific X86ISD::VFPROUND to work around this constraints. llvm-svn: 165631	2012-10-10 16:53:28 +00:00
Evan Cheng	3903e1be01	When expanding atomic load arith instructions, do not lose target flags. rdar://12453106 llvm-svn: 165568	2012-10-09 23:48:33 +00:00
Jakob Stoklund Olesen	9d1173a86e	Don't crash on extra evil irreducible control flow. When the CFG contains a loop with multiple entry blocks, the traces computed by MachineTraceMetrics don't always have the same nice properties. Loop back-edges are normally excluded from traces, but MachineLoopInfo doesn't recognize loops with multiple entry blocks, so those back-edges may be included. Avoid asserting when that happens by adding an isEarlierInSameTrace() function that accurately determines if a dominating block is part of the same trace AND is above the currrent block in the trace. llvm-svn: 165434	2012-10-08 22:06:44 +00:00
Benjamin Kramer	302178bf13	X86: fcmov doesn't handle all possible EFLAGS, fall back to a branch for the others. Otherwise it will try to use SSE patterns and fail horribly if sse is disabled. Fixes PR14035. llvm-svn: 165377	2012-10-07 15:34:27 +00:00
Evan Cheng	847ad4460a	Follow up to r165072. Try a different approach: only move the load when it's going to be folded into the call. rdar://12437604 llvm-svn: 165287	2012-10-05 01:48:22 +00:00
Nadav Rotem	b27777ff02	When merging connsecutive stores, use vectors to store the constant zero. llvm-svn: 165267	2012-10-04 22:35:15 +00:00
Chad Rosier	271623f8ae	[ms-inline asm] Add support in the X86AsmPrinter for printing memory references in the Intel syntax. The MC layer supports emitting in the Intel syntax, but this would require the inline assembly MachineInstr to be lowered to an MCInst before emission. This is potential future work, but for now emitting directly from the MachineInstr suffices. llvm-svn: 165173	2012-10-03 22:06:44 +00:00
Nadav Rotem	ac92066b0c	Fix a cycle in the DAG. In this code we replace multiple loads with a single load and multiple stores with a single load. We create the wide loads and stores (and their chains) before we remove the scalar loads and stores and fix the DAG chain. We attempted to merge loads with a different chain. When that happened, the assumption that it is safe to RAUW broke and a cycle was introduced. llvm-svn: 165148	2012-10-03 19:30:31 +00:00
Nadav Rotem	7cbc12a41d	A DAGCombine optimization for mergeing consecutive stores to memory. The optimization is not profitable in many cases because modern processors perform multiple stores in parallel and merging stores prior to merging requires extra work. We handle two main cases: 1. Store of multiple consecutive constants: q->a = 3; q->4 = 5; In this case we store a single legal wide integer. 2. Store of multiple consecutive loads: int a = p->a; int b = p->b; q->a = a; q->b = b; In this case we load/store either ilegal vector registers or legal wide integer registers. llvm-svn: 165125	2012-10-03 16:11:15 +00:00
Jakob Stoklund Olesen	0f6e8bb5e0	The early if conversion pass is ready to be used as an opt-in. Enable the pass by default for targets that request it, and change the -enable-early-ifcvt to the opposite -disable-early-ifcvt. There are still some x86 regressions when enabling early if-conversion because of the missing machine models. Disable the pass for x86 until machine models are added. llvm-svn: 165075	2012-10-03 00:51:32 +00:00
Evan Cheng	214156c43d	Fix a serious X86 instruction selection bug. In X86DAGToDAGISel::PreprocessISelDAG(), isel is moving load inside callseq_start / callseq_end so it can be folded into a call. This can create a cycle in the DAG when the call is glued to a copytoreg. We have been lucky this hasn't caused too many issues because the pre-ra scheduler has special handling of call sequences. However, it has caused a crash in a specific tailcall case. rdar://12393897 llvm-svn: 165072	2012-10-02 23:49:13 +00:00
Nick Lewycky	f8fc892b6c	Make sure to put our sret argument into %rax on x86-64. Fixes PR13563! llvm-svn: 165063	2012-10-02 22:45:06 +00:00
Duncan Sands	f97cb15aee	Fix PR13991: legalizing an overflowing multiplication operation is harder than the add/sub case since in the case of multiplication you also have to check that the operation in the larger type did not overflow. llvm-svn: 165017	2012-10-02 15:03:49 +00:00
NAKAMURA Takumi	1d24e56bad	test/CodeGen/X86/red-zone2.ll: Add -mtriple=x86_64-linux, and FileCheck-ize. llvm-svn: 164975	2012-10-01 22:48:07 +00:00
Michael Liao	d756f06063	Fix PR13899 - Update maximal stack alignment when stack arguments are prepared before a call. - Test cases are enhanced to show it's not a Win32 specific issue but a generic one. llvm-svn: 164946	2012-10-01 16:44:04 +00:00
Nadav Rotem	abbe665154	Revert r164910 because it causes failures to several phase2 builds. llvm-svn: 164911	2012-09-30 07:17:56 +00:00
Nadav Rotem	45715b25f7	A DAGCombine optimization for merging consecutive stores. This optimization is not profitable in many cases because moden processos can store multiple values in parallel, and preparing the consecutive store requires some work. We only handle these cases: 1. Consecutive stores where the values and consecutive loads. For example: int a = p->a; int b = p->b; q->a = a; q->b = b; 2. Consecutive stores where the values are constants. Foe example: q->a = 4; q->b = 5; llvm-svn: 164910	2012-09-30 06:24:14 +00:00
Duncan Sands	fb9d30dd64	Speculatively revert commit 164885 (nadav) in the hope of ressurecting a pile of buildbots. Original commit message: A DAGCombine optimization for merging consecutive stores. This optimization is not profitable in many cases because moden processos can store multiple values in parallel, and preparing the consecutive store requires some work. We only handle these cases: 1. Consecutive stores where the values and consecutive loads. For example: int a = p->a; int b = p->b; q->a = a; q->b = b; 2. Consecutive stores where the values are constants. Foe example: q->a = 4; q->b = 5; llvm-svn: 164890	2012-09-29 10:25:35 +00:00
Nadav Rotem	a2e7ea2f18	A DAGCombine optimization for merging consecutive stores. This optimization is not profitable in many cases because moden processos can store multiple values in parallel, and preparing the consecutive store requires some work. We only handle these cases: 1. Consecutive stores where the values and consecutive loads. For example: int a = p->a; int b = p->b; q->a = a; q->b = b; 2. Consecutive stores where the values are constants. Foe example: q->a = 4; q->b = 5; llvm-svn: 164885	2012-09-29 06:33:25 +00:00
Evan Cheng	64a223aed8	Do not delete BBs if their addresses are taken. rdar://12396696 llvm-svn: 164866	2012-09-28 23:58:57 +00:00
Manman Ren	009cd82c59	Testcase for r164835 llvm-svn: 164842	2012-09-28 20:26:33 +00:00
Jakob Stoklund Olesen	1d19582a8f	Avoid dereferencing a NULL pointer. Fixes PR13943. llvm-svn: 164778	2012-09-27 16:34:19 +00:00
NAKAMURA Takumi	63f9adb98b	llvm/test/CodeGen/X86/mulx*.ll: Fix copypasto. llvm-svn: 164681	2012-09-26 09:24:12 +00:00
Michael Liao	2b425e1e24	Add SARX/SHRX/SHLX code generation support llvm-svn: 164675	2012-09-26 08:26:25 +00:00
Michael Liao	2de86af22d	Add RORX code generation support llvm-svn: 164674	2012-09-26 08:24:51 +00:00
Michael Liao	f9f7b5518a	Add MULX code generation support llvm-svn: 164673	2012-09-26 08:22:37 +00:00
Michael Liao	de51caf2a0	Add missing i64 max/min/umax/umin on 32-bit target - Turn on atomic6432.ll and add specific test case as well llvm-svn: 164616	2012-09-25 18:08:13 +00:00
Evan Cheng	446ff28df1	Fix an illegal tailcall opt where the callee returns a double via xmm while caller returns x86_fp80 via st0. rdar://12229511 llvm-svn: 164588	2012-09-25 05:32:34 +00:00
Jim Grosbach	361ca34270	Mark jump tables in code sections with DataRegion directives. Even out-of-line jump tables can be in the code section, so mark them as data-regions for those targets which support the directives. rdar://12362871&12362974 llvm-svn: 164571	2012-09-24 23:06:27 +00:00
Michael Liao	8d62212e93	Revise test to avoid using of 'grep' llvm-svn: 164472	2012-09-23 02:41:47 +00:00
Michael Liao	1b823fc1ba	Enhance test case of atomic16 to verify inst encoding fixed in r164453. llvm-svn: 164465	2012-09-22 21:07:59 +00:00
NAKAMURA Takumi	be9ad01d27	llvm/test/CodeGen/X86/pr5145.ll: Tweak expressions to match for darwin target. .LBB0_1: # Linux LBB0_1: # Darwin llvm-svn: 164362	2012-09-21 05:19:19 +00:00
Michael Liao	a880186030	Add missing i8 max/min/umax/umin support - Fix PR5145 and turn on test 8-bit atomic ops llvm-svn: 164358	2012-09-21 03:18:52 +00:00
Benjamin Kramer	8554206652	Fix broken check lines. llvm-svn: 164317	2012-09-20 19:54:13 +00:00
Michael Liao	83bc2119dc	Specify CPu to prevent failure on ATOM due to different code scheduling llvm-svn: 164283	2012-09-20 03:34:04 +00:00
Michael Liao	3237662b65	Re-work X86 code generation of atomic ops with spin-loop - Rewrite/merge pseudo-atomic instruction emitters to address the following issue: * Reduce one unnecessary load in spin-loop previously the spin-loop looks like thisMBB: newMBB: ld t1 = [bitinstr.addr] op t2 = t1, [bitinstr.val] not t3 = t2 (if Invert) mov EAX = t1 lcs dest = [bitinstr.addr], t3 [EAX is implicit] bz newMBB fallthrough -->nextMBB the 'ld' at the beginning of newMBB should be lift out of the loop as lcs (or CMPXCHG on x86) will load the current memory value into EAX. This loop is refined as: thisMBB: EAX = LOAD [MI.addr] mainMBB: t1 = OP [MI.val], EAX LCMPXCHG [MI.addr], t1, [EAX is implicitly used & defined] JNE mainMBB sinkMBB: * Remove immopc as, so far, all pseudo-atomic instructions has all-register form only, there is no immedidate operand. * Remove unnecessary attributes/modifiers in pseudo-atomic instruction td * Fix issues in PR13458 - Add comprehensive tests on atomic ops on various data types. NOTE: Some of them are turned off due to missing functionality. - Revise tests due to the new spin-loop generated. llvm-svn: 164281	2012-09-20 03:06:15 +00:00
Michael Liao	8372539543	Unify the logic in SelectAtomicLoadAdd and SelectAtomicLoadArith - Merge the processing of LOAD_ADD with other atomic load-arith operations - Separate the logic getting target constant for atomic-load-op and add an optimization for atomic-load-add on i16 with negative value - Optimize a minor case for atomic-fetch-add i16 with negative operand. Test case is revised. llvm-svn: 164243	2012-09-19 19:36:58 +00:00
Jan Wen Voung	4ce1d7b4f1	Add some cases to x86 OptimizeCompare to handle DEC and INC, too. While we are setting the earlier def to true, also make it live. llvm-svn: 164056	2012-09-17 22:04:23 +00:00
Michael Liao	b503b323f3	Fix PR13859 - Preserve the original NOutVT during casting from vector to integer by extracting vector elements. llvm-svn: 164042	2012-09-17 18:05:20 +00:00
Nadav Rotem	ae6809b19a	Fix the testcase to work on all platforms. llvm-svn: 163997	2012-09-16 07:58:47 +00:00
Nadav Rotem	37521aa89c	The PMOVZXWD family of functions had patterns extends narrow vector types to wide vector types. It had patterns for zext-loading and extending. This commit adds patterns for loading a wide type, performing a bitcast, and extending. This is an odd pattern, but it is commonly used when writing code with intrinsics. rdar://11897677 llvm-svn: 163995	2012-09-16 07:39:07 +00:00
Benjamin Kramer	ece434252c	X86: Emitting x87 fsin/fcos for sinf/cosf is not safe without unsafe fp math. This was only an issue if sse is disabled. llvm-svn: 163967	2012-09-15 12:44:27 +00:00
Eric Christopher	b83dba2b84	Fix both the test for zero and what we do if we have a zero for umulo legalization. Fixes PR13839 llvm-svn: 163856	2012-09-13 23:24:02 +00:00
Michael Liao	137f8aedea	Add wider vector/integer support for PR12312 - Enhance the fix to PR12312 to support wider integer, such as 256-bit integer. If more than 1 fully evaluated vectors are found, POR them first followed by the final PTEST. llvm-svn: 163832	2012-09-13 20:24:54 +00:00
Michael Liao	460fc46e0f	Enhance type legalization on bitcast from vector to integer - Find a legal vector type before casting and extracting element from it. - As the new vector type may have more than 2 elements, build the final hi/lo pair by BFS pairing them from bottom to top. llvm-svn: 163830	2012-09-13 19:58:21 +00:00
Jakob Stoklund Olesen	32a56fa3ba	Fix test case to avoid PIC magic. llvm-svn: 163827	2012-09-13 19:47:45 +00:00
Jakob Stoklund Olesen	3cf3ffce24	Fix the TCRETURNmi64 bug differently. Add a PatFrag to match X86tcret using 6 fixed registers or less. This avoids folding loads into TCRETURNmi64 using 7 or more volatile registers. <rdar://problem/12282281> llvm-svn: 163819	2012-09-13 18:31:27 +00:00
Jakob Stoklund Olesen	78b9f8fc67	Revert r163761 "Don't fold indexed loads into TCRETURNmi64." The patch caused "Wrong topological sorting" assertions. llvm-svn: 163810	2012-09-13 16:52:17 +00:00
Nadav Rotem	24a822a5cb	Fix a dagcombine optimization. The optimization attempts to optimize a bitcast of fneg to integers by xoring the high-bit. This fails if the source operand is a vector because we need to negate each of the elements in the vector. Fix rdar://12281066 PR13813. llvm-svn: 163802	2012-09-13 14:54:28 +00:00
Nadav Rotem	4e9ad06617	Stack Coloring: We have code that checks that all of the uses of allocas are within the lifetime zone. Sometime legitimate usages of allocas are hoisted outside of the lifetime zone. For example, GEPS may calculate the address of a member of an allocated struct. This commit makes sure that we only check (abort regions or assert) for instructions that read and write memory using stack frames directly. Notice that by allowing legitimate usages outside the lifetime zone we also stop checking for instructions which use derivatives of allocas. We will catch less bugs in user code and in the compiler itself. llvm-svn: 163791	2012-09-13 12:38:37 +00:00
Jakob Stoklund Olesen	bfacef45eb	Don't fold indexed loads into TCRETURNmi64. We don't have enough GR64_TC registers when calling a varargs function with 6 arguments. Since %al holds the number of vector registers used, only %r11 is available as a scratch register. This means that addressing modes using both base and index registers can't be folded into TCRETURNmi64. <rdar://problem/12282281> llvm-svn: 163761	2012-09-13 00:25:00 +00:00
Michael Liao	abb87d4857	Fix PR11985 - BlockAddress has no support of BA + offset form and there is no way to propagate that offset into machine operand; - Add BA + offset support and a new interface 'getTargetBlockAddress' to simplify target block address forming; - All targets are modified to use new interface and X86 backend is enhanced to support BA + offset addressing. llvm-svn: 163743	2012-09-12 21:43:09 +00:00
Nadav Rotem	8ff00989fc	Stack coloring: remove lifetime intervals which contain escaped allocas. The input program may contain intructions which are not inside lifetime markers. This can happen due to a bug in the compiler or due to a bug in user code (for example, returning a reference to a local variable). This commit adds checks that all of the instructions in the function and invalidates lifetime ranges which do not contain all of the instructions. llvm-svn: 163678	2012-09-12 04:57:37 +00:00
Chad Rosier	1778831a3d	[ms-inline asm] Split the parsing of IR asm strings into GCC and MS variants. Add support in the EmitMSInlineAsmStr() function for handling integer consts. llvm-svn: 163645	2012-09-11 19:09:56 +00:00
Chad Rosier	ab51c9de34	Formatting. No functional change intended. llvm-svn: 163627	2012-09-11 16:33:10 +00:00
Nadav Rotem	65ba95ebf9	Stack Coloring: Dont crash on dbg values which use stack frames. llvm-svn: 163616	2012-09-11 12:34:27 +00:00
NAKAMURA Takumi	8c72306cdb	test/CodeGen/X86/ms-inline-asm.ll: Relax for non-darwin x86 targets. '##InlineAsm' could not be seen in other hosts. llvm-svn: 163554	2012-09-10 22:04:54 +00:00
Chad Rosier	7641f58784	[ms-inline asm] Properly emit the asm directives when the AsmPrinterVariant and InlineAsmVariant don't match. llvm-svn: 163550	2012-09-10 21:36:05 +00:00
Chad Rosier	1c1319b9e7	Update test case for Release builds. llvm-svn: 163549	2012-09-10 21:31:43 +00:00

1 2 3 4 5 ...

3699 Commits