llvm-project

Commit Graph

Author	SHA1	Message	Date
Jim Grosbach	8d77bb5f06	Use regex to remove false dependencies on register allocation. llvm-svn: 138137	2011-08-19 23:10:31 +00:00
Jim Grosbach	066e9ec1e4	Update tests. llvm-svn: 138116	2011-08-19 22:19:48 +00:00
Ivan Krasin	d7cbd4c518	FastISel: avoid function calls between the materialization of the constant and its use. llvm-svn: 137993	2011-08-18 22:06:10 +00:00
Jim Grosbach	e2a0404a69	Thumb assembly parsing and encoding for ADR. llvm-svn: 137864	2011-08-17 20:37:40 +00:00
Eric Christopher	8c5f3f7624	Fix this test to avoid leaving a temporary file behind. llvm-svn: 137651	2011-08-15 20:55:03 +00:00
Bob Wilson	d1de7764be	Expand VMOVQQQQ pseudo instructions. Apparently we never added code to expand these pseudo instructions, and in over a year, no one has noticed. Our register allocator must be awesome! llvm-svn: 137551	2011-08-13 05:14:55 +00:00
Jim Grosbach	27ad83d8a9	ARM push of a single register encodes as pre-indexed STR. Per the ARM ARM, a 'push' of a single register encodes as an STR, not an STM. llvm-svn: 137318	2011-08-11 18:07:11 +00:00
Jim Grosbach	8ba76c6d5c	ARM pop of a single register encodes as post-indexed LDR. Per the ARM ARM, a 'pop' of a single register encodes as an LDR, not an LDM. llvm-svn: 137316	2011-08-11 17:35:48 +00:00
Devang Patel	37a62058fe	While extending definition range of a debug variable, consult lexical scopes also. There is no point extending debug variable out side its lexical block. This provides 6x compile time speedup in some cases. llvm-svn: 137250	2011-08-10 21:25:34 +00:00
Rafael Espindola	36a3abc671	Add support for the R and Q constraints. llvm-svn: 137217	2011-08-10 16:26:42 +00:00
Jakob Stoklund Olesen	53910d6aae	Inflate register classes after coalescing. Coalescing can remove copy-like instructions with sub-register operands that constrained the register class. Examples are: x86: GR32_ABCD:sub_8bit_hi -> GR32 arm: DPR_VFP2:ssub0 -> DPR Recompute the register class of any virtual registers that are used by less instructions after coalescing. This affects code generation for the Cortex-A8 where we use NEON instructions for f32 operations, c.f. fp_convert.ll: vadd.f32 d16, d1, d0 vcvt.s32.f32 d0, d16 The register allocator is now free to use d16 for the temporary, and that comes first in the allocation order because it doesn't interfere with any s-registers. llvm-svn: 137133	2011-08-09 18:19:41 +00:00
Rafael Espindola	9bc32a96be	print st_shndx with the correct number of bits. llvm-svn: 136880	2011-08-04 15:50:13 +00:00
Rafael Espindola	9528995e3f	print st_other with the correct number of bits. llvm-svn: 136877	2011-08-04 15:38:19 +00:00
Rafael Espindola	96df560ce1	print st_type with the correct number of bits. llvm-svn: 136875	2011-08-04 15:24:00 +00:00
Rafael Espindola	79ef75dc49	Print st_bind with the correct number of bits. llvm-svn: 136874	2011-08-04 15:10:35 +00:00
Rafael Espindola	1848231ad1	Print r_sym with the correct number of bits. llvm-svn: 136873	2011-08-04 14:48:27 +00:00
Rafael Espindola	260af5cef6	Print r_type with the correct number of bits. llvm-svn: 136872	2011-08-04 14:39:30 +00:00
Rafael Espindola	65c559c5fb	Change anther counter to decimal. llvm-svn: 136870	2011-08-04 14:01:03 +00:00
Rafael Espindola	cad9e7f094	Don't print a counter in hex. llvm-svn: 136869	2011-08-04 13:39:15 +00:00
Benjamin Kramer	3c7e9ee480	Remove underscore that's breaking linux buildbots. llvm-svn: 136833	2011-08-03 23:13:01 +00:00
Jakub Staszak	15e5b742ad	Use MachineBranchProbabilityInfo in If-Conversion instead of its own heuristics. llvm-svn: 136826	2011-08-03 22:34:43 +00:00
Devang Patel	dc9cbaaf23	Use byte offset, instead of element number, to access merged global. llvm-svn: 136759	2011-08-03 01:25:46 +00:00
Eric Christopher	aa5030066f	Add support for the 'Q' constraint. Fixes rdar://9866494 llvm-svn: 136523	2011-07-29 21:18:58 +00:00
Jakob Stoklund Olesen	b28ee4115d	Transfer implicit operands in NEONMoveFixPass. Later passes /are/ using this information when running the register scavenger. This fixes the second problem in PR10520. llvm-svn: 136440	2011-07-29 00:27:35 +00:00
Jakob Stoklund Olesen	9c3badceba	Add -verify-arm-pseudo-expand. This hidden llc option runs the machine code verifier after expanding ARM pseudo-instructions, but before if-conversion. The machine code verifier is much better at pointing out liveness errors that can trip up the register scavenger. llvm-svn: 136439	2011-07-29 00:27:32 +00:00
Jakob Stoklund Olesen	b16081ce8c	Handle REG_SEQUENCE with implicitly defined operands. Code like that would only be produced by bugpoint, but we should still handle it correctly. When a register is defined by a REG_SEQUENCE of undefs, the register itself is undef. Previously, we would create a register with uses but no defs. Fixes part of PR10520. llvm-svn: 136401	2011-07-28 21:38:51 +00:00
Jim Grosbach	73a8393a47	FileCheck'ize test. llvm-svn: 136135	2011-07-26 20:49:44 +00:00
Jakob Stoklund Olesen	c45d38e14a	Fix a crash when building 177.mesa for armv6. When splitting a live range immediately before an LDR_POST instruction that redefines the address register, make sure to use the correct value number in leaveIntvBefore. We need the value number entering the instruction. <rdar://problem/9793765> llvm-svn: 135413	2011-07-18 18:47:13 +00:00
Owen Anderson	454e1c7abb	Remove VMOVDneon and VMOVQ, which are just aliases for VORR. This continues to simplify the path towards an auto-generated disassembler. llvm-svn: 135290	2011-07-15 18:46:47 +00:00
Eric Christopher	0c666b4664	Add a testcase for r135123. Part of rdar://9761830 llvm-svn: 135133	2011-07-14 06:23:09 +00:00
Evan Cheng	f863e3fb73	Improve codegen for select's: if (x != 0) x = 1 if (x == 1) x = 1 Previous codegen looks like this: mov r1, r0 cmp r1, #1 mov r0, #0 moveq r0, #1 The naive lowering select between two different values. It should recognize the test is equality test so it's more a conditional move rather than a select: cmp r0, #1 movne r0, #0 rdar://9758317 llvm-svn: 135017	2011-07-13 00:42:17 +00:00
Jim Grosbach	581da64241	Simplify printing of ARM shifted immediates. Print shifted immediate values directly rather than as a payload+shifter value pair. This makes for more readable output assembly code, simplifies the instruction printer, and is consistent with how Thumb immediates are displayed. llvm-svn: 134902	2011-07-11 16:48:36 +00:00
Cameron Zwarich	61715740cd	Add a missing test for r134882. llvm-svn: 134889	2011-07-11 08:35:17 +00:00
Jakob Stoklund Olesen	4931bbc671	Be more aggressive about following hints. RAGreedy::tryAssign will now evict interference from the preferred register even when another register is free. To support this, add the EvictionCost struct that counts how many hints are broken by an eviction. We don't want to break one hint just to satisfy another. Rename canEvict to shouldEvict, and add the first bit of eviction policy that doesn't depend on spill weights: Always make room in the preferred register as long as the evictees can be split and aren't already assigned to their preferred register. Also make the CSR avoidance more accurate. When looking for a cheaper register it is OK to use a new volatile register. Only CSR aliases that have never been used before should be avoided. llvm-svn: 134735	2011-07-08 20:46:18 +00:00
Jim Grosbach	dbfb29d6c0	Use ARMPseudoExpand for ARM tail calls. llvm-svn: 134719	2011-07-08 18:50:22 +00:00
Evan Cheng	8b2bda09a5	Change some ARM subtarget features to be single bit yes/no in order to sink them down to MC layer. Also fix tests. llvm-svn: 134590	2011-07-07 03:55:05 +00:00
Chandler Carruth	959fe548d7	FileCheck-ize and simplify RUN lines. llvm-svn: 134352	2011-07-02 20:43:11 +00:00
Eric Christopher	29f1db85dd	Add support for the 'j' immediate constraint. This is conditionalized on supporting the instruction that the constraint is for 'movw'. Part of rdar://9119939 llvm-svn: 134222	2011-07-01 01:00:07 +00:00
Eric Christopher	c011d31543	Add support for the ARM 't' register constraint. And another testcase for the 'x' register constraint. Part of rdar://9119939 llvm-svn: 134220	2011-07-01 00:30:46 +00:00
Eric Christopher	f1c74595aa	Add support for the 'x' constraint. Part of rdar://9307836 and rdar://9119939 llvm-svn: 134215	2011-07-01 00:14:47 +00:00
Cameron Zwarich	34c8f51d65	In the ARM global merging pass, allow extraneous alignment specifiers. This pass already makes the assumption, which is correct on ARM, that a type's alignment is less than its alloc size. This improves codegen with Clang (which inserts a lot of extraneous alignment specifiers) and fixes <rdar://problem/9695089>. llvm-svn: 134106	2011-06-29 22:24:25 +00:00
Benjamin Kramer	d2a84f6a63	Don't depend on the optimization reverted in r134067. llvm-svn: 134068	2011-06-29 14:07:18 +00:00
Eric Christopher	aff7ed55bc	Allow lr in the register options here. llvm-svn: 133935	2011-06-27 20:31:01 +00:00
Chad Rosier	fa8d89327f	The Neon VCVT (between floating-point and fixed-point, Advanced SIMD) instructions can be used to match combinations of multiply/divide and VCVT (between floating-point and integer, Advanced SIMD). Basically the VCVT immediate operand that specifies the number of fraction bits corresponds to a floating-point multiply or divide by the corresponding power of 2. For example, VCVT (floating-point to fixed-point, Advanced SIMD) can replace a combination of VMUL and VCVT (floating-point to integer) as follows: Example (assume d17 = <float 8.000000e+00, float 8.000000e+00>): vmul.f32 d16, d17, d16 vcvt.s32.f32 d16, d16 becomes: vcvt.s32.f32 d16, d16, #3 Similarly, VCVT (fixed-point to floating-point, Advanced SIMD) can replace a combinations of VCVT (integer to floating-point) and VDIV as follows: Example (assume d17 = <float 8.000000e+00, float 8.000000e+00>): vcvt.f32.s32 d16, d16 vdiv.f32 d16, d17, d16 becomes: vcvt.f32.s32 d16, d16, #3 llvm-svn: 133813	2011-06-24 19:23:04 +00:00
Nick Lewycky	90e6a4e5d5	Needs a triple. llvm-svn: 133634	2011-06-22 19:42:14 +00:00
Nick Lewycky	6208a2fd66	Emit trailing padding on constant vectors when TargetData says that the vector is larger than the sum of the elements (including per-element padding). llvm-svn: 133631	2011-06-22 18:55:03 +00:00
Devang Patel	c93ef81e24	Test case for r133560. llvm-svn: 133585	2011-06-22 00:03:42 +00:00
Evan Cheng	4c0bd9629d	Teach dag combine to match halfword byteswap patterns. 1. (((x) & 0xFF00) >> 8) \| (((x) & 0x00FF) << 8) => (bswap x) >> 16 2. ((x&0xff)<<8)\|((x&0xff00)>>8)\|((x&0xff000000)>>8)\|((x&0x00ff0000)<<8)) => (rotl (bswap x) 16) This allows us to eliminate most of the def : Pat patterns for ARM rev16 revsh instructions. It catches many more cases for ARM and x86. rdar://9609108 llvm-svn: 133503	2011-06-21 06:01:08 +00:00
Chris Lattner	80ed9dc9e5	rip out a ton of intrinsic modernization logic from AutoUpgrade.cpp, which is for pre-2.9 bitcode files. We keep x86 unaligned loads, movnt, crc32, and the target indep prefetch change. As usual, updating the testsuite is a PITA. llvm-svn: 133337	2011-06-18 06:05:24 +00:00
Evan Cheng	7552a62af5	Add an alternative rev16 pattern. We should figure out a better way to handle these complex rev patterns. rdar://9609108 llvm-svn: 133289	2011-06-17 20:47:21 +00:00
Chris Lattner	5756c16cdf	make the asmparser reject function and type redefinitions. 'Merging' hasn't been needed since llvm-gcc 3.4 days. llvm-svn: 133248	2011-06-17 07:06:44 +00:00
Chris Lattner	def1949c00	Remove support for using "foo" as symbols instead of %"foo". This is ancient syntax and has been long obsolete. As usual, updating the tests is the nasty part of this. llvm-svn: 133242	2011-06-17 06:36:20 +00:00
Chris Lattner	b90ed2233c	manually upgrade a bunch of tests to modern syntax, and remove some that are either unreduced or only test old syntax. llvm-svn: 133228	2011-06-17 03:14:27 +00:00
Cameron Zwarich	033026ffc0	Update an insertion point iterator after replacing a return instruction with a tail call pseudoinstruction. This fixes <rdar://problem/9624333>. llvm-svn: 133227	2011-06-17 02:16:43 +00:00
Eli Friedman	575d0163bb	Force a triple here so this test doesn't fail on EABI hosts (like clang-native-arm-cortex-a9). llvm-svn: 133134	2011-06-16 01:49:31 +00:00
Chad Rosier	aed609da92	Typos. llvm-svn: 133128	2011-06-16 01:24:24 +00:00
Chad Rosier	2730162bee	Revision r128665 added an optimization to make use of NEON multiplier accumulator forwarding. Specifically (from SVN log entry): Distribute (A + B) * C to (A * C) + (B * C) to make use of NEON multiplier accumulator forwarding: vadd d3, d0, d1 vmul d3, d3, d2 => vmul d3, d0, d2 vmla d3, d1, d2 Make sure it catches cases where operand 1 is add/fadd/sub/fsub, which was intended in the original revision. llvm-svn: 133127	2011-06-16 01:21:54 +00:00
Rafael Espindola	10028230cf	Testcase for previous commit. llvm-svn: 133089	2011-06-15 21:18:51 +00:00
Evan Cheng	678b691aa3	Another revsh pattern. rdar://9609059 llvm-svn: 133064	2011-06-15 17:17:48 +00:00
Evan Cheng	6d02d9044b	PerformBFICombine - (bfi A, (and B, Mask1), Mask2) -> (bfi A, B, Mask2) iff the bits being cleared by the AND are not demanded by the BFI. The previous BFI dag combine rule was actually incorrect (or used to be correct until BFI representation changed). rdar://9609030 llvm-svn: 133034	2011-06-15 01:12:31 +00:00
Tanya Lattner	e9e6705cf9	Add an optimization that looks for a specific pair-wise add pattern and generates a vpaddl instruction instead of scalarizing the add. Includes a test case. llvm-svn: 133027	2011-06-14 23:48:48 +00:00
Bruno Cardoso Lopes	29386fb10d	Since ARM's prefetch implementation predicted the presence of a instruction cache prefetch and now that the info from "prefetch" to "ARMPreload" is present, only add a testcase for PLI. llvm-svn: 132978	2011-06-14 05:11:46 +00:00
Bruno Cardoso Lopes	dc9ff3a4b1	Add one more argument to the prefetch intrinsic to indicate whether it's a data or instruction cache access. Update the targets to match it and also teach autoupgrade. llvm-svn: 132976	2011-06-14 04:58:37 +00:00
Jakob Stoklund Olesen	fb03a92c33	Be less aggressive about hinting in RAFast. In particular, don't spill dirty registers only to satisfy a hint. It is not worth it. The attached test case provides an example where the fast allocator would spill a register when other registers are available. llvm-svn: 132900	2011-06-13 03:26:46 +00:00
Cameron Zwarich	361548d4b4	A CCState was being created without setting whether it is in the Call or Prologue state, causing an assertion failure downstream. This fixes <rdar://problem/9562908>. This really seems like it should always be set at CCState creation time, so mistakes like this can never happen. I'll take a look at doing that. llvm-svn: 132811	2011-06-09 22:30:07 +00:00
Eric Christopher	1e3e8933ed	Another possible bug. Stopgap until we can autogenerate tables and constraint lengths. Part of rdar://9037836 and rdar://9119939 llvm-svn: 132598	2011-06-03 22:09:12 +00:00
Eric Christopher	761a5d4280	Fix an off by one error. Part of rdar://9037836 and rdar://9119939 llvm-svn: 132590	2011-06-03 20:44:52 +00:00
Eric Christopher	354b2a25f3	Make the Uv constraint a memory operand. This doesn't solve the addressing mode problem mentioned in r132559. Backend part of rdar://9037836 and part of rdar://9119939 llvm-svn: 132561	2011-06-03 17:24:37 +00:00
Eli Friedman	86585798af	Add ARM fast-isel support for materializing the address of a global in cases where the global uses an indirect symbol. rdar://9431157 llvm-svn: 132522	2011-06-03 01:13:19 +00:00
Devang Patel	e5feef0fe1	During post RA scheduling, do not try to chase reg defs. to preserve DBG_VALUEs. This approach has several downsides, for example, it does not work when dbg value is a constant integer, it does not work if reg is defined more than once, it places end of debug value range markers in the wrong place. It even causes misleading incorrect debug info when duplicate DBG_VALUE instructions point to same reg def. Instead, use simpler approach and let DBG_VALUE follow its predecessor instruction. After live debug value analysis pass, all DBG_VALUE instruction are placed at the right place. Thanks Jakob for the hint! llvm-svn: 132483	2011-06-02 20:07:12 +00:00
Eric Christopher	690030c116	Allow bitcasts between valid types of the same size and vector types if the vector type is legal. Fixes rdar://9306086 llvm-svn: 132420	2011-06-01 19:55:10 +00:00
John McCall	7d84ece09b	On Darwin ARM, set the UNWIND_RESUME libcall to _Unwind_SjLj_Resume. This is important for the correct lowering of unwind instructions (which doesn't matter at all) and llvm.eh.resume calls (which does). Take 2, now with more basic competence. llvm-svn: 132295	2011-05-29 19:50:32 +00:00
John McCall	e64371b932	I didn't mean to commit these residues of a personal project. llvm-svn: 132293	2011-05-29 19:41:56 +00:00
John McCall	085d891d80	On Darwin ARM, set the UNWIND_RESUME libcall to _Unwind_SjLj_Resume. This is important for the correct lowering of unwind instructions (which doesn't matter at all) and llvm.eh.resume calls (which does). llvm-svn: 132291	2011-05-29 19:39:04 +00:00
Bruno Cardoso Lopes	325110f30d	Add support for ARM ldrexd/strexd intrinsics. They both use i32 register pairs to load/store i64 values. Since there's no current support to explicitly declare such restrictions, implement it by using specific hardcoded register pairs during isel. llvm-svn: 132248	2011-05-28 04:07:29 +00:00
Eric Christopher	d00e8ad803	Implement the 'M' output modifier for arm inline asm. This is fairly register allocation dependent and will occasionally break. WIP in the register allocator to model paired/etc registers. rdar://9119939 llvm-svn: 132242	2011-05-28 01:40:44 +00:00
Cameron Zwarich	1d553a2cc4	Fix the remaining atomic intrinsics to use the right register classes on Thumb2, and add some basic tests for them. llvm-svn: 132235	2011-05-27 23:54:00 +00:00
Rafael Espindola	d23bfb8a7a	Make size computation less brittle. llvm-svn: 132222	2011-05-27 22:05:41 +00:00
Jakob Stoklund Olesen	2348f3133f	Make room for register allocation to improve. llvm-svn: 132213	2011-05-27 20:15:06 +00:00
Evan Cheng	518bcd0ef4	Don't use movw / movt for iOS static codegen for now to workaround some tools issues. rdar://9514789 llvm-svn: 132211	2011-05-27 20:11:27 +00:00
Evan Cheng	97c9f84f68	Add iOS test llvm-svn: 132203	2011-05-27 19:04:21 +00:00
Eli Friedman	3a8d9625b0	And fix the test in r132194. llvm-svn: 132196	2011-05-27 18:14:28 +00:00
Eli Friedman	fe84bd659c	Fix a silly mistake (which trips over an assertion) in r132099. rdar://9515076 llvm-svn: 132194	2011-05-27 18:02:04 +00:00
Devang Patel	42ddaa10d3	During branch folding avoid inserting redundant DBG_VALUE machine instructions. llvm-svn: 132148	2011-05-26 21:47:59 +00:00
Eli Friedman	c70355195c	Rewrite fast-isel integer cast handling to handle more cases, and to be simpler and more consistent. The practical effects here are that x86-64 fast-isel can now handle trunc from i8 to i1, and ARM fast-isel can handle many more constructs involving integers narrower than 32 bits (including loads, stores, and many integer casts). rdar://9437928 . llvm-svn: 132099	2011-05-25 23:49:02 +00:00
Eric Christopher	8c5e4192e6	Implement the 'm' modifier. Note that it only works for memory operands. Part of rdar://9119939 llvm-svn: 132081	2011-05-25 20:51:58 +00:00
Cameron Zwarich	3088e0a179	Make tTAILJMPr/tTAILJMPrND emit a tBX without a preceding MOV of PC to LR. This fixes <rdar://problem/9495913> llvm-svn: 132042	2011-05-25 04:45:27 +00:00
Eric Christopher	1b724948e9	Implement the arm 'L' asm modifier. Part of rdar://9119939 llvm-svn: 132024	2011-05-24 23:27:13 +00:00
Eric Christopher	b1dda56ac2	Implement the immediate part of the 'B' modifier. Part of rdar://9119939 llvm-svn: 132023	2011-05-24 23:15:43 +00:00
Eric Christopher	7617883ce3	Add support for the arm 'y' asm modifier. Fixes part of rdar://9444657 llvm-svn: 132011	2011-05-24 22:10:34 +00:00
Cameron Zwarich	bc90690b24	Fix <rdar://problem/9476260> by having tail calls always generate 32-bit branches in Darwin Thumb2 code. Tail calls are already disabled on Thumb1. llvm-svn: 131894	2011-05-23 01:57:17 +00:00
Renato Golin	4cd5187f5b	RTABI chapter 4.3.4 specifies __eabi_mem* calls. Specifically, __eabi_memset accepts parameters (ptr, size, value) in a different order than GNU's memset (ptr, value, size), therefore the special lowering in AAPCS mode. Implementation by Evzen Muller. llvm-svn: 131868	2011-05-22 21:41:23 +00:00
Tanya Lattner	1d11720ae4	Handle perfect shuffle case that generates a vrev for vectors of floats. Add test case. llvm-svn: 131582	2011-05-18 21:44:54 +00:00
Tanya Lattner	48b182c3a4	In r131488 I misunderstood how VREV works. It splits the vector in half and splits each half. Therefore, the real problem was that we were using a VREV64 for a 4xi16, when we should have been using a VREV32. Updated test case and reverted change to the PerfectShuffle Table. llvm-svn: 131529	2011-05-18 06:42:21 +00:00
Tanya Lattner	c7e291b354	vrev is incorrectly defined in the perfect shuffle table. The ordering is backwards (should be 0x3210 versus 0x1032) which exposed a bug when doing a shuffle on a 4xi16. I've attached a test case. llvm-svn: 131488	2011-05-17 20:48:40 +00:00
Jakob Stoklund Olesen	4edf17d91f	Teach LiveInterval::isZeroLength about null SlotIndexes. When instructions are deleted, they leave tombstone SlotIndex entries. The isZeroLength method should ignore these null indexes. This causes RABasic to sometimes spill a callee-saved register in the abi-isel.ll test, so don't run that test with -regalloc=basic. Prioritizing register allocation according to spill weight can cause more registers to be used. llvm-svn: 131436	2011-05-16 23:50:05 +00:00
Galina Kistanova	9e56e51fab	Correction. Use explicit target triple in the test. llvm-svn: 131252	2011-05-12 21:55:34 +00:00
Nadav Rotem	8a7beb80f0	Fixes a bug in the DAGCombiner. LoadSDNodes have two values (data, chain). If there is a store after the load node, then there is a chain, which means that there is another user. Thus, asking hasOneUser would fail. Instead we ask hasNUsesOfValue on the 'data' value. llvm-svn: 131183	2011-05-11 14:40:50 +00:00
Rafael Espindola	19c1a56287	Produce a __debug_frame section on darwin ARM when appropriate. llvm-svn: 131151	2011-05-10 21:04:45 +00:00
Dan Gohman	dd550305e6	Give this test an explicit register allocator, so that it can work even if the default register allocator is changed. llvm-svn: 130883	2011-05-04 23:14:02 +00:00
Bill Wendling	2a40131f6b	SjLj EH could produce a machine basic block that legitimately has more than one landing pad as its successor. SjLj exception handling jumps to the correct landing pad via a switch statement that's generated right before code-gen. Loosen the constraint in the machine instruction verifier to allow for this. Note, this isn't the most rigorous check since we cannot determine where that switch statement came from. But it's marginally better than turning this check off when SjLj exceptions are used. <rdar://problem/9187612> llvm-svn: 130881	2011-05-04 22:54:05 +00:00
Galina Kistanova	e53ae508ec	This test fails on ARM. The test shouldn't explicitly specify alignment (and alignment 4 is wrong) and requires hard-float. llvm-svn: 130875	2011-05-04 21:57:44 +00:00
Devang Patel	39ecf816c5	Do not emit location expression size twice. llvm-svn: 130854	2011-05-04 19:00:57 +00:00
Jakob Stoklund Olesen	51b35f7bb1	Fix a bunch of ARM tests to be register allocation independent. llvm-svn: 130800	2011-05-03 22:31:21 +00:00
Evan Cheng	93b5cdc5ab	Make the test less likely to fail with minor changes. llvm-svn: 130778	2011-05-03 19:09:32 +00:00
Bob Wilson	c5242b0e78	Remove test for iOS divmod function, since that is disabled for now. llvm-svn: 130769	2011-05-03 17:54:49 +00:00
Bruno Cardoso Lopes	168c9005b5	Add a few ARM coprocessor intrinsics. Testcases included llvm-svn: 130763	2011-05-03 17:29:22 +00:00
Dan Gohman	6136e94897	Add an unfolded offset field to LSR's Formula record. This is used to model constants which can be added to base registers via add-immediate instructions which don't require an additional register to materialize the immediate. llvm-svn: 130743	2011-05-03 00:46:49 +00:00
Jakob Stoklund Olesen	edfabc9aad	Weekly fix of register allocation dependent unit tests. llvm-svn: 130567	2011-04-30 01:37:52 +00:00
Eli Friedman	4105ed1523	Make FastEmit_ri_ try a bit harder to succeed for supported operations; FastEmit_i can fail for non-Thumb2 ARM. Makes ARMSimplifyAddress work correctly, and reduces the number of fast-isel bailouts on non-Thumb ARM. llvm-svn: 130560	2011-04-29 23:34:52 +00:00
Eli Friedman	328bad02fa	Switch to ImmLeaf (which can be used by FastISel) for a few more common ARM/Thumb2 patterns. llvm-svn: 130552	2011-04-29 22:48:03 +00:00
Eli Friedman	dd937843d3	Fix run-line, again. :( llvm-svn: 130540	2011-04-29 21:33:03 +00:00
Eli Friedman	86caced370	Re-committing r130454, which does not in fact break anything. Fix a rather obscure crash caused by ARM fast-isel generating code which redefines a register. rdar://problem/9338332 . llvm-svn: 130539	2011-04-29 21:22:56 +00:00
Eric Christopher	8d46b47787	Add trunc->branch support, this won't help with clang's i8->i1 truncations for bools, but is a start. llvm-svn: 130534	2011-04-29 20:02:39 +00:00
Eli Friedman	517728b1ae	Revert r130454; apparently this doesn't actually work. llvm-svn: 130462	2011-04-28 23:55:14 +00:00
Eli Friedman	37b9ede969	Fix runline. llvm-svn: 130455	2011-04-28 23:12:24 +00:00
Eli Friedman	e4ecd42926	Fix a rather obscure crash caused by ARM fast-isel generating code which redefines a register. rdar://problem/9338332 . llvm-svn: 130454	2011-04-28 23:03:25 +00:00
Devang Patel	3e021533cd	Teach dwarf writer to handle complex address expression for .debug_loc entries. This fixes clang generated blocks' variables' debug info. Radar 9279956. llvm-svn: 130373	2011-04-28 02:22:40 +00:00
Evan Cheng	9808d31b9e	If converter was being too cute. It look for root BBs (which don't have successors) and use inverse depth first search to traverse the BBs. However that doesn't work when the CFG has infinite loops. Simply do a linear traversal of all BBs work just fine. rdar://9344645 llvm-svn: 130324	2011-04-27 19:32:43 +00:00
Jakob Stoklund Olesen	71d3b895ba	Also add <imp-def> operands for defined and dead super-registers when rewriting. We cannot rely on the <imp-def> operands added by LiveIntervals in all cases as demonstrated by the test case. llvm-svn: 130313	2011-04-27 17:42:31 +00:00
Evan Cheng	1355bbdd11	Be careful about scheduling nodes above previous calls. It increase usages of more callee-saved registers and introduce copies. Only allows it if scheduling a node above calls would end up lessen register pressure. Call operands also has added ABI restrictions for register allocation, so be extra careful with hoisting them above calls. rdar://9329627 llvm-svn: 130245	2011-04-26 21:31:35 +00:00
Evan Cheng	dbb86b8108	This test should be in MC. It breaks with changes to scheduling / register allocation so it's being removed. llvm-svn: 130243	2011-04-26 21:09:04 +00:00
Chris Lattner	189ca1498f	don't emit the symbol name twice for local bss and common symbols. For example, don't emit: .comm _i,4,2 ## @i ## @i instead emit: .comm _i,4,2 ## @i llvm-svn: 130192	2011-04-26 06:14:13 +00:00
Eric Christopher	238a21f2d5	Make this test disable fast isel as it's not needed. llvm-svn: 130165	2011-04-25 22:39:46 +00:00
Benjamin Kramer	ba446cc12a	Make tests more useful. lit needs a linter ... llvm-svn: 130126	2011-04-25 10:12:01 +00:00
Andrew Trick	0ed5778a1e	Thumb2 and ARM add/subtract with carry fixes. Fixes Thumb2 ADCS and SBCS lowering: <rdar://problem/9275821>. t2ADCS/t2SBCS are now pseudo instructions, consistent with ARM, so the assembly printer correctly prints the 's' suffix. Fixes Thumb2 adde -> SBC matching to check for live/dead carry flags. Fixes the internal ARM machine opcode mnemonic for ADCS/SBCS. Fixes ARM SBC lowering to check for live carry (potential bug). llvm-svn: 130048	2011-04-23 03:55:32 +00:00
Devang Patel	94ad6ac13c	Fix DWARF description of Q registers. llvm-svn: 129952	2011-04-21 23:22:35 +00:00
Devang Patel	3712c14be9	Fix DWARF description of S registers. llvm-svn: 129947	2011-04-21 22:48:26 +00:00
Devang Patel	be22131c28	Test case for r129922 llvm-svn: 129934	2011-04-21 20:16:43 +00:00
Evan Cheng	5f1ba4cd2d	Remove -use-divmod-libcall. Let targets opt in when they are available. llvm-svn: 129884	2011-04-20 22:20:12 +00:00
Eric Christopher	bcaedb5ce0	Rewrite the expander for umulo/smulo to remember to sign extend the input manually and pass all (now) 4 arguments to the mul libcall. Add a new ExpandLibCall for just this (copied gratuitously from type legalization). Fixes rdar://9292577 llvm-svn: 129842	2011-04-20 01:19:45 +00:00
Daniel Dunbar	4a7783b0c2	CodeGen: Eliminate a use of getDarwinMajorNumber(). - There is a minor semantic change here (evidenced by the test change) for Darwin triples that have no version component. I debated changing the default behavior of isOSVersionLT, but decided it made more sense for triples to be explicit. llvm-svn: 129802	2011-04-19 20:32:39 +00:00
Bob Wilson	0858c3aaed	This patch combines several changes from Evan Cheng for rdar://8659675. Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Enable these fp vmlx codegen changes for Cortex-A9. llvm-svn: 129775	2011-04-19 18:11:57 +00:00
Bob Wilson	d04a83f8f2	Add -mcpu=cortex-a9-mp. It's cortex-a9 with MP extension. rdar://8648637. llvm-svn: 129774	2011-04-19 18:11:52 +00:00
Bob Wilson	a2881ee8a4	Avoid some 's' 16-bit instruction which partially update CPSR (and add false dependency) when it isn't dependent on last CPSR defining instruction. rdar://8928208 llvm-svn: 129773	2011-04-19 18:11:49 +00:00
Bob Wilson	df612ba006	Avoid write-after-write issue hazards for Cortex-A9. Add a avoidWriteAfterWrite() target hook to identify register classes that suffer from write-after-write hazards. For those register classes, try to avoid writing the same register in two consecutive instructions. This is currently disabled by default. We should not spill to avoid hazards! The command line flag -avoid-waw-hazard can be used to enable waw avoidance. llvm-svn: 129772	2011-04-19 18:11:45 +00:00
Jakob Stoklund Olesen	fb1249548f	Tighten test case a bit. Ideally, we would match an S-register to its containing D-register, but that requires arithmetic (divide by 2). llvm-svn: 129756	2011-04-19 06:14:45 +00:00
Jakob Stoklund Olesen	bf78618db6	Make tests register allocation independent again. llvm-svn: 129739	2011-04-19 00:14:43 +00:00
Evan Cheng	4079133796	Do not lose mem_operands while lowering VLD / VST intrinsics. llvm-svn: 129738	2011-04-19 00:04:03 +00:00
Eric Christopher	c37aa0b26a	Fix a bug where we were counting the alias sets as completely used registers for fast allocation a different way. This has us updating used registers only when we're using that exact register. Fixes rdar://9207598 llvm-svn: 129711	2011-04-18 19:26:25 +00:00
Evan Cheng	b14ce09fca	Fix divmod libcall lowering. Convert to {S\|U}DIVREM first and then expand the node to a libcall. rdar://9280991 llvm-svn: 129633	2011-04-16 03:08:26 +00:00
Cameron Zwarich	9c65e4d69c	Add ORR and EOR to the CMP peephole optimizer. It's hard to get isel to generate a case involving EOR, so I only added a test for ORR. llvm-svn: 129610	2011-04-15 21:24:38 +00:00
Cameron Zwarich	0829b3065a	The AND instruction leaves the V flag unmodified, so it falls victim to the same problem as all of the other instructions we fold with CMPs. llvm-svn: 129602	2011-04-15 20:45:00 +00:00
Cameron Zwarich	93eae1571c	Add missing register forms of instructions to the ARM CMP-folding code. This fixes <rdar://problem/9287901>. llvm-svn: 129599	2011-04-15 20:28:28 +00:00
Evan Cheng	12bb05b75b	Fix another fcopysign lowering bug. If src is f64 and destination is f32, don't forget to right shift the source by 32 first. rdar://9287902 llvm-svn: 129556	2011-04-15 01:31:00 +00:00
Cameron Zwarich	415b5e8341	Fix a typo in an ARM-specific DAG combine. This fixes <rdar://problem/9278274>. llvm-svn: 129468	2011-04-13 21:01:19 +00:00
Cameron Zwarich	70be27e913	Fix an obvious problem with an alignment computation. AsmPrinter actually does the max itself, so it is not easy to write a test case for this, but I added a test case that would fail if the code in AsmPrinter were removed. llvm-svn: 129432	2011-04-13 09:02:43 +00:00
Cameron Zwarich	cdf59f7016	If a global variable has a specified alignment that is less than the preferred alignment for its type, use the minimum of the specified alignment and the ABI alignment. This fixes <rdar://problem/9275290>. llvm-svn: 129428	2011-04-13 06:03:16 +00:00
Andrew Trick	b53a00d2cb	Recommit r129383. PreRA scheduler heuristic fixes: VRegCycle, TokenFactor latency. Additional fixes: Do something reasonable for subtargets with generic itineraries by handle node latency the same as for an empty itinerary. Now nodes default to unit latency unless an itinerary explicitly specifies a zero cycle stage or it is a TokenFactor chain. Original fixes: UnitsSharePred was a source of randomness in the scheduler: node priority depended on the queue data structure. I rewrote the recent VRegCycle heuristics to completely replace the old heuristic without any randomness. To make the ndoe latency adjustments work, I also needed to do something a little more reasonable with TokenFactor. I gave it zero latency to its consumers and always schedule it as low as possible. llvm-svn: 129421	2011-04-13 00:38:32 +00:00
Eric Christopher	28f4c729f7	Temporarily revert r129408 to see if it brings the bots back. llvm-svn: 129417	2011-04-13 00:20:59 +00:00

1 2 3 4 5 ...

1115 Commits