llvm-project

Commit Graph

Author	SHA1	Message	Date
Jakob Stoklund Olesen	d9b66506a3	Reapply r161633-161634 "Partition use lists so defs always come before uses."" No changes to these patches, MRI needed to be notified when changing uses into defs and vice versa. llvm-svn: 161644	2012-08-10 00:21:30 +00:00
Jakob Stoklund Olesen	acd27c9279	Revert r161633-161634 "Partition use lists so defs always come before uses." These commits broke a number of buildbots. llvm-svn: 161640	2012-08-09 23:31:36 +00:00
Jakob Stoklund Olesen	df01e00710	Partition use lists so defs always come before uses. This makes it possible to speed up def_iterator by stopping at the first use. This makes def_empty() and getUniqueVRegDef() much faster when there are many uses. In a +Asserts build, LiveVariables is 100x faster in one case because getVRegDef() has an assertion that would scan to the end of a def_iterator chain. Spill weight calculation is significantly faster (300x in one case) because isTriviallyReMaterializable() calls MRI->isConstantPhysReg(%RIP) which calls def_empty(%RIP). llvm-svn: 161634	2012-08-09 22:49:46 +00:00
Jakob Stoklund Olesen	7d7051ca3c	Don't use pointer-pointers for the register use lists. Use a more conventional doubly linked list where the Prev pointers form a cycle. This means it is no longer necessary to adjust the Prev pointers when reallocating the VRegInfo array. The test changes are required because the register allocation hint is using the use-list order to break ties. llvm-svn: 161633	2012-08-09 22:49:42 +00:00
Jakob Stoklund Olesen	4238a89db8	Don't modify MO while use_iterator is still pointing to it. llvm-svn: 161626	2012-08-09 22:08:24 +00:00
Arnold Schwaighofer	81b2eec1ab	Patch to implement UMLAL/SMLAL instructions for the ARM architecture This patch corrects the definition of umlal/smlal instructions and adds support for matching them to the ARM dag combiner. Bug 12213 Patch by Yin Ma! llvm-svn: 161581	2012-08-09 15:25:52 +00:00
Nadav Rotem	e0f84d31c8	Fix the legalization of ExtLoad on ARM. ExpandUnalignedLoad did not properly handle the cases where the memory value type was illegal. PR 13111. llvm-svn: 161565	2012-08-09 01:56:44 +00:00
Bob Wilson	4c65c505e0	Add test triples to fix win32 failures. Revert workaround from r161292. I don't have a win32 system to test, so hopefully I got them all fixed here. llvm-svn: 161519	2012-08-08 20:31:37 +00:00
Manman Ren	1be131ba27	X86: enable CSE between CMP and SUB We perform the following: 1> Use SUB instead of CMP for i8,i16,i32 and i64 in ISel lowering. 2> Modify MachineCSE to correctly handle implicit defs. 3> Convert SUB back to CMP if possible at peephole. Removed pattern matching of (a>b) ? (a-b):0 and like, since they are handled by peephole now. rdar://11873276 llvm-svn: 161462	2012-08-08 00:51:41 +00:00
Evan Cheng	fbdd25c135	X86 cmp lowering is looking past truncate on the condition node. It should only do so when the high bits are known zero. This caused a subtle miscompilation. rdar://12027825 llvm-svn: 161451	2012-08-07 22:21:00 +00:00
Chandler Carruth	881d0a7966	Add a much more conservative strategy for aligning branch targets. Previously, MBP essentially aligned every branch target it could. This bloats code quite a bit, especially non-looping code which has no real reason to prefer aligned branch targets so heavily. As Andy said in review, it's still a bit odd to do this without a real cost model, but this at least has much more plausible heuristics. Fixes PR13265. llvm-svn: 161409	2012-08-07 09:45:24 +00:00
Manman Ren	cb36b8c2e6	MachineCSE: Update the heuristics for isProfitableToCSE. If the result of a common subexpression is used at all uses of the candidate expression, CSE should not increase the live range of the common subexpression. rdar://11393714 and rdar://11819721 llvm-svn: 161396	2012-08-07 06:16:46 +00:00
Hal Finkel	33e529d56b	MFTB on PPC64 should really be encoded using MFSPR. The MFTB instruction itself is being phased out, and its functionality is provided by MFSPR. According to the ISA docs, using MFSPR works on all known chips except for the 601 (which did not have a timebase register anyway) and the POWER3. Thanks to Adhemerval Zanella for pointing this out! llvm-svn: 161346	2012-08-06 21:21:44 +00:00
Craig Topper	ab47fe4e16	Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305. llvm-svn: 161318	2012-08-06 06:22:36 +00:00
Craig Topper	812005e562	Update test to check for r161305 llvm-svn: 161307	2012-08-05 09:06:28 +00:00
Hal Finkel	70381a7b18	Add readcyclecounter lowering on PPC64. On PPC64, this can be done with a simple TableGen pattern. To enable this, I've added the (otherwise missing) readcyclecounter SDNode definition to TargetSelectionDAG.td. llvm-svn: 161302	2012-08-04 14:10:46 +00:00
Anton Korobeynikov	218aaf6d04	Add stack spill / reload instructions for DTriple and DQuad register classes, which were missed for no reason. This fixes PR13377 llvm-svn: 161299	2012-08-04 13:16:12 +00:00
Bob Wilson	874886cd66	Refactor and check "onlyReadsMemory" before optimizing builtins. This patch is mostly just refactoring a bunch of copy-and-pasted code, but it also adds a check that the call instructions are readnone or readonly. That check was already present for sin, cos, sqrt, log2, and exp2 calls, but it was missing for the rest of the builtins being handled in this code. llvm-svn: 161282	2012-08-03 23:29:17 +00:00
Akira Hatanaka	22bec282e9	1. Redo mips16 instructions to avoid multiple opcodes for same instruction. Change these to patterns. 2. Add another 16 instructions. Patch by Reed Kotler. llvm-svn: 161272	2012-08-03 22:57:02 +00:00
Bob Wilson	fa59485b94	Fix memcmp code-gen to honor -fno-builtin. I noticed that SelectionDAGBuilder::visitCall was missing a check for memcmp in TargetLibraryInfo, so that it would use custom code for memcmp calls even with -fno-builtin. I also had to add a new -disable-simplify-libcalls option to llc so that I could write a test for this. llvm-svn: 161262	2012-08-03 21:26:18 +00:00
Bob Wilson	3e6fa462f3	Fall back to selection DAG isel for calls to builtin functions. Fast isel doesn't currently have support for translating builtin function calls to target instructions. For embedded environments where the library functions are not available, this is a matter of correctness and not just optimization. Most of this patch is just arranging to make the TargetLibraryInfo available in fast isel. <rdar://problem/12008746> llvm-svn: 161232	2012-08-03 04:06:28 +00:00
Jush Lu	4705da9020	[arm-fast-isel] Add support for shl, lshr, and ashr. llvm-svn: 161230	2012-08-03 02:37:48 +00:00
Manman Ren	ba8122cc25	X86 Peephole: fold loads to the source register operand if possible. Add more comments and use early returns to reduce nesting in isLoadFoldable. Also disable folding for V_SET0 to avoid introducing a const pool entry and a const pool load. rdar://10554090 and rdar://11873276 llvm-svn: 161207	2012-08-02 19:37:32 +00:00
Akira Hatanaka	fffad897f2	Set transient stack alignment in constructor of MipsFrameLowering and re-enable test o32_cc_vararg.ll. llvm-svn: 161189	2012-08-02 18:15:13 +00:00
NAKAMURA Takumi	7020f51622	llvm/test/CodeGen/X86/fold-pcmpeqd-1.ll: Make sure this is testing without +avx. FIXME: Could +avx be checked here too? llvm-svn: 161156	2012-08-02 06:36:56 +00:00
NAKAMURA Takumi	aaca1e690d	llvm/test/CodeGen/X86/fold-pcmpeqd-1.ll: Rewrite expressions to pass regardless of PR11031. - Relax to match even if epilogue (pop %ebp) were emitted. - Assume the return value is stored to %xmm0. llvm-svn: 161155	2012-08-02 06:33:58 +00:00
Manman Ren	5759d01230	X86 Peephole: fold loads to the source register operand if possible. Machine CSE and other optimizations can remove instructions so folding is possible at peephole while not possible at ISel. This patch is a rework of r160919 and was tested on clang self-host on my local machine. rdar://10554090 and rdar://11873276 llvm-svn: 161152	2012-08-02 00:56:42 +00:00
Matt Beaumont-Gay	7947aecaf1	Line endings. llvm-svn: 161117	2012-08-01 16:42:35 +00:00
Elena Demikhovsky	3cb3b0045c	Added FMA functionality to X86 target. llvm-svn: 161110	2012-08-01 12:06:00 +00:00
Akira Hatanaka	d1c43cee24	Add definitions of two subclasses of MipsFrameLowering, Mips16FrameLowering and MipsSEFrameLowering. Implement MipsSEFrameLowering::hasReservedCallFrame. Call frames will not be reserved if there is a call with a large call frame or there are variable sized objects on the stack. llvm-svn: 161090	2012-07-31 22:50:19 +00:00
Akira Hatanaka	02de0e4425	Let PEI::calculateFrameObjectOffsets compute the final stack size rather than computing it in MipsFrameLowering::emitPrologue. llvm-svn: 161078	2012-07-31 21:28:49 +00:00
Akira Hatanaka	33a25af5a8	Expand DYNAMIC_STACKALLOC nodes rather than doing custom-lowering. The frame object which points to the dynamically allocated area will not be needed after changes are made to cease reserving call frames. llvm-svn: 161076	2012-07-31 20:54:48 +00:00
Akira Hatanaka	beda2241a4	When store nodes or memcpy nodes are created to copy the function call arguments to the stack in MipsISelLowering::LowerCall, use stack pointer and integer offset operands rather than frame object operands. llvm-svn: 161068	2012-07-31 18:46:41 +00:00
Chad Rosier	710be7df71	[x86 frame lowering] In 32-bit mode, use ESI as the base pointer. Previously, we were using EBX, but PIC requires the GOT to be in EBX before function calls via PLT GOT pointer. llvm-svn: 161066	2012-07-31 18:29:21 +00:00
Akira Hatanaka	4ce7c4060d	Fix type of LUXC1 and SUXC1. These instructions were incorrectly defined as single-precision load and store. Also avoid selecting LUXC1 and SUXC1 instructions during isel. It is incorrect to map unaligned floating point load/store nodes to these instructions. llvm-svn: 161063	2012-07-31 18:16:49 +00:00
Manman Ren	8c549b586c	MachineSink: Sort the successors before trying to find SuccToSinkTo. One motivating example is to sink an instruction from a basic block which has two successors: one outside the loop, the other inside the loop. We should try to sink the instruction outside the loop. rdar://11980766 llvm-svn: 161062	2012-07-31 18:10:39 +00:00
Jakob Stoklund Olesen	0c807dfae2	Clear kill flags in removeCopyByCommutingDef(). We are extending live ranges, so kill flags are not accurate. They aren't needed until they are recomputed after RA anyway. <rdar://problem/11950722> llvm-svn: 161023	2012-07-31 02:47:24 +00:00
Manman Ren	2b6a0dfd4c	Reverse order of the two branches at end of a basic block if it is profitable. We branch to the successor with higher edge weight first. Convert from je LBB4_8 --> to outer loop jmp LBB4_14 --> to inner loop to jne LBB4_14 jmp LBB4_8 PR12750 rdar: 11393714 llvm-svn: 161018	2012-07-31 01:11:07 +00:00
Pete Cooper	91244268d7	Consider address spaces for hashing and CSEing DAG nodes. Otherwise two loads from different x86 segments but the same address would get CSEd llvm-svn: 160987	2012-07-30 20:23:19 +00:00
Manman Ren	f87dd7c01b	Revert r160920 and r160919 due to dragonegg and clang selfhost failure llvm-svn: 160927	2012-07-29 02:44:09 +00:00
Manman Ren	9de95e779c	X86 Peephole: fold loads to the source register operand if possible. Trying to fix the bot by specifying a triple in the failing testing cases. llvm-svn: 160920	2012-07-28 17:51:24 +00:00
Manman Ren	0fa3ab88ba	X86 Peephole: fold loads to the source register operand if possible. Machine CSE and other optimizations can remove instructions so folding is possible at peephole while not possible at ISel. rdar://10554090 and rdar://11873276 llvm-svn: 160919	2012-07-28 16:48:01 +00:00
Manman Ren	32367c063b	X86 Peephole: fix PR13475 in optimizeCompare. It is possible that an instruction can use and update EFLAGS. When checking the safety, we should check the usage of EFLAGS first before declaring it is safe to optimize due to the update. llvm-svn: 160912	2012-07-28 03:15:46 +00:00
Evan Cheng	249716e8ae	Teach CodeGenPrep to look past bitcast when it's duplicating return instruction into predecessor blocks to enable tail call optimization. rdar://11958338 llvm-svn: 160894	2012-07-27 21:21:26 +00:00
Jakob Stoklund Olesen	bc65e8f94e	Add <imp-def> of super-register when lowering SUBREG_TO_REG. Patch by Tyler Nowicki! llvm-svn: 160888	2012-07-27 20:19:49 +00:00
Jakob Stoklund Olesen	ceee4a9d0c	Eliminate a batch of uses of sub_ss and sub_sd in the X86 target. These idempotent sub-register indices don't do anything --- They simply map XMM registers to themselves. They no longer affect register classes either since the SubRegClasses field has been removed from Target.td. This patch replaces XMM->XMM EXTRACT_SUBREG and INSERT_SUBREG patterns with COPY_TO_REGCLASS patterns which simply become COPY instructions. The number of IMPLICIT_DEF instructions before register allocation is reduced, and that is the cause of the test case changes. llvm-svn: 160816	2012-07-26 21:40:42 +00:00
Akira Hatanaka	64626fc20f	Fix call setup for PIC. Patch by Reed Kotler. llvm-svn: 160774	2012-07-26 02:24:43 +00:00
Manman Ren	e8c6b15137	Update testing case for Atom when disabling rematerialization in TwoAddressInstructionPass. The generated code for Atom has a different code sequence. This is realted to commit r160749. llvm-svn: 160755	2012-07-25 20:17:14 +00:00
Manman Ren	cc1dc6dc11	Disable rematerialization in TwoAddressInstructionPass. It is redundant; RegisterCoalescer will do the remat if it can't eliminate the copy. Collected instruction counts before and after this. A few extra instructions are generated due to spilling but it is normal to see these kinds of changes with almost any small codegen change, according to Jakob. This also fixed rdar://11830760 where xor is expected instead of movi0. llvm-svn: 160749	2012-07-25 18:28:13 +00:00
Rafael Espindola	11c38b9657	When a return struct pointer is passed in registers, the called has nothing to pop. llvm-svn: 160725	2012-07-25 13:41:10 +00:00

1 2 3 4 5 ...

6209 Commits