llvm-project

Commit Graph

Author	SHA1	Message	Date
Chad Rosier	f3e73ad5da	Add fast-isel support for returning i1, i8, and i16. llvm-svn: 143669	2011-11-04 00:50:21 +00:00
Daniel Dunbar	e6d40de414	Speculatively revert "DeadStoreElimination can now trim the size of a store if the end of it is dead.", which appears to break bootstrapping LLVM. llvm-svn: 143668	2011-11-04 00:48:26 +00:00
Dan Gohman	198b7ffc11	Reapply r143206, with fixes. Disallow physical register lifetimes across calls, and only check for nested dependences on the special call-sequence-resource register. llvm-svn: 143660	2011-11-03 21:49:52 +00:00
Pete Cooper	65ba66c660	Reverted r143600 - selector reference change llvm-svn: 143646	2011-11-03 20:47:50 +00:00
Dan Bailey	b68515c232	fixed global array handling for ptx to use the correct bit widths llvm-svn: 143640	2011-11-03 19:24:46 +00:00
Pete Cooper	8a95aedb5d	DeadStoreElimination can now trim the size of a store if the end of it is dead. Only currently done if the later store is writing to a power of 2 address or has the same alignment as the earlier store as then its likely to not break up large stores into smaller ones Fixes <rdar://problem/10140300> llvm-svn: 143630	2011-11-03 18:01:56 +00:00
Craig Topper	0e7cbbabea	Add new X86 AVX2 VBROADCAST instructions. llvm-svn: 143612	2011-11-03 07:35:53 +00:00
Chad Rosier	bf5f4bec1a	Add support for sign-extending non-legal types in SelectSIToFP(). llvm-svn: 143603	2011-11-03 02:04:59 +00:00
Pete Cooper	e6173d81ae	Treat objc selector reference globals as invariant so that MachineLICM can hoist them out of loops. Fixes <rdar://problem/6027699> llvm-svn: 143600	2011-11-03 00:56:36 +00:00
Lang Hames	9929c423a1	Try to lower memset/memcpy/memmove to vector instructions on ARM where the alignment permits. llvm-svn: 143582	2011-11-02 22:52:45 +00:00
Nick Lewycky	000307fef9	I added the first test to run llvm-dwarfdump. llvm-svn: 143571	2011-11-02 21:02:27 +00:00
Nick Lewycky	d1ee7f8cf1	Don't emit a directory entry for the value in DW_AT_comp_dir, that is always implied by directory index zero. llvm-svn: 143570	2011-11-02 20:55:33 +00:00
Chad Rosier	9cf803c4bf	Add support for comparing integer non-legal types. llvm-svn: 143559	2011-11-02 18:08:25 +00:00
Owen Anderson	fbb704f551	Fix the issue that r143552 was trying to address the _right_ way. One-register lists are legal on LDM/STM instructions, but we should not print the PUSH/POP aliases when they appear. This fixes round tripping on this instruction. llvm-svn: 143557	2011-11-02 18:03:14 +00:00
Daniel Dunbar	5aba1b4ea3	tests: Clean up tests/CMakeLists.txt to drop some variable configuration we no longer need substitutions for. llvm-svn: 143555	2011-11-02 17:54:51 +00:00
Andrew Trick	c2c79c90f2	Rewrite LinearFunctionTestReplace to handle pointer-type IVs. We've been hitting asserts in this code due to the many supported combintions of modes (iv-rewrite/no-iv-rewrite) and IV types. This second rewrite of the code attempts to deal with these cases systematically. llvm-svn: 143546	2011-11-02 17:19:57 +00:00
Craig Topper	a47b05c7f3	More AVX2 instructions and intrinsics. llvm-svn: 143536	2011-11-02 06:54:17 +00:00
Craig Topper	682b850602	Add a bunch more X86 AVX2 instructions and their corresponding intrinsics. llvm-svn: 143529	2011-11-02 04:42:13 +00:00
Andrew Trick	0dae890346	Broaden an assert to handle enable-iv-rewrite=true following r143183. Narrowest possible fix for PR11279. llvm-svn: 143522	2011-11-02 00:02:45 +00:00
Kevin Enderby	82ed3be1fb	Fixed a bug in the code to create a dwarf file and directory table entires when it is separating the directory part from the basename of the FileName. Noticed that this: .file 1 "dir/foo" when assembled got the two parts switched. Using the Mac OS X dwarfdump tool it can be seen easily: % dwarfdump -a a.out include_directories[ 1] = 'foo' Dir Mod Time File Len File Name ---- ---------- ---------- --------------------------- file_names[ 1] 1 0x00000000 0x00000000 dir ... Which should be: ... include_directories[ 1] = 'dir' Dir Mod Time File Len File Name ---- ---------- ---------- --------------------------- file_names[ 1] 1 0x00000000 0x00000000 foo llvm-svn: 143521	2011-11-01 23:39:05 +00:00
Owen Anderson	69e54a740c	Fix disassembly of some VST1 instructions. llvm-svn: 143507	2011-11-01 22:18:13 +00:00
Eli Friedman	3f5eccbe7a	Teach the x86 backend a couple tricks for dealing with v16i8 sra by a constant splat value. Fixes PR11289. llvm-svn: 143498	2011-11-01 21:18:39 +00:00
Richard Osborne	56ce0932db	Don't fold negative offsets into cp / dp accesses to avoid relocation errors. This can happen if the address + addend is less than the start of the cp / dp. llvm-svn: 143459	2011-11-01 11:31:53 +00:00
Richard Osborne	37fe7d6641	Combine various XCore tests for floating point intrinsic support into a single test. llvm-svn: 143458	2011-11-01 10:51:48 +00:00
Richard Osborne	8591b6b0ab	Move various XCore tests to FileCheck llvm-svn: 143457	2011-11-01 10:41:28 +00:00
Craig Topper	fec80c6ad2	Fix operand type for x86 pmadd_ub_sw intrinsic. llvm-svn: 143455	2011-11-01 07:25:22 +00:00
Eli Friedman	a49b828f8f	Make sure we use the right insertion point when instcombine replaces a PHI with another instruction. (Specifically, don't insert an arbitrary instruction before a PHI.) Fixes PR11275. llvm-svn: 143437	2011-11-01 04:49:29 +00:00
Eli Friedman	0eb88775ef	Move x86-specific tests into X86 folder. llvm-svn: 143424	2011-11-01 03:21:48 +00:00
Eli Friedman	6185a2aa7c	Move another test requiring x86 into X86 directory. llvm-svn: 143421	2011-11-01 03:12:47 +00:00
Eli Friedman	2cd281ea67	Move test requiring x86 backend into X86 directory. llvm-svn: 143420	2011-11-01 03:11:41 +00:00
Matt Beaumont-Gay	1c1a2b8123	Change the actual tests to match the input directory rename (duh) llvm-svn: 143404	2011-10-31 23:56:52 +00:00
Matt Beaumont-Gay	da5e57cba1	Rename "TestObjectFiles" to "Inputs" (like the pattern for Clang tests) llvm-svn: 143400	2011-10-31 23:46:38 +00:00
Rafael Espindola	300dcb8e37	Move test to the X86 directory, note the PR number and only run MC once. llvm-svn: 143352	2011-10-31 17:23:09 +00:00
Owen Anderson	40703f4252	More not-crashing NEON disassembly updates for the vld refactoring. llvm-svn: 143351	2011-10-31 17:17:32 +00:00
Craig Topper	9821e75e64	Fix operand type for int_x86_ssse3_phadd_sw_128 intrinsic llvm-svn: 143336	2011-10-31 07:16:37 +00:00
Craig Topper	242d1f8c73	Test case for X86 FS/GS Base intrinsics llvm-svn: 143332	2011-10-31 02:15:47 +00:00
Craig Topper	cfcfdf2aab	Begin adding AVX2 instructions. No selection support yet other than intrinsics. llvm-svn: 143331	2011-10-31 02:15:10 +00:00
Nick Lewycky	aab6169ef6	Switch new .file directive emission off by default, change llc's flag for it to -enable-dwarf-directory. llvm-svn: 143326	2011-10-31 01:06:02 +00:00
Duncan Sands	3d5692a475	Reapply commit 143214 with a fix: m_ICmp doesn't match conditions with the given predicate, it matches any condition and returns the predicate - d'oh! Original commit message: The expression icmp eq (select (icmp eq x, 0), 1, x), 0 folds to false. Spotted by my super-optimizer in 186.crafty and 450.soplex. We really need a proper infrastructure for handling generalizations of this kind of thing (which occur a lot), however this case is so simple that I decided to go ahead and implement it directly. llvm-svn: 143318	2011-10-30 19:56:36 +00:00
Benjamin Kramer	7402ee6ec2	X86: Emit logical shift by constant splat of <16 x i8> as a <8 x i16> shift and zero out the bits where zeros should've been shifted in. llvm-svn: 143315	2011-10-30 17:31:21 +00:00
Craig Topper	9cdb9ffa43	Fix return type for X86 mpsadbw instrinsic. The instruction takes in a vector of 8-bit integers, but produces a vector of 16-bit integers. llvm-svn: 143313	2011-10-30 17:22:45 +00:00
Nadav Rotem	c602b2c4de	Fix pr11266. On x86: (shl V, 1) -> add V,V Hardware support for vector-shift is sparse and in many cases we scalarize the result. Additionally, on sandybridge padd is faster than shl. llvm-svn: 143311	2011-10-30 13:24:22 +00:00
Nadav Rotem	1dda6a8ce1	Stabilize the test by specifying an exact cpu target llvm-svn: 143307	2011-10-30 08:07:50 +00:00
Nadav Rotem	bf6568b5d6	Add a new DAGCombine optimization for BUILD_VECTOR. If all of the inputs are zero/any_extended, create a new simple BV which can be further optimized by other BV optimizations. llvm-svn: 143297	2011-10-29 21:23:04 +00:00
Benjamin Kramer	932de2bc86	Force SSE for this test. llvm-svn: 143291	2011-10-29 19:43:44 +00:00
Benjamin Kramer	594ee77964	SimplifyLibCalls: Use IRBuilder.CreateGlobalString when creating a string for printf->puts, which correctly sets the unnamed_addr bit on the resulting GlobalVariable. Fixes PR11264. llvm-svn: 143289	2011-10-29 19:43:31 +00:00
Eli Friedman	3af3c046a9	Revert r143214; it's breaking a bunch of stuff. llvm-svn: 143265	2011-10-29 00:56:07 +00:00
Dan Gohman	9b9c970148	Revert r143206, as there are still some failing tests. llvm-svn: 143262	2011-10-29 00:41:52 +00:00
NAKAMURA Takumi	6e315dd8ba	test/CodeGen/PowerPC/2008-10-17-AsmMatchingOperands.ll: [PR11218] Mark "REQUIRES: asserts" for now. llvm-svn: 143247	2011-10-28 23:11:03 +00:00
Jim Grosbach	b009a872d7	Add Thumb2 alias for "mov Rd, #imm" to "mvn Rd, #~imm". When '~imm' is encodable as a t2_so_imm but plain 'imm' is not. For example, mov r2, #-3 becomes mvn r2, #2 rdar://10349224 llvm-svn: 143235	2011-10-28 22:36:30 +00:00
Owen Anderson	5524ce7d82	Fix illegal disassembly testcase. llvm-svn: 143231	2011-10-28 21:45:09 +00:00
Duncan Sands	280bc553b3	The expression icmp eq (select (icmp eq x, 0), 1, x), 0 folds to false. Spotted by my super-optimizer in 186.crafty and 450.soplex. We really need a proper infrastructure for handling generalizations of this kind of thing (which occur a lot), however this case is so simple that I decided to go ahead and implement it directly. llvm-svn: 143214	2011-10-28 19:01:20 +00:00
Duncan Sands	985ba6386d	A shift of a power of two is a power of two or zero. For completeness - not spotted in the wild. llvm-svn: 143211	2011-10-28 18:30:05 +00:00
Duncan Sands	92af0a8a7f	Fold icmp ugt (udiv X, Y), X to false. Spotted by my super-optimizer in 186.crafty. llvm-svn: 143209	2011-10-28 18:17:44 +00:00
Owen Anderson	dde461c8b1	Reapply r143202, with a manual decoding hook for SWP. This change inadvertantly exposed a decoding ambiguity between SWP and CPS that the auto-generated decoder can't handle. llvm-svn: 143208	2011-10-28 18:02:13 +00:00
Dan Gohman	73057ad24f	Reapply r143177 and r143179 (reverting r143188), with scheduler fixes: Use a separate register, instead of SP, as the calling-convention resource, to avoid spurious conflicts with actual uses of SP. Also, fix unscheduling of calling sequences, which can be triggered by pseudo-two-address dependencies. llvm-svn: 143206	2011-10-28 17:55:38 +00:00
Jim Grosbach	7a49575d7f	Thumb2 ADD/SUB instructions encoding selection outside IT block. Outside an IT block, "add r3, #2" should select a 32-bit wide encoding rather than generating an error indicating the 16-bit encoding is only legal in an IT block (outside, the 'S' suffic is required for the 16-bit encoding). rdar://10348481 llvm-svn: 143201	2011-10-28 16:57:07 +00:00
NAKAMURA Takumi	7636f55348	test/MC/AsmParser/2011-09-06-NoNewline.s: Add explicit -mtriple=i386. It uses X86 instruction. FIXME: Would it be reproduced without target-specific operands? FIXME: Why run llvm-mc as the same input by 3 times? llvm-svn: 143195	2011-10-28 14:12:30 +00:00
NAKAMURA Takumi	29ccdd8207	Dwarf: [PR11022] Fix emitting DW_AT_const_value(>i64), to be host-endian-neutral. Don't assume APInt::getRawData() would hold target-aware endianness nor host-compliant endianness. rawdata[0] holds most lower i64, even on big endian host. FIXME: Add a testcase for big endian target. FIXME: Ditto on CompileUnit::addConstantFPValue() ? llvm-svn: 143194	2011-10-28 14:12:22 +00:00
NAKAMURA Takumi	88dd835f09	test/CodeGen/X86/2010-08-10-DbgConstant.ll: Add explicit -mtriple=i686-linux. It must be for elf! llvm-svn: 143189	2011-10-28 10:50:52 +00:00
Duncan Sands	225a7037d6	Speculatively disable Dan's commits 143177 and 143179 to see if it fixes the dragonegg self-host (it looks like gcc is miscompiled). Original commit messages: Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. Delete #if 0 code accidentally left in. llvm-svn: 143188	2011-10-28 09:55:57 +00:00
Nick Lewycky	cc64ae140d	Always use the string pool, even when it makes the .o larger. This may help tools that read the debug info in the .o files by making the DIE sizes more consistent. llvm-svn: 143186	2011-10-28 05:29:47 +00:00
Andrew Trick	effdca9441	LFTR should avoid a type mismatch with null pointer IVs. Fixes rdar://10359193 Indvar LinearFunctionTestReplace assertion llvm-svn: 143183	2011-10-28 03:45:11 +00:00
Dan Gohman	4db3f7dd83	Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. llvm-svn: 143177	2011-10-28 01:29:32 +00:00
Jim Grosbach	080a499ee0	ARM Allow 'q' registers in VLD/VST vector lists. Just treat it as if the constituent D registers where specified. rdar://10348896 llvm-svn: 143167	2011-10-28 00:06:50 +00:00
Dan Gohman	4c9fca99c9	Remove the Alpha backend. llvm-svn: 143164	2011-10-27 22:56:32 +00:00
Owen Anderson	f211416dde	Add testcase for r143162. llvm-svn: 143163	2011-10-27 22:54:14 +00:00
Jakob Stoklund Olesen	e5a6adceac	Also set addrmode6 alignment when align==size. Previously, we were only setting the alignment bits on over-aligned loads and stores. llvm-svn: 143160	2011-10-27 22:39:16 +00:00
Evan Cheng	f4807a19e8	Avoid partial CPSR dependency from loop backedges. rdar://10357570 llvm-svn: 143145	2011-10-27 21:21:05 +00:00
Daniel Dunbar	a054790390	tests: Rip out a bunch of now unused test code relating to use of llvm-gcc in LLVM tests. llvm-svn: 143143	2011-10-27 20:59:26 +00:00
Daniel Dunbar	f7223fcc80	tests: Remove llvm2cpp, I'm pretty sure no one uses this. llvm-svn: 143142	2011-10-27 20:59:21 +00:00
Duncan Sands	7cb61e5a0e	Reapply commit 143028 with a fix: the problem was casting a ConstantExpr Mul using BinaryOperator (which only works for instructions) when it should have been a cast to OverflowingBinaryOperator (which also works for constants). While there, correct a few other dubious looking uses of BinaryOperator. Thanks to Chad Rosier for the testcase. Original commit message: My super-optimizer noticed that we weren't folding this expression to true: (x *nsw x) sgt 0, where x = (y \| 1). This occurs in 464.h264ref. llvm-svn: 143125	2011-10-27 19:16:21 +00:00
Benjamin Kramer	652f576a70	2>&1 doesn't work here, it just creates an empty file called "&1" llvm-svn: 143117	2011-10-27 18:27:45 +00:00
Pete Cooper	ce63700797	Changed test to check for correct load size instead of shift as the shift might change if optimised llvm-svn: 143116	2011-10-27 18:15:58 +00:00
Kevin Enderby	49e6a0da7e	Change the sysexit mnemonic (and sysexitl) to never have the REX.W prefix and not depend on In32BitMode. Use the sysexitq mnemonic for the version with the REX.W prefix and only allow it only In64BitMode. rdar://9738584 llvm-svn: 143112	2011-10-27 17:40:41 +00:00
Jim Grosbach	6ed3845530	Thumb2 t2LDMDB[_UPD] assembly parsing to recognize .w suffix. rdar://10348844 llvm-svn: 143110	2011-10-27 17:33:59 +00:00
Jim Grosbach	ba7f90c7df	Thumb2 t2MVNi assembly parsing to recognize ".w" suffix. rdar://10348584 llvm-svn: 143108	2011-10-27 17:16:55 +00:00
Bob Wilson	1455ce27e4	Revert Duncan's r143028 expression folding which appears to be the culprit behind a compile failure on 483.xalancbmk. llvm-svn: 143102	2011-10-27 15:47:25 +00:00
Nick Lewycky	d59c0cac6c	Teach our Dwarf emission to use the string pool. llvm-svn: 143097	2011-10-27 06:44:11 +00:00
Eli Friedman	e9e356ad6b	Don't crash on 128-bit sdiv by constant. Found by inspection. llvm-svn: 143095	2011-10-27 02:06:39 +00:00
Eli Friedman	73beaf7bbc	It is not safe to sink an alloca into a stacksave/stackrestore pair, so don't do that. <rdar://problem/10352360> llvm-svn: 143093	2011-10-27 01:33:51 +00:00
Chad Rosier	d24e7e1d9b	A branch predicated on a constant can just FastEmit an unconditional branch. llvm-svn: 143086	2011-10-27 00:21:16 +00:00
Jim Grosbach	61fdba048f	Thumb2 ldr pc-relative encoding fixes. We were parsing label references to the i12 encoding, which isn't right. They need to go to the pci variant instead. More of rdar://10348687 llvm-svn: 143068	2011-10-26 22:22:01 +00:00
Rafael Espindola	f5a15529a7	Run test with -verify-machineinstrs. Patch by Sanjoy Das. llvm-svn: 143066	2011-10-26 21:20:26 +00:00
Rafael Espindola	b3285224cd	Fixes an issue reported by -verify-machineinstrs. Patch by Sanjoy Das. llvm-svn: 143064	2011-10-26 21:16:41 +00:00
Rafael Espindola	66393c127d	This commit introduces two fake instructions MORESTACK_RET and MORESTACK_RET_RESTORE_R10; which are lowered to a RET and a RET followed by a MOV respectively. Having a fake instruction prevents the verifier from seeing a MachineBasicBlock end with a non-terminator (MOV). It also prevents the rather eccentric case of a MachineBasicBlock ending with RET but having successors nevertheless. Patch by Sanjoy Das. llvm-svn: 143062	2011-10-26 21:12:27 +00:00
Lang Hames	c47e283430	Make sure short memsets on ARM lower to stores, even when optimizing for size. llvm-svn: 143055	2011-10-26 20:56:52 +00:00
Duncan Sands	ba286d7c73	The maximum power of 2 dividing a power of 2 is itself. This occurs in 403.gcc and was spotted by my super-optimizer. llvm-svn: 143054	2011-10-26 20:55:21 +00:00
Jim Grosbach	25d4707c4d	Thumb2 remove redundant ".w" suffix from t2MVNCCi pattern. llvm-svn: 143034	2011-10-26 17:28:15 +00:00
Duncan Sands	1d2bb9882d	My super-optimizer noticed that we weren't folding this expression to true: (x *nsw x) sgt 0, where x = (y \| 1). This occurs in 464.h264ref. llvm-svn: 143028	2011-10-26 15:31:51 +00:00
James Molloy	dd9137aa56	Revert r142530 at least temporarily while a discussion is had on llvm-commits regarding exactly how much optsize should optimize for size over performance. llvm-svn: 143023	2011-10-26 08:53:19 +00:00
Evan Cheng	043c9d3f7a	Revert part of r142530. The patch potentially hurts performance especially on Darwin platforms where -Os means optimize for size without hurting performance. llvm-svn: 143002	2011-10-26 01:17:44 +00:00
Mon P Wang	6ebf401412	The bitcode reader can create an shuffle with a place holder mask which it will fix up later. For this special case, allow such a mask to be considered valid. <rdar://problem/8622574> llvm-svn: 142992	2011-10-26 00:34:48 +00:00
Michael J. Spencer	8ab7b036f7	Object: change test to create archive. llvm-svn: 142982	2011-10-25 22:30:58 +00:00
Chad Rosier	67a5df5329	Add a few test cases to ensure the bitcode reader is backward compatible with LLVM 2.9. My understanding is that we plan to maintain compatibility with 2.9 until the 3.1 release. At that time we can generate new test cases using LLVM 3.0. llvm-svn: 142958	2011-10-25 20:33:19 +00:00
Chad Rosier	1248500425	Simplify tests by not piping them through llvm-dis. llvm-svn: 142948	2011-10-25 19:59:50 +00:00
Duncan Sands	a370f3e34e	Restore commits 142790 and 142843 - they weren't breaking the build bots. Original commit messages: - Reapply r142781 with fix. Original message: Enhance SCEV's brute force loop analysis to handle multiple PHI nodes in the loop header when computing the trip count. With this, we now constant evaluate: struct ListNode { const struct ListNode next; int i; }; static const struct ListNode node1 = {0, 1}; static const struct ListNode node2 = {&node1, 2}; static const struct ListNode node3 = {&node2, 3}; int test() { int sum = 0; for (const struct ListNode n = &node3; n != 0; n = n->next) sum += n->i; return sum; } - Now that we look at all the header PHIs, we need to consider all the header PHIs when deciding that the loop has stopped evolving. Fixes miscompile in the gcc torture testsuite! llvm-svn: 142919	2011-10-25 12:28:52 +00:00
Chandler Carruth	32f46e7c07	Fix the API usage in loop probability heuristics. It was incorrectly classifying many edges as exiting which were in fact not. These mainly formed edges into sub-loops. It was also not correctly classifying all returning edges out of loops as leaving the loop. With this match most of the loop heuristics are more rational. Several serious regressions on loop-intesive benchmarks like perlbench's loop tests when built with -enable-block-placement are fixed by these updated heuristics. Unfortunately they in turn uncover some other regressions. There are still several improvemenst that should be made to loop heuristics including trip-count, and early back-edge management. llvm-svn: 142917	2011-10-25 09:47:41 +00:00
Duncan Sands	805c5b92c8	Speculatively revert commits 142790 and 142843 to see if it fixes the dragonegg and llvm-gcc self-host buildbots. Original commit messages: - Reapply r142781 with fix. Original message: Enhance SCEV's brute force loop analysis to handle multiple PHI nodes in the loop header when computing the trip count. With this, we now constant evaluate: struct ListNode { const struct ListNode next; int i; }; static const struct ListNode node1 = {0, 1}; static const struct ListNode node2 = {&node1, 2}; static const struct ListNode node3 = {&node2, 3}; int test() { int sum = 0; for (const struct ListNode n = &node3; n != 0; n = n->next) sum += n->i; return sum; } - Now that we look at all the header PHIs, we need to consider all the header PHIs when deciding that the loop has stopped evolving. Fixes miscompile in the gcc torture testsuite! llvm-svn: 142916	2011-10-25 09:26:43 +00:00
Chad Rosier	48d436618d	Fix these test cases to not use .bc files. Otherwise, we run into issues with bitcode reader/writer backward compatibility. llvm-svn: 142896	2011-10-25 01:22:20 +00:00
Jim Grosbach	17ec1a19e5	ARM assembly parsing and encoding for VLD1 with writeback. Four entry register lists. llvm-svn: 142882	2011-10-25 00:14:01 +00:00
Dan Gohman	b43c36f391	Remove the Blackfin backend. llvm-svn: 142880	2011-10-25 00:05:42 +00:00
Dan Gohman	dfc96aea90	Remove the SystemZ backend. llvm-svn: 142878	2011-10-24 23:48:32 +00:00
Jim Grosbach	92fd05ecdc	ARM assembly parsing and encoding for VLD1 w/ writeback. Three entry register list variation. llvm-svn: 142876	2011-10-24 23:26:05 +00:00
Eli Friedman	a5e244c08d	Don't crash on variable insertelement on ARM. PR10258. llvm-svn: 142871	2011-10-24 23:08:52 +00:00
Bill Wendling	57e3aaad89	Check the visibility of the global variable before placing it into the stubs table. A hidden variable could potentially end up in both lists. <rdar://problem/10336715> llvm-svn: 142869	2011-10-24 23:05:43 +00:00
Jim Grosbach	3ea0657d54	ARM assembly parsing and encoding for VLD1 w/ writeback. One and two length register list variants. llvm-svn: 142861	2011-10-24 22:16:58 +00:00
Nick Lewycky	a58fb48a55	Now that we look at all the header PHIs, we need to consider all the header PHIs when deciding that the loop has stopped evolving. Fixes miscompile in the gcc torture testsuite! llvm-svn: 142843	2011-10-24 21:02:38 +00:00
Owen Anderson	295b1e84ce	Fix a NEON disassembly case that was broken in the recent refactorings. As more of this code gets refactored, a lot of these manual decoding hooks should get smaller and/or go away entirely. llvm-svn: 142817	2011-10-24 18:04:29 +00:00
Dan Gohman	2c9bda1512	Remove the explicit request for "Latency" scheduling from MSP430, as the Latency scheduler is going away. llvm-svn: 142811	2011-10-24 17:53:16 +00:00
Dan Gohman	c32af340fc	Change the default scheduler from Latency to ILP, since Latency is going away. llvm-svn: 142810	2011-10-24 17:45:02 +00:00
Jim Grosbach	3adec13c3e	Update test for r142801. llvm-svn: 142806	2011-10-24 17:26:26 +00:00
Benjamin Kramer	d48d52e7b2	XFAIL test on leak checkers. llvm-svn: 142804	2011-10-24 17:24:05 +00:00
Chandler Carruth	7111f4564c	Remove return heuristics from the static branch probabilities, and introduce no-return or unreachable heuristics. The return heuristics from the Ball and Larus paper don't work well in practice as they pessimize early return paths. The only good hitrate return heuristics are those for: - NULL return - Constant return - negative integer return Only the last of these three can possibly require significant code for the returning block, and even the last is fairly rare and usually also a constant. As a consequence, even for the cold return paths, there is little code on that return path, and so little code density to be gained by sinking it. The places where sinking these blocks is valuable (inner loops) will already be weighted appropriately as the edge is a loop-exit branch. All of this aside, early returns are nearly as common as all three of these return categories, and should actually be predicted as taken! Rather than muddy the waters of the static predictions, just remain silent on returns and let the CFG itself dictate any layout or other issues. However, the return heuristic was flagging one very important case: unreachable. Unfortunately it still gave a 1/4 chance of the branch-to-unreachable occuring. It also didn't do a rigorous job of finding those blocks which post-dominate an unreachable block. This patch builds a more powerful analysis that should flag all branches to blocks known to then reach unreachable. It also has better worst-case runtime complexity by not looping through successors for each block. The previous code would perform an N^2 walk in the event of a single entry block branching to N successors with a switch where each successor falls through to the next and they finally fall through to a return. Test case added for noreturn heuristics. Also doxygen comments improved along the way. llvm-svn: 142793	2011-10-24 12:01:08 +00:00
Nick Lewycky	9be7f277e4	Reapply r142781 with fix. Original message: Enhance SCEV's brute force loop analysis to handle multiple PHI nodes in the loop header when computing the trip count. With this, we now constant evaluate: struct ListNode { const struct ListNode next; int i; }; static const struct ListNode node1 = {0, 1}; static const struct ListNode node2 = {&node1, 2}; static const struct ListNode node3 = {&node2, 3}; int test() { int sum = 0; for (const struct ListNode n = &node3; n != 0; n = n->next) sum += n->i; return sum; } llvm-svn: 142790	2011-10-24 06:57:05 +00:00
Nick Lewycky	dd1d3df524	A dead malloc, a free(NULL) and a free(undef) are all trivially dead instructions. This doesn't introduce any optimizations we weren't doing before (except potentially due to pass ordering issues), now passes will eliminate them sooner as part of their own cleanups. llvm-svn: 142787	2011-10-24 04:35:36 +00:00
Nick Lewycky	9d28c26d77	Speculatively revert r142781. Bots are showing Assertion `i_nocapture < OperandTraits<PHINode>::operands(this) && "getOperand() out of range!"' failed. coming out of indvars. llvm-svn: 142786	2011-10-24 04:00:25 +00:00
Nick Lewycky	1700007ecc	Enhance SCEV's brute force loop analysis to handle multiple PHI nodes in the loop header when computing the trip count. With this, we now constant evaluate: struct ListNode { const struct ListNode next; int i; }; static const struct ListNode node1 = {0, 1}; static const struct ListNode node2 = {&node1, 2}; static const struct ListNode node3 = {&node2, 3}; int test() { int sum = 0; for (const struct ListNode n = &node3; n != 0; n = n->next) sum += n->i; return sum; } llvm-svn: 142781	2011-10-23 23:43:14 +00:00
Craig Topper	b05d9e9bea	Add X86 SARX, SHRX, and SHLX instructions. llvm-svn: 142779	2011-10-23 22:18:24 +00:00
Chandler Carruth	1c8ace0e89	Teach the BranchProbabilityInfo pass to print its results, and use that to bring it under direct test instead of merely indirectly testing it in the BlockFrequencyInfo pass. The next step is to start adding tests for the various heuristics employed, and to start fixing those heuristics once they're under test. llvm-svn: 142778	2011-10-23 21:21:50 +00:00
Chandler Carruth	bd1be4d01c	Completely re-write the algorithm behind MachineBlockPlacement based on discussions with Andy. Fundamentally, the previous algorithm is both counter productive on several fronts and prioritizing things which aren't necessarily the most important: static branch prediction. The new algorithm uses the existing loop CFG structure information to walk through the CFG itself to layout blocks. It coalesces adjacent blocks within the loop where the CFG allows based on the most likely path taken. Finally, it topologically orders the block chains that have been formed. This allows it to choose a (mostly) topologically valid ordering which still priorizes fallthrough within the structural constraints. As a final twist in the algorithm, it does violate the CFG when it discovers a "hot" edge, that is an edge that is more than 4x hotter than the competing edges in the CFG. These are forcibly merged into a fallthrough chain. Future transformations that need te be added are rotation of loop exit conditions to be fallthrough, and better isolation of cold block chains. I'm also planning on adding statistics to model how well the algorithm does at laying out blocks based on the probabilities it receives. The old tests mostly still pass, and I have some new tests to add, but the nested loops are still behaving very strangely. This almost seems like working-as-intended as it rotated the exit branch to be fallthrough, but I'm not convinced this is actually the best layout. It is well supported by the probabilities for loops we currently get, but those are pretty broken for nested loops, so this may change later. llvm-svn: 142743	2011-10-23 09:18:45 +00:00
Craig Topper	980d59832a	Add X86 RORX instruction llvm-svn: 142741	2011-10-23 07:34:00 +00:00
Cameron Zwarich	057fbb1a10	The element insertion code in scalar replacement doesn't handle incorrect element types, even though the element extraction code does. It is surprising that this bug has been here for so long. Fixes <rdar://problem/10318778>. llvm-svn: 142740	2011-10-23 07:02:10 +00:00
Craig Topper	e94d277db8	Add X86 MULX instruction for disassembler. llvm-svn: 142738	2011-10-23 00:33:32 +00:00
Nick Lewycky	52340ac5f8	Oops! Fix test I forgot to submit as part of r142735. llvm-svn: 142736	2011-10-22 22:07:31 +00:00
Nick Lewycky	32f8051d66	A non-escaping malloc in the entry block is not unlike an alloca. Do dead-store elimination on them too. llvm-svn: 142735	2011-10-22 21:59:35 +00:00
Nick Lewycky	a6674c7fc9	Make SCEV's brute force analysis stronger in two ways. Firstly, we should be able to constant fold load instructions where the argument is a constant. Second, we should be able to watch multiple PHI nodes through the loop; this patch only supports PHIs in loop headers, more can be done here. With this patch, we now constant evaluate: static const int arr[] = {1, 2, 3, 4, 5}; int test() { int sum = 0; for (int i = 0; i < 5; ++i) sum += arr[i]; return sum; } llvm-svn: 142731	2011-10-22 19:58:20 +00:00
Nadav Rotem	e649d66552	Fix pr11193. SHL inserts zeros from the right, thus even when the original sign_extend_inreg value was of 1-bit, we need to sra. llvm-svn: 142724	2011-10-22 12:39:25 +00:00
Jim Grosbach	11c0b347c6	Assembly parsing for 4-register sequential variant of VLD2. llvm-svn: 142704	2011-10-21 23:58:57 +00:00
Jim Grosbach	118b38cbf1	Assembly parsing for 2-register sequential variant of VLD2. llvm-svn: 142691	2011-10-21 22:21:10 +00:00
Eli Friedman	688db1d6d0	Remap blockaddress correctly when inlining a function. Fixes PR10162. llvm-svn: 142684	2011-10-21 20:45:19 +00:00
Jim Grosbach	846bcff7c7	Assembly parsing for 4-register variant of VLD1. llvm-svn: 142682	2011-10-21 20:35:01 +00:00
Jim Grosbach	c4360fe575	Assembly parsing for 3-register variant of VLD1. llvm-svn: 142675	2011-10-21 20:02:19 +00:00
Eli Friedman	ce818277fc	Extend instcombine's shufflevector simplification to handle more cases where the input and output vectors have different sizes. Patch by Xiaoyi Guo. llvm-svn: 142671	2011-10-21 19:06:29 +00:00
Jim Grosbach	2f2e3c4737	ARM VLD parsing and encoding. Next step in the ongoing saga of NEON load/store assmebly parsing. Handle VLD1 instructions that take a two-register register list. Adjust the instruction definitions to only have the single encoded register as an operand. The super-register from the pseudo is kept as an implicit def, so passes which come after pseudo-expansion still know that the instruction defines the other subregs. llvm-svn: 142670	2011-10-21 18:54:25 +00:00
Nadav Rotem	5e00bb5feb	Fix pr11194. When promoting and splitting integers we need to use ZExtPromotedInteger and SExtPromotedInteger based on the operation we legalize. SetCC return type needs to be legalized via PromoteTargetBoolean. llvm-svn: 142660	2011-10-21 17:35:19 +00:00
Chandler Carruth	70a38058b1	Don't hard code the desired alignment for loops -- it isn't 16-bytes on all x86 systems. Sorry for the breakage. llvm-svn: 142656	2011-10-21 16:41:39 +00:00
Nadav Rotem	d315157f12	1. Fix the widening of SETCC in WidenVecOp_SETCC. Use the correct return CC type. 2. Fix a typo in CONCAT_VECTORS which exposed the bug in #1. llvm-svn: 142648	2011-10-21 11:42:07 +00:00
Chandler Carruth	8b9737cb54	Add loop aligning to MachineBlockPlacement based on review discussion so it's a bit more plausible to use this instead of CodePlacementOpt. The code for this was shamelessly stolen from CodePlacementOpt, and then trimmed down a bit. There doesn't seem to be much utility in returning true/false from this pass as we may or may not have rewritten all of the blocks. Also, the statistic of counting how many loops were aligned doesn't seem terribly important so I removed it. If folks would like it to be included, I'm happy to add it back. This was probably the most egregious of the missing features, and now I'm going to start gathering some performance numbers and looking at specific loop structures that have different layout between the two. Test is updated to include both basic loop alignment and nested loop alignment. llvm-svn: 142645	2011-10-21 08:57:37 +00:00
Chandler Carruth	ddfeaafdfb	Add a very basic test for MachineBlockPlacement. This is essentially the canonical example I used when developing it, and is one of the primary motivating real-world use cases for __builtin_expect (when burried under a macro). I'm working on more test cases here, but I'm trying to make sure both that the pass is doing the right thing with the test cases and that they aren't too brittle to changes elsewhere in the code generation pipeline. Feedback and/or suggestions on how to test this are very welcome. Especially feedback on whether testing the block comments is a good strategy; I couldn't find any good examples to steal from but all the other ideas I had were a lot uglier or more fragile. llvm-svn: 142644	2011-10-21 08:01:56 +00:00
Craig Topper	039a79067a	Remove intrinsics for X86 BLSI, BLSMSK, and BLSR intrinsics and replace with custom isel lowering code. llvm-svn: 142642	2011-10-21 06:55:01 +00:00
Owen Anderson	16c8fc5191	Revert r142618, r142622, and r142624, which were based on an incorrect reading of the ARMv7 docs. llvm-svn: 142626	2011-10-20 22:23:58 +00:00
Owen Anderson	608c60c773	Fix decoding tests for fixed MSR encodings. llvm-svn: 142624	2011-10-20 22:01:48 +00:00
Owen Anderson	48da0ed477	Fix tests for corrected MSR encodings. llvm-svn: 142622	2011-10-20 21:53:19 +00:00
Jim Grosbach	9036c5cf2b	ARM VLD1/VST1 (one register, no writeback) assembly parsing and encoding. llvm-svn: 142583	2011-10-20 15:04:25 +00:00
Jim Grosbach	3ad44e50b3	Tidy up formatting. llvm-svn: 142582	2011-10-20 14:57:47 +00:00
Jim Grosbach	8db25984a9	ARM VTBX (one register) assembly parsing and encoding. llvm-svn: 142581	2011-10-20 14:48:50 +00:00
Eli Friedman	1923a330e6	Refactor code from inlining and globalopt that checks whether a function definition is unused, and enhance it so it can tell that functions which are only used by a blockaddress are in fact dead. This probably doesn't happen much on most code, but the Linux kernel's _THIS_IP_ can trigger this issue with blockaddress. (GlobalDCE can also handle the given tescase, but we only run that at -O3.) Found while looking at PR11180. llvm-svn: 142572	2011-10-20 05:23:42 +00:00
Nick Lewycky	462098824f	"@string = constant i8 0" is a value i8* string of length zero. Analyze that correctly in GetStringLength, fixing PR11181! llvm-svn: 142558	2011-10-20 00:34:35 +00:00
Chad Rosier	add38c12b8	Revert 142337. Thumb1 still doesn't support dynamic stack realignment. :( llvm-svn: 142557	2011-10-20 00:07:12 +00:00

1 2 3 4 5 ...

14953 Commits