llvm-project

Commit Graph

Author	SHA1	Message	Date
Alexey Samsonov	dcc1291d17	This CL changes the function prologue and epilogue emitted on X86 when stack needs realignment. It is intended to fix PR11468. Old prologue and epilogue looked like this: push %rbp mov %rsp, %rbp and $alignment, %rsp push %r14 push %r15 ... pop %r15 pop %r14 mov %rbp, %rsp pop %rbp The problem was to reference the locations of callee-saved registers in exception handling: locations of callee-saved had to be re-calculated regarding the stack alignment operation. It would take some effort to implement this in LLVM, as currently MachineLocation can only have the form "Register + Offset". Funciton prologue and epilogue are now changed to: push %rbp mov %rsp, %rbp push %14 push %15 and $alignment, %rsp ... lea -$size_of_saved_registers(%rbp), %rsp pop %r15 pop %r14 pop %rbp Reviewed by Chad Rosier. llvm-svn: 160248	2012-07-16 06:54:09 +00:00
Nadav Rotem	3050e07108	Fix a bug in the scalarization of BUILD_VECTOR. BUILD_VECTOR elements may be wider than the output element type. Make sure to trunc them if needed. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160235	2012-07-15 20:39:08 +00:00
Nadav Rotem	eec74c7279	Teach getTargetVShiftNode about TargetConstant nodes. llvm-svn: 160234	2012-07-15 20:27:43 +00:00
NAKAMURA Takumi	032dc0a06c	llvm/test/CodeGen/X86/2012-07-15-broadcastfold.ll: Rewrite expressions to fit various targets. - Make sure existence of "barrier". - Confirm reload corresponding to spill. llvm-svn: 160232	2012-07-15 14:38:35 +00:00
Nadav Rotem	ee3552f88d	Rename VBROADCASTSDrm into VBROADCASTSDYrm to match the naming convention. Allow the folding of vbroadcastRR to vbroadcastRM, where the memory operand is a spill slot. PR12782. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160230	2012-07-15 12:26:30 +00:00
Nadav Rotem	9466e81df6	AVX: Fix a bug in getTargetVShiftNode. The shift amount has to be a 128bit vector with the same element type as the input vector. This is needed because of the patterns we have for the VP[SLL/SRA/SRL][W/D/Q] instructions. llvm-svn: 160222	2012-07-14 22:26:05 +00:00
Nadav Rotem	018921002e	Add a dagcombine optimization to convert concat_vectors of undefs into a single undef. The unoptimized concat_vectors isd prevented the canonicalization of the vector_shuffle node. llvm-svn: 160221	2012-07-14 21:30:27 +00:00
Andrew Trick	653513b8dd	LSR Fix: check SCEV expression safety before expansion. All SCEV expressions used by LSR formulae must be safe to expand. i.e. they may not contain UDiv unless we can prove nonzero denominator. Fixes PR11356: LSR hoists UDiv. llvm-svn: 160205	2012-07-13 23:33:10 +00:00
Joel Jones	43cb87839c	This is one of the first steps at moving to replace target-dependent intrinsics with target-indepdent intrinsics. The first instruction(s) to be handled are the vector versions of count leading zeros (ctlz). The changes here are to clang so that it generates a target independent vector ctlz when it sees an ARM dependent vector ctlz. The changes in llvm are to match the target independent vector ctlz and in VMCore/AutoUpgrade.cpp to update any existing bc files containing ARM dependent vector ctlzs with target-independent ctlzs. There are also changes to an existing test case in llvm for ARM vector count instructions and a new test for the bitcode upgrade. <rdar://problem/11831778> There is deliberately no test for the change to clang, as so far as I know, no consensus has been reached regarding how to test neon instructions in clang; q.v. <rdar://problem/8762292> llvm-svn: 160200	2012-07-13 23:25:25 +00:00
Jack Carter	5ddcfda8ef	The Mips specific relocation R_MIPS_GOT_DISP is used in cases where global symbols are directly represented in the GOT and we use an offset into the global offset table. This patch adds direct object support for R_MIPS_GOT_DISP. llvm-svn: 160183	2012-07-13 19:15:47 +00:00
Jack Carter	2e3358a0f8	test case for revision 160084: Alignment filling between Mips function units llvm-svn: 160177	2012-07-13 18:14:01 +00:00
Duncan Sands	a9c373e49d	Restrict this to x86, hopefully fixing ARM buildbots. llvm-svn: 160163	2012-07-13 07:02:00 +00:00
Eric Christopher	bf57091f8b	The end of the prologue should be marked with is_stmt. Fixes PR13303. Patch by Paul Robinson! llvm-svn: 160148	2012-07-12 23:30:25 +00:00
Akira Hatanaka	a13cd0666e	Fix check strings in test/MC/Disassembler/Mips/* and run FileCheck. Patch by Vladimir Medic. llvm-svn: 160143	2012-07-12 21:19:32 +00:00
Benjamin Kramer	4d0916788d	Give the rdrand instructions a SideEffect flag and a chain so MachineCSE and MachineLICM don't touch it. I already had the necessary things in place for IR-level passes but missed the machine passes. llvm-svn: 160137	2012-07-12 18:14:57 +00:00
Nadav Rotem	fdce33a495	The LIT tests below do not specify the exact cpu model and fail on AVX2 machines, because we select different instructions such as vbroadcast, new shuffles, etc. Patch by Michael Liao. llvm-svn: 160129	2012-07-12 13:45:15 +00:00
NAKAMURA Takumi	f415fe70f3	llvm/test/CodeGen/X86/rdrand.ll: Relax expression corresponding to Win64 CC. llvm-svn: 160124	2012-07-12 10:22:57 +00:00
NAKAMURA Takumi	0b00f994a6	llvm/test/CMakeLists.txt: Add llvm-diff to deps. llvm-svn: 160123	2012-07-12 10:15:48 +00:00
Benjamin Kramer	cbac2f3bc9	Use %s instead of the explicit name, the latter doesn't work in out-of-tree builds. llvm-svn: 160120	2012-07-12 09:36:29 +00:00
Benjamin Kramer	0ab2794eda	Add intrinsics for Ivy Bridge's rdrand instruction. The rdrand/cmov sequence is the same that is emitted by both GCC and ICC. Fixes PR13284. llvm-svn: 160117	2012-07-12 09:31:43 +00:00
Duncan Sands	671cc2575d	The result type of EXTRACT_VECTOR_ELT doesn't have to match the element type of the input vector, it can be bigger (this is helpful for powerpc where <2 x i16> is a legal vector type but i16 isn't a legal type, IIRC). However this wasn't being taken into account by ExpandRes_EXTRACT_VECTOR_ELT, causing PR13220. Lightly tweaked version of a patch by Michael Liao. llvm-svn: 160116	2012-07-12 09:01:35 +00:00
Craig Topper	f7755df776	Update GATHER instructions to support 2 read-write operands. Patch from myself and Manman Ren. llvm-svn: 160110	2012-07-12 06:52:41 +00:00
Evan Cheng	493eb32ff4	Instcombine was transforming: %shr = lshr i64 %key, 3 %0 = load i64* %val, align 8 %sub = add i64 %0, -1 %and = and i64 %sub, %shr ret i64 %and to: %shr = lshr i64 %key, 3 %0 = load i64* %val, align 8 %sub = add i64 %0, 2305843009213693951 %and = and i64 %sub, %shr ret i64 %and The demanded bit optimization is actually a pessimization because add -1 would be codegen'ed as a sub 1. Teach the demanded constant shrinking optimization to check for negated constant to make sure it is actually reducing the width of the constant. rdar://11793464 llvm-svn: 160101	2012-07-12 01:45:35 +00:00
Manman Ren	34cb93e192	ARM: Fix optimizeCompare to correctly check safe condition. It is safe if CPSR is killed or re-defined. When we are done with the basic block, check whether CPSR is live-out. Do not optimize away cmp if CPSR is live-out. llvm-svn: 160090	2012-07-11 22:51:44 +00:00
Stepan Dyatkovskiy	326edc579a	Fixed diff comparison. llvm-svn: 160076	2012-07-11 21:02:57 +00:00
Akira Hatanaka	20dced4dbb	Test case for r160036. llvm-svn: 160067	2012-07-11 19:50:46 +00:00
Manman Ren	1553ce0e81	X86: Update to peephole optimization to move Movr0 before (Sub, Cmp) pair. When Movr0 is between sub and cmp, we move Movr0 before sub if it enables removal of Cmp. llvm-svn: 160066	2012-07-11 19:35:12 +00:00
Akira Hatanaka	24cf4e36e5	Implement MipsTargetLowering::LowerSELECT_CC to custom lower SELECT_CC. llvm-svn: 160064	2012-07-11 19:32:27 +00:00
Benjamin Kramer	3aab6a86a2	PR13326: Fix a subtle edge case in the udiv -> magic multiply generator. This caused 6 of 65k possible 8 bit udivs to be wrong. llvm-svn: 160058	2012-07-11 18:31:59 +00:00
Nadav Rotem	d2bdcebb14	When ext-loading and trunc-storing vectors to memory, on x86 32bit systems, allow loads/stores of 64bit values from xmm registers. llvm-svn: 160044	2012-07-11 13:27:05 +00:00
Akira Hatanaka	878ad8b28d	Lower RETURNADDR node in Mips backend. Patch by Sasa Stankovic. llvm-svn: 160031	2012-07-11 00:53:32 +00:00
Jack Carter	e8cb2fc616	Mips specific inline asm operand modifier 'L'. Low order register of a double word register operand. Operands are defined by the name of the variable they are marked with in the inline assembler code. This is a way to specify that the operand just refers to the low order register for that variable. It is the opposite of modifier 'D' which specifies the high order register. Example: main() { long long ll_input = 0x1111222233334444LL; long long ll_val = 3; int i_result = 0; __asm__ __volatile__( "or %0, %L1, %2" : "=r" (i_result) : "r" (ll_input), "r" (ll_val)); } Which results in: lui $2, %hi(_gp_disp) addiu $2, $2, %lo(_gp_disp) addiu $sp, $sp, -8 addu $2, $2, $25 sw $2, 0($sp) lui $2, 13107 ori $3, $2, 17476 <-- Low 32 bits of ll_input lui $2, 4369 ori $4, $2, 8738 <-- High 32 bits of ll_input addiu $5, $zero, 3 <-- Low 32 bits of ll_val addiu $2, $zero, 0 <-- High 32 bits of ll_val #APP or $3, $4, $5 <-- or i_result, high 32 ll_input, low 32 of ll_val #NO_APP addiu $sp, $sp, 8 jr $ra If not direction is done for the long long for 32 bit variables results in using the low 32 bits as ll_val shows. There is an existing bug if 'L' or 'D' is used for the destination register for 32 bit long longs in that the target value will be updated incorrectly for the non-specified part unless explicitly set within the inline asm code. llvm-svn: 160028	2012-07-10 22:41:20 +00:00
Chad Rosier	3ee9a4c29e	Add newline. llvm-svn: 160006	2012-07-10 17:57:00 +00:00
Chad Rosier	579b1fee6b	Add test case accidentally omitted from r160002. llvm-svn: 160004	2012-07-10 17:49:39 +00:00
Chad Rosier	bdb08ac50a	Add support for dynamic stack realignment in the presence of dynamic allocas on X86. Basically, this is a reapplication of r158087 with a few fixes. Specifically, (1) the stack pointer is restored from the base pointer before popping callee-saved registers and (2) in obscure cases (see comments in patch) we must cache the value of the original stack adjustment in the prologue and apply it in the epilogue. rdar://11496434 llvm-svn: 160002	2012-07-10 17:45:53 +00:00
Nadav Rotem	d908ddc186	Improve the loading of load-anyext vectors by allowing the codegen to load multiple scalars and insert them into a vector. Next, we shuffle the elements into the correct places, as before. Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the migration of bitcasts happened too late in the SelectionDAG process. llvm-svn: 159991	2012-07-10 13:25:08 +00:00
Richard Barton	1dc44dcedd	Fix instruction description of VMOV (between two ARM core registers and two single-precision resiters) (and do it properly this time! llvm-svn: 159989	2012-07-10 12:51:09 +00:00
Craig Topper	be41e2daa6	Reverse assembler/disassembler operand order for gather instructions. llvm-svn: 159983	2012-07-10 06:38:33 +00:00
Akira Hatanaka	efff7b763b	Make register Mips::RA allocatable if not in mips16 mode. llvm-svn: 159971	2012-07-10 00:19:06 +00:00
Chad Rosier	aeed158f75	Revert r159938 (and r159945) to appease the buildbots. llvm-svn: 159960	2012-07-09 20:43:34 +00:00
Owen Anderson	d4b841f8f9	Teach the DAG combiner to turn sitofp/uitofp from i1 into a conditional move, since there are only two possible values. Previously, this would become an integer extension operation, followed by a real integer->float conversion. llvm-svn: 159957	2012-07-09 20:31:12 +00:00
Manman Ren	5f6fa428fa	X86: implement functions to analyze & synthesize CMOV\|SET\|Jcc getCondFromSETOpc, getCondFromCMovOpc, getSETFromCond, getCMovFromCond No functional change intended. If we want to update the condition code of CMOV\|SET\|Jcc, we first analyze the opcode to get the condition code, then update the condition code, finally synthesize the new opcode form the new condition code. llvm-svn: 159955	2012-07-09 18:57:12 +00:00
Akira Hatanaka	9bf2b5677d	Reapply r158846. Access mips register classes via MCRegisterInfo's functions instead of via the TargetRegisterClasses defined in MipsGenRegisterInfo.inc. llvm-svn: 159953	2012-07-09 18:46:47 +00:00
Nuno Lopes	95cc4f3cb5	instcombine: merge the functions that remove dead allocas and dead mallocs/callocs/... This patch removes ~70 lines in InstCombineLoadStoreAlloca.cpp and makes both functions a bit more aggressive than before :) In theory, we can be more aggressive when removing an alloca than a malloc, because an alloca pointer should never escape, but we are not taking advantage of this anyway llvm-svn: 159952	2012-07-09 18:38:20 +00:00
Richard Barton	c9e1c94fae	Fix instruction description of VMOV (between two ARM core registers and two single-precision resiters) llvm-svn: 159938	2012-07-09 16:41:33 +00:00
Richard Barton	35aceb86fe	Prevent ARM assembler from losing a right shift by #32 applied to a register llvm-svn: 159937	2012-07-09 16:31:14 +00:00
Richard Barton	a39625ecc6	Teach the assembler to use the narrow thumb encodings of various three-register dp instructions where permissable. llvm-svn: 159935	2012-07-09 16:12:24 +00:00
Manman Ren	bb36074047	X86: Fix optimizeCompare to correctly check safe condition. It is safe if EFLAGS is killed or re-defined. When we are done with the basic block, check whether EFLAGS is live-out. Do not optimize away cmp if EFLAGS is live-out. llvm-svn: 159888	2012-07-07 03:34:46 +00:00
Nuno Lopes	fa0dffccee	teach instcombine to remove allocated buffers even if there are stores, memcpy/memmove/memset, and objectsize users. This means we can do cheap DSE for heap memory. Nothing is done if the pointer excapes or has a load. The churn in the tests is mostly due to objectsize, since we want to make sure we don't delete the malloc call before evaluating the objectsize (otherwise it becomes -1/0) llvm-svn: 159876	2012-07-06 23:09:25 +00:00
Akira Hatanaka	b577ff116d	revert r159851. llvm-svn: 159854	2012-07-06 20:16:48 +00:00
Akira Hatanaka	cfa35fa0ff	Reapply r158846. Include file MipsGenRegisterInfo.inc. llvm-svn: 159851	2012-07-06 19:29:11 +00:00
Manman Ren	c965673707	X86: peephole optimization to remove cmp instruction For each Cmp, we check whether there is an earlier Sub which make Cmp redundant. We handle the case where SUB operates on the same source operands as Cmp, including the case where the two source operands are swapped. llvm-svn: 159838	2012-07-06 17:36:20 +00:00
Chad Rosier	88d53eae56	[fast-isel] Tell fast-isel to do nothing with the new donothing intrinsic. llvm-svn: 159837	2012-07-06 17:33:39 +00:00
Duncan Sands	c65aa3f6ae	Attempt to fix windows buildbots. Patch by James Benton. llvm-svn: 159826	2012-07-06 14:43:16 +00:00
NAKAMURA Takumi	4f934676fb	test/CodeGen/X86/sext-setcc-self.ll: Mark it as XFAIL: cygwin,mingw32,win32. Investigating. llvm-svn: 159820	2012-07-06 12:12:39 +00:00
NAKAMURA Takumi	0246724cd6	Revert r159804, "[arm-fast-isel] Add support for vararg function calls." It broke LLVM :: CodeGen/Thumb2/large-call.ll on several hosts. llvm-svn: 159817	2012-07-06 11:12:44 +00:00
Alexey Samsonov	39602781f6	Fix PR13202 and a regtest. DwarfDebug class could generate the same (inlined) DIVariable twice: 1) when trying to find abstract debug variable for a concrete inlined instance. 2) when explicitly collecting info for variables that were optimized out. This change makes sure that this duplication won't happen and makes Clang pass "gdb.opt/inline-locals" test from gdb testsuite. Reviewed by Eric Christopher. llvm-svn: 159811	2012-07-06 08:45:08 +00:00
Jush Lu	5e6e6264f4	[arm-fast-isel] Add support for vararg function calls. llvm-svn: 159804	2012-07-06 03:02:37 +00:00
Jack Carter	b2af512cef	Mips specific inline asm operand modifier D. Print the second half of a double word operand. The include list was cleaned up a bit as well. Also the test case was modified to test for both big and little patterns. llvm-svn: 159787	2012-07-05 23:58:21 +00:00
Akira Hatanaka	bbf374c4c6	test case for r159770. llvm-svn: 159771	2012-07-05 19:29:31 +00:00
Duncan Sands	0552a2cad2	Use the right kind of booleans: we were emitting 0/1 booleans, instead of 0/-1 booleans. Patch by James Benton. llvm-svn: 159739	2012-07-05 09:32:46 +00:00
Jakob Stoklund Olesen	2dee812445	Ensure CopyToReg nodes are always glued to the call instruction. The CopyToReg nodes that set up the argument registers before a call must be glued to the call instruction. Otherwise, the scheduler may emit the physreg copies long before the call, causing long live ranges for the fixed registers. Besides disabling good register allocation, that can also expose problems when EmitInstrWithCustomInserter() splits a basic block during the live range of a physreg. llvm-svn: 159721	2012-07-04 19:28:31 +00:00
Rafael Espindola	1a7cf13215	Add a testcase for pr13209. It is not a great test, but it still fails if 159509 and 159479 are reverted. It would be really nice to be able to run just the coalescer :-( llvm-svn: 159715	2012-07-04 16:06:00 +00:00
Jakob Stoklund Olesen	49e4d4b3ef	Add early if-conversion support to X86. Implement the TII hooks needed by EarlyIfConversion to create cmov instructions and estimate their latency. Early if-conversion is still not enabled by default. llvm-svn: 159695	2012-07-04 00:09:58 +00:00
Nuno Lopes	1e8dffdf27	BoundsChecking: optimize out the check for offset < 0 if size is known to be >= 0 (signed). (LLVM optimizers cannot do this optimization by themselves) llvm-svn: 159668	2012-07-03 17:30:18 +00:00
Craig Topper	676dcd8c39	Add aliases for pblendvb, blendvpd, and blendvps instructions with the implicit xmm0 operand specified. Fixes PR13252. llvm-svn: 159644	2012-07-03 05:49:45 +00:00
NAKAMURA Takumi	2338556320	test/CodeGen/SPARC/private.ll: Fixup. Forgot to prune old RUN lines. llvm-svn: 159643	2012-07-03 04:29:20 +00:00
NAKAMURA Takumi	c2a5bd6822	test/CodeGen/SPARC/private.ll: FileCheck-ize. llvm-svn: 159642	2012-07-03 04:21:57 +00:00
NAKAMURA Takumi	2a4930c96a	llvm/test/lit.cfg: Retweak for Win32 to fix testing. - execute_external should be; - Not on Win32. - Using bash. In reverse, "execute_internal" shoud be (Win32 && !bash). - lit.getBashPath() behaves differently before and after tweaking $PATH. I will add a few explanations there later. llvm-svn: 159641	2012-07-03 03:59:34 +00:00
NAKAMURA Takumi	dff1a78321	test/CodeGen/X86/sincos.ll: FileCheck-ize. llvm-svn: 159639	2012-07-03 03:59:22 +00:00
NAKAMURA Takumi	10dc235746	test/CodeGen/X86/fabs.ll: FileCheck-ize. llvm-svn: 159638	2012-07-03 03:59:15 +00:00
NAKAMURA Takumi	ff680b1db6	test/CodeGen/X86/2007-09-05-InvalidAsm.ll: FileCheck-ize. llvm-svn: 159637	2012-07-03 03:59:08 +00:00
NAKAMURA Takumi	e5e19e4f7b	test/CodeGen/X86/2004-03-30-Select-Max.ll: FileCheck-ize. llvm-svn: 159636	2012-07-03 03:58:59 +00:00
Jack Carter	b353094f27	mips32 long long register inline asm constraint support. inlineasm-cnstrnt-bad-r-1.ll is NOT supposed to fail, so it was removed. This resulted in the removal of a negative test (inlineasm-cnstrnt-bad-r-1.ll) llvm-svn: 159625	2012-07-02 23:35:23 +00:00
Eric Christopher	dfc3e68c40	Revert " mips32 long long register inline asm constraint support." as it appears to be breaking the bots. This reverts commit 1b055ce320fa13f6f1ac81670d11b45e01f79876. llvm-svn: 159619	2012-07-02 23:22:25 +00:00
Eric Christopher	b65acc61a5	Revert "IntRange:" as it appears to be breaking self hosting. This reverts commit b2833d9dcba88c6f0520cad760619200adc0442c. llvm-svn: 159618	2012-07-02 23:22:21 +00:00
Jack Carter	939236c2eb	deleted test/CodeGen/Mips/inlineasm-cnstrnt-bad-r-1.ll llvm-svn: 159617	2012-07-02 23:21:22 +00:00
Jack Carter	5c1a01a625	mips32 long long register inline asm constraint support. inlineasm-cnstrnt-bad-r-1.ll is NOT supposed to fail, so it was removed. This resulted in the removal of a negative test (inlineasm-cnstrnt-bad-r-1.ll) llvm-svn: 159610	2012-07-02 22:39:45 +00:00
Chandler Carruth	a7f1f35eb8	Extend the workaround from r159593 to cover a few explicit alias targets. llvm-svn: 159597	2012-07-02 21:45:22 +00:00
Chandler Carruth	aec961811b	Revert r159588, and apply a more principled fix. Place the fix for this in the abstraction for lit test suites so that the various other layers of abstraction pick up the same behavioral fix, and so that we still get a complete list of dependencies for the 'check-all' target. This should fix the follow-on issues of the same nature with various other build targets, including Clang targets. Sorry for the churn, and again thanks to Matt for testing and breaking this more thoroughly. llvm-svn: 159593	2012-07-02 21:31:03 +00:00
Chandler Carruth	6e80d5934d	Work around a really frustrating apparant CMake bug. No functionality changed here, except that the CMake installed by default on Ubuntu Lucid should actually work with the makefile generators now. Thanks to Matt for the report and head-desking required to figure out why it was failing. llvm-svn: 159588	2012-07-02 21:14:06 +00:00
Jack Carter	06de0fb083	Pass the correct ELFOSABI enumeration to the MipsELFObjectWriter constructor Contributer: Sasa Stankovic llvm-svn: 159574	2012-07-02 20:04:43 +00:00
Bob Wilson	cac3b90633	Extend TargetPassConfig to allow running only a subset of the normal passes. This is still a work in progress but I believe it is currently good enough to fix PR13122 "Need unit test driver for codegen IR passes". For example, you can run llc with -stop-after=loop-reduce to have it dump out the IR after running LSR. Serializing machine-level IR is not yet supported but we have some patches in progress for that. The plan is to serialize the IR to a YAML file, containing separate sections for the LLVM IR, machine-level IR, and whatever other info is needed. Chad suggested that we stash the stop-after pass in the YAML file and use that instead of the start-after option to figure out where to restart the compilation. I think that's a great idea, but since it's not implemented yet I put the -start-after option into this patch for testing purposes. llvm-svn: 159570	2012-07-02 19:48:45 +00:00
Chandler Carruth	ff123d5c63	Fix the remaining TCL-style quotes found in the testsuite. This is another mechanical change accomplished though the power of terrible Perl scripts. I have manually switched some "s to 's to make escaping simpler. While I started this to fix tests that aren't run in all configurations, the massive number of tests is due to a really frustrating fragility of our testing infrastructure: things like 'grep -v', 'not grep', and 'expected failures' can mask broken tests all too easily. Essentially, I'm deeply disturbed that I can change the testsuite so radically without causing any change in results for most platforms. =/ llvm-svn: 159547	2012-07-02 19:09:46 +00:00
Duncan Sands	e8ce94fcd7	GlobalOpt forgot to handle bitcast when analyzing globals. Found by inspection. llvm-svn: 159546	2012-07-02 18:55:39 +00:00
Chandler Carruth	5da53436d5	Convert the uses of '\|&' to use '2>&1 \|' instead, which works on old versions of Bash. In addition, I can back out the change to the lit built-in shell test runner to support this. This should fix the majority of fallout on Darwin, but I suspect there will be a few straggling issues. llvm-svn: 159544	2012-07-02 18:37:59 +00:00
Bob Wilson	2297221028	Do not attempt to use ROR for Thumb1. Patch by Matt Fischer! llvm-svn: 159538	2012-07-02 17:22:47 +00:00
Nuno Lopes	d0bcfe4d9d	fix the regression I introduced in r159385 (it's necessary to update PHI nodes in unwind BB llvm-svn: 159534	2012-07-02 16:14:47 +00:00
Chandler Carruth	665c76bc52	The built-in shell test runner for some reason doesn't like the quoting and multi-line nature of this test. I don't really feel like bugging this kind of edge-case, so just put it on one line and use single quotes. With this, every test really passes with the built-in shell test runner. llvm-svn: 159530	2012-07-02 13:35:01 +00:00
Chandler Carruth	872ac7cfad	Fix the TCL-style quoting in one random test that somehow slipped through my perl nets. With this, the test suite passes even if I force it to run with the built-in shell test logic, except for a test which REQUIREs shell. llvm-svn: 159529	2012-07-02 13:29:47 +00:00
Stepan Dyatkovskiy	8b9ecca42d	IntRange: - Changed isSingleNumber method behaviour. Now this flag is calculated on demand. IntegersSubsetMapping - Optimized diff operation. - Replaced type of Items field from std::list with std::map. - Added new methods: bool isOverlapped(self &RHS) void add(self& RHS, SuccessorClass S) void detachCase(self& NewMapping, SuccessorClass Succ) void removeCase(SuccessorClass Succ) SuccessorClass findSuccessor(const IntTy& Val) const IntTy* getCaseSingleNumber(SuccessorClass *Succ) IntegersSubsetTest - DiffTest: Added checks for successors. SimplifyCFG Updated SwitchInst usage (now it is case-ragnes compatible) for - SimplifyEqualityComparisonWithOnlyPredecessor - FoldValueComparisonIntoPredecessors llvm-svn: 159527	2012-07-02 13:02:18 +00:00
Chandler Carruth	a5a29f970e	Convert all tests using TCL-style quoting to use shell-style quoting. This was done through the aid of a terrible Perl creation. I will not paste any of the horrors here. Suffice to say, it require multiple staged rounds of replacements, state carried between, and a few nested-construct-parsing hacks that I'm not proud of. It happens, by luck, to be able to deal with all the TCL-quoting patterns in evidence in the LLVM test suite. If anyone is maintaining large out-of-tree test trees, feel free to poke me and I'll send you the steps I used to convert things, as well as answer any painful questions etc. IRC works best for this type of thing I find. Once converted, switch the LLVM lit config to use ShTests the same as Clang. In addition to being able to delete large amounts of Python code from 'lit', this will also simplify the entire test suite and some of lit's architecture. Finally, the test suite runs 33% faster on Linux now. ;] For my 16-hardware-thread (2x 4-core xeon e5520): 36s -> 24s llvm-svn: 159525	2012-07-02 12:47:22 +00:00
Chandler Carruth	0a4a261365	Make tests which first provide a negative assertion via 'not', then a pipeline, and then a positive assertion via grep, use two RUN lines instead. Supporting these complex ideas of 'success' and 'failure' across multiple stages of a pipeline is brittle in the shell world, and would block switching to ShTest format; it only worked due to contrivances introduced by the TclTest format. Writing this as two separate RUN lines seems clearer in any event. This is another step toward completely removing TclTests from lit. llvm-svn: 159524	2012-07-02 12:23:19 +00:00
Chandler Carruth	ae00a80869	Rewrite three tests that had truly egregious abuses of 'grep' in them to use FileCheck. Aside from removing a dependence on TCL-style quoting, this also makes the tests ... significantly more robust. =] It would be really, really great of the maintainer(s) of the CellSPU backend went through and systematically rewrite these tests to use FileCheck. There are a lot more that have nearly this bad of abuses. Another step along the path to a TclTest-free testsuite. llvm-svn: 159523	2012-07-02 12:20:14 +00:00
Chandler Carruth	8bdfe1ec92	Switch a bunch of Linker tests from using elaborate echo productions to just provide and reference separate input files from an Inputs subdirectory. This pattern works very well in the Clang tree and is easier to understand in my opinion. It also has fewer limitations and will remove one particularly annoying use of TCL-style {} quoting from the testsuite. Also teach the LLVM lit configuration to avoid recursing into 'Inputs' subdirectories. This wasn't required for the previous 'Inputs' subdirectories used due to fortuitous suffix patterns. This is the first step to completely removing support for TCL-style tests. llvm-svn: 159520	2012-07-02 10:18:06 +00:00
Alexey Samsonov	f4462fa3ca	This patch extends the libLLVMDebugInfo which contains a minimalistic DWARF parser: 1) DIContext is now able to return function name for a given instruction address (besides file/line info). 2) llvm-dwarfdump accepts flag --functions that prints the function name (if address is specified by --address flag). 3) test case that checks the basic functionality of llvm-dwarfdump added llvm-svn: 159512	2012-07-02 05:54:45 +00:00
Rafael Espindola	a77d31d7fd	Now that RegistersDefinedFromSameValue handles one instruction being an implicit_def, the other instruction can be anything, including instructions that define multiple values. Be careful about that and don't assume what operand 0 is. Fixes pr13249. llvm-svn: 159509	2012-07-01 17:08:01 +00:00
Elena Demikhovsky	9af899fa88	Optimization of shuffle node that can fit to the register form of VBROADCAST instruction on AVX2. llvm-svn: 159504	2012-07-01 06:12:26 +00:00
Chandler Carruth	69ce6652b8	Hoist LLVM's lit testsuite infrastructure into module so that it can be re-used. Also, build in direct support for accumulating a set of lit parameters, arguments, and testsuites to run as part of a 'check-all' rule. This sinks 'check-all' from a Clang-specific construct to a generic construct of the project. llvm-svn: 159482	2012-06-30 10:14:14 +00:00
Jakob Stoklund Olesen	3e3cdecf98	Clear kill flags in InstrEmitter::EmitSubregNode(). When a local virtual register is made global, make sure to clear any existing kill flags. llvm-svn: 159461	2012-06-29 21:00:03 +00:00
Duncan Sands	369c6d270b	Fix a reassociate crash on sozefx when compiling with dragonegg+gcc-4.7 due to the optimizers producing a multiply expression with more multiplications than the original (!). llvm-svn: 159426	2012-06-29 13:25:06 +00:00
Rafael Espindola	efdfb1e6b2	In the initial exec mode we always do a load to find the address of a variable. Before this patch in pic 32 bit code we would add the global base register and not load from that address. This is a really old bug, but before the introduction of the tls attributes we would never select initial exec for pic code. llvm-svn: 159409	2012-06-29 04:22:35 +00:00
Manman Ren	98a5bf24a9	X86: add more GATHER intrinsics in LLVM Corrected type for index of llvm.x86.avx2.gather.d.pd.256 from 256-bit to 128-bit. Corrected types for src\|dst\|mask of llvm.x86.avx2.gather.q.ps.256 from 256-bit to 128-bit. Support the following intrinsics: llvm.x86.avx2.gather.d.q, llvm.x86.avx2.gather.q.q llvm.x86.avx2.gather.d.q.256, llvm.x86.avx2.gather.q.q.256 llvm.x86.avx2.gather.d.d, llvm.x86.avx2.gather.q.d llvm.x86.avx2.gather.d.d.256, llvm.x86.avx2.gather.q.d.256 llvm-svn: 159402	2012-06-29 00:54:20 +00:00
Chandler Carruth	0cb6c4bdcc	Remove a completely unnecessary mkdir from the CMake build. Clang has been getting along fine without this for quite some time. llvm-svn: 159400	2012-06-29 00:45:57 +00:00
Nick Lewycky	474112d82c	If the step value is a constant zero, the loop isn't going to terminate. Fixes the assert reported in PR13228! llvm-svn: 159393	2012-06-28 23:44:57 +00:00
Nuno Lopes	2f49284f12	make the verifier accept @llvm.donothing as the only intrinsic that can be invoked While at it, merge 2 tests and FileCheckize them llvm-svn: 159388	2012-06-28 22:57:00 +00:00
Nuno Lopes	b97a4e8bc2	make simplifyCFG erase invokes to readonly/readnone functions llvm-svn: 159385	2012-06-28 22:32:27 +00:00
Nuno Lopes	9ac4661afa	make instcombine produce calls to llvm.donothing instead of a random intrinsic llvm-svn: 159384	2012-06-28 22:31:24 +00:00
Nuno Lopes	ec9653b363	add a new @llvm.donothing intrinsic that, well, does nothing, and teach CodeGen to ignore calls to it llvm-svn: 159383	2012-06-28 22:30:12 +00:00
Nuno Lopes	8650fb8e0e	make LazyValueInfo analyze the default case of switch statements (we know that in the default branch the value cannot be any of the switch cases) llvm-svn: 159353	2012-06-28 16:13:37 +00:00
Chandler Carruth	3511dd30c8	Move the setup for variables that are expanded in the lit.site.cfg into a dedicated helper function. This will enable re-using the same logic for Clang's lit setup, etc. llvm-svn: 159333	2012-06-28 06:36:24 +00:00
Hal Finkel	f2dcb9a9c4	Allow BBVectorize to form non-2^n-length vectors. The original algorithm only used recursive pair fusion of equal-length types. This is now extended to allow pairing of any types that share the same underlying scalar type. Because we would still generally prefer the 2^n-length types, those are formed first. Then a second set of iterations form the non-2^n-length types. Also, a call to SimplifyInstructionsInBlock has been added after each pairing iteration. This takes care of DCE (and a few other things) that make the following iterations execute somewhat faster. For the same reason, some of the simple shuffle-combination cases are now handled internally. There is some additional refactoring work to be done, but I've had many requests for this feature, so additional refactoring will come soon in future commits (as will additional test cases). llvm-svn: 159330	2012-06-28 05:42:42 +00:00
Jack Carter	6c0bc0b378	The Mips specific inline asm operand modifier 'z' has the following description in the gnu sources: Print $0 if operand is zero otherwise print the op normally. llvm-svn: 159324	2012-06-28 01:33:40 +00:00
Nuno Lopes	e6e049020b	make LVI::getEdgeValue() always intersect the constraints of the edge with the range of the block. Previously it was only performing the intersection for a few cases, thus losing precision llvm-svn: 159320	2012-06-28 01:16:18 +00:00
Chandler Carruth	bf2b400f3b	Remove 'site.exp' building from both CMake and configure+make. This is another vestige of the DejaGNU roots. There were FIXMEs in the lit setup to add a 'lit.site.cfg', which has been around for quite some time now, so I've properly switched the handling of the 4 things actually used in site.exp to go through lit.site.cfg now. No more parsing of the .exp file, one fewer configure-style generated file, etc., etc. llvm-svn: 159313	2012-06-28 00:16:51 +00:00
Chandler Carruth	fd3a5e33d5	Remove the last vestiges of the '-lit' and '-dg' test runner split by removing '-lit' qualifiers from make rules. I've left a legacy 'check-local-lit' rule in case build scripts have this encoded somewhere. llvm-svn: 159311	2012-06-28 00:03:15 +00:00
Chandler Carruth	256d3a9eaa	Rip out legacy DejaGNU support from our Makefiles. This hasn't been the default in forever, and hasn't even worked since most of the .exp files were removed. llvm-svn: 159307	2012-06-27 23:48:39 +00:00
Chandler Carruth	b5c1a2b87c	LLVM-GCC is dead. Really. I promise. ;] More importantly, these files don't even have the variable that these lines purport to substite. llvm-svn: 159304	2012-06-27 23:34:25 +00:00
Jack Carter	ef40238a0e	This allows hello world to be compiled for Mips 64 direct object. It takes advantage of r159299 which introduces relocation support for N64. elf-dump needed to be upgraded to support N64 relocations as well. This passes make check. Jack llvm-svn: 159302	2012-06-27 23:13:42 +00:00
Jack Carter	b9f9de93df	This allows hello world to be compiled for Mips 64 direct object. It takes advantage of r159299 which introduces relocation support for N64. elf-dump needed to be upgraded to support N64 relocations as well. This passes make check. Jack llvm-svn: 159301	2012-06-27 22:48:25 +00:00
Matt Beaumont-Gay	a58862310c	Revert r159136 due to PR13124. Original commit message: If a constant or a function has linkonce_odr linkage and unnamed_addr, mark it hidden. Being linkonce_odr guarantees that it is available in every dso that needs it. Being a constant/function with unnamed_addr guarantees that the copies don't have to be merged. llvm-svn: 159272	2012-06-27 17:10:33 +00:00
Duncan Sands	514db117bd	Some reassociate optimizations create new instructions, which they insert just before the expression root. Any existing operators that are changed to use one of them needs to be moved between it and the expression root, and recursively for the operators using that one. When I rewrote RewriteExprTree I accidentally inverted the logic, resulting in the compacting going down from operators to operands rather than up from operands to the operators using them, oops. Fix this, resolving PR12963. llvm-svn: 159265	2012-06-27 14:19:00 +00:00
Richard Barton	57b7d16e34	Teach assembler to handle capitalised operation values for DSB instructions llvm-svn: 159259	2012-06-27 09:48:23 +00:00
Chandler Carruth	aa324c9078	Clean up the 'check' CMake build rule a bit, notable renaming it to 'check-llvm'. Don't worry! 'check' still works! =] To rationalize the names of targets used to run tests, the vague plan is the following: make check-llvm # run LLVM reg/unit tests (currently 'check') make check-clang # run Clang reg/unit tests (currently 'clang-test') make check-rt # run CompilerRT reg/unit tests make check-asan # run ASan reg/unit tests (subset of -rt) make check-tsan # run TSan reg/unit tests (subset of -rt) make check-all # run as much of the above as is available The last one respects what projects are checked out and built for a given tree. Personally, I would like to eventually make 'check' be an alias for 'check-all'. For now however, it is an alias for 'check-llvm', and thus no behavior has changed. While this patch and my plan only really apply to CMake, I think it might be good to similarly rationalize the naming scheme for the Make builds. llvm-svn: 159258	2012-06-27 09:44:16 +00:00
Akira Hatanaka	ad31cd9a01	Test case for r159240. llvm-svn: 159242	2012-06-27 00:40:34 +00:00
Evan Cheng	319be53a1f	Remove a instcombine transform that (no longer?) makes sense: // C - zext(bool) -> bool ? C - 1 : C if (ZExtInst *ZI = dyn_cast<ZExtInst>(Op1)) if (ZI->getSrcTy()->isIntegerTy(1)) return SelectInst::Create(ZI->getOperand(0), SubOne(C), C); This ends up forming sext i1 instructions that codegen to terrible code. e.g. int blah(_Bool x, _Bool y) { return (x - y) + 1; } => movzbl %dil, %eax movzbl %sil, %ecx shll $31, %ecx sarl $31, %ecx leal 1(%rax,%rcx), %eax ret Without the rule, llvm now generates: movzbl %sil, %ecx movzbl %dil, %eax incl %eax subl %ecx, %eax ret It also helps with ARM (and pretty much any target that doesn't have a sext i1 :-). The transformation was done as part of Eli's r75531. He has given the ok to remove it. rdar://11748024 llvm-svn: 159230	2012-06-26 22:03:13 +00:00
Rafael Espindola	e0eaa043eb	Fix llc's -print-before=pass and -print-after=pass. llvm-svn: 159227	2012-06-26 21:33:36 +00:00
Manman Ren	a09820414a	X86: add GATHER intrinsics (AVX2) in LLVM Support the following intrinsics: llvm.x86.avx2.gather.d.pd, llvm.x86.avx2.gather.q.pd llvm.x86.avx2.gather.d.pd.256, llvm.x86.avx2.gather.q.pd.256 llvm.x86.avx2.gather.d.ps, llvm.x86.avx2.gather.q.ps llvm.x86.avx2.gather.d.ps.256, llvm.x86.avx2.gather.q.ps.256 Modified Disassembler to handle VSIB addressing mode. llvm-svn: 159221	2012-06-26 19:47:59 +00:00
Jack Carter	5e69cffed5	There are a number of generic inline asm operand modifiers that up to r158925 were handled as processor specific. Making them generic and putting tests for these modifiers in the CodeGen/Generic directory caused a number of targets to fail. This commit addresses that problem by having the targets call the generic routine for generic modifiers that they don't currently have explicit code for. For now only generic print operands 'c' and 'n' are supported.vi Affected files: test/CodeGen/Generic/asm-large-immediate.ll lib/Target/PowerPC/PPCAsmPrinter.cpp lib/Target/NVPTX/NVPTXAsmPrinter.cpp lib/Target/ARM/ARMAsmPrinter.cpp lib/Target/XCore/XCoreAsmPrinter.cpp lib/Target/X86/X86AsmPrinter.cpp lib/Target/Hexagon/HexagonAsmPrinter.cpp lib/Target/CellSPU/SPUAsmPrinter.cpp lib/Target/Sparc/SparcAsmPrinter.cpp lib/Target/MBlaze/MBlazeAsmPrinter.cpp lib/Target/Mips/MipsAsmPrinter.cpp MSP430 isn't represented because it did not even run with the long existing 'c' modifier and it was not apparent what needs to be done to get it inline asm ready. Contributer: Jack Carter llvm-svn: 159203	2012-06-26 13:49:27 +00:00
Duncan Sands	8bc764aeca	Replacing zero-sized alloca's with a null pointer is too aggressive, instead merge all zero-sized alloca's into one, fixing c43204g from the Ada ACATS conformance testsuite. What happened there was that a variable sized object was being allocated on the stack, "alloca i8, i32 %size". It was then being passed to another function, which tested that the address was not null (raising an exception if it was) then manipulated %size bytes in it (load and/or store). The optimizers cleverly managed to deduce that %size was zero (congratulations to them, as it isn't at all obvious), which made the alloca zero size, causing the optimizers to replace it with null, which then caused the check mentioned above to fail, and the exception to be raised, wrongly. Note that no loads and stores were actually being done to the alloca (the loop that does them is executed %size times, i.e. is not executed), only the not-null address check. llvm-svn: 159202	2012-06-26 13:39:21 +00:00
Elena Demikhovsky	26088d2e24	Shuffle optimization for AVX/AVX2. The current patch optimizes frequently used shuffle patterns and gives these instruction sequence reduction. Before: vshufps $-35, %xmm1, %xmm0, %xmm2 ## xmm2 = xmm0[1,3],xmm1[1,3] vpermilps $-40, %xmm2, %xmm2 ## xmm2 = xmm2[0,2,1,3] vextractf128 $1, %ymm1, %xmm1 vextractf128 $1, %ymm0, %xmm0 vshufps $-35, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[1,3],xmm1[1,3] vpermilps $-40, %xmm0, %xmm0 ## xmm0 = xmm0[0,2,1,3] vinsertf128 $1, %xmm0, %ymm2, %ymm0 After: vshufps $13, %ymm0, %ymm1, %ymm1 ## ymm1 = ymm1[1,3],ymm0[0,0],ymm1[5,7],ymm0[4,4] vshufps $13, %ymm0, %ymm0, %ymm0 ## ymm0 = ymm0[1,3,0,0,5,7,4,4] vunpcklps %ymm1, %ymm0, %ymm0 ## ymm0 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5] llvm-svn: 159188	2012-06-26 08:04:10 +00:00
Craig Topper	94bf0f3855	Remove some duplicate instructions that exist only to given different mnemonics for the assembler. Use InstAlias instead. llvm-svn: 159184	2012-06-26 04:12:49 +00:00
Andrew Trick	fb2ba3e1cb	Enable the new LoopInfo algorithm by default. The primary advantage is that loop optimizations will be applied in a stable order. This helps debugging and unit test creation. It is also a better overall implementation without pathologically bad performance on deep functions. On large functions (llvm-stress --size=200000 \| opt -loops) Before: 0.1263s After: 0.0225s On deep functions (after tweaking llvm-stress, thanks Nadav): Before: 0.2281s After: 0.0227s See r158790 for more comments. The loop tree is now consistently generated in forward order, but loop passes are applied in reverse order over the program. If we have a loop optimization that prefers forward order, that can easily be achieved by adding a different type of LoopPassManager. llvm-svn: 159183	2012-06-26 04:11:38 +00:00
Eli Friedman	bbcd09cc00	Make some ugly hacks for inline asm operands which name a specific register a bit more thorough. PR13196. llvm-svn: 159176	2012-06-25 23:42:33 +00:00
Nuno Lopes	31b54a5379	revert my previous commit (r159173), since as Eli pointed out, it's perfectly ok to mark realloc as noalias llvm-svn: 159175	2012-06-25 23:26:10 +00:00
Nuno Lopes	75eaa72de9	do not set realloc() as NotAlias, since it can return the same pointer. This whole thing should be upgraded to use the MemoryBuiltin interface anyway.. llvm-svn: 159173	2012-06-25 22:55:50 +00:00
Manman Ren	606953fbe7	ARM: update peephole optimization. More condition codes are included when deciding whether to remove cmp after a sub instruction. Specifically, we extend from GE\|LT\|GT\|LE to GE\|LT\|GT\|LE\|HS\|LS\|HI\|LO\|EQ\|NE. If we have "sub a, b; cmp b, a; movhs", we should be able to replace with "sub a, b; movls". rdar: 11725965 llvm-svn: 159166	2012-06-25 21:49:38 +00:00
Dan Gohman	5f725cd196	Fix the objc_autoreleasedReturnValue optimization code to locate the call correctly even in the case where it is an invoke. This fixes rdar://11714057. llvm-svn: 159157	2012-06-25 19:47:37 +00:00
Jakob Stoklund Olesen	a57fc12ec9	Enforce stricter liveness rules for PHIs. Verify that all paths from the entry block to a virtual register read pass through a def. Enable this check even when MRI->isSSA() is false. Verify that the live range of a virtual register is live out of all predecessor blocks, even for PHI-values. This requires that PHIElimination sometimes inserts IMPLICIT_DEF instruction in predecessor blocks. llvm-svn: 159150	2012-06-25 18:18:27 +00:00
Jakob Stoklund Olesen	eb49566447	Run ProcessImplicitDefs on SSA form where it can be much simpler. Implicitly defined virtual registers can simply have the <undef> bit set on all uses, and copies can be turned into implicit defs recursively. Physical registers are a bit trickier. We handle the common case where a physreg def is used by a nearby instruction in the same basic block. For more complicated cases, just leave the IMPLICIT_DEF instruction in. llvm-svn: 159149	2012-06-25 18:12:18 +00:00
Nuno Lopes	07594cba7c	improve optimization of invoke instructions: - simplifycfg: invoke undef/null -> unreachable - instcombine: invoke new -> invoke expect(0, 0) (an arbitrary NOOP intrinsic; only done if the allocated memory is unused, of course) - verifier: allow invoke of intrinsics (to make the previous step work) llvm-svn: 159146	2012-06-25 17:11:47 +00:00
Meador Inge	fc2fb711e8	PR13013: ELF Type identification fails for MSB type ELF files. Fix 'sys::IdentifyFileType' to work with big and little endian byte orderings when reading the ELF object file type. Initial patch by Stefan Hepp. llvm-svn: 159138	2012-06-25 14:48:43 +00:00
Rafael Espindola	540c3d23df	If a constant or a function has linkonce_odr linkage and unnamed_addr, mark it hidden. Being linkonce_odr guarantees that it is available in every dso that needs it. Being a constant/function with unnamed_addr guarantees that the copies don't have to be merged. llvm-svn: 159136	2012-06-25 14:30:31 +00:00
Jakob Stoklund Olesen	2e22e6a361	%RCX is not a function live-out in eh.return functions. The function live-out registers must be live at all function returns, and %RCX is only used by eh.return. When a function also has a normal return, only %RAX holds a return value. This fixes PR13188. llvm-svn: 159116	2012-06-24 15:53:01 +00:00
Hal Finkel	3099ce9489	Allow controlling vectorization of boolean values separately from other integer types. These are used as the result of comparisons, and often handled differently from larger integer types. llvm-svn: 159111	2012-06-24 13:28:01 +00:00
Nick Lewycky	0a045bbe4e	Remove dyn_cast + dereference pattern by replacing it with a cast and changing the safety check to look for the same type we're going to actually cast to. Fixes PR13180! llvm-svn: 159110	2012-06-24 10:15:42 +00:00
Nick Lewycky	bfb07fb562	Remove a dangling reference to a deleted instruction. Fixes PR13185! llvm-svn: 159096	2012-06-24 01:44:08 +00:00
Pete Cooper	fe212e762f	DAG legalisation can now handle illegal fma vector types by scalarisation llvm-svn: 159092	2012-06-24 00:05:44 +00:00
Hal Finkel	4b06b1a0ee	Allow BBVectorize to fuse compare instructions. llvm-svn: 159088	2012-06-23 21:52:50 +00:00
Marshall Clow	78ade1dd08	Add relocation types for Hexagon processor; patch by Sidney Manning <sidneym@codeaurora.org> llvm-svn: 159081	2012-06-23 14:46:18 +00:00
Hans Wennborg	cbe34b4cc9	Extend the IL for selecting TLS models (PR9788) This allows the user/front-end to specify a model that is better than what LLVM would choose by default. For example, a variable might be declared as @x = thread_local(initialexec) global i32 42 if it will not be used in a shared library that is dlopen'ed. If the specified model isn't supported by the target, or if LLVM can make a better choice, a different model may be used. llvm-svn: 159077	2012-06-23 11:37:03 +00:00
Rafael Espindola	a3088f09b3	Handle aliases to tls variables in all architectures, not just x86. llvm-svn: 159058	2012-06-23 00:30:03 +00:00
Evan Cheng	68c2f9a9a7	(sub X, imm) gets canonicalized to (add X, -imm) There are patterns to handle immediates when they fit in the immediate field. e.g. %sub = add i32 %x, -123 => sub r0, r0, #123 Add patterns to catch immediates that do not fit but should be materialized with a single movw instruction rather than movw + movt pair. e.g. %sub = add i32 %x, -65535 => movw r1, #65535 sub r0, r0, r1 rdar://11726136 llvm-svn: 159057	2012-06-23 00:29:06 +00:00
Jim Grosbach	087affe2f3	ARM: Add a better diagnostic for some out of range immediates. As an example of how the custom DiagnosticType can be used to provide better operand-mismatch diagnostics, add a custom diagnostic for the imm0_15 operand class used for several system instructions. Update the tests to expect the improved diagnostic. rdar://8987109 llvm-svn: 159051	2012-06-22 23:56:48 +00:00
Hal Finkel	460e94d842	Add support for the PPC isel instruction. The isel (integer select) instruction is supported on the 440 and A2 embedded cores and on the POWER7. llvm-svn: 159045	2012-06-22 23:10:08 +00:00
Chad Rosier	1ce3805b23	FileCheckize tests. llvm-svn: 159044	2012-06-22 23:04:02 +00:00
Lang Hames	c98ebda325	Rename fp-op fusion option (yet again) for compatibility with GCC option. llvm-svn: 159042	2012-06-22 22:31:00 +00:00
Evan Cheng	f5bd6c6510	EmitZerofill should take a 64-bit size or else it's chopping off large zero-filled global. rdar://11729134 llvm-svn: 159023	2012-06-22 20:14:46 +00:00
Jakob Stoklund Olesen	c5c4e96f3e	Revert remaining part of r93200: "Disable folding sext(trunc(x)) -> x" This fixes PR5997. These transforms were disabled because codegen couldn't deal with other uses of trunc(x). This is now handled by the peephole pass. This causes no regressions on x86-64. llvm-svn: 159003	2012-06-22 16:36:43 +00:00
NAKAMURA Takumi	c384b95939	test/CodeGen/Generic/asm-large-immediate.ll: Mark it as XFAIL: powerpc, possibly due to r158939. llvm-svn: 158994	2012-06-22 13:41:00 +00:00
Jakob Stoklund Olesen	321d41a871	Functions calling __builtin_eh_return must have a frame pointer. The code in X86TargetLowering::LowerEH_RETURN() assumes that a frame pointer exists, but the frame pointer was forced by the presence of llvm.eh.unwind.init which isn't guaranteed. If llvm.eh.unwind.init is actually required in functions calling eh.return (is it?), we should diagnose that instead of emitting bad machine code. This should fix the dragonegg-x86_64-linux-gcc-4.6-test bot. llvm-svn: 158961	2012-06-22 03:04:27 +00:00
Andrew Trick	3ccb1b8cf9	ARM scheduling fix: compute predicated implicit use properly. Minor drive by fix to cleanup latency computation. Calling getOperandLatency with a deliberately incorrect operand index does not give you the latency you want. llvm-svn: 158959	2012-06-22 02:50:31 +00:00
Nick Lewycky	33da33676f	Emit relocations for DW_AT_location entries on systems which need it. This is a recommit of r127757. Fixes PR9493. Patch by Paul Robinson! llvm-svn: 158957	2012-06-22 01:25:12 +00:00
Lang Hames	b8650f106a	Rename -allow-excess-fp-precision flag to -fuse-fp-ops, and switch from a boolean flag to an enum: { Fast, Standard, Strict } (default = Standard). This option controls the creation by optimizations of fused FP ops that store intermediate results in higher precision than IEEE allows (E.g. FMAs). The behavior of this option is intended to match the behaviour specified by a soon-to-be-introduced frontend flag: '-ffuse-fp-ops'. Fast mode - allows formation of fused FP ops whenever they're profitable. Standard mode - allow fusion only for 'blessed' FP ops. At present the only blessed op is the fmuladd intrinsic. In the future more blessed ops may be added. Strict mode - allow fusion only if/when it can be proven that the excess precision won't effect the result. Note: This option only controls formation of fused ops by the optimizers. Fused operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic) will always be honored, regardless of the value of this option. Internally TargetOptions::AllowExcessFPPrecision has been replaced by TargetOptions::AllowFPOpFusion. llvm-svn: 158956	2012-06-22 01:09:09 +00:00
Nuno Lopes	771e7bd4ba	instcombine: disable optimization of 'invoke null/undef'. I'll move this functionality to SimplifyCFG (since we cannot make changes to the CFG here). Fixes the crashes with the attached test case llvm-svn: 158951	2012-06-21 23:52:14 +00:00
Evan Cheng	32c7cc8ec9	Look pass zext to strength reduce an udiv. Patch by David Majnemer. rdar://11721329 llvm-svn: 158946	2012-06-21 22:52:49 +00:00
Jack Carter	c457f62033	The inline asm operand modifier 'n' is suppose to be generic across architectures. It has the following description in the gnu sources: Negate the immediate constant Several Architectures such as x86 have local implementations of operand modifier 'n' which go beyond the above description slightly. This won't affect them. Affected files: lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp Added 'n' to the switch cases. test/CodeGen/Generic/asm-large-immediate.ll Generic compiled test (x86 for me) test/CodeGen/Mips/asm-large-immediate.ll Mips compiled version of the generic one Contributer: Jack Carter llvm-svn: 158939	2012-06-21 21:37:54 +00:00
Nuno Lopes	dc6085e52d	Add support for invoke to the MemoryBuiltin analysid. Update comments accordingly. Make instcombine remove useless invokes to C++'s 'new' allocation function (test attached). llvm-svn: 158937	2012-06-21 21:25:05 +00:00
Akira Hatanaka	765c312314	1. fix null program output after some other changes 2. re-enable null.ll test 3. fix some minor style violations Patch by Reed Kotler. llvm-svn: 158935	2012-06-21 20:39:10 +00:00
Akira Hatanaka	fcf52c8304	Add Mips to the list of target architectures for the MCJIT tests. Patch by Reed Kotler. llvm-svn: 158933	2012-06-21 20:23:32 +00:00
Hal Finkel	a86b0f20dd	Treat TargetGlobalAddress as a constant for the purpose of matching pre-inc stores on PPC. Thanks to Tobias von Koch for pointing out this problem. llvm-svn: 158932	2012-06-21 20:10:48 +00:00
Jack Carter	b2fd5f66b4	The inline asm operand modifier 'c' is suppose to be generic across architectures. It has the following description in the gnu sources: Substitute immediate value without immediate syntax Several Architectures such as x86 have local implementations of operand modifier 'c' which go beyond the above description slightly. To make use of the generic modifiers without overriding local implementation one can make a call to the base class method for AsmPrinter::PrintAsmOperand() in the locally derived method's "default" case in the switch statement. That way if it is already defined locally the generic version will never get called. This change is needed when test/CodeGen/generic/asm-large-immediate.ll failed on a native Mips board. The test was assuming a generic implementation was in place. Affected files: lib/Target/Mips/MipsAsmPrinter.cpp: Changed the default case to call the base method. lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp Added 'c' to the switch cases. test/CodeGen/Mips/asm-large-immediate.ll Mips compiled version of the generic one Contributer: Jack Carter llvm-svn: 158925	2012-06-21 17:14:46 +00:00
Nuno Lopes	a6aa3d3b5f	hopefully fix the buildbots: some tests have wrong definitions of malloc and were crashing this code on 64 bits machines llvm-svn: 158923	2012-06-21 16:47:58 +00:00
Nuno Lopes	0e967e0186	port the BoundsChecking patch to the new MemoryBuiltin API (i.e., remove most of the code from here). Remove the alloc_size.ll test until we settle on a metadata format that makes everyone happy.. llvm-svn: 158920	2012-06-21 15:59:53 +00:00
Nuno Lopes	55fff83422	refactor the MemoryBuiltin analysis: - provide more extensive set of functions to detect library allocation functions (e.g., malloc, calloc, strdup, etc) - provide an API to compute the size and offset of an object pointed by Move a few clients (GVN, AA, instcombine, ...) to the new API. This implementation is a lot more aggressive than each of the custom implementations being replaced. Patch reviewed by Nick Lewycky and Chandler Carruth, thanks. llvm-svn: 158919	2012-06-21 15:45:28 +00:00
NAKAMURA Takumi	613663cfe2	Revert r158209, "test/CodeGen/Generic/APIntLoadStore.ll: Mark as XFAIL:ppc since r157911." It passes according to ppc changes. llvm-svn: 158917	2012-06-21 13:43:06 +00:00
Lang Hames	90b2a4cbad	Add a missing llvm.fma -> VFNMS pattern to the ARM backend. llvm-svn: 158902	2012-06-21 06:10:00 +00:00
Evan Cheng	8c2ad81238	Emit a single _udivmodsi4 libcall instead of two separate _udivsi3 and _umodsi3 libcalls if they have the same arguments. This optimization was apparently broken if one of the node was replaced in place. rdar://11714607 llvm-svn: 158900	2012-06-21 05:56:05 +00:00
Jakob Stoklund Olesen	51c63e64e3	Remove the -live-regunits command line option. Register allocators depend on it being permanently enabled now. llvm-svn: 158873	2012-06-20 23:31:34 +00:00
Akira Hatanaka	87505f46ac	Revert r158846. llvm-svn: 158855	2012-06-20 21:19:39 +00:00
Akira Hatanaka	da448fe0b1	In MipsDisassembler.cpp, instead of defining register class tables, use the ones that are generated by TableGen and are already available in MipsGenRegisterInfo.inc. Suggested by Jakob Stoklund Olesen. Also, fix bug in function DecodeAFGR64RegisterClass. Patch by Vladimir Medic. llvm-svn: 158846	2012-06-20 20:39:23 +00:00
Jakob Stoklund Olesen	833308d785	Only update regunit live ranges that have been precomputed. Regunit live ranges are computed on demand, so when mi-sched calls handleMove, some regunits may not have live ranges yet. That makes updating them easier: Just skip the non-existing ranges. They will be computed correctly from the rescheduled machine code when they are needed. llvm-svn: 158831	2012-06-20 18:00:57 +00:00
Hal Finkel	ca542beffe	Add support for generating reg+reg (indexed) pre-inc loads on PPC. llvm-svn: 158823	2012-06-20 15:43:03 +00:00
Craig Topper	b9e8e18949	Don't insert 128-bit UNDEF into 256-bit vectors. Just keep the 256-bit vector. Original patch by Elena Demikhovsky. Tweaked by me to allow possibility of covering more cases. llvm-svn: 158792	2012-06-20 05:39:26 +00:00
Lang Hames	39fb1d08dc	Add DAG-combines for aggressive FMA formation. This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or FSUB + FMUL. The combines are performed when: (a) Either AllowExcessFPPrecision option (-enable-excess-fp-precision for llc) OR UnsafeFPMath option (-enable-unsafe-fp-math) are set, and (b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of the FADD/FSUB, and (c) The FMUL only has one user (the FADD/FSUB). If your target has fast FMA instructions you can make use of these combines by overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for types supported by your FMA instruction, and adding patterns to match ISD::FMA to your FMA instructions. llvm-svn: 158757	2012-06-19 22:51:23 +00:00
Jakob Stoklund Olesen	77a0cfb19a	Add a triple. The test was failing on Linux because of asm syntax differences. llvm-svn: 158748	2012-06-19 21:46:25 +00:00
Jakob Stoklund Olesen	0f855e4263	Implement PPCInstrInfo::isCoalescableExtInstr(). The PPC::EXTSW instruction preserves the low 32 bits of its input, just like some of the x86 instructions. Use it to reduce register pressure when the low 32 bits have multiple uses. This requires a small change to PeepholeOptimizer since EXTSW takes a 64-bit input register. This is related to PR5997. llvm-svn: 158743	2012-06-19 21:14:34 +00:00
Jan Wen Voung	7f5d79f864	Have ARM ELF use correct reloc for "b" instr. The condition code didn't actually matter for arm "b" instructions, unlike "bl". It should just use the R_ARM_JUMP24 reloc. llvm-svn: 158722	2012-06-19 16:03:02 +00:00
Hal Finkel	1cc27e44a4	Add support for generating reg+reg preinc stores on PPC. PPC will now generate STWUX and friends. llvm-svn: 158698	2012-06-19 02:34:32 +00:00
Rafael Espindola	31567515ed	really add a triple :-( llvm-svn: 158696	2012-06-19 02:17:35 +00:00
Rafael Espindola	f2ae4075c8	Add a triple to the test. llvm-svn: 158695	2012-06-19 01:42:34 +00:00
Rafael Espindola	ca3e0ee8b3	Move the support for using .init_array from ARM to the generic TargetLoweringObjectFileELF. Use this to support it on X86. Unlike ARM, on X86 it is not easy to find out if .init_array should be used or not, so the decision is made via TargetOptions and defaults to off. Add a command line option to llc that enables it. llvm-svn: 158692	2012-06-19 00:48:28 +00:00
Nuno Lopes	f9abcb7ba9	revert r158660, since Chris has some issues with this patch (namely using code to reprent information only used by the compiler) Original commit msg: add the 'alloc' metadata node to represent the size of offset of buffers pointed to by pointers. This metadata can be attached to any instruction returning a pointer llvm-svn: 158688	2012-06-18 23:34:26 +00:00
Manman Ren	6e1fd46fdf	ARM: use NOEN loads and stores if possible when handling struct byval. This change is to be enabled in clang. rdar://9877866 llvm-svn: 158684	2012-06-18 22:23:48 +00:00
Jim Grosbach	cb540f5cff	ARM: Define generic HINT instruction. The NOP, WFE, WFI, SEV and YIELD instructions are all hints w/ a different immediate value in bits [7,0]. Define a generic HINT instruction and refactor NOP, WFI, WFI, SEV and YIELD to be assembly aliases of that. rdar://11600518 llvm-svn: 158674	2012-06-18 19:45:50 +00:00
Nuno Lopes	b7c941bad9	add the 'alloc' metadata node to represent the size of offset of buffers pointed to by pointers. This metadata can be attached to any instruction returning a pointer llvm-svn: 158660	2012-06-18 16:04:04 +00:00
Joel Jones	3237ce737e	This change handles a another case for generating the bic instruction when a compile time constant is known. This occurs when implicitly zero extending function arguments from 16 bits to 32 bits. The 8 bit case doesn't need to be handled, as the 8 bit constants are encoded directly, thereby not needing a separate load instruction to form the constant into a register. <rdar://problem/11481151> llvm-svn: 158659	2012-06-18 14:51:32 +00:00
Chandler Carruth	a1da0bf5ef	Add a regression test for the bug exposed by r158087, which has been temporarily reverted. This test is annoyingly overspecified, but I don't know of another way to thoroughly test the saving and restoring of the registers. While this will have to be adjusted even with the issue fixed in order to re-apply r158087, those adjustments should very clearly indicate that it is still correct (%esp getting restored prior to pops), whereas without it, this case can easily slip under the radar. Still, any suggestions for improvements are very welcome. All credit to Matt Beaumont-Gay for reducing this out of an insane Address Sanitizer crash to a reasonably small seg-faulting C program when built with -mstackrealign. I just reduced it to IR, which was much simpler. =] llvm-svn: 158656	2012-06-18 09:15:04 +00:00
Chandler Carruth	2cc11fd8c7	Temporarily revert r158087. This patch causes problems when both dynamic stack realignment and dynamic allocas combine in the same function. With this patch, we no longer build the epilog correctly, and silently restore registers from the wrong position in the stack. Thanks to Matt for tracking this down, and getting at least an initial test case to Chad. I'm going to try to check a variation of that test case in so we can easily track the fixes required. llvm-svn: 158654	2012-06-18 07:03:12 +00:00
Pete Cooper	33ee6c9bf1	Now that SROA can form alloca's for dynamic vector accesses, further improve it to be able to replace operations on these vector alloca's with insert/extract element insts llvm-svn: 158623	2012-06-17 03:58:26 +00:00
Hal Finkel	6261c2dc28	Cleanup trip-count finding for PPC CTR loops (and some bug fixes). This cleans up the method used to find trip counts in order to form CTR loops on PPC. This refactoring allows the pass to find loops which have a constant trip count but also happen to end with a comparison to zero. This also adds explicit FIXMEs to mark two different classes of loops that are currently ignored. In addition, we now search through all potential induction operations instead of just the first. Also, we check the predicate code on the conditional branch and abort the transformation if the code is not EQ or NE, and we then make sure that the branch to be transformed matches the condition register defined by the comparison (multiple possible comparisons will be considered). llvm-svn: 158607	2012-06-16 20:34:07 +00:00
Hal Finkel	fa103d3fc7	Teach BBVectorize to combine, when possible, or discard metadata when fusing instructions. The present implementation handles only TBAA and FP metadata, discarding everything else. For debug metadata, the current behavior is maintained (the debug metadata associated with one of the instructions will be kept, discarding that attached to the other). This should address PR 13040. llvm-svn: 158606	2012-06-16 20:34:06 +00:00
Rafael Espindola	f70bea93e2	Implement irpc. Extracted from a patch by the PaX team. I just added the test. llvm-svn: 158604	2012-06-16 18:03:25 +00:00
Pete Cooper	818e9f4a26	Fix crash from r158529 on Bullet. Dynamic GEPs created by SROA needed to insert extra "i32 0" operands to index through structs and arrays to get to the vector being indexed. llvm-svn: 158590	2012-06-16 01:43:26 +00:00
Andrew Trick	e67a30c77f	Unit test for LSR kind=Special fix: r158536. llvm-svn: 158570	2012-06-15 22:46:31 +00:00
Kevin Enderby	6c7279ec2e	Fix the encoding of the armv7m (MClass) for MSR registers other than aspr, iaspr, espr and xpsr which also needed to have 0b10 in their mask encoding bits. llvm-svn: 158560	2012-06-15 22:14:44 +00:00
Manman Ren	e0763c7472	ARM: optimization for sub+abs. This patch will optimize abs(x-y) FROM sub, movs, rsbmi TO subs, rsbmi For abs, we will use cmp instead of movs. This is necessary because we already have an existing peephole pass which optimizes away cmp following sub. rdar: 11633193 llvm-svn: 158551	2012-06-15 21:32:12 +00:00
Pete Cooper	e24d6a19e3	Allow SROA to split up an array of vectors into multiple vectors, even when the vectors are dynamically indexed llvm-svn: 158529	2012-06-15 18:07:29 +00:00
Rafael Espindola	1821c6c3b0	Some optimizations done by globalopt are safe only for internal linkage, not linkonce linkage. For example, it is not valid to add unnamed_addr. This also fixes a crash in g++.dg/opt/static5.C. llvm-svn: 158528	2012-06-15 18:00:24 +00:00
Jakob Stoklund Olesen	a15a224db0	Preserve <undef> flags in ARMExpandPseudo. This probably mostly shows up in bugpoint-generated code. llvm-svn: 158527	2012-06-15 17:46:54 +00:00
Rafael Espindola	768b41c17a	Factor macro argument parsing into helper methods and add support for .irp. Patch extracted from a larger one by the PaX team. I added the testcases and tightened error handling a bit. llvm-svn: 158523	2012-06-15 14:02:34 +00:00
Duncan Sands	7838603ffc	Fix issues (infinite loop and/or crash) with self-referential instructions, for example degenerate phi nodes and binops that use themselves in unreachable code. Thanks to Charles Davis for the testcase that uncovered this can of worms. llvm-svn: 158508	2012-06-15 08:37:50 +00:00
Pete Cooper	1d1fa72837	Recommit r158407: Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access. Now with additional fix and test for indexing into a vector inside a struct llvm-svn: 158479	2012-06-14 23:53:53 +00:00
Rafael Espindola	def1b09be2	Implement the isSafeToDiscardIfUnused predicate and use it in globalopt and globaldce. Globaldce was already removing linkonce globals, but globalopt was not. llvm-svn: 158476	2012-06-14 22:48:13 +00:00
Akira Hatanaka	d8ab16b86f	1. introduce MipsPat in place of Pat in order to exclude those from being used by Mips16 or Micro Mips 2. clean up a few lines too long encountered Patch by Reed Kotler. llvm-svn: 158470	2012-06-14 21:03:23 +00:00
Akira Hatanaka	1b420ac4c8	Make machine verifier check the first instruction of the last bundle instead of the last instruction of a basic block. llvm-svn: 158468	2012-06-14 20:51:13 +00:00
Pete Cooper	5d19452f3f	Revert r158454: Allow SROA to look at a vector type... Its breaking the vectorise buildbot This reverts commit 12c1f86ffa731e2952c80d2cc577000c96b8962c. llvm-svn: 158462	2012-06-14 18:32:52 +00:00
Pete Cooper	a7e6d58a87	Recommit r158407: Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access. Now with additional fix and test for indexing into a vector inside a struct llvm-svn: 158454	2012-06-14 16:38:13 +00:00
Richard Barton	b0ec375b96	Replace assertion failure for badly formatted CPS instrution with error message. llvm-svn: 158445	2012-06-14 10:48:04 +00:00
Manman Ren	2764301a77	Revert: test/CodeGen/ARM/iabs.ll in r158441 Sorry that I accidently checked in this file with my previous commit. llvm-svn: 158442	2012-06-14 06:04:02 +00:00
Manman Ren	c2bc2d106b	InstCombine: fix a bug when combining (fcmp cc0 x, y) && (fcmp cc1 x, y). uno && ueq was converted to ueq, it should be converted to uno. llvm-svn: 158441	2012-06-14 05:57:42 +00:00
Akira Hatanaka	c6496e2cb6	Test case for MIPS long branch pass. llvm-svn: 158438	2012-06-14 02:12:21 +00:00
Akira Hatanaka	843aca9328	Fix test cases. llvm-svn: 158435	2012-06-14 01:21:00 +00:00
Akira Hatanaka	df5205ef3d	Implement a DAGCombine in MipsISelLowering.cpp which transforms the following pattern: (add v0, (add v1, abs_lo(tjt))) => (add (add v0, v1), abs_lo(tjt)) "tjt" is a TargetJumpTable node. llvm-svn: 158419	2012-06-13 20:33:18 +00:00
Akira Hatanaka	1daf8c2a16	Set a higher value for maxStoresPerMemcpy in MipsISelLowering.cpp. llvm-svn: 158414	2012-06-13 19:33:32 +00:00
Akira Hatanaka	f0273603f5	Implement fastcc calling convention for MIPS. llvm-svn: 158410	2012-06-13 18:06:00 +00:00
Richard Osborne	ab7d788eb5	Fix pattern for MKMSK instruction. llvm-svn: 158409	2012-06-13 17:59:12 +00:00
Pete Cooper	e2fe809772	Revert "Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access" This reverts commit 51786e0aaec76b973205066bd44f7f427b21969f. llvm-svn: 158408	2012-06-13 17:55:22 +00:00
Pete Cooper	e1d4e8b563	Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access llvm-svn: 158407	2012-06-13 17:30:34 +00:00
Duncan Sands	409d8ae165	It is possible for several constants which aren't individually absorbing to combine to the absorbing element. Thanks to nbjoerg on IRC for pointing this out. llvm-svn: 158399	2012-06-13 12:15:56 +00:00
Craig Topper	71dc02d659	Fix intrinsics for XOP frczss/sd instructions. These instructions only take one source register and zero the upper bits of the destination rather than preserving them. llvm-svn: 158396	2012-06-13 07:18:53 +00:00
Manman Ren	d33f4efbfd	SimplifyCFG: fold unconditional branch to its predecessor if profitable. This patch extends FoldBranchToCommonDest to fold unconditional branches. For unconditional branches, we fold them if it is easy to update the phi nodes in the common successors. rdar://10554090 llvm-svn: 158392	2012-06-13 05:43:29 +00:00
Akira Hatanaka	5fa541231b	disable use of directive .set nomicromips until this directive is pushed in gas to open source fsf Patch by Reed Kotler. llvm-svn: 158381	2012-06-13 02:41:14 +00:00
Andrew Trick	344fb64fa3	sched: fix latency of memory dependence chain edges for consistency. For store->load dependencies that may alias, we should always use TrueMemOrderLatency, which may eventually become a subtarget hook. In effect, we should guarantee at least TrueMemOrderLatency on at least one DAG path from a store to a may-alias load. This should fix the standard mode as well as -enable-aa-sched-mi". llvm-svn: 158380	2012-06-13 02:39:03 +00:00
Duncan Sands	67cd591989	Use std::map rather than SmallMap because SmallMap assumes that the value has POD type, causing memory corruption when mapping to APInts with bitwidth > 64. Merge another crash testcase into crash.ll while there. llvm-svn: 158369	2012-06-12 20:16:51 +00:00
Chad Rosier	c6916f88a8	[arm-fast-isel] Add support for -arm-long-calls. Patch by Jush Lu <jush.msn@gmail.com>. llvm-svn: 158368	2012-06-12 19:25:13 +00:00
Duncan Sands	d7aeefebd6	Now that Reassociate's LinearizeExprTree can look through arbitrary expression topologies, it is quite possible for a leaf node to have huge multiplicity, for example: x0 = xx, x1 = x0x0, x2 = x1*x1, ... rapidly gives a value which is x raised to a vast power (the multiplicity, or weight, of x). This patch fixes the computation of weights by correctly computing them no matter how big they are, rather than just overflowing and getting a wrong value. It turns out that the weight for a value never needs more bits to represent than the value itself, so it is enough to represent weights as APInts of the same bitwidth and do the right overflow-avoiding dance steps when computing weights. As a side-effect it reduces the number of multiplies needed in some cases of large powers. While there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree static, pushing the rank computation out into users. This is progress towards fixing PR13021. llvm-svn: 158358	2012-06-12 14:33:56 +00:00
Jakob Stoklund Olesen	e782fa649f	Fix test that depends on register allocation. The test is really checking the prolog/epilog load/store multiple formation. llvm-svn: 158328	2012-06-11 21:14:28 +00:00
Jakob Stoklund Olesen	4e28777465	Fix test case to work on ARM. Patch by James Benton! llvm-svn: 158316	2012-06-11 16:01:14 +00:00
Bill Wendling	4b79647a6e	Re-enable the CMN instruction. We turned off the CMN instruction because it had semantics which we weren't getting correct. If we are comparing with an immediate, then it's okay to use the CMN instruction. <rdar://problem/7569620> llvm-svn: 158302	2012-06-11 08:07:26 +00:00
Benjamin Kramer	8b8a76974f	InstCombine: Turn (zext A) == (B & (1<<X)-1) into A == (trunc B), narrowing the compare. This saves a cast, and zext is more expensive on platforms with subreg support than trunc is. This occurs in the BSD implementation of memchr(3), see PR12750. On the synthetic benchmark from that bug stupid_memchr and bsd_memchr have the same performance now when not inlining either function. stupid_memchr: 323.0us bsd_memchr: 321.0us memchr: 479.0us where memchr is the llvm-gcc compiled bsd_memchr from osx lion's libc. When inlining is enabled bsd_memchr still regresses down to llvm-gcc memchr time, I haven't fully understood the issue yet, something is grossly mangling the loop after inlining. llvm-svn: 158297	2012-06-10 20:35:00 +00:00
Hal Finkel	4e9f1a859f	Enable ILP scheduling for all nodes by default on PPC. Over the entire test-suite, this has an insignificantly negative average performance impact, but reduces some of the worst slowdowns from the anti-dep. change (r158294). Largest speedups: SingleSource/Benchmarks/Stanford/Quicksort - 28% SingleSource/Benchmarks/Stanford/Towers - 24% SingleSource/Benchmarks/Shootout-C++/matrix - 23% MultiSource/Benchmarks/SciMark2-C/scimark2 - 19% MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount - 15% (matrix and automotive-bitcount were both in the top-5 slowdown list from the anti-dep. change) Largest slowdowns: MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 28% MultiSource/Benchmarks/mediabench/gsm/toast/toast - 26% MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan - 21% SingleSource/Benchmarks/CoyoteBench/lpbench - 20% MultiSource/Applications/d/make_dparser - 16% llvm-svn: 158296	2012-06-10 19:32:29 +00:00
Nadav Rotem	17ee58a792	Add AutoUpgrade support for the SSE4 ptest intrinsics. Patch by Michael Kuperstein. llvm-svn: 158295	2012-06-10 18:42:51 +00:00
Hal Finkel	2edfbddcf0	Improve ext/trunc patterns on PPC64. The PPC64 backend had patterns for i32 <-> i64 extensions and truncations that would leave self-moves in the final assembly. Replacing those patterns with ones based on the SUBREG builtins yields better-looking code. Thanks to Jakob and Owen for their suggestions in this matter. llvm-svn: 158283	2012-06-09 22:10:19 +00:00
Craig Topper	3352ba55b9	Replace XOP vpcom intrinsics with fewer intrinsics that take the immediate as an argument. llvm-svn: 158278	2012-06-09 16:46:13 +00:00
Hal Finkel	eb50c2d4a4	Enable tail merging on PPC. Tail merging had been disabled on PPC because it would disturb bundling decisions made during pre-RA scheduling on the 970 cores. Now, however, all bundling decisions are made during post-RA scheduling, and tail merging is generally beneficial (the average test-suite speedup is insignificantly positive). Largest test-suite speedups: MultiSource/Benchmarks/mediabench/gsm/toast/toast - 30% MultiSource/Benchmarks/BitBench/uuencode/uuencode - 23% SingleSource/Benchmarks/Shootout-C++/ary - 21% SingleSource/Benchmarks/Stanford/Queens - 17% Largest slowdowns: MultiSource/Benchmarks/MiBench/security-sha/security-sha - 24% MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 22% MultiSource/Applications/JM/ldecod/ldecod - 14% MultiSource/Benchmarks/mediabench/g721/g721encode/encode - 9% This is improved by using full (instead of just critical) anti-dependency breaking, but doing so still causes miscompiles and so cannot yet be enabled by default. llvm-svn: 158259	2012-06-09 03:14:50 +00:00
Jakob Stoklund Olesen	33a1b416ac	Don't run RAFast in the optimizing regalloc pipeline. The fast register allocator is not supposed to work in the optimizing pipeline. It doesn't make sense to compute live intervals, run full copy coalescing, and then run RAFast. Fast register allocation in the optimizing pipeline is better done by RABasic. llvm-svn: 158242	2012-06-08 23:15:12 +00:00
Nuno Lopes	2710f1b049	canonicalize: -%a + 42 into 42 - %a previously we were emitting: -(%a + 42) This fixes the infinite loop in PR12338. The generated code is still not perfect, though. Will work on that next llvm-svn: 158237	2012-06-08 22:30:05 +00:00
Hal Finkel	c6b5debb40	Enable PPC CTR loop formation by default. Thanks to Jakob's help, this now causes no new test suite failures! Over the entire test suite, this gives an average 1% speedup. The largest speedups are: SingleSource/Benchmarks/Misc/pi - 108% SingleSource/Benchmarks/CoyoteBench/lpbench - 54% MultiSource/Benchmarks/Prolangs-C/unix-smail/unix-smail - 50% SingleSource/Benchmarks/Shootout/ary3 - 32% SingleSource/Benchmarks/Shootout-C++/matrix - 30% The largest slowdowns are: MultiSource/Benchmarks/mediabench/gsm/toast/toast - -30% MultiSource/Benchmarks/Prolangs-C/bison/mybison - -25% MultiSource/Benchmarks/BitBench/uuencode/uuencode - -22% MultiSource/Applications/d/make_dparser - -14% SingleSource/Benchmarks/Shootout-C++/ary - -13% In light of these slowdowns, additional profiling work is obviously needed! llvm-svn: 158223	2012-06-08 19:19:53 +00:00
Manman Ren	bf86b295bb	Test case for r158160 llvm-svn: 158218	2012-06-08 18:42:37 +00:00

... 3 4 5 6 7 ...

16812 Commits