llvm-project

Commit Graph

Author	SHA1	Message	Date
Duncan Sands	e6beec6765	Relax the restrictions on vector of pointer types, and vector getelementptr. Previously in a vector of pointers, the pointer couldn't be any pointer type, it had to be a pointer to an integer or floating point type. This is a hassle for dragonegg because the GCC vectorizer happily produces vectors of pointers where the pointer is a pointer to a struct or whatever. Vector getelementptr was restricted to just one index, but now that vectors of pointers can have any pointer type it is more natural to allow arbitrary vector getelementptrs. There is however the issue of struct GEPs, where if each lane chose different struct fields then from that point on each lane will be working down into unrelated types. This seems like too much pain for too little gain, so when you have a vector struct index all the elements are required to be the same. llvm-svn: 167828	2012-11-13 12:59:33 +00:00
Benjamin Kramer	3eb156306a	DependenceAnalysis: Print all dependency pairs when dumping. Update all testcases. Part of a patch by Preston Briggs. llvm-svn: 167827	2012-11-13 12:12:02 +00:00
Alexey Samsonov	cfd662f279	Figure out <size> argument of llvm.lifetime intrinsics at the moment they are created (during function inlining) llvm-svn: 167821	2012-11-13 07:15:32 +00:00
Meador Inge	193e035b9c	instcombine: Migrate math library call simplifications This patch migrates the math library call simplifications from the simplify-libcalls pass into the instcombine library call simplifier. I have typically migrated just one simplifier at a time, but the math simplifiers are interdependent because: 1. CosOpt, PowOpt, and Exp2Opt all depend on UnaryDoubleFPOpt. 2. CosOpt, PowOpt, Exp2Opt, and UnaryDoubleFPOpt all depend on the option -enable-double-float-shrink. These two factors made migrating each of these simplifiers individually more of a pain than it would be worth. So, I migrated them all together. llvm-svn: 167815	2012-11-13 04:16:17 +00:00
Hal Finkel	2a1df367d4	BBVectorize: Don't vectorize vector-manipulation chains Don't choose a vectorization plan containing only shuffles and vector inserts/extracts. Due to inperfections in the cost model, these can lead to infinite recusion. llvm-svn: 167811	2012-11-13 03:12:40 +00:00
Bill Wendling	f454dfb6b5	Use the 'count' attribute instead of the 'upper_bound' attribute. If we have a type 'int a[1]' and a type 'int b[0]', the generated DWARF is the same for both of them because we use the 'upper_bound' attribute. Instead use the 'count' attrbute, which gives the correct number of elements in the array. <rdar://problem/12566646> llvm-svn: 167806	2012-11-13 02:31:47 +00:00
Andrew Trick	edac22a9f3	Cleanup the main RegisterCoalescer loop. Block priorities still apply outside loops. llvm-svn: 167793	2012-11-13 00:34:44 +00:00
Shuxin Yang	c94c3bb5d0	revert r167740 llvm-svn: 167787	2012-11-13 00:08:49 +00:00
Hal Finkel	3b79f55c5f	BBVectorize: Only some insert element operand pairs are free. This fixes another infinite recursion case when using target costs. We can only replace insert element input chains that are pure (end with inserting into an undef). llvm-svn: 167784	2012-11-12 23:55:36 +00:00
Michael Liao	b193ed44ee	Fix test case added in patch fixing PR14314 llvm-svn: 167769	2012-11-12 22:33:18 +00:00
Chad Rosier	a458d88b21	Update test case for r167754/r167755. llvm-svn: 167760	2012-11-12 21:51:08 +00:00
Hal Finkel	9cf3372931	BBVectorize: Use a more sophisticated check for input cost The old checking code, which assumed that input shuffles and insert-elements could always be folded (and thus were free) is too simple. This can only happen in special circumstances. Using the simple check caused infinite recursion. llvm-svn: 167750	2012-11-12 21:21:02 +00:00
Hal Finkel	f8326b6052	BBVectorize: Check the types of compare instructions The pass would previously assert when trying to compute the cost of compare instructions with illegal vector types (like struct pointers). llvm-svn: 167743	2012-11-12 19:41:38 +00:00
Shuxin Yang	1c442f5ec6	This change is to fix rdar://12571717 which is about assertion in Reassociate pass. The assertion is trigged when the Reassociater tries to transform expression ... + 2 * n * 3 + 2 * m + ... into: ... + 2 * (n3 + m). In the process of the transformation, a helper routine folds the constant 23 into 6, confusing optimizer which is trying the to eliminate the common factor 2, and cannot find 2 any more. Review is pending. But I'd like commit first in order to help those who are waiting for this fix. llvm-svn: 167740	2012-11-12 19:34:11 +00:00
Andrew Trick	f1ff84c64e	misched: Infrastructure for weak DAG edges. This adds support for weak DAG edges to the general scheduling infrastructure in preparation for MachineScheduler support for heuristics based on weak edges. llvm-svn: 167738	2012-11-12 19:28:57 +00:00
Hal Finkel	ef53df0f9f	BBVectorize: Check the input types of shuffles for legality This fixes a bug where shuffles were being fused such that the resulting input types were not legal on the target. This would occur only when both inputs and dependencies were also foldable operations (such as other shuffles) and there were other connected pairs in the same block. llvm-svn: 167731	2012-11-12 14:50:59 +00:00
Meador Inge	b3e91f6ae0	Normalize memcmp constant folding results. The library call simplifier folds memcmp calls with all constant arguments to a constant. For example: memcmp("foo", "foo", 3) -> 0 memcmp("hel", "foo", 3) -> 1 memcmp("foo", "hel", 3) -> -1 The folding is implemented in terms of the system memcmp that LLVM gets linked with. It currently just blindly uses the value returned from the system memcmp as the folded constant. This patch normalizes the values returned from the system memcmp to (-1, 0, 1) so that we get consistent results across multiple platforms. The test cases were adjusted accordingly. llvm-svn: 167726	2012-11-12 14:00:45 +00:00
Michael Liao	d39c0fb19f	Fix PR14314 - Fix operand order for atomic sub, where the minuend is the value loaded from memory and the subtrahend is the parameter specified. llvm-svn: 167718	2012-11-12 06:49:17 +00:00
Justin Holewinski	1812ee9a5b	[NVPTX] Add more precise PTX/SM target attributes Each SM and PTX version is modeled as a subtarget feature/CPU. Additionally, PTX 3.1 is added as the default PTX version to be out-of-the-box compatible with CUDA 5.0. Available CPUs for this target: sm_10 - Select the sm_10 processor. sm_11 - Select the sm_11 processor. sm_12 - Select the sm_12 processor. sm_13 - Select the sm_13 processor. sm_20 - Select the sm_20 processor. sm_21 - Select the sm_21 processor. sm_30 - Select the sm_30 processor. sm_35 - Select the sm_35 processor. Available features for this target: ptx30 - Use PTX version 3.0. ptx31 - Use PTX version 3.1. sm_10 - Target SM 1.0. sm_11 - Target SM 1.1. sm_12 - Target SM 1.2. sm_13 - Target SM 1.3. sm_20 - Target SM 2.0. sm_21 - Target SM 2.1. sm_30 - Target SM 3.0. sm_35 - Target SM 3.5. llvm-svn: 167699	2012-11-12 03:16:43 +00:00
Meador Inge	9493eb9bc4	Remove hard-coded constant in Transforms/InstCombine/memcmp-1.ll Transforms/InstCombine/memcmp-1.ll has a test case that looks like: @foo = constant [4 x i8] c"foo\00" @hel = constant [4 x i8] c"hel\00" ... %mem1 = getelementptr [4 x i8]* @hel, i32 0, i32 0 %mem2 = getelementptr [4 x i8]* @foo, i32 0, i32 0 %ret = call i32 @memcmp(i8* %mem1, i8* %mem2, i32 3) ret i32 %ret ; CHECK: ret i32 2 The folded return value (2 above) is computed using the system memcmp that the compiler is linked with. This can return different values on different systems. The test was originally written on an OS X 10.7.5 x86-64 box and passed. However, it failed on one of the x86-64 FreeBSD buildbots because the system memcpy on that machine returned a different value (1 instead of 2). I fixed the test by checking the folding constants with regexes. llvm-svn: 167691	2012-11-11 07:10:25 +00:00
Meador Inge	d4825780ed	instcombine: Migrate memset optimizations This patch migrates the memset optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167689	2012-11-11 06:49:03 +00:00
Meador Inge	9cf328b526	instcombine: Migrate memmove optimizations This patch migrates the memmove optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167687	2012-11-11 06:22:40 +00:00
Meador Inge	dd9234a10a	instcombine: Migrate memcpy optimizations This patch migrates the memcpy optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167686	2012-11-11 05:54:34 +00:00
Meador Inge	4d2827c10d	instcombine: Migrate memcmp optimizations This patch migrates the memcmp optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167683	2012-11-11 05:11:20 +00:00
Meador Inge	56edbc9323	instcombine: Migrate strstr optimizations This patch migrates the strstr optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167682	2012-11-11 03:51:48 +00:00
Meador Inge	bcd88ef764	instcombine: Migrate strcspn optimizations This patch migrates the strcspn optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167675	2012-11-10 15:16:48 +00:00
Evan Cheng	a5d363ec24	Convert an improper CodeGen test to a MC test. llvm-svn: 167663	2012-11-10 04:30:40 +00:00
Meador Inge	03be256db9	instcombine: Query target library information to gate libcall simplifications Several of the simplifiers migrated from the simplify-libcalls pass to the instcombine pass were not correctly checking the target library information to gate the simplifications. This patch ensures that the check is made. llvm-svn: 167660	2012-11-10 03:11:10 +00:00
Evan Cheng	a17fea1967	xfail a bad test. This is a MC test but it's dependent on a codegen optimization which is now disabled. llvm-svn: 167658	2012-11-10 02:34:36 +00:00
Evan Cheng	21b0348199	Disable the Thumb no-return call optimization: mov lr, pc b.w _foo The "mov" instruction doesn't set bit zero to one, it's putting incorrect value in lr. It messes up backtraces. rdar://12663632 llvm-svn: 167657	2012-11-10 02:09:05 +00:00
Craig Topper	9268c94b15	Cleanup pcmp(e/i)str(m/i) instruction definitions and load folding support. llvm-svn: 167652	2012-11-10 01:23:36 +00:00
Justin Holewinski	2dc9d072e5	[NVPTX] Use ABI alignment for parameters when alignment is not specified. Affects SM 2.0+. Fixes bug 13324. llvm-svn: 167646	2012-11-09 23:50:24 +00:00
Jakob Stoklund Olesen	13d5562963	Fix assertions in updateRegMaskSlots(). The RegMaskSlots contains 'r' slots while NewIdx and OldIdx are 'B' slots. This broke the checks in the assertions. This fixes PR14302. llvm-svn: 167625	2012-11-09 19:18:49 +00:00
Dmitry Vyukov	0044e386e9	tsan: switch to new memory_order constants (ABI compatible) llvm-svn: 167615	2012-11-09 14:12:16 +00:00
Dmitry Vyukov	92b9e1dbfd	tsan: instrument all atomics (including fetch_add, exchange, cas, etc) llvm-svn: 167612	2012-11-09 12:55:36 +00:00
Nadav Rotem	1cfef3e9ee	Add support for memory runtime check. When we can, we calculate array bounds. If the arrays are found to be disjoint then we run the vectorized version of the loop. If they are not, we run the scalar code. llvm-svn: 167608	2012-11-09 07:09:44 +00:00
NAKAMURA Takumi	43ab4ef9ba	llvm/ConstantFolding.cpp: Make ReadDataFromGlobal() and FoldReinterpretLoadFromConstPtr() Big-endian-aware. llvm-svn: 167595	2012-11-08 20:34:25 +00:00
Amara Emerson	ec2cd56708	Recommit modified r167540. Improve ARM build attribute emission for architectures types. This also changes the default architecture emitted for a generic CPU to "v7". llvm-svn: 167574	2012-11-08 09:51:45 +00:00
Michael Liao	73cffddb95	Add support of RTM from TSX extension - Add RTM code generation support throught 3 X86 intrinsics: xbegin()/xend() to start/end a transaction region, and xabort() to abort a tranaction region llvm-svn: 167573	2012-11-08 07:28:54 +00:00
Meador Inge	489b5d645f	instcombine: Migrate strspn optimizations This patch migrates the strspn optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167568	2012-11-08 01:33:50 +00:00
Eric Christopher	7c678de861	Add a relocation visitor to lib object. This works via caching relocated values in a map that can be passed to consumers. Add a testcase that ensures this works for llvm-dwarfdump. llvm-svn: 167558	2012-11-07 23:22:07 +00:00
Hans Wennborg	c3c8d95c51	Only do switch-to-lookup table transformation when TargetTransformInfo is available. llvm-svn: 167552	2012-11-07 21:35:12 +00:00
Akira Hatanaka	28e02ec8c1	[mips] Custom-lower ISD::FRAME_TO_ARGS_OFFSET node. Patch by Sasa Stankovic. llvm-svn: 167548	2012-11-07 19:10:58 +00:00
Hans Wennborg	11d4ebe224	Fix bad test IR in switch_to_lookup_table.ll llvm-svn: 167543	2012-11-07 18:38:24 +00:00
Andrew Trick	3ca33acb95	misched: Heuristics based on the machine model. misched is disabled by default. With -enable-misched, these heuristics balance the schedule to simultaneously avoid saturating processor resources, expose ILP, and minimize register pressure. I've been analyzing the performance of these heuristics on everything in the llvm test suite in addition to a few other benchmarks. I would like each heuristic check to be verified by a unit test, but I'm still trying to figure out the best way to do that. The heuristics are still in considerable flux, but as they are refined we should be rigorous about unit testing the improvements. llvm-svn: 167527	2012-11-07 07:05:09 +00:00
Nadav Rotem	f036ca466e	CostModel: add another known vector trunc optimization. llvm-svn: 167488	2012-11-06 21:17:17 +00:00
Nadav Rotem	0914f0b262	Cost Model: add tables for some avx type-conversion hacks. llvm-svn: 167480	2012-11-06 19:33:53 +00:00
Nadav Rotem	c378a8067d	CostModel: Add tables for the common x86 compares. llvm-svn: 167421	2012-11-05 23:48:20 +00:00
Nadav Rotem	ae79765676	Code Model: Improve the accuracy of the zext/sext/trunc vector cost estimation. llvm-svn: 167412	2012-11-05 22:20:53 +00:00
Kevin Enderby	27121c1543	Fix for PR14264 cause by commit r167237 which did not take into account a possible buffer change with a .macro directive. rdar://12637628 llvm-svn: 167408	2012-11-05 21:55:41 +00:00
Nadav Rotem	856ffa6677	Cost Model: Normalize the insert/extract index when splitting types llvm-svn: 167402	2012-11-05 21:12:13 +00:00
Nadav Rotem	020be9dc29	Cost Model: teach the cost model about expanding integers. llvm-svn: 167401	2012-11-05 21:11:10 +00:00
Ulrich Weigand	339d0597d3	On PowerPC64, integer return values (as well as arguments) are supposed to be extended to a full register. This is modeled in the IR by marking the return value (or argument) with a signext or zeroext attribute. However, while these attributes are respected for function arguments, they are currently ignored for function return values by the PowerPC back-end. This patch updates PPCCallingConv.td to ask for the promotion to i64, and fixes LowerReturn and LowerCallResult to implement it. The new test case verifies that both arguments and return values are properly extended when passing them; and also that the optimizers understand incoming argument and return values are in fact guaranteed by the ABI to be extended. The patch caused a spurious breakage in CodeGen/PowerPC/coalesce-ext.ll, since the test case used a "ret" instruction to create a use of an i32 value at the end of the function (to set up data flow as required for what the test is intended to test). Since there's now an implicit promotion to i64, that data flow no longer works as expected. To fix this, this patch now adds an extra "add" to ensure we have an appropriate use of the i32 value. llvm-svn: 167396	2012-11-05 19:39:45 +00:00
Nadav Rotem	7411623fd8	Implement the cost of abnormal x86 instruction lowering as a table. llvm-svn: 167395	2012-11-05 19:32:46 +00:00
Hal Finkel	4f24c621d9	Add support for the PowerPC-specific inline asm Z constraint and y modifier. The Z constraint specifies an r+r memory address, and the y modifier expands to the "r, r" in the asm string. For this initial implementation, the base register is forced to r0 (which has the special meaning of 0 for r+r addressing on PowerPC) and the full address is taken in the second register. In the future, this should be improved. llvm-svn: 167388	2012-11-05 18:18:42 +00:00
Adhemerval Zanella	c4182d1890	[PATCH] PowerPC: Expand load extend vector operations This patch expands the SEXTLOAD, ZEXTLOAD, and EXTLOAD operations for vector types when altivec is enabled. llvm-svn: 167386	2012-11-05 17:15:56 +00:00
Richard Osborne	a1fffcf73a	Don't infer whether a value is captured in the current function from the 'nocapture' attribute. The nocapture attribute only specifies that no copies are made that outlive the function. This isn't the same as there being no copies at all. This fixes PR14045. llvm-svn: 167381	2012-11-05 10:48:24 +00:00
Duncan Sands	a318ef6fa6	Generalize the transform that boosts GEP indices to the size of a pointer to also do it for vectors of pointers. llvm-svn: 167354	2012-11-03 11:44:17 +00:00
Akira Hatanaka	da1980f697	[mips] Set flag neverHasSideEffects flag on floating point conversion instructions. llvm-svn: 167348	2012-11-03 00:53:12 +00:00
Nadav Rotem	c2345cbe73	X86 CostModel: Add support for a some of the common arithmetic instructions for SSE4, AVX and AVX2. llvm-svn: 167347	2012-11-03 00:39:56 +00:00
Akira Hatanaka	7828331329	[mips] Set flag isAsCheapAsAMove flag on instruction LUi. llvm-svn: 167345	2012-11-03 00:26:02 +00:00
Akira Hatanaka	5852e3b800	[mips] Stop reserving register AT and use register scavenger when a scratch register is needed. llvm-svn: 167341	2012-11-03 00:05:43 +00:00
Nadav Rotem	23848f8f1d	Add a stub for the x86 cost model impl. Implement a basic cost rule for inserting/extracting from XMM registers. llvm-svn: 167333	2012-11-02 23:27:16 +00:00
Nadav Rotem	13da94734c	CostModel: add support for Vector Insert and Extract. llvm-svn: 167329	2012-11-02 22:31:56 +00:00
Akira Hatanaka	d0836fd20a	[mips] Fix disassembler test cases. llvm-svn: 167326	2012-11-02 22:20:10 +00:00
Nadav Rotem	a6b91ac307	Add a cost model analysis that allows us to estimate the cost of IR-level instructions. llvm-svn: 167324	2012-11-02 21:48:17 +00:00
Akira Hatanaka	6dcf75897c	[mips] Fix bug in test case. Disable machine LICM to prevent instruction from being moved out of a basic block. llvm-svn: 167322	2012-11-02 21:46:42 +00:00
Quentin Colombet	8e1fe84c3c	Vext Lowering was missing opportunities llvm-svn: 167318	2012-11-02 21:32:17 +00:00
Akira Hatanaka	949f8d890d	[mips] Use register number instead of name to print register $AT. llvm-svn: 167315	2012-11-02 21:26:03 +00:00
Akira Hatanaka	0dfbf1262b	[mips] Delete MipsFunctionInfo::EmitNOAT. Unconditionally print directive "set .noat" so that the assembler doesn't issue warnings when register $AT is used. llvm-svn: 167310	2012-11-02 20:56:25 +00:00
Chandler Carruth	b62807a95c	Add a testcase to loop-idiom to cover PR14241 when we start handling strided loops again. llvm-svn: 167287	2012-11-02 08:40:24 +00:00
Chandler Carruth	099f5cb031	Revert the switch of loop-idiom to use the new dependence analysis. The new analysis is not yet ready for prime time. It has a critical flawed assumption, and some troubling shortages of testing. Until it's been hammered into better shape, let's stick with the working code. This should be easy to revert itself when the analysis is ready. Fixes PR14241, a miscompile of any memcpy-able loop which uses a pointer as the induction mechanism. If you have been seeing miscompiles in this revision range, you really want to test with this backed out. The results of this miscompile are a bit subtle as they can lead to downstream passes concluding things are impossible which are in fact possible. Thanks to David Blaikie for the majority of the reduction of this miscompile. I'll be checking in the test case in a non-revert commit. Revesions reverted here: r167045: LoopIdiom: Fix a serious missed optimization: we only turned top-level loops into memmove. r166877: LoopIdiom: Add checks to avoid turning memmove into an infinite loop. r166875: LoopIdiom: Recognize memmove loops. r166874: LoopIdiom: Replace custom dependence analysis with DependenceAnalysis. llvm-svn: 167286	2012-11-02 08:33:25 +00:00
Hal Finkel	376f82d5d3	BBVectorize: Commit the rest of the test-case change. llvm-svn: 167257	2012-11-01 21:57:27 +00:00
Hal Finkel	560545b85f	BBVectorize: Use target costs for incoming and outgoing values instead of the depth heuristic. When target cost information is available, compute explicit costs of inserting and extracting values from vectors. At this point, all costs are estimated using the target information, and the chain-depth heuristic is not needed. As a result, it is now, by default, disabled when using target costs. llvm-svn: 167256	2012-11-01 21:50:12 +00:00
Kevin Enderby	4eaf8ef5cb	Add support for generating dwarf debugging info with assembly files run through the 'C' preprocessor. That is pick up the file name and line numbers from the cpp hash file line comments for the dwarf file and line numbers tables. rdar://9275556 llvm-svn: 167237	2012-11-01 17:31:35 +00:00
NAKAMURA Takumi	da2afc9a70	llvm/test/lit.cfg: Don't use mcjit to ppc32 yet, not ready. Unsupported CPU type! UNREACHABLE executed at llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:553! llvm-svn: 167231	2012-11-01 14:28:51 +00:00
Kostya Serebryany	28d0694c27	[asan] don't instrument globals that we've created ourselves (reduces the binary size a bit) llvm-svn: 167230	2012-11-01 13:42:40 +00:00
Chandler Carruth	d5639ff80f	Add a test case for PR14233. llvm-svn: 167224	2012-11-01 10:26:36 +00:00
Chandler Carruth	7ec5085e01	Revert the series of commits starting with r166578 which introduced the getIntPtrType support for multiple address spaces via a pointer type, and also introduced a crasher bug in the constant folder reported in PR14233. These commits also contained several problems that should really be addressed before they are re-committed. I have avoided reverting various cleanups to the DataLayout APIs that are reasonable to have moving forward in order to reduce the amount of churn, and minimize the number of commits that were reverted. I've also manually updated merge conflicts and manually arranged for the getIntPtrType function to stay in DataLayout and to be defined in a plausible way after this revert. Thanks to Duncan for working through this exact strategy with me, and Nick Lewycky for tracking down the really annoying crasher this triggered. (Test case to follow in its own commit.) After discussing with Duncan extensively, and based on a note from Micah, I'm going to continue to back out some more of the more problematic patches in this series in order to ensure we go into the LLVM 3.2 branch with a reasonable story here. I'll send a note to llvmdev explaining what's going on and why. Summary of reverted revisions: r166634: Fix a compiler warning with an unused variable. r166607: Add some cleanup to the DataLayout changes requested by Chandler. r166596: Revert "Back out r166591, not sure why this made it through since I cancelled the command. Bleh, sorry about this! r166591: Delete a directory that wasn't supposed to be checked in yet. r166578: Add in support for getIntPtrType to get the pointer type based on the address space. llvm-svn: 167221	2012-11-01 08:07:29 +00:00
NAKAMURA Takumi	e9b89b4fe5	[CMake] Add llvm-mcmarkup to check-llvm. llvm-svn: 167208	2012-11-01 02:13:50 +00:00
NAKAMURA Takumi	68d1700eae	test/CodeGen/X86/fp-fast.ll: Add +avx. llvm-svn: 167207	2012-11-01 02:13:45 +00:00
Owen Anderson	b351c8d692	Add a few more simple fast-math constant propagations and cancellations. llvm-svn: 167200	2012-11-01 02:00:53 +00:00
Jim Grosbach	acd8801e25	MC: Simple example parser for MC assembly markup. Nothing fancy, just a simple demonstration parser. llvm-svn: 167181	2012-10-31 23:24:13 +00:00
Shuxin Yang	01efdd6c28	(For X86) Enhancement to add-carray/sub-borrow (adc/sbb) optimization. The adc/sbb optimization is to able to convert following expression into a single adc/sbb instruction: (ult) ... = x + 1 // where the ult is unsigned-less-than comparison (ult) ... = x - 1 This change is to flip the "x >u y" (i.e. ugt comparison) in order to expose the adc/sbb opportunity. llvm-svn: 167180	2012-10-31 23:11:48 +00:00
Nadav Rotem	4cb8cdab5e	LoopVectorize: Preserve NSW, NUW and IsExact flags. llvm-svn: 167174	2012-10-31 21:40:39 +00:00
Nadav Rotem	6d7d39783d	Fix a bug in the cost calculation of vector casts. Detect situations where bitcasts cost zero. llvm-svn: 167170	2012-10-31 20:52:26 +00:00
Akira Hatanaka	4f5ef21869	[mips] Set isAsCheapAsAMove flag on ADDiu and DADDiu, which enables re-materialization of immediate loads. llvm-svn: 167153	2012-10-31 18:37:55 +00:00
Akira Hatanaka	c096c88067	Test case for r167039. Check that tail-call optimization is disabled for mips16. llvm-svn: 167139	2012-10-31 17:25:23 +00:00
Hans Wennborg	b71f72aa82	Remove fixme about unreachable cases from SwitchToLookupTable SimplifyCFG will have removed those cases for us. llvm-svn: 167132	2012-10-31 16:15:25 +00:00
Hal Finkel	842ad0b621	BBVectorize: Choose pair ordering to minimize shuffles BBVectorize would, except for loads and stores, always fuse instructions so that the first instruction (in the current source order) would always represent the low part of the input vectors and the second instruction would always represent the high part. This lead to too many shuffles being produced because sometimes the opposite order produces fewer of them. With this change, BBVectorize tracks the kind of pair connections that form the DAG of candidate pairs, and uses that information to reorder the pairs to avoid excess shuffles. Using this information, a future commit will be able to add VTTI-based shuffle costs to the pair selection procedure. Importantly, the number of remaining shuffles can now be estimated during pair selection. There are some trivial instruction reorderings in the test cases, and one simple additional test where we certainly want to do a reordering to avoid an unnecessary shuffle. llvm-svn: 167122	2012-10-31 15:17:07 +00:00
Meador Inge	05a625a0ed	instcombine: Migrate strto* optimizations This patch migrates the strto* optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167119	2012-10-31 14:58:26 +00:00
Hans Wennborg	9e74dd97b8	Do simple constant propagation in lookup table formation for switches By propagating the value for the switch condition, LLVM can now build lookup tables for code such as: switch (x) { case 1: return 5; case 2: return 42; case 3: case 4: case 5: return x - 123; default: return 123; } Given that x is known for each case, "x - 123" becomes a constant for cases 3, 4, and 5. llvm-svn: 167115	2012-10-31 13:42:45 +00:00
Benjamin Kramer	8682ac1a77	LCSSA: Add a workaround for another nasty SCEV cache invalidation issue. I'm not entirely happy with this solution, but I don't see a smarter way currently. Fixes PR14214. llvm-svn: 167112	2012-10-31 10:01:29 +00:00
Benjamin Kramer	24c643b6de	DependenceAnalysis: Don't crash if there is no constant operand. This makes the code match the comments. Resolves a crash in loop idiom (PR14219). llvm-svn: 167110	2012-10-31 09:20:38 +00:00
Reed Kotler	27a7229c47	Implement ADJCALLSTACKUP and ADJCALLSTACKDOWN llvm-svn: 167107	2012-10-31 05:21:10 +00:00
Meador Inge	6f8e01121a	instcombine: Migrate strpbrk optimizations This patch migrates the strpbrk optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167105	2012-10-31 04:29:58 +00:00
Meador Inge	d589ac621b	instcombine: Migrate strlen optimizations This patch migrates the strlen optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167103	2012-10-31 03:33:06 +00:00
Meador Inge	067294b3ac	instcombine: Migrate strncpy optimizations This patch migrates the strncpy optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167102	2012-10-31 03:33:00 +00:00
Nadav Rotem	ce77ab0c24	LoopVectorize: Do not vectorize loops with tiny constant trip counts. llvm-svn: 167101	2012-10-31 03:31:07 +00:00
Bill Schmidt	9953cf294b	This patch addresses an ABI compatibility issue with empty aggregate parameters. Examples of these are: struct { } a; union { } b[256]; int a[0]; An empty aggregate has an address, although dereferencing that address is pointless. When passed as a parameter, an empty aggregate does not consume a protocol register, nor does it consume a doubleword in the parameter save area. Passing an empty aggregate by reference passes an address just as for any other aggregate. Returning an empty aggregate uses GPR3 as a hidden address of the return value location, just as for any other aggregate. The patch modifies PPCTargetLowering::LowerFormalArguments_64SVR4 and PPCTargetLowering::LowerCall_64SVR4 to properly skip empty aggregate parameters passed by value. The handling of return values and by-reference parameters was already correct. Built on powerpc64-unknown-linux-gnu and tested with no new regressions. A test case is included to test proper handling of empty aggregate parameters on both sides of the function call protocol. llvm-svn: 167090	2012-10-31 01:15:05 +00:00
Nadav Rotem	ff7889196b	Add support for loops that don't start with Zero. This is important for loops in the LAPACK test-suite. These loops start at 1 because they are auto-converted from fortran. llvm-svn: 167084	2012-10-31 00:45:26 +00:00
Meador Inge	9a6a190562	instcombine: Migrate stpcpy optimizations This patch migrates the stpcpy optimizations from the simplify-libcalls pass into the instcombine library call simplifier. Note that the __stpcpy_chk simplifications were migrated in a previous commit. llvm-svn: 167083	2012-10-31 00:20:56 +00:00
Meador Inge	cdb2ca54ae	instcombine: Split out the __stpcpy_chk simplifications from StrCpyChkOpt r166198 migrated the strcpy optimization to instcombine. The strcpy simplifier that was migrated from Transforms/Scalar/SimplifyLibCalls.cpp was also doing some __strcpy_chk simplifications. Those fortified simplifications were migrated as well, but introduced a bug in the __stpcpy_chk simplifier in the process. This happened because the __strcpy_chk and __stpcpy_chk simplifiers were both mapped to StrCpyChkOpt which was updated with simplifications that worked for __strcpy_chk, but not __stpcpy_chk. This patch fixes the problem by adding proper test coverage and creating a new simplifier for __stpcpy_chk (instead of sharing one with __strcpy_chk). llvm-svn: 167082	2012-10-31 00:20:51 +00:00
Manman Ren	6b223a4f06	X86 SSE: update rsqrtss and rcpss to use two source operands and the first source operand is tied to the destination operand. This is to accurately model the corresponding instructions where the upper bits are unmodified. rdar://12558838 PR14221 llvm-svn: 167064	2012-10-30 23:53:59 +00:00
Manman Ren	acb8becc73	X86 MMX: optimize transfer from mmx to i32 We used to generate a store (movq) + a load. Now we use movd. rdar://9946746 llvm-svn: 167056	2012-10-30 22:15:38 +00:00
Chandler Carruth	1296b59522	Fix PR14212: For some strange reason I treated vectors differently from integers in that the code to handle split alloca-wide integer loads or stores doesn't come first. It should, for the same reasons as with integers, and the PR attests to that. Also had to fix a busted assert in that this test case also covers. llvm-svn: 167051	2012-10-30 20:52:40 +00:00
Akira Hatanaka	9c962c02e4	[mips] Allow tail-call optimization for vararg functions and functions which use the caller's stack. llvm-svn: 167048	2012-10-30 20:16:31 +00:00
Benjamin Kramer	48a6478242	LoopIdiom: Fix a serious missed optimization: we only turned top-level loops into memmove. Thanks to Preston Briggs for catching this! llvm-svn: 167045	2012-10-30 19:49:39 +00:00
Hal Finkel	2eaadd1a2d	BBVectorize: Fix a small bug introduced in r167042. We need to make sure that we take the correct load/store alignment when the inputs are flipped. llvm-svn: 167044	2012-10-30 19:47:37 +00:00
Nadav Rotem	bc21aceb19	LoopVectorize: Add support for write-only loops when the write destination is a single pointer. Speedup SciMark by 1% llvm-svn: 167035	2012-10-30 18:36:45 +00:00
Adhemerval Zanella	5c043aeb1b	PowerPC: Expand FSRQT for vector types This patch expands FSQRT for floating point vector types when altivec is used. llvm-svn: 167034	2012-10-30 18:29:42 +00:00
Nadav Rotem	b3e8e688da	LoopVectorize: Fix a bug in the initialization of reduction variables. AND needs to start at all-one while XOR, and OR need to start at zero. llvm-svn: 167032	2012-10-30 18:12:36 +00:00
Ulrich Weigand	7db4429430	Set %defaultjit to use MCJIT for PowerPC targets. Update Transforms/LICM/2003-12-11-SinkingToPHI.ll test to use %defaultjit as well. llvm-svn: 167031	2012-10-30 18:07:58 +00:00
Quentin Colombet	5799e9f66c	Change ForceSizeOpt attribute into MinSize attribute llvm-svn: 167020	2012-10-30 16:32:52 +00:00
Hans Wennborg	e0cf14fa9d	switch_to_lookup_table.ll: Remove some unnecessary lines, comments, function attributes, etc. llvm-svn: 167016	2012-10-30 15:11:52 +00:00
Adhemerval Zanella	56775e0f13	PowerPC: More support for Altivec compare operations This patch adds more support for vector type comparisons using altivec. It adds correct support for v16i8, v8i16, v4i32, and v4f32 vector types for comparison operators ==, !=, >, >=, <, and <=. llvm-svn: 167015	2012-10-30 13:50:19 +00:00
Ulrich Weigand	6a9bb51a8d	Enable some additional constant folding for PPCDoubleDouble. This fixes Clang :: CodeGen/complex-builtints.c on PowerPC. llvm-svn: 167013	2012-10-30 12:33:18 +00:00
Hans Wennborg	f3254838e4	Use TargetTransformInfo to control switch-to-lookup table transformation When the switch-to-lookup tables transform landed in SimplifyCFG, it was pointed out that this could be inappropriate for some targets. Since there was no way at the time for the pass to know anything about the target, an awkward reverse-transform was added in CodeGenPrepare that turned lookup tables back into switches for some targets. This patch uses the new TargetTransformInfo to determine if a switch should be transformed, and removes CodeGenPrepare::ConvertLoadToSwitch. llvm-svn: 167011	2012-10-30 11:23:25 +00:00
Hal Finkel	d0b95b0961	Remove an invalid assert in TargetTransformImpl getCastInstrCost had an assert prohibiting scalar to vector casts. Such casts, however, are allowed. This should make the vectorizer buildbot happier. llvm-svn: 166998	2012-10-30 02:41:57 +00:00
Jim Grosbach	4739f2eb19	ARM: Better disassembly for pc-relative LDR. When the operand is a plain immediate rather than a label, print it as [pc, #imm] like we do for the Thumb2 wide encoding variant. rdar://12154503 llvm-svn: 166991	2012-10-30 01:04:51 +00:00
Reed Kotler	a811753716	Change mips16 delay slot jumps to non delay slot forms by default. We will make them delay slot forms if there is something that can be placed in the delay slot during a separate pass. Mips16 extended instructions cannot be placed in delay slots. llvm-svn: 166990	2012-10-30 00:54:49 +00:00
Jakub Staszak	a3d8e9974a	Re-commit r166971. I reverted it to quickly, when buildbots didn't have a chance to test it with chapni's fix (-mattr=+avx). llvm-svn: 166985	2012-10-30 00:01:57 +00:00
Kevin Enderby	6fd9624843	Fix ARM's b.w instruction for thumb 2 and the encoding T4. The branch target is 24 bits not 20 and the decoding needed to correctly handle converting the J1 and J2 bits to their I1 and I2 values to reconstruct the displacement. llvm-svn: 166982	2012-10-29 23:27:20 +00:00
Jakub Staszak	d74cb61d86	Revert r166971. It causes buildbot failure. To be investigated. llvm-svn: 166979	2012-10-29 23:13:50 +00:00
NAKAMURA Takumi	382df5eb18	llvm/test/CodeGen/X86/vec_shuffle-30.ll: Try to unbreak builds - assuming +avx. llvm-svn: 166974	2012-10-29 22:45:18 +00:00
Jakub Staszak	c8f4825ba6	Allow to fold vector load if there is more than one bitcast, so in the case: %0 = load <8 x i16>* %dest %1 = shufflevector <8 x i16> %0, <8 x i16> %in, <8 x i32> < i32 0, i32 1, i32 2, i32 3, i32 13, i32 undef, i32 14, i32 14> store <8 x i16> %1, <8 x i16>* %dest We get: vmovlpd (%eax), %xmm0, %xmm0 instead of: vmovaps (%eax), %xmm1 vmovsd %xmm1, %xmm0, %xmm0 No extra test-case is added. I just fixed the existing one (also it uses FileCheck now). llvm-svn: 166971	2012-10-29 21:56:35 +00:00
Bill Schmidt	bd4ac26973	This patch solves a problem with passing varargs parameters under the PPC64 ELF ABI. A varargs parameter consisting of a single-precision floating-point value, or of a single-element aggregate containing a single-precision floating-point value, must be passed in the low-order (rightmost) four bytes of the doubleword stack slot reserved for that parameter. If there are GPR protocol registers remaining, the parameter must also be mirrored in the low-order four bytes of the reserved GPR. Prior to this patch, such parameters were being passed in the high-order four bytes of the stack slot and the mirrored GPR. The patch adds a new test case to verify the correct code generation. llvm-svn: 166968	2012-10-29 21:18:16 +00:00
Reed Kotler	740981e35c	Implement patterns for extloadi8 and extloadi16 llvm-svn: 166960	2012-10-29 19:39:04 +00:00
Ulrich Weigand	3abb34389d	In various places throughout the code generator, there were special checks to avoid performing compile-time arithmetic on PPCDoubleDouble. Now that APFloat supports arithmetic on PPCDoubleDouble, those checks are no longer needed, and we can treat the type like any other. llvm-svn: 166958	2012-10-29 18:35:49 +00:00
Chad Rosier	466c1c6870	Remove redundant test case from r166949, per Eli's suggestion. llvm-svn: 166953	2012-10-29 18:18:26 +00:00
Chad Rosier	1bbaa449ad	[ms-inline asm] Add support for the [] operator. Essentially, [expr1][expr2] is equivalent to [expr1 + expr2]. See test cases for more examples. rdar://12470392 llvm-svn: 166949	2012-10-29 18:01:54 +00:00
Michael Liao	ad0b69fe3e	Fix PR14204 - Add missing pattern on X86ISD::VZEXT from VR256 to VR256 when AVX2 is enabled. llvm-svn: 166947	2012-10-29 17:57:12 +00:00
Jakob Stoklund Olesen	9a06696a77	Completely disallow partial copies in adjustCopiesBackFrom(). Partial copies can show up even when CoalescerPair.isPartial() returns false. For example: %vreg24:dsub_0<def> = COPY %vreg31:dsub_0; QPR:%vreg24,%vreg31 Such a partial-partial copy is not good enough for the transformation adjustCopiesBackFrom() needs to do. llvm-svn: 166944	2012-10-29 17:51:52 +00:00
Ulrich Weigand	0de4a1e4ae	Allow i32/i64 for 'f' constraint on PowerPC. This fixes PR12757. llvm-svn: 166943	2012-10-29 17:49:34 +00:00
Reed Kotler	aebb8b034c	Expand all atomic ops for mips16. llvm-svn: 166935	2012-10-29 16:16:54 +00:00
Preston Gurd	52dacca977	This patch addresses a problem with the Post RA scheduler generating an incorrect instruction sequence due to it not being aware that an inline assembly instruction may reference memory. This patch fixes the problem by causing the scheduler to always assume that any inline assembly code instruction could access memory. This is necessary because the internal representation of the inline instruction does not include any information about memory accesses. This should fix PR13504. llvm-svn: 166929	2012-10-29 15:01:23 +00:00
Bill Schmidt	bbc661e572	This patch adds alignment information for long double to the 64-bit PowerPC ELF subtarget. The existing logic is used as a fallback to avoid any changes to the Darwin ABI. PPC64 ELF now has two possible data layout strings: one for FreeBSD, which requires 8-byte alignment, and a default string that requires 16-byte alignment. I've added a test for PPC64 Linux to verify the 16-byte alignment. If somebody wants to add a separate test for FreeBSD, that would be great. Note that there is a companion patch to update the alignment information in Clang, which I am committing now as well. llvm-svn: 166928	2012-10-29 14:59:36 +00:00
Tim Northover	3643a8f8eb	Align the data section correctly when loading an ELF file. Patch by Amara Emerson. llvm-svn: 166920	2012-10-29 10:47:07 +00:00
Tim Northover	94bc73d3d1	Make use of common-symbol alignment info in ELF loader. Patch by Amara Emerson. llvm-svn: 166919	2012-10-29 10:47:04 +00:00
Rafael Espindola	7043858a5b	Add -alias and -ralias options to match what we have for functions and globals. llvm-svn: 166909	2012-10-29 02:23:07 +00:00
Rafael Espindola	56183fbe78	llvm-extract changes linkages so that functions on both sides of the split module can see each other. If it is keeping a symbol that already has a non local linkage, it doesn't need to change it. llvm-svn: 166908	2012-10-29 01:59:03 +00:00
Rafael Espindola	9d30d0fc67	llvm-extract was unable to handle aliases. It would leave a copy on the output of both llvm-extract foo.ll -func=bar and llvm-extract foo.ll -func=bar -delete so the two new files could not be linked together anymore. With this change alias are handled almost like functions and global variables. Almost because with alias we cannot just clear the initializer/body, we have to create a new declaration and replace the alias with it. The net result is that now the output of the above commands can be linked even if foo.ll has aliases. llvm-svn: 166907	2012-10-29 00:27:55 +00:00
Reed Kotler	e6c31579be	Implement brind operator for mips16. llvm-svn: 166903	2012-10-28 23:08:07 +00:00
Reed Kotler	3589dd74ac	This patch is for the implementation of mips16 complex pattern addr16. Previously mips16 was sharing the pattern addr which is used for mips32 and mips64. This had a number of problems: 1) Storing and loading byte and halfword quantities for mips16 has particular problems due to the primarily non mips16 nature of SP. When we must load/store byte/halfword stack objects in a function, we must create a mips16 alias register for SP. This functionality is tested in stchar.ll. 2) We need to have an FP register under certain conditions (such as dynamically sized alloca). We use mips16 register S0 for this purpose. In this case, we also use this register when accessing frame objects so this issue also affects the complex pattern addr16. This functionality is tested in alloca16.ll. The Mips16InstrInfo.td has been updated to use addr16 instead of addr. The complex pattern C++ function for addr has been copied to addr16 and updated to reflect the above issues. llvm-svn: 166897	2012-10-28 06:02:37 +00:00
Jakob Stoklund Olesen	57143f7e78	Never attempt to join an early-clobber def with a regular kill. This fixes PR14194. llvm-svn: 166880	2012-10-27 17:41:27 +00:00
Benjamin Kramer	8d2ee55a0c	LoopIdiom: Add checks to avoid turning memmove into an infinite loop. I don't think this is possible with the current implementation but that may change eventually. llvm-svn: 166877	2012-10-27 15:18:28 +00:00
Benjamin Kramer	1c9e5186c0	LoopIdiom: Recognize memmove loops. This turns loops like for (unsigned i = 0; i != n; ++i) p[i] = p[i+1]; into memmove, which has a highly optimized implementation in most libcs. This was really easy with the new DependenceAnalysis :) llvm-svn: 166875	2012-10-27 14:25:51 +00:00
Benjamin Kramer	d5c9be8247	LoopIdiom: Replace custom dependence analysis with DependenceAnalysis. Requires a lot less code and complexity on loop-idiom's side and the more precise analysis can catch more cases, like the one I included as a test case. This also fixes the edge-case miscompilation from PR9481. Compile time performance seems to be slightly worse, but this is mostly due to an extra LCSSA run scheduled by the PassManager and should be fixed there. llvm-svn: 166874	2012-10-27 14:25:44 +00:00
Nadav Rotem	859366f93f	1. Fix a bug in getTypeConversion. When a simple type is split, we need to return the type of the split result. 2. Change the maximum vectorization width from 4 to 8. 3. A test for both. llvm-svn: 166864	2012-10-27 04:11:32 +00:00
Quentin Colombet	3ee56a3bf5	[code size][ARM] Emit regular call instructions instead of the move, branch sequence llvm-svn: 166854	2012-10-27 01:10:17 +00:00

1 2 3 4 5 ...

17624 Commits