llvm-project

Commit Graph

Author	SHA1	Message	Date
Andrew Trick	07aeb629ec	Use %% for literals in RUN lines. llvm-svn: 138647	2011-08-26 20:09:48 +00:00
Eli Friedman	02e737b08e	Move "atomic" and "volatile" designations on instructions after the opcode of the instruction. Note that this change affects the existing non-atomic load and store instructions; the parser now accepts both forms, and the change is noted in the release notes. llvm-svn: 137527	2011-08-12 22:50:01 +00:00
Nick Lewycky	15e2d90746	Finish adding support for lifetime intrinsics to SROA. Fixes PR10121! llvm-svn: 136008	2011-07-25 23:14:22 +00:00
Dan Gohman	e106aee6f5	Fix MergeInVectorType to check for vector types with the same alloc size but different element types, so that it filters out the cases that CreateShuffleVectorCast doesn't handle. This fixes rdar://9786827. llvm-svn: 135721	2011-07-21 23:30:09 +00:00
Chris Lattner	b1ed91f397	Land the long talked about "type system rewrite" patch. This patch brings numerous advantages to LLVM. One way to look at it is through diffstat: 109 files changed, 3005 insertions(+), 5906 deletions(-) Removing almost 3K lines of code is a good thing. Other advantages include: 1. Value::getType() is a simple load that can be CSE'd, not a mutating union-find operation. 2. Types a uniqued and never move once created, defining away PATypeHolder. 3. Structs can be "named" now, and their name is part of the identity that uniques them. This means that the compiler doesn't merge them structurally which makes the IR much less confusing. 4. Now that there is no way to get a cycle in a type graph without a named struct type, "upreferences" go away. 5. Type refinement is completely gone, which should make LTO much MUCH faster in some common cases with C++ code. 6. Types are now generally immutable, so we can use "Type " instead "const Type " everywhere. Downsides of this patch are that it removes some functions from the C API, so people using those will have to upgrade to (not yet added) new API. "LLVM 3.0" is the right time to do this. There are still some cleanups pending after this, this patch is large enough as-is. llvm-svn: 134829	2011-07-09 17:41:24 +00:00
Nick Lewycky	a61df3f843	Teach one piece of scalarrepl to handle lifetime markers. When transforming an alloca that only holds a copy of a global and we're going to replace the users of the alloca with that global, just nuke the lifetime intrinsics. Part of PR10121. llvm-svn: 133905	2011-06-27 05:40:02 +00:00
Chris Lattner	8936d2bfbc	Remove support for parsing the "type i32" syntax for defining a numbered top level type without a specified number. This syntax isn't documented and blocks forward progress. llvm-svn: 133371	2011-06-19 00:03:46 +00:00
Cameron Zwarich	9601ddb2f3	When scalar replacement returns a vector type, only accept it if the vector type's bitwidth matches the (allocated) size of the alloca. This severely pessimizes vector scalar replacement when the only vector type being used is something like <3 x float> on x86 or ARM whose allocated size matches a <4 x float>. I hope to fix some of the flawed assumptions about allocated size throughout scalar replacement and reenable this in most cases. llvm-svn: 133338	2011-06-18 06:17:51 +00:00
Chris Lattner	80ed9dc9e5	rip out a ton of intrinsic modernization logic from AutoUpgrade.cpp, which is for pre-2.9 bitcode files. We keep x86 unaligned loads, movnt, crc32, and the target indep prefetch change. As usual, updating the testsuite is a PITA. llvm-svn: 133337	2011-06-18 06:05:24 +00:00
Cameron Zwarich	2a26100c87	Fix an invalid bitcast crash that occurs when doing a partial memset of a vector alloca. Fixes part of <rdar://problem/9580800>. llvm-svn: 133336	2011-06-18 05:47:49 +00:00
Chris Lattner	b90ed2233c	manually upgrade a bunch of tests to modern syntax, and remove some that are either unreduced or only test old syntax. llvm-svn: 133228	2011-06-17 03:14:27 +00:00
Cameron Zwarich	77a699a829	Fix PR10104 by adding a bounds check on a vector element access check. It was assuming that all offsets are legal vector accesses, and thus trying to access the float member of { <2 x float>, float } as the 3rd element of the first member. llvm-svn: 132766	2011-06-09 01:45:33 +00:00
Cameron Zwarich	c3b1cc9aca	Fix an assymmetry between ConvertScalar_ExtractValue and ConvertScalar_InsertValue. The former was using the size of the entire alloca, whereas the latter was correctly using the allocated size of the immediate type being converted (which may differ from the size of the alloca). This fixes PR10082. llvm-svn: 132759	2011-06-08 22:08:31 +00:00
Cameron Zwarich	d7707fc911	Fix "make check" in Release by removing debug-only options from an 'opt' invocation. llvm-svn: 131972	2011-05-24 18:26:09 +00:00
Cameron Zwarich	843bc7d673	Make LoadAndStorePromoter preserve debug info and create llvm.dbg.values when promoting allocas to SSA variables. Fixes <rdar://problem/9479036>. llvm-svn: 131953	2011-05-24 03:10:43 +00:00
Duncan Sands	a071c82900	Fix PR9820: a read-only call differs from a load in that a load doesn't return the pointer being dereferenced, it returns the pointee, but a call might return the pointer itself. llvm-svn: 130979	2011-05-06 10:30:37 +00:00
Chris Lattner	029afe4787	make a couple of changes to the standard pass pipeline: 1. Only run the early (in the module pass pipe) instcombine/simplifycfg if the "unit at a time" passes they are cleaning up after runs. 2. Move the "clean up after the unroller" pass to the very end of the function-level pass pipeline. Loop unroll uses instsimplify now, so it doesn't create a ton of trash. Moving instcombine later allows it to clean up after opportunities are exposed by GVN, DSE, etc. 3. Introduce some phase ordering tests for things that are specifically intended to be simplified by the full optimizer as a whole. This resolves PR2338, and is progress towards PR6627, which will be generating code that looks similar to test2. llvm-svn: 130241	2011-04-26 20:45:33 +00:00
Cameron Zwarich	ca4c633489	Fix another case of <rdar://problem/9184212> that only occurs with code generated by llvm-gcc, since llvm-gcc uses 2 i64s for passing a 4 x float vector on ARM rather than an i64 array like Clang. llvm-svn: 129878	2011-04-20 21:48:38 +00:00
Chris Lattner	0ab5e2cded	Fix a ton of comment typos found by codespell. Patch by Luis Felipe Strano Moraes! llvm-svn: 129558	2011-04-15 05:18:47 +00:00
Mon P Wang	2e5528f0b2	Vectors with different number of elements of the same element type can have the same allocation size but different primitive sizes(e.g., <3xi32> and <4xi32>). When ScalarRepl promotes them, it can't use a bit cast but should use a shuffle vector instead. llvm-svn: 129472	2011-04-13 21:40:02 +00:00
Cameron Zwarich	ff811cc475	Do some simple copy propagation through integer loads and stores when promoting vector types. This helps a lot with inlined functions when using the ARM soft float ABI. Fixes <rdar://problem/9184212>. llvm-svn: 128453	2011-03-29 05:19:52 +00:00
Cameron Zwarich	d4174ee43e	Fix a typo and add a test. llvm-svn: 128331	2011-03-26 04:58:50 +00:00
Cameron Zwarich	10ebc189ee	Fix PR9464 by correcting some math that just happened to be right in most cases that were hit in practice. llvm-svn: 128146	2011-03-23 05:25:55 +00:00
Cameron Zwarich	0454253d7a	Only convert allocas to scalars if it is profitable. The profitability metric I chose is having a non-memcpy/memset use and being larger than any native integer type. Originally I chose having an access of a size smaller than the total size of the alloca, but this caused some minor issues on the spirit benchmark where SRoA runs again after some inlining. This fixes <rdar://problem/8613163>. llvm-svn: 127718	2011-03-16 00:13:44 +00:00
Cameron Zwarich	7b0f3c6a1a	Add native integer type TargetData to some existing tests. llvm-svn: 127717	2011-03-16 00:13:40 +00:00
Cameron Zwarich	718918b07a	Add a test case for r127320. llvm-svn: 127321	2011-03-09 08:11:02 +00:00
Cameron Zwarich	3b649f4d01	Add support to scalar replacement for partial vector accesses of an alloca, e.g. a union of a float, <2 x float>, and <4 x float>. This mostly comes up with the use of vector intrinsics, especially in NEON when programmers know the layout of the register file. This enables codegen to eliminate a lot of the subregister traffic it would otherwise generate. This commit only enables this for a small number of floating-point cases, but a lot more integer cases. I assume this is okay for all ports, but I did not do extensive testing of the quality of code involving i512 vectors and the like. If there is a use case where this generates worse code than before, let me know and we can scale it back. This fixes <rdar://problem/9036264>. llvm-svn: 127317	2011-03-09 05:43:05 +00:00
Chris Lattner	2bcec1297e	merge all the "crash tests" into crash.ll llvm-svn: 124101	2011-01-24 03:37:34 +00:00
Chris Lattner	b4017769ae	fix PR9017, a bug where we'd assert when promoting in unreachable code. llvm-svn: 124100	2011-01-24 03:29:07 +00:00
Chris Lattner	d83e7b0ff6	enhance SRoA to promote allocas that are used by PHI nodes. This often occurs because instcombine sinks loads and inserts phis. This kicks in on such apps as 175.vpr, eon, 403.gcc, xalancbmk and a bunch of times in spec2006 in some app that uses std::deque. This resolves the last of rdar://7339113. llvm-svn: 124090	2011-01-24 01:07:11 +00:00
Chris Lattner	a960725d18	Enhance SRoA to promote allocas that are used by selects in some common cases. This triggers a surprising number of times in SPEC2K6 because min/max idioms end up doing this. For example, code from the STL ends up looking like this to SRoA: %202 = load i64* %__old_size, align 8, !tbaa !3 %203 = load i64* %__old_size, align 8, !tbaa !3 %204 = load i64* %__n, align 8, !tbaa !3 %205 = icmp ult i64 %203, %204 %storemerge.i = select i1 %205, i64* %__n, i64* %__old_size %206 = load i64* %storemerge.i, align 8, !tbaa !3 We can now promote both the __n and the __old_size allocas. This addresses another chunk of rdar://7339113, poor codegen on stringswitch. llvm-svn: 124088	2011-01-23 22:04:55 +00:00
Chris Lattner	9491dee24e	Enhance SRoA to be more aggressive about scalarization of aggregate allocas that have PHI or select uses of their element pointers. This can often happen when instcombine sinks two loads into a successor, inserting a phi or select. With this patch, we can scalarize the alloca, but the pinned elements are not yet promoted. This is still a win for large aggregates where only one element is used. This fixes rdar://8904039 and part of rdar://7339113 (poor codegen on stringswitch). llvm-svn: 124070	2011-01-23 08:27:54 +00:00
Chris Lattner	a587ab7b94	remove an old hack that avoided creating MMX datatypes. The X86 backend has been fixed. llvm-svn: 124064	2011-01-23 06:40:33 +00:00
Chris Lattner	6fab2e9418	if an alloca is only ever accessed as a unit, and is accessed with load/store instructions, then don't try to decimate it into its individual pieces. This will just make a mess of the IR and is pointless if none of the elements are individually accessed. This was generating really terrible code for std::bitset (PR8980) because it happens to be lowered by clang as an {[8 x i8]} structure instead of {i64}. The testcase now is optimized to: define i64 @test2(i64 %X) { br label %L2 L2: ; preds = %0 ret i64 %X } before we generated: define i64 @test2(i64 %X) { %sroa.store.elt = lshr i64 %X, 56 %1 = trunc i64 %sroa.store.elt to i8 %sroa.store.elt8 = lshr i64 %X, 48 %2 = trunc i64 %sroa.store.elt8 to i8 %sroa.store.elt9 = lshr i64 %X, 40 %3 = trunc i64 %sroa.store.elt9 to i8 %sroa.store.elt10 = lshr i64 %X, 32 %4 = trunc i64 %sroa.store.elt10 to i8 %sroa.store.elt11 = lshr i64 %X, 24 %5 = trunc i64 %sroa.store.elt11 to i8 %sroa.store.elt12 = lshr i64 %X, 16 %6 = trunc i64 %sroa.store.elt12 to i8 %sroa.store.elt13 = lshr i64 %X, 8 %7 = trunc i64 %sroa.store.elt13 to i8 %8 = trunc i64 %X to i8 br label %L2 L2: ; preds = %0 %9 = zext i8 %1 to i64 %10 = shl i64 %9, 56 %11 = zext i8 %2 to i64 %12 = shl i64 %11, 48 %13 = or i64 %12, %10 %14 = zext i8 %3 to i64 %15 = shl i64 %14, 40 %16 = or i64 %15, %13 %17 = zext i8 %4 to i64 %18 = shl i64 %17, 32 %19 = or i64 %18, %16 %20 = zext i8 %5 to i64 %21 = shl i64 %20, 24 %22 = or i64 %21, %19 %23 = zext i8 %6 to i64 %24 = shl i64 %23, 16 %25 = or i64 %24, %22 %26 = zext i8 %7 to i64 %27 = shl i64 %26, 8 %28 = or i64 %27, %25 %29 = zext i8 %8 to i64 %30 = or i64 %29, %28 ret i64 %30 } In this case, instcombine was able to eliminate the nonsense, but in PR8980 enough PHIs are in play that instcombine backs off. It's better to not generate this stuff in the first place. llvm-svn: 123571	2011-01-16 06:18:28 +00:00
Bob Wilson	08713d3c5f	Extend SROA to handle arrays accessed as homogeneous structs and vice versa. This is a minor extension of SROA to handle a special case that is important for some ARM NEON operations. Some of the NEON intrinsics return multiple values, which are handled as struct types containing multiple elements of the same vector type. The corresponding return types declared in the arm_neon.h header have equivalent arrays. We need SROA to recognize that it can split up those arrays and structs into separate vectors, even though they are not always accessed with the same type. SROA already handles loads and stores of an entire alloca by using insertvalue/extractvalue to access the individual pieces, and that code works the same regardless of whether the type is a struct or an array. So, all that needs to be done is to check for compatible arrays and homogeneous structs. llvm-svn: 123381	2011-01-13 17:45:11 +00:00
Bob Wilson	12eec40c83	Make SROA more aggressive with allocas containing padding. SROA only split up structs and arrays one level at a time, so padding can only cause trouble if it is located in between the struct or array elements. llvm-svn: 123380	2011-01-13 17:45:08 +00:00
Nick Lewycky	b8de00ee07	Treat a call of function pointer like a load of the pointer when considering whether the pointer can be replaced with the global variable it is a copy of. Fixes PR8680. llvm-svn: 120126	2010-11-24 22:04:20 +00:00
Chris Lattner	ac5701319b	allow eliminating an alloca that is just copied from an constant global if it is passed as a byval argument. The byval argument will just be a read, so it is safe to read from the original global instead. This allows us to promote away the %agg.tmp alloca in PR8582 llvm-svn: 119686	2010-11-18 06:41:51 +00:00
Chris Lattner	f183d5c4be	enhance the "alloca is just a memcpy from constant global" to ignore calls that obviously can't modify the alloca because they are readonly/readnone. llvm-svn: 119683	2010-11-18 06:26:49 +00:00
Chris Lattner	7aeae25c78	fix a small oversight in the "eliminate memcpy from constant global" optimization. If the alloca that is "memcpy'd from constant" also has a memcpy from it, ignore it: it is a load. We now optimize the testcase to: define void @test2() { %B = alloca %T %a = bitcast %T* @G to i8* %b = bitcast %T* %B to i8* call void @llvm.memcpy.p0i8.p0i8.i64(i8* %b, i8* %a, i64 124, i32 4, i1 false) call void @bar(i8* %b) ret void } previously we would generate: define void @test() { %B = alloca %T %b = bitcast %T* %B to i8* %G.0 = getelementptr inbounds %T* @G, i32 0, i32 0 %tmp3 = load i8* %G.0, align 4 %G.1 = getelementptr inbounds %T* @G, i32 0, i32 1 %G.15 = bitcast [123 x i8]* %G.1 to i8* %1 = bitcast [123 x i8]* %G.1 to i984* %srcval = load i984* %1, align 1 %B.0 = getelementptr inbounds %T* %B, i32 0, i32 0 store i8 %tmp3, i8* %B.0, align 4 %B.1 = getelementptr inbounds %T* %B, i32 0, i32 1 %B.12 = bitcast [123 x i8]* %B.1 to i8* %2 = bitcast [123 x i8]* %B.1 to i984* store i984 %srcval, i984* %2, align 1 call void @bar(i8* %b) ret void } llvm-svn: 119682	2010-11-18 06:20:47 +00:00
Chris Lattner	9434184142	filecheckize llvm-svn: 119681	2010-11-18 06:16:43 +00:00
Chris Lattner	8af45a889d	deepen my MMX/SRoA hack to avoid hurting non-x86 codegen. llvm-svn: 112763	2010-09-01 23:09:27 +00:00
Chris Lattner	34e5361eb5	add a gross hack to work around a problem that Argiris reported on llvmdev: SRoA is introducing MMX datatypes like <1 x i64>, which then cause random problems because the X86 backend is producing mmx stuff without inserting proper emms calls. In the short term, force off MMX datatypes. In the long term, the X86 backend should not select generic vector types to MMX registers. This is being worked on, but won't be done in time for 2.8. rdar://8380055 llvm-svn: 112696	2010-09-01 05:14:33 +00:00
Chris Lattner	b9ed4f252f	filecheckize llvm-svn: 112695	2010-09-01 05:10:14 +00:00
Chris Lattner	efa3c824cc	Fix the second half of PR7437: scalarrepl wasn't preserving address spaces when SRoA'ing memcpy's. llvm-svn: 107846	2010-07-08 00:27:05 +00:00
Rafael Espindola	29dda21e96	Remove arm_apcscc from the test files. It is the default and doing this matches what llvm-gcc and clang now produce. llvm-svn: 106221	2010-06-17 15:18:27 +00:00
Rafael Espindola	5a24a56e1e	Remove the arm_aapcscc marker from the tests. It is the default for the linux targets. llvm-svn: 106029	2010-06-15 19:04:29 +00:00
Chris Lattner	393e08536d	move comment. llvm-svn: 101433	2010-04-16 01:05:52 +00:00
Chris Lattner	1146d326a7	fix PR6832: we were using the alignment of a pointer when we wanted the alignment of the pointee. llvm-svn: 101432	2010-04-16 01:05:38 +00:00
Devang Patel	aaecdaeb5d	Remove tests that checks @llvm.dbg.stoppoint handling. llvm-svn: 97493	2010-03-01 20:33:48 +00:00

1 2 3

125 Commits