llvm-project

Commit Graph

Author	SHA1	Message	Date
Chris Lattner	44c2b90556	Fix x86-64 byval passing to specify the alignment even when the code generator will give it something sufficient. This is important because the mid-level optimizer doesn't know what alignment is required otherwise. llvm-svn: 131879	2011-05-22 23:21:23 +00:00
John McCall	e0fda7377e	The 0.98 revision of the x86-64 ABI clarified a lot of things, some of which break strict compatibility with previous compilers. Implement one of them and then immediately opt out on Darwin. llvm-svn: 129899	2011-04-21 01:20:55 +00:00
Chris Lattner	69e683fb35	vector of long and ulong are also classified as INTEGER in x86-64 abi, this fixes rdar://8358475 a failure of the gcc.dg/compat/vector_1 abi test. llvm-svn: 112205	2010-08-26 18:13:50 +00:00
Chris Lattner	46830f2fd6	1 x ulonglong needs to be classified as INTEGER, just like 1 x longlong, this fixes a miscompilation on the included testcase, rdar://8359248 llvm-svn: 112201	2010-08-26 18:03:20 +00:00
Chris Lattner	51e1cc2fe2	tame an assertion, fixing rdar://8357396 llvm-svn: 112174	2010-08-26 06:28:35 +00:00
Chris Lattner	9f8b451876	Finally pass "two floats in a 64-bit unit" as a <2 x float> instead of as a double in the x86-64 ABI. This allows us to generate much better code for certain things, e.g.: _Complex float f32(_Complex float A, _Complex float B) { return A+B; } Used to compile into (look at the integer silliness!): _f32: ## @f32 ## BB#0: ## %entry movd %xmm1, %rax movd %eax, %xmm1 movd %xmm0, %rcx movd %ecx, %xmm0 addss %xmm1, %xmm0 movd %xmm0, %edx shrq $32, %rax movd %eax, %xmm0 shrq $32, %rcx movd %ecx, %xmm1 addss %xmm0, %xmm1 movd %xmm1, %eax shlq $32, %rax addq %rdx, %rax movd %rax, %xmm0 ret Now we get: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $16, %xmm2, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm0 addss %xmm1, %xmm0 pshufd $16, %xmm0, %xmm1 movdqa %xmm2, %xmm0 unpcklps %xmm1, %xmm0 ret and compile stuff like: extern float _Complex ccoshf( float _Complex ) ; float _Complex ccosf ( float _Complex z ) { float _Complex iz; (__real__ iz) = -(__imag__ z); (__imag__ iz) = (__real__ z); return ccoshf(iz); } into: _ccosf: ## @ccosf ## BB#0: ## %entry pshufd $1, %xmm0, %xmm1 xorps LCPI4_0(%rip), %xmm1 unpcklps %xmm0, %xmm1 movaps %xmm1, %xmm0 jmp _ccoshf ## TAILCALL instead of: _ccosf: ## @ccosf ## BB#0: ## %entry movd %xmm0, %rax movq %rax, %rcx shlq $32, %rcx shrq $32, %rax xorl $-2147483648, %eax ## imm = 0xFFFFFFFF80000000 addq %rcx, %rax movd %rax, %xmm0 jmp _ccoshf ## TAILCALL There is still "stuff to be done" here for the struct case, but this resolves rdar://6379669 - [x86-64 ABI] Pass and return _Complex float / double efficiently llvm-svn: 112111	2010-08-25 23:39:14 +00:00
Chris Lattner	7f4b81af7a	fix rdar://8251384, another case where we could access beyond the end of a struct. This improves the case when the struct being passed contains 3 floats, either due to a struct or array of 3 things. Before we'd generate this IR for the testcase: define float @bar(double %X.coerce0, double %X.coerce1) nounwind { entry: %X = alloca %struct.foof, align 8 ; <%struct.foof> [#uses=2] %0 = bitcast %struct.foof %X to %1* ; <%1> [#uses=2] %1 = getelementptr %1 %0, i32 0, i32 0 ; <double> [#uses=1] store double %X.coerce0, double %1 %2 = getelementptr %1* %0, i32 0, i32 1 ; <double> [#uses=1] store double %X.coerce1, double %2 %tmp = getelementptr inbounds %struct.foof* %X, i32 0, i32 2 ; <float> [#uses=1] %tmp1 = load float %tmp ; <float> [#uses=1] ret float %tmp1 } which compiled (with optimization) to: _bar: ## @bar ## BB#0: ## %entry movd %xmm1, %rax movd %eax, %xmm0 ret Now we produce: define float @bar(double %X.coerce0, float %X.coerce1) nounwind { entry: %X = alloca %struct.foof, align 8 ; <%struct.foof> [#uses=2] %0 = bitcast %struct.foof %X to %0* ; <%0> [#uses=2] %1 = getelementptr %0 %0, i32 0, i32 0 ; <double> [#uses=1] store double %X.coerce0, double %1 %2 = getelementptr %0* %0, i32 0, i32 1 ; <float> [#uses=1] store float %X.coerce1, float %2 %tmp = getelementptr inbounds %struct.foof* %X, i32 0, i32 2 ; <float> [#uses=1] %tmp1 = load float %tmp ; <float> [#uses=1] ret float %tmp1 } and: _bar: ## @bar ## BB#0: ## %entry movaps %xmm1, %xmm0 ret llvm-svn: 109776	2010-07-29 18:13:09 +00:00
Chris Lattner	3f76342cfc	handle a case where we could access off the end of a function that Eli pointed out, rdar://8249586 llvm-svn: 109762	2010-07-29 17:34:39 +00:00
Chris Lattner	44f9c3b3f1	in release mode, irbuilder doesn't add names to instructions, this will hopefully fix the osuosl clang-i686-darwin10 builder. llvm-svn: 109760	2010-07-29 17:14:05 +00:00
Chris Lattner	98076a25ce	This is a little bit far, but optimize cases like: struct a { struct c { double x; int y; } x[1]; }; void foo(struct a A) { } into: define void @foo(double %A.coerce0, i32 %A.coerce1) nounwind { entry: %A = alloca %struct.a, align 8 ; <%struct.a> [#uses=1] %0 = bitcast %struct.a %A to %struct.c* ; <%struct.c> [#uses=2] %1 = getelementptr %struct.c %0, i32 0, i32 0 ; <double> [#uses=1] store double %A.coerce0, double %1 %2 = getelementptr %struct.c* %0, i32 0, i32 1 ; <i32> [#uses=1] store i32 %A.coerce1, i32 %2 instead of: define void @foo(double %A.coerce0, i64 %A.coerce1) nounwind { entry: %A = alloca %struct.a, align 8 ; <%struct.a> [#uses=1] %0 = bitcast %struct.a %A to %0* ; <%0> [#uses=2] %1 = getelementptr %0 %0, i32 0, i32 0 ; <double> [#uses=1] store double %A.coerce0, double %1 %2 = getelementptr %0* %0, i32 0, i32 1 ; <i64> [#uses=1] store i64 %A.coerce1, i64 %2 I only do this now because I never want to look at this code again :) llvm-svn: 109738	2010-07-29 07:43:55 +00:00
Chris Lattner	c8b7b53a1e	implement a todo: pass a eight-byte that consists of a small integer + padding as that small integer. On code like: struct c { double x; int y; }; void bar(struct c C) { } This means that we compile to: define void @bar(double %C.coerce0, i32 %C.coerce1) nounwind { entry: %C = alloca %struct.c, align 8 ; <%struct.c> [#uses=2] %0 = getelementptr %struct.c %C, i32 0, i32 0 ; <double> [#uses=1] store double %C.coerce0, double %0 %1 = getelementptr %struct.c* %C, i32 0, i32 1 ; <i32> [#uses=1] store i32 %C.coerce1, i32 %1 instead of: define void @bar(double %C.coerce0, i64 %C.coerce1) nounwind { entry: %C = alloca %struct.c, align 8 ; <%struct.c> [#uses=3] %0 = bitcast %struct.c %C to %0* ; <%0> [#uses=2] %1 = getelementptr %0 %0, i32 0, i32 0 ; <double> [#uses=1] store double %C.coerce0, double %1 %2 = getelementptr %0* %0, i32 0, i32 1 ; <i64> [#uses=1] store i64 %C.coerce1, i64 %2 which gives SRoA heartburn. This implements rdar://5711709, a nice low number :) llvm-svn: 109737	2010-07-29 07:30:00 +00:00
Chris Lattner	fe34c1d53e	Kill off the 'coerce' ABI passing form. Now 'direct' and 'extend' always have a "coerce to" type which often matches the default lowering of Clang type to LLVM IR type, but the coerce case can be handled by making them not be the same. This simplifies things and fixes issues where X86-64 abi lowering would return coerce after making preferred types exactly match up. This caused us to compile: typedef float v4f32 __attribute__((__vector_size__(16))); v4f32 foo(v4f32 X) { return X+X; } into this code at -O0: define <4 x float> @foo(<4 x float> %X.coerce) nounwind { entry: %retval = alloca <4 x float>, align 16 ; <<4 x float>> [#uses=2] %coerce = alloca <4 x float>, align 16 ; <<4 x float>> [#uses=2] %X.addr = alloca <4 x float>, align 16 ; <<4 x float>> [#uses=3] store <4 x float> %X.coerce, <4 x float> %coerce %X = load <4 x float>* %coerce ; <<4 x float>> [#uses=1] store <4 x float> %X, <4 x float>* %X.addr %tmp = load <4 x float>* %X.addr ; <<4 x float>> [#uses=1] %tmp1 = load <4 x float>* %X.addr ; <<4 x float>> [#uses=1] %add = fadd <4 x float> %tmp, %tmp1 ; <<4 x float>> [#uses=1] store <4 x float> %add, <4 x float>* %retval %0 = load <4 x float>* %retval ; <<4 x float>> [#uses=1] ret <4 x float> %0 } Now we get: define <4 x float> @foo(<4 x float> %X) nounwind { entry: %X.addr = alloca <4 x float>, align 16 ; <<4 x float>> [#uses=3] store <4 x float> %X, <4 x float> %X.addr %tmp = load <4 x float>* %X.addr ; <<4 x float>> [#uses=1] %tmp1 = load <4 x float>* %X.addr ; <<4 x float>> [#uses=1] %add = fadd <4 x float> %tmp, %tmp1 ; <<4 x float>> [#uses=1] ret <4 x float> %add } This implements rdar://8248065 llvm-svn: 109733	2010-07-29 06:26:06 +00:00
Chris Lattner	9fa15c3608	ignore structs that wrap vectors in IR, the abstraction shouldn't add penalty. Before we'd compile the example into something like: %coerce.dive2 = getelementptr %struct.v4f32wrapper* %retval, i32 0, i32 0 ; <<4 x float>> [#uses=1] %1 = bitcast <4 x float> %coerce.dive2 to <2 x double>* ; <<2 x double>> [#uses=1] %2 = load <2 x double> %1, align 1 ; <<2 x double>> [#uses=1] ret <2 x double> %2 Now we produce: %coerce.dive2 = getelementptr %struct.v4f32wrapper* %retval, i32 0, i32 0 ; <<4 x float>> [#uses=1] %0 = load <4 x float> %coerce.dive2, align 1 ; <<4 x float>> [#uses=1] ret <4 x float> %0 llvm-svn: 109732	2010-07-29 05:02:29 +00:00
Chris Lattner	4200fe4e50	move the 'pretty 16-byte vector' inferring code up to be shared with return values, improving stuff that returns __m128 etc. llvm-svn: 109731	2010-07-29 04:56:46 +00:00
Chris Lattner	3a44c7e55d	now that we have CGT around, we can start using preferred types for return values too. Instead of compiling something like: struct foo { int X; float Y; }; struct foo test(struct foo P) { return P; } to: %1 = type { i64, i64 } define %1 @test(%struct.foo* %P) nounwind { entry: %retval = alloca %struct.foo, align 8 ; <%struct.foo> [#uses=2] %P.addr = alloca %struct.foo, align 8 ; <%struct.foo*> [#uses=2] store %struct.foo %P, %struct.foo %P.addr %tmp = load %struct.foo %P.addr ; <%struct.foo> [#uses=1] %tmp1 = bitcast %struct.foo %retval to i8* ; <i8> [#uses=1] %tmp2 = bitcast %struct.foo %tmp to i8* ; <i8> [#uses=1] call void @llvm.memcpy.p0i8.p0i8.i64(i8 %tmp1, i8* %tmp2, i64 16, i32 8, i1 false) %0 = bitcast %struct.foo* %retval to %1* ; <%1> [#uses=1] %1 = load %1 %0, align 1 ; <%1> [#uses=1] ret %1 %1 } We now get the result more type safe, with: define %struct.foo @test(%struct.foo* %P) nounwind { entry: %retval = alloca %struct.foo, align 8 ; <%struct.foo> [#uses=2] %P.addr = alloca %struct.foo, align 8 ; <%struct.foo*> [#uses=2] store %struct.foo %P, %struct.foo %P.addr %tmp = load %struct.foo %P.addr ; <%struct.foo> [#uses=1] %tmp1 = bitcast %struct.foo %retval to i8* ; <i8> [#uses=1] %tmp2 = bitcast %struct.foo %tmp to i8* ; <i8> [#uses=1] call void @llvm.memcpy.p0i8.p0i8.i64(i8 %tmp1, i8* %tmp2, i64 16, i32 8, i1 false) %0 = load %struct.foo* %retval ; <%struct.foo> [#uses=1] ret %struct.foo %0 } That memcpy is completely terrible, but I don't know how to fix it. llvm-svn: 109729	2010-07-29 04:46:19 +00:00
Chris Lattner	f4ba08aeaf	pass argument vectors in a type that corresponds to the user type if possible. This improves the example to pass <4 x float> instead of <2 x double> but we still get awful code, and still don't get the return value right. llvm-svn: 109700	2010-07-28 23:47:21 +00:00
Chris Lattner	31faff5d58	use Get8ByteTypeAtOffset for the return value path as well so we don't get errors similar to PR7714 on the return path. llvm-svn: 109689	2010-07-28 23:06:14 +00:00
Chris Lattner	4c1e484f39	fix PR7714 by not referencing off the end of a struct when passed by value in x86-64 abi. This also improves codegen as well. Some refactoring is needed of this code. llvm-svn: 109681	2010-07-28 22:15:08 +00:00
Chris Lattner	c401de9998	in the "coerce" case, the ABI handling code ends up making the alloca for an argument. Make sure the argument gets the proper decl alignment, which may be different than the type alignment. This fixes PR7567 llvm-svn: 107627	2010-07-05 20:21:00 +00:00
Chris Lattner	22a931e3bb	Change X86_64ABIInfo to have ASTContext and TargetData ivars to avoid passing ASTContext down through all the methods it has. When classifying an argument, or argument piece, as INTEGER, check to see if we have a pointer at exactly the same offset in the preferred type. If so, use that pointer type instead of i64. This allows us to compile A function taking a stringref into something like this: define i8* @foo(i64 %D.coerce0, i8* %D.coerce1) nounwind ssp { entry: %D = alloca %struct.DeclGroup, align 8 ; <%struct.DeclGroup> [#uses=4] %0 = getelementptr %struct.DeclGroup %D, i32 0, i32 0 ; <i64> [#uses=1] store i64 %D.coerce0, i64 %0 %1 = getelementptr %struct.DeclGroup* %D, i32 0, i32 1 ; <i8*> [#uses=1] store i8 %D.coerce1, i8** %1 %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64> [#uses=1] %tmp1 = load i64 %tmp ; <i64> [#uses=1] %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i8> [#uses=1] %tmp3 = load i8 %tmp2 ; <i8> [#uses=1] %add.ptr = getelementptr inbounds i8 %tmp3, i64 %tmp1 ; <i8> [#uses=1] ret i8 %add.ptr } instead of this: define i8* @foo(i64 %D.coerce0, i64 %D.coerce1) nounwind ssp { entry: %D = alloca %struct.DeclGroup, align 8 ; <%struct.DeclGroup> [#uses=3] %0 = insertvalue %0 undef, i64 %D.coerce0, 0 ; <%0> [#uses=1] %1 = insertvalue %0 %0, i64 %D.coerce1, 1 ; <%0> [#uses=1] %2 = bitcast %struct.DeclGroup %D to %0* ; <%0> [#uses=1] store %0 %1, %0 %2, align 1 %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64> [#uses=1] %tmp1 = load i64 %tmp ; <i64> [#uses=1] %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i8> [#uses=1] %tmp3 = load i8 %tmp2 ; <i8> [#uses=1] %add.ptr = getelementptr inbounds i8 %tmp3, i64 %tmp1 ; <i8> [#uses=1] ret i8 %add.ptr } This implements rdar://7375902 - [codegen quality] clang x86-64 ABI lowering code punishing StringRef llvm-svn: 107123	2010-06-29 06:01:59 +00:00
Chris Lattner	9e748e9d6e	add IR names to coerced arguments. llvm-svn: 107105	2010-06-29 00:14:52 +00:00
Chris Lattner	3dd716c3c3	Change CGCall to handle the "coerce" case where the coerce-to type is a FCA to pass each of the elements as individual scalars. This produces code fast isel is less likely to reject and is easier on the optimizers. For example, before we would compile: struct DeclGroup { long NumDecls; char * Y; }; char * foo(DeclGroup D) { return D.NumDecls+D.Y; } to: %struct.DeclGroup = type { i64, i64 } define i64 @_Z3foo9DeclGroup(%struct.DeclGroup) nounwind { entry: %D = alloca %struct.DeclGroup, align 8 ; <%struct.DeclGroup> [#uses=3] store %struct.DeclGroup %0, %struct.DeclGroup %D, align 1 %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64> [#uses=1] %tmp1 = load i64 %tmp ; <i64> [#uses=1] %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i64> [#uses=1] %tmp3 = load i64 %tmp2 ; <i64> [#uses=1] %add = add nsw i64 %tmp1, %tmp3 ; <i64> [#uses=1] ret i64 %add } Now we get: %0 = type { i64, i64 } %struct.DeclGroup = type { i64, i8* } define i8* @_Z3foo9DeclGroup(i64, i64) nounwind { entry: %D = alloca %struct.DeclGroup, align 8 ; <%struct.DeclGroup> [#uses=3] %2 = insertvalue %0 undef, i64 %0, 0 ; <%0> [#uses=1] %3 = insertvalue %0 %2, i64 %1, 1 ; <%0> [#uses=1] %4 = bitcast %struct.DeclGroup %D to %0* ; <%0> [#uses=1] store %0 %3, %0 %4, align 1 %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64> [#uses=1] %tmp1 = load i64 %tmp ; <i64> [#uses=1] %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i8> [#uses=1] %tmp3 = load i8 %tmp2 ; <i8> [#uses=1] %add.ptr = getelementptr inbounds i8 %tmp3, i64 %tmp1 ; <i8> [#uses=1] ret i8 %add.ptr } Elimination of the FCA inside the function is still-to-come. llvm-svn: 107099	2010-06-28 23:44:11 +00:00
Chris Lattner	a7d81ab7f3	X86-64: pass/return structs of float/int as float/i32 instead of double/i64 to make the code generated for ABI cleaner. Passing in the low part of a double is the same as passing in a float. For example, we now compile: struct DeclGroup { float NumDecls; }; float foo(DeclGroup D); void bar(DeclGroup D) { foo(D); } into: %struct.DeclGroup = type { float } define void @_Z3barP9DeclGroup(%struct.DeclGroup* %D) nounwind { entry: %D.addr = alloca %struct.DeclGroup, align 8 ; <%struct.DeclGroup> [#uses=2] %agg.tmp = alloca %struct.DeclGroup, align 4 ; <%struct.DeclGroup> [#uses=2] store %struct.DeclGroup* %D, %struct.DeclGroup %D.addr %tmp = load %struct.DeclGroup %D.addr ; <%struct.DeclGroup> [#uses=1] %tmp1 = bitcast %struct.DeclGroup %agg.tmp to i8* ; <i8> [#uses=1] %tmp2 = bitcast %struct.DeclGroup %tmp to i8* ; <i8> [#uses=1] call void @llvm.memcpy.p0i8.p0i8.i64(i8 %tmp1, i8* %tmp2, i64 4, i32 4, i1 false) %coerce.dive = getelementptr %struct.DeclGroup* %agg.tmp, i32 0, i32 0 ; <float> [#uses=1] %0 = load float %coerce.dive, align 1 ; <float> [#uses=1] %call = call float @_Z3foo9DeclGroup(float %0) ; <float> [#uses=0] ret void } instead of: %struct.DeclGroup = type { float } define void @_Z3barP9DeclGroup(%struct.DeclGroup* %D) nounwind { entry: %D.addr = alloca %struct.DeclGroup, align 8 ; <%struct.DeclGroup> [#uses=2] %agg.tmp = alloca %struct.DeclGroup, align 4 ; <%struct.DeclGroup> [#uses=2] %tmp3 = alloca double ; <double> [#uses=2] store %struct.DeclGroup %D, %struct.DeclGroup %D.addr %tmp = load %struct.DeclGroup %D.addr ; <%struct.DeclGroup> [#uses=1] %tmp1 = bitcast %struct.DeclGroup %agg.tmp to i8* ; <i8> [#uses=1] %tmp2 = bitcast %struct.DeclGroup %tmp to i8* ; <i8> [#uses=1] call void @llvm.memcpy.p0i8.p0i8.i64(i8 %tmp1, i8* %tmp2, i64 4, i32 4, i1 false) %coerce.dive = getelementptr %struct.DeclGroup* %agg.tmp, i32 0, i32 0 ; <float> [#uses=1] %0 = bitcast double %tmp3 to float* ; <float> [#uses=1] %1 = load float %coerce.dive ; <float> [#uses=1] store float %1, float* %0, align 1 %2 = load double* %tmp3 ; <double> [#uses=1] %call = call float @_Z3foo9DeclGroup(double %2) ; <float> [#uses=0] ret void } which is this machine code (at -O0): __Z3barP9DeclGroup: subq $24, %rsp movq %rdi, 16(%rsp) movq 16(%rsp), %rdi leaq 8(%rsp), %rax movl (%rdi), %ecx movl %ecx, (%rax) movss 8(%rsp), %xmm0 callq __Z3foo9DeclGroup addq $24, %rsp ret vs this: __Z3barP9DeclGroup: subq $24, %rsp movq %rdi, 16(%rsp) movq 16(%rsp), %rdi leaq 8(%rsp), %rax movl (%rdi), %ecx movl %ecx, (%rax) movss 8(%rsp), %xmm0 movss %xmm0, (%rsp) movsd (%rsp), %xmm0 callq __Z3foo9DeclGroup addq $24, %rsp ret At -O3, it is the difference between this now: __Z3barP9DeclGroup: movss (%rdi), %xmm0 jmp __Z3foo9DeclGroup # TAILCALL vs this before: __Z3barP9DeclGroup: movl (%rdi), %eax movd %rax, %xmm0 jmp __Z3foo9DeclGroup # TAILCALL llvm-svn: 107048	2010-06-28 19:56:59 +00:00
Chris Lattner	055097f024	If coercing something from int or pointer type to int or pointer type (potentially after unwrapping it from a struct) do it without going through memory. We now compile: struct DeclGroup { unsigned NumDecls; }; int foo(DeclGroup D) { return D.NumDecls; } into: %struct.DeclGroup = type { i32 } define i32 @_Z3foo9DeclGroup(i64) nounwind ssp noredzone { entry: %D = alloca %struct.DeclGroup, align 4 ; <%struct.DeclGroup> [#uses=2] %coerce.dive = getelementptr %struct.DeclGroup %D, i32 0, i32 0 ; <i32> [#uses=1] %coerce.val.ii = trunc i64 %0 to i32 ; <i32> [#uses=1] store i32 %coerce.val.ii, i32 %coerce.dive %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i32> [#uses=1] %tmp1 = load i32 %tmp ; <i32> [#uses=1] ret i32 %tmp1 } instead of: %struct.DeclGroup = type { i32 } define i32 @_Z3foo9DeclGroup(i64) nounwind ssp noredzone { entry: %D = alloca %struct.DeclGroup, align 4 ; <%struct.DeclGroup> [#uses=2] %tmp = alloca i64 ; <i64> [#uses=2] %coerce.dive = getelementptr %struct.DeclGroup* %D, i32 0, i32 0 ; <i32> [#uses=1] store i64 %0, i64 %tmp %1 = bitcast i64* %tmp to i32* ; <i32> [#uses=1] %2 = load i32 %1, align 1 ; <i32> [#uses=1] store i32 %2, i32* %coerce.dive %tmp1 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i32> [#uses=1] %tmp2 = load i32 %tmp1 ; <i32> [#uses=1] ret i32 %tmp2 } ... which is quite a bit less terrifying. llvm-svn: 106975	2010-06-27 06:26:04 +00:00
Chris Lattner	895c52ba8b	Same patch as the previous on the store side. Before we compiled this: struct DeclGroup { unsigned NumDecls; }; int foo(DeclGroup D) { return D.NumDecls; } to: %struct.DeclGroup = type { i32 } define i32 @_Z3foo9DeclGroup(i64) nounwind ssp noredzone { entry: %D = alloca %struct.DeclGroup, align 4 ; <%struct.DeclGroup> [#uses=2] %tmp = alloca i64 ; <i64> [#uses=2] store i64 %0, i64* %tmp %1 = bitcast i64* %tmp to %struct.DeclGroup* ; <%struct.DeclGroup> [#uses=1] %2 = load %struct.DeclGroup %1, align 1 ; <%struct.DeclGroup> [#uses=1] store %struct.DeclGroup %2, %struct.DeclGroup* %D %tmp1 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i32> [#uses=1] %tmp2 = load i32 %tmp1 ; <i32> [#uses=1] ret i32 %tmp2 } which caused fast isel bailouts due to the FCA load/store of %2. Now we generate this just blissful code: %struct.DeclGroup = type { i32 } define i32 @_Z3foo9DeclGroup(i64) nounwind ssp noredzone { entry: %D = alloca %struct.DeclGroup, align 4 ; <%struct.DeclGroup> [#uses=2] %tmp = alloca i64 ; <i64> [#uses=2] %coerce.dive = getelementptr %struct.DeclGroup* %D, i32 0, i32 0 ; <i32> [#uses=1] store i64 %0, i64 %tmp %1 = bitcast i64* %tmp to i32* ; <i32> [#uses=1] %2 = load i32 %1, align 1 ; <i32> [#uses=1] store i32 %2, i32* %coerce.dive %tmp1 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i32> [#uses=1] %tmp2 = load i32 %tmp1 ; <i32> [#uses=1] ret i32 %tmp2 } This avoids fastisel bailing out and is groundwork for future patch. This reduces bailouts on CGStmt.ll to 911 from 935. llvm-svn: 106974	2010-06-27 06:04:18 +00:00
Daniel Dunbar	53fac692fa	ABI/x86-32 & x86-64: Alignment on 'byval' must be set when when the alignment exceeds the minimum ABI alignment. llvm-svn: 102019	2010-04-21 19:49:55 +00:00
Daniel Dunbar	14ec60024c	Convert test to FileCheck. llvm-svn: 102016	2010-04-21 19:10:54 +00:00
Chris Lattner	9cffdf1331	don't slap noalias attribute on stret result arguments. This mirror's Dan's patch for llvm-gcc in r97989, and fixes the miscompilation in PR6525. There is some contention over whether this is the right thing to do, but it is the conservative answer and demonstrably fixes a miscompilation. llvm-svn: 101877	2010-04-20 05:44:43 +00:00
Daniel Dunbar	8fbe78f6fc	Update tests to use %clang_cc1 instead of 'clang-cc' or 'clang -cc1'. - This is designed to make it obvious that %clang_cc1 is a "test variable" which is substituted. It is '%clang_cc1' instead of '%clang -cc1' because it can be useful to redefine what gets run as 'clang -cc1' (for example, to set a default target). llvm-svn: 91446	2009-12-15 20:14:24 +00:00
Daniel Dunbar	34546ce43d	Remove RUN: true lines. llvm-svn: 86432	2009-11-08 01:47:25 +00:00
Daniel Dunbar	8b57697954	Eliminate &&s in tests. - 'for i in $(find . -type f); do sed -e 's#$RUN:.[^ ]$ && *$#\1#g' $i \| FileUpdate $i; done', for the curious. llvm-svn: 86430	2009-11-08 01:45:36 +00:00
Daniel Dunbar	87db734400	Fix a few tests to be -Asserts agnostic. - Ugh. llvm-svn: 79860	2009-08-23 19:28:59 +00:00
Daniel Dunbar	e5515289fa	Update test llvm-svn: 78877	2009-08-13 01:27:45 +00:00
Mike Stump	5e7869f63e	Prep for new warning. llvm-svn: 76638	2009-07-21 20:52:43 +00:00
Daniel Dunbar	4be99ff767	ABI handling: Fix nasty thinko where IRgen could generate an out-of-bounds read when generating a coercion for ABI handling purposes. - This may only manifest itself when building at -O0, but the practical effect is that other arguments may get clobbered. - <rdar://problem/6930451> [irgen] ABI coercion clobbers other arguments llvm-svn: 72932	2009-06-05 07:58:54 +00:00
Daniel Dunbar	1518b64ddc	When trying to pass an argument on the stack, assume LLVM will do the right thing for non-aggregate types. - Otherwise we unnecessarily pin values to the stack and currently end up triggering a backend bug in one case. - This loose cooperation with LLVM to implement the ABI is pretty ugly. - <rdar://problem/6918722> [irgen] clang miscompile of many pointer varargs on x86-64 llvm-svn: 72419	2009-05-26 16:37:37 +00:00
Daniel Dunbar	499f3f9c5d	x86_64 ABI: Account for sret parameters consuming an integer register. - PR4242. llvm-svn: 72268	2009-05-22 17:33:44 +00:00
Daniel Dunbar	ffdb8439d7	ABI handling: Fix invalid assertion, it is possible for a valid coercion to be specified which truncates padding bits. It would be nice to still have the assert, but we don't have any API call for the unpadding size of a type yet. llvm-svn: 71695	2009-05-13 18:54:26 +00:00
Daniel Dunbar	203e2e8dd8	x86-64 ABI: clang incorrectly passes union { long double, float } in register. - Merge algorithm was returning MEMORY as it should. llvm-svn: 71556	2009-05-12 15:22:40 +00:00
Daniel Dunbar	b997f3bcc3	x86_64 ABI: Ignore padding bit-fields during classification. - {return-types,single-args}-{32,64} pass the first 1k ABI tests with bit-fields enabled. llvm-svn: 71272	2009-05-08 22:26:44 +00:00
Daniel Dunbar	a45cf5b6b0	Rename clang to clang-cc. Tests and drivers updated, still need to shuffle dirs. llvm-svn: 67602	2009-03-24 02:24:46 +00:00
Daniel Dunbar	94911a91d5	x86_64 ABI: Handle long double in union when upper eightbyte results in a lone X87 class. - PR3735. llvm-svn: 66277	2009-03-06 17:50:25 +00:00
Mike Stump	92000e54ac	Add end of line at end. llvm-svn: 65557	2009-02-26 19:00:14 +00:00
Anders Carlsson	43e8a2b36e	Add test for enum types llvm-svn: 65540	2009-02-26 17:38:19 +00:00
Daniel Dunbar	019ef0bbfe	x86_64 ABI: Pass simple types directly when possible. This is important for both keeping the generated LLVM simple and for ensuring that integer types are passed/promoted correctly. llvm-svn: 64529	2009-02-14 02:09:24 +00:00

45 Commits