Commit Graph

1046 Commits

Author SHA1 Message Date
Cameron Esfahani 70004ec456 Fix pointer-signext.c test case: it was relying on value names, which don't appear in the non-assert build. Switch to using check-next as well.
llvm-svn: 113964
2010-09-15 10:52:02 +00:00
Cameron Esfahani eb85650e67 Fix Windows64 target info so pointer arithmetic is done correctly, and no sign extension code is emitted: PtrDiffType needs to be a signed long long. Add a corresponding test case.
llvm-svn: 113910
2010-09-15 00:28:12 +00:00
Argyrios Kyrtzidis 9efa1ce145 Fix VLA miscompilation.
llvm.stacksave/llvm.stackrestore wasn't emitted for VLAs in inner scopes.
Fixes r8403108.

llvm-svn: 113822
2010-09-14 00:42:34 +00:00
Jakob Stoklund Olesen 54481e5948 Clean up in buildbot directories.
This test created a statements.ll file until about a month ago. Some buildbots
still have this file in their source dir. This is the easiest way to remove the
file on all bots. Then I'll revert.

llvm-svn: 113814
2010-09-13 23:26:28 +00:00
Eric Christopher 26c045d9ff Try to get this to stop leaving a temporary file on linux.
llvm-svn: 113793
2010-09-13 21:51:42 +00:00
Abramo Bagnara 3aabb4b452 Congruent diagnostic for void* arithmetic.
llvm-svn: 113740
2010-09-13 06:50:07 +00:00
Fariborz Jahanian 56603ef7b2 Have Sema check for validity of CGString literal
instead of asserting in IRGen. Fixes radar 8390459.

llvm-svn: 113253
2010-09-07 19:38:13 +00:00
Dale Johannesen 2002e1f1bf Adjust a test that's expecting optimizations to be done
on MMX palignr; we don't do this for the intrinsics.

llvm-svn: 113234
2010-09-07 18:11:53 +00:00
Chris Lattner 03483613c2 Due to asmparser improvements, this error message is now better
llvm-svn: 113177
2010-09-06 22:09:27 +00:00
Chris Lattner 52bcf96384 move the hackaround for PR6537 to catch unions as well,
fixing the ICE in PR7151

llvm-svn: 113130
2010-09-06 00:13:11 +00:00
Eli Friedman 0b1fbd1394 PR7242: Make sure to use a different context for evaluating constant
initializers, so the result of the evaluation doesn't leak through
inconsistently.  Also, don't evaluate references to variables with
initializers with side-effects.

llvm-svn: 113128
2010-09-06 00:10:32 +00:00
John McCall 56f57589af A constant initializer never matches the type of the variable it's
initializing;  it at best matches the element type of the variable
it's initializing.  Fixes PR8073.

llvm-svn: 112992
2010-09-03 18:58:50 +00:00
Daniel Dunbar 2f8df98c92 IRgen: Fix silly thinko in r112021, which was generating code for the same expr
twice. This showed up as an assert on the odd test case because we generated the
decl map entry twice.

llvm-svn: 112943
2010-09-03 02:07:00 +00:00
Chris Lattner 369721a16e stop looking for #uses comments.
llvm-svn: 112898
2010-09-02 22:48:26 +00:00
Chris Lattner 60c160ff4d remove some tests that aren't adding any value: the check lines don't
make it clear what they're testing so there is no way to know it's right
or to update it.

llvm-svn: 112897
2010-09-02 22:43:55 +00:00
Bill Wendling e6fd79bc1c Newline at end of file.
llvm-svn: 112871
2010-09-02 22:07:07 +00:00
Duncan Sands 7f1982731e Correct this test for the fact that the number of uses is now printed
in a comment.

llvm-svn: 112813
2010-09-02 08:52:56 +00:00
Chris Lattner a48fbe8c53 Fix PR8029, a x86-32 ABI regression in introduced in r112211
llvm-svn: 112537
2010-08-30 22:03:23 +00:00
Chris Lattner 07b71c4eb1 add radar #
llvm-svn: 112212
2010-08-26 20:05:48 +00:00
Chris Lattner d774ae9ed1 fix 2xi16 to pass as i32 instead of <2 x i16>. The former passes in
memory (as required) the later now passes in an xmm register.  This
fixes gcc.dg/compat/vector_1 on x86-32.

llvm-svn: 112211
2010-08-26 20:05:13 +00:00
Chris Lattner 69e683fb35 vector of long and ulong are also classified as INTEGER in x86-64 abi,
this fixes rdar://8358475 a failure of the gcc.dg/compat/vector_1 abi
test.

llvm-svn: 112205
2010-08-26 18:13:50 +00:00
Chris Lattner 46830f2fd6 1 x ulonglong needs to be classified as INTEGER, just like 1 x longlong,
this fixes a miscompilation on the included testcase, rdar://8359248

llvm-svn: 112201
2010-08-26 18:03:20 +00:00
Chris Lattner 51e1cc2fe2 tame an assertion, fixing rdar://8357396
llvm-svn: 112174
2010-08-26 06:28:35 +00:00
Argyrios Kyrtzidis 1f5cfb6446 Revert r112043, static volatiles are removed by the optimizer. Thanks Chris!
llvm-svn: 112112
2010-08-25 23:42:51 +00:00
Chris Lattner 9f8b451876 Finally pass "two floats in a 64-bit unit" as a <2 x float> instead of
as a double in the x86-64 ABI.  This allows us to generate much better
code for certain things, e.g.:

_Complex float f32(_Complex float A, _Complex float B) {
  return A+B;
}

Used to compile into (look at the integer silliness!):

_f32:                                   ## @f32
## BB#0:                                ## %entry
	movd	%xmm1, %rax
	movd	%eax, %xmm1
	movd	%xmm0, %rcx
	movd	%ecx, %xmm0
	addss	%xmm1, %xmm0
	movd	%xmm0, %edx
	shrq	$32, %rax
	movd	%eax, %xmm0
	shrq	$32, %rcx
	movd	%ecx, %xmm1
	addss	%xmm0, %xmm1
	movd	%xmm1, %eax
	shlq	$32, %rax
	addq	%rdx, %rax
	movd	%rax, %xmm0
	ret

Now we get:

_f32:                                   ## @f32
	movdqa	%xmm0, %xmm2
	addss	%xmm1, %xmm2
	pshufd	$16, %xmm2, %xmm2
	pshufd	$1, %xmm1, %xmm1
	pshufd	$1, %xmm0, %xmm0
	addss	%xmm1, %xmm0
	pshufd	$16, %xmm0, %xmm1
	movdqa	%xmm2, %xmm0
	unpcklps	%xmm1, %xmm0
	ret

and compile stuff like:

extern float _Complex ccoshf( float _Complex ) ;
float _Complex ccosf ( float _Complex z ) {
 float _Complex iz;
 (__real__ iz) = -(__imag__ z);
 (__imag__ iz) = (__real__ z);
 return ccoshf(iz);
}

into:

_ccosf:                                 ## @ccosf
## BB#0:                                ## %entry
	pshufd	$1, %xmm0, %xmm1
	xorps	LCPI4_0(%rip), %xmm1
	unpcklps	%xmm0, %xmm1
	movaps	%xmm1, %xmm0
	jmp	_ccoshf                 ## TAILCALL

instead of:

_ccosf:                                 ## @ccosf
## BB#0:                                ## %entry
	movd	%xmm0, %rax
	movq	%rax, %rcx
	shlq	$32, %rcx
	shrq	$32, %rax
	xorl	$-2147483648, %eax      ## imm = 0xFFFFFFFF80000000
	addq	%rcx, %rax
	movd	%rax, %xmm0
	jmp	_ccoshf                 ## TAILCALL


There is still "stuff to be done" here for the struct case,
but this resolves rdar://6379669 - [x86-64 ABI] Pass and return 
_Complex float / double efficiently

llvm-svn: 112111
2010-08-25 23:39:14 +00:00
Argyrios Kyrtzidis b50a088122 Make sure volatile variables are emitted even if static. Fixes rdar://8315219
llvm-svn: 112043
2010-08-25 10:15:24 +00:00
Daniel Dunbar ead6824c3c IRgen: Fix a horrible bug in pointer to bool conversion, which we were treating
as a truncation not a comparison to null.

llvm-svn: 112021
2010-08-25 03:32:38 +00:00
Devang Patel 356e3e0c6a Fix 'for' loop variables' scope.
llvm-svn: 112002
2010-08-25 00:28:56 +00:00
Dale Johannesen 46742a4771 Add some missing X86-specific asm constraint letters, and fix
some bugs in setting allowsRegister on the ones there.
8348447.

llvm-svn: 111980
2010-08-24 22:33:12 +00:00
Devang Patel 41c2097058 Emit debug info for enum constants.
llvm-svn: 111852
2010-08-23 22:07:25 +00:00
John McCall 614dbdcd55 Go back to asking CodeGenTypes whether a type is zero-initializable.
Make CGT defer to the ABI on all member pointer types.
This requires giving CGT a handle to the ABI.
It's way easier to make that work if we avoid lazily creating the ABI.
Make it so.

llvm-svn: 111786
2010-08-22 21:01:12 +00:00
Benjamin Kramer 1e0cb91249 Avoid including mm_malloc.h in a cc1 test, it pulls in system headers.
llvm-svn: 111738
2010-08-21 13:39:38 +00:00
John McCall fed68df76c This test needs a triple: it's checking the alignment of a pointer in bytes.
llvm-svn: 111727
2010-08-21 04:58:16 +00:00
Daniel Dunbar 5c816378f8 IRgen: Set the alignment correctly when creating LValue for a decls.
- Fixes PR5598.
 - Review appreciated.

llvm-svn: 111726
2010-08-21 04:20:22 +00:00
Daniel Dunbar 30eb5fa3ba Improve test coverage.
llvm-svn: 111712
2010-08-21 02:46:28 +00:00
Chris Lattner 9052c35479 fix some vector extractions to return properly zero extended values
(instead of sign extending) to match ICC.  GCC is changing this in 
a series of their own PRs (e.g. 41323).

llvm-svn: 111637
2010-08-20 16:08:33 +00:00
Anton Yartsev 583a1cf7b5 support for predicates with bool/pixel arguments
llvm-svn: 111515
2010-08-19 11:57:49 +00:00
Anton Yartsev fc83c60755 support for the rest of AltiVec functions with bool/pixel arguments and return values (except predicates)
llvm-svn: 111511
2010-08-19 03:21:36 +00:00
Anton Yartsev 9e96898032 support for vec_perm and all dependent functions (vec_mergeh, vec_mergel, vec_pack, vec_sld, vec_splat) with bool/pixel arguments and return values
llvm-svn: 111509
2010-08-19 03:00:09 +00:00
Anton Yartsev 2cc136d4e3 support for vec_add, vec_adds, vec_and, vec_andc with bool arguments
llvm-svn: 111141
2010-08-16 16:22:12 +00:00
Fariborz Jahanian f7f020bb2a Make use of __func__ in a block actually refer to
block's helper function. Fixes radar 7860965.

llvm-svn: 110988
2010-08-13 00:19:55 +00:00
Devang Patel a3025fcd45 update test to reflect r110876 change.
llvm-svn: 110884
2010-08-12 00:00:41 +00:00
John McCall 5996699834 Revise r110163: don't mark weak functions nounwind, because the optimizer
treats that as a contract to be fulfilled by any replacements.

llvm-svn: 110864
2010-08-11 22:38:33 +00:00
Bruno Cardoso Lopes 762e401911 Remove rsqrtps_nr256 and sqrtps_nr256 builtins, at least until we need them
llvm-svn: 110844
2010-08-11 19:18:36 +00:00
Daniel Dunbar 9034aa36c7 ARM: Recognize single precision float register names.
- We don't recognize double or NEON register names yet -- we don't have the
   infrastructure to generate the right clobbers for them.

llvm-svn: 110775
2010-08-11 02:17:20 +00:00
Daniel Dunbar 256e1f3ad0 ARM: Swap which registers we consider real / aliases to match LLVM and llvm-gcc.
llvm-svn: 110774
2010-08-11 02:17:11 +00:00
Bruno Cardoso Lopes 65954ffc69 Remove 256-bit cast built-ins and make the AVX intrinsic call llvm __builtin_shufflevector with the appropriate arguments
llvm-svn: 110771
2010-08-11 02:14:38 +00:00
Bruno Cardoso Lopes a4f1930b75 Remove 256-bit unpack built-ins and make the AVX intrinsic call llvm __builtin_shufflevector with the appropriate arguments
llvm-svn: 110768
2010-08-11 01:43:24 +00:00
Bruno Cardoso Lopes e712a135b7 Remove 256-bit shuffle built-ins and make the AVX intrinsic call llvm __builtin_shufflevector with the appropriate arguments
llvm-svn: 110766
2010-08-11 01:17:34 +00:00
John Thompson 307c2729fd Something's wrong with this test on other platforms. I'll probably need to simplify it later. For now revert.
llvm-svn: 110738
2010-08-10 22:04:00 +00:00
John Thompson a5c7d706b8 Slightly revised handling of mult-alt constraints, to avoid an assert, until we have the full fix.
llvm-svn: 110706
2010-08-10 19:20:14 +00:00
Devang Patel 76e3b53541 Do not use DIGlobalVariable to emit debugging information for enums.
llvm-svn: 110697
2010-08-10 18:27:15 +00:00
Devang Patel e03edfd3e7 Even if a constant's evaluated value is used, emit debug info for the constant variable.
llvm-svn: 110660
2010-08-10 07:24:25 +00:00
Bruno Cardoso Lopes 3d3fc1d075 Make replicate intrinsics use shufflevector instead of dup builtins, also remove the dup builtins
llvm-svn: 110646
2010-08-10 02:23:54 +00:00
Devang Patel 2210aa2eca There is no need to pubish file static variable's name. Do not rely on this code gen bug to check whether debug info is generated for such variables or not.
llvm-svn: 110640
2010-08-10 01:36:24 +00:00
Eric Christopher 6ff7161d51 Thread local variables aren't considered common linkage.
llvm-svn: 110530
2010-08-08 01:37:14 +00:00
Chris Lattner 8139c98cf9 Correct -ftrapv to trap on errors, instead of calling the
__overflow_handler entrypoint that David Chisnall made up.
Calling __overflow_handler is not part of the contract of
-ftrapv provided by GCC, and should never have been checked
in in the first place.

According to:
http://permalink.gmane.org/gmane.comp.compilers.clang.devel/8699

David is using this for some of arbitrary precision integer stuff
or something, which is not an appropriate thing to implement on
this.

llvm-svn: 110490
2010-08-07 00:20:46 +00:00
Chandler Carruth 66ce9651f1 Prevent these tests from dirtying the tree with output files that aren't even
used for the test.

llvm-svn: 110431
2010-08-06 05:29:57 +00:00
Bruno Cardoso Lopes e2538c4ecf We don't want to support built-ins which aren't needed by the intrinsics. Remove them
llvm-svn: 110399
2010-08-05 23:47:43 +00:00
John McCall a9731a4179 Fix a major bug with -ftrapv and ++/--. Patch by David Keaton!
llvm-svn: 110347
2010-08-05 17:39:44 +00:00
Eli Friedman d986fc8b48 Tests for #pragma GCC visibility.
llvm-svn: 110316
2010-08-05 07:00:53 +00:00
Bruno Cardoso Lopes 6586724f71 Add more AVX 256-bit intrinsics and test cases for them
llvm-svn: 110178
2010-08-04 01:11:26 +00:00
John McCall f8280e723d Fix a warning on a test.
llvm-svn: 110165
2010-08-03 22:49:45 +00:00
John McCall 8601a75118 Do a very simple pass over every function we emit to infer whether we can
mark it nounwind based on whether it contains any non-nounwind calls.
<rdar://problem/8087431>

llvm-svn: 110163
2010-08-03 22:46:07 +00:00
Bruno Cardoso Lopes 1f927ccaa2 Support x86 AVX 256-bit instructions built-ins. Right now support all of them, but
as soon as we properly codegen the simple vector operations, remove the
unnecessary built-ins/intrinsics from clang and llvm. Also add tests for the new
built-ins

llvm-svn: 110096
2010-08-03 01:57:18 +00:00
John McCall a95172baa0 Only run the jump-checker if there's a branch-protected scope *and* there's
a switch or goto somewhere in the function.  Indirect gotos trigger the
jump-checker regardless, because the conditions there are slightly more
elaborate and it's too marginal a case to be worth optimizing.

Turns off the jump-checker in a lot of cases in C++.  rdar://problem/7702918

llvm-svn: 109962
2010-08-01 00:26:45 +00:00
Daniel Dunbar b8cba97cde There is no reason for this test to invoke 'llc'.
llvm-svn: 109847
2010-07-30 03:30:55 +00:00
Chris Lattner 7f4b81af7a fix rdar://8251384, another case where we could access beyond the
end of a struct.  This improves the case when the struct being passed
contains 3 floats, either due to a struct or array of 3 things.  Before
we'd generate this IR for the testcase:

define float @bar(double %X.coerce0, double %X.coerce1) nounwind {
entry:
  %X = alloca %struct.foof, align 8               ; <%struct.foof*> [#uses=2]
  %0 = bitcast %struct.foof* %X to %1*            ; <%1*> [#uses=2]
  %1 = getelementptr %1* %0, i32 0, i32 0         ; <double*> [#uses=1]
  store double %X.coerce0, double* %1
  %2 = getelementptr %1* %0, i32 0, i32 1         ; <double*> [#uses=1]
  store double %X.coerce1, double* %2
  %tmp = getelementptr inbounds %struct.foof* %X, i32 0, i32 2 ; <float*> [#uses=1]
  %tmp1 = load float* %tmp                        ; <float> [#uses=1]
  ret float %tmp1
}

which compiled (with optimization) to:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movd	%xmm1, %rax
	movd	%eax, %xmm0
	ret

Now we produce:

define float @bar(double %X.coerce0, float %X.coerce1) nounwind {
entry:
  %X = alloca %struct.foof, align 8               ; <%struct.foof*> [#uses=2]
  %0 = bitcast %struct.foof* %X to %0*            ; <%0*> [#uses=2]
  %1 = getelementptr %0* %0, i32 0, i32 0         ; <double*> [#uses=1]
  store double %X.coerce0, double* %1
  %2 = getelementptr %0* %0, i32 0, i32 1         ; <float*> [#uses=1]
  store float %X.coerce1, float* %2
  %tmp = getelementptr inbounds %struct.foof* %X, i32 0, i32 2 ; <float*> [#uses=1]
  %tmp1 = load float* %tmp                        ; <float> [#uses=1]
  ret float %tmp1
}

and:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movaps	%xmm1, %xmm0
	ret

llvm-svn: 109776
2010-07-29 18:13:09 +00:00
Chris Lattner 3f76342cfc handle a case where we could access off the end of a function
that Eli pointed out, rdar://8249586

llvm-svn: 109762
2010-07-29 17:34:39 +00:00
Chris Lattner 44f9c3b3f1 in release mode, irbuilder doesn't add names to instructions,
this will hopefully fix the osuosl clang-i686-darwin10 builder.

llvm-svn: 109760
2010-07-29 17:14:05 +00:00
Chris Lattner 98076a25ce This is a little bit far, but optimize cases like:
struct a {
  struct c {
    double x;
    int y;
  } x[1];
};

void foo(struct a A) {
}

into:

define void @foo(double %A.coerce0, i32 %A.coerce1) nounwind {
entry:
  %A = alloca %struct.a, align 8                  ; <%struct.a*> [#uses=1]
  %0 = bitcast %struct.a* %A to %struct.c*        ; <%struct.c*> [#uses=2]
  %1 = getelementptr %struct.c* %0, i32 0, i32 0  ; <double*> [#uses=1]
  store double %A.coerce0, double* %1
  %2 = getelementptr %struct.c* %0, i32 0, i32 1  ; <i32*> [#uses=1]
  store i32 %A.coerce1, i32* %2

instead of:

define void @foo(double %A.coerce0, i64 %A.coerce1) nounwind {
entry:
  %A = alloca %struct.a, align 8                  ; <%struct.a*> [#uses=1]
  %0 = bitcast %struct.a* %A to %0*               ; <%0*> [#uses=2]
  %1 = getelementptr %0* %0, i32 0, i32 0         ; <double*> [#uses=1]
  store double %A.coerce0, double* %1
  %2 = getelementptr %0* %0, i32 0, i32 1         ; <i64*> [#uses=1]
  store i64 %A.coerce1, i64* %2

I only do this now because I never want to look at this code again :)
 

llvm-svn: 109738
2010-07-29 07:43:55 +00:00
Chris Lattner c8b7b53a1e implement a todo: pass a eight-byte that consists of a
small integer + padding as that small integer.  On code
like:

struct c { double x; int y; };
void bar(struct c C) { }

This means that we compile to:

define void @bar(double %C.coerce0, i32 %C.coerce1) nounwind {
entry:
  %C = alloca %struct.c, align 8                  ; <%struct.c*> [#uses=2]
  %0 = getelementptr %struct.c* %C, i32 0, i32 0  ; <double*> [#uses=1]
  store double %C.coerce0, double* %0
  %1 = getelementptr %struct.c* %C, i32 0, i32 1  ; <i32*> [#uses=1]
  store i32 %C.coerce1, i32* %1

instead of:

define void @bar(double %C.coerce0, i64 %C.coerce1) nounwind {
entry:
  %C = alloca %struct.c, align 8                  ; <%struct.c*> [#uses=3]
  %0 = bitcast %struct.c* %C to %0*               ; <%0*> [#uses=2]
  %1 = getelementptr %0* %0, i32 0, i32 0         ; <double*> [#uses=1]
  store double %C.coerce0, double* %1
  %2 = getelementptr %0* %0, i32 0, i32 1         ; <i64*> [#uses=1]
  store i64 %C.coerce1, i64* %2

which gives SRoA heartburn.

This implements rdar://5711709, a nice low number :)

llvm-svn: 109737
2010-07-29 07:30:00 +00:00
Chris Lattner fe34c1d53e Kill off the 'coerce' ABI passing form. Now 'direct' and 'extend' always
have a "coerce to" type which often matches the default lowering of Clang
type to LLVM IR type, but the coerce case can be handled by making them
not be the same.

This simplifies things and fixes issues where X86-64 abi lowering would 
return coerce after making preferred types exactly match up.  This caused
us to compile:

typedef float v4f32 __attribute__((__vector_size__(16)));
v4f32 foo(v4f32 X) {
  return X+X;
}

into this code at -O0:

define <4 x float> @foo(<4 x float> %X.coerce) nounwind {
entry:
  %retval = alloca <4 x float>, align 16          ; <<4 x float>*> [#uses=2]
  %coerce = alloca <4 x float>, align 16          ; <<4 x float>*> [#uses=2]
  %X.addr = alloca <4 x float>, align 16          ; <<4 x float>*> [#uses=3]
  store <4 x float> %X.coerce, <4 x float>* %coerce
  %X = load <4 x float>* %coerce                  ; <<4 x float>> [#uses=1]
  store <4 x float> %X, <4 x float>* %X.addr
  %tmp = load <4 x float>* %X.addr                ; <<4 x float>> [#uses=1]
  %tmp1 = load <4 x float>* %X.addr               ; <<4 x float>> [#uses=1]
  %add = fadd <4 x float> %tmp, %tmp1             ; <<4 x float>> [#uses=1]
  store <4 x float> %add, <4 x float>* %retval
  %0 = load <4 x float>* %retval                  ; <<4 x float>> [#uses=1]
  ret <4 x float> %0
}

Now we get:

define <4 x float> @foo(<4 x float> %X) nounwind {
entry:
  %X.addr = alloca <4 x float>, align 16          ; <<4 x float>*> [#uses=3]
  store <4 x float> %X, <4 x float>* %X.addr
  %tmp = load <4 x float>* %X.addr                ; <<4 x float>> [#uses=1]
  %tmp1 = load <4 x float>* %X.addr               ; <<4 x float>> [#uses=1]
  %add = fadd <4 x float> %tmp, %tmp1             ; <<4 x float>> [#uses=1]
  ret <4 x float> %add
}

This implements rdar://8248065

llvm-svn: 109733
2010-07-29 06:26:06 +00:00
Chris Lattner 9fa15c3608 ignore structs that wrap vectors in IR, the abstraction shouldn't add penalty.
Before we'd compile the example into something like:

  %coerce.dive2 = getelementptr %struct.v4f32wrapper* %retval, i32 0, i32 0 ; <<4 x float>*> [#uses=1]
  %1 = bitcast <4 x float>* %coerce.dive2 to <2 x double>* ; <<2 x double>*> [#uses=1]
  %2 = load <2 x double>* %1, align 1             ; <<2 x double>> [#uses=1]
  ret <2 x double> %2

Now we produce:

  %coerce.dive2 = getelementptr %struct.v4f32wrapper* %retval, i32 0, i32 0 ; <<4 x float>*> [#uses=1]
  %0 = load <4 x float>* %coerce.dive2, align 1   ; <<4 x float>> [#uses=1]
  ret <4 x float> %0

llvm-svn: 109732
2010-07-29 05:02:29 +00:00
Chris Lattner 4200fe4e50 move the 'pretty 16-byte vector' inferring code up to be shared
with return values, improving stuff that returns __m128 etc.

llvm-svn: 109731
2010-07-29 04:56:46 +00:00
Chris Lattner 3a44c7e55d now that we have CGT around, we can start using preferred types
for return values too.  Instead of compiling something like:

struct foo {
  int *X;
  float *Y;
};

struct foo test(struct foo *P) { return *P; }

to:

%1 = type { i64, i64 }

define %1 @test(%struct.foo* %P) nounwind {
entry:
  %retval = alloca %struct.foo, align 8           ; <%struct.foo*> [#uses=2]
  %P.addr = alloca %struct.foo*, align 8          ; <%struct.foo**> [#uses=2]
  store %struct.foo* %P, %struct.foo** %P.addr
  %tmp = load %struct.foo** %P.addr               ; <%struct.foo*> [#uses=1]
  %tmp1 = bitcast %struct.foo* %retval to i8*     ; <i8*> [#uses=1]
  %tmp2 = bitcast %struct.foo* %tmp to i8*        ; <i8*> [#uses=1]
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %tmp1, i8* %tmp2, i64 16, i32 8, i1 false)
  %0 = bitcast %struct.foo* %retval to %1*        ; <%1*> [#uses=1]
  %1 = load %1* %0, align 1                       ; <%1> [#uses=1]
  ret %1 %1
}

We now get the result more type safe, with:

define %struct.foo @test(%struct.foo* %P) nounwind {
entry:
  %retval = alloca %struct.foo, align 8           ; <%struct.foo*> [#uses=2]
  %P.addr = alloca %struct.foo*, align 8          ; <%struct.foo**> [#uses=2]
  store %struct.foo* %P, %struct.foo** %P.addr
  %tmp = load %struct.foo** %P.addr               ; <%struct.foo*> [#uses=1]
  %tmp1 = bitcast %struct.foo* %retval to i8*     ; <i8*> [#uses=1]
  %tmp2 = bitcast %struct.foo* %tmp to i8*        ; <i8*> [#uses=1]
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %tmp1, i8* %tmp2, i64 16, i32 8, i1 false)
  %0 = load %struct.foo* %retval                  ; <%struct.foo> [#uses=1]
  ret %struct.foo %0
}

That memcpy is completely terrible, but I don't know how to fix it.

llvm-svn: 109729
2010-07-29 04:46:19 +00:00
Chris Lattner f4ba08aeaf pass argument vectors in a type that corresponds to the user type if
possible.  This improves the example to pass <4 x float> instead of
<2 x double> but we still get awful code, and still don't get the
return value right.

llvm-svn: 109700
2010-07-28 23:47:21 +00:00
Chris Lattner 31faff5d58 use Get8ByteTypeAtOffset for the return value path as well so we
don't get errors similar to PR7714 on the return path.

llvm-svn: 109689
2010-07-28 23:06:14 +00:00
Chris Lattner 4c1e484f39 fix PR7714 by not referencing off the end of a struct when passed by value in
x86-64 abi.  This also improves codegen as well.  Some refactoring is needed of
this code.

llvm-svn: 109681
2010-07-28 22:15:08 +00:00
Fariborz Jahanian d5010898ab Fix flags in global block descriptor when
block returns structs. Fies radar 8241648.
Executable test added to llvm test suite.

llvm-svn: 109620
2010-07-28 19:07:18 +00:00
Fariborz Jahanian 0ebca28f1d 2nd argument of __builtin_expect must be evaluated
if it hs side-effect to matchgcc's behaviour.
Addresses radar 8172109.

llvm-svn: 109467
2010-07-26 23:11:03 +00:00
John McCall a464ff9d15 Switch some random local-decl cleanups over to using lazy cleanups. Turn on
the block-release unwind cleanup:  we're never going to test it if we don't turn
it on.

llvm-svn: 108992
2010-07-21 06:13:08 +00:00
Chandler Carruth 3973af797a Fix a goof in my previous patch -- not all of the builtins return a value, some
fixed return types.

llvm-svn: 108657
2010-07-18 20:54:12 +00:00
Chandler Carruth bc8cab16c5 Improve the representation of the atomic builtins in a few ways. First, we make
their call expressions synthetically have the "deduced" types based on their
first argument. We only insert conversions in the AST for arguments whose
values require conversion to match the value type expected. This keeps PR7600
closed by maintaining the return type, but avoids assertions due to unexpected
implicit casts making the type unsigned (test case added from Daniel).

The magic is moved into the codegen for the atomic builtin which inserts the
casts as needed at the IR level to raise the type to an integer suitable for
the LLVM intrinsic. This shouldn't cause any real change in functionality, but
now we can make the builtin be more truly polymorphic.

llvm-svn: 108638
2010-07-18 07:23:17 +00:00
Eli Friedman eca55afea3 Fix for PR3800: make sure not to evaluate the expression for a read-write
asm operand twice.

llvm-svn: 108489
2010-07-16 00:55:21 +00:00
Daniel Dunbar 999daa57c7 Builtins/ARM: __clear_cache doesn't seem to have a consistent prototype, declare
the builtin as void __clear_cache(...) to workaround this, which appears to
match what GCC does.

llvm-svn: 108487
2010-07-16 00:31:23 +00:00
Daniel Dunbar 3348e2d175 IRgen: Support user defined attributes on block runtime functions.
- This issue here is that /usr/include/Blocks.h wants to define some of the
   block runtime globals as weak, depending on the target. This doesn't work in
   Clang because we aren't using the AST decl for these globals.

 - The fix is a pretty gross hack which just watches all the decls for the
   specific blocks globals we need to know about; if we see one we use it,
   otherwise we use the hand coded type.

   In time, I would like to clean this up by changing IRgen to ask Sema/AST for
   the decl, which would then be lazily loaded from the builtin table if
   necessary. This could be used in a whole host of places in IRgen and would
   get rid of a lot of grotty hand coding of LLVM IR; however, we need some
   extra Sema support for this as well as support for builtin global variables.

llvm-svn: 108482
2010-07-16 00:00:19 +00:00
Douglas Gregor c5dded5f99 Improve test case. Thanks Eli
llvm-svn: 108470
2010-07-15 23:04:05 +00:00
Douglas Gregor 8997690ff1 Don't suppress the emission of available_externally functions marked
with always_inline attribute. Thanks to Howard for the tip.

llvm-svn: 108469
2010-07-15 22:58:18 +00:00
Douglas Gregor 603d81bf8d When forming a function call or message send expression, be sure to
strip cv-qualifiers from the expression's type when the language calls
for it: in C, that's all the time, while C++ only does it for
non-class types. 

Centralized the computation of the call expression type in
QualType::getCallResultType() and some helper functions in other nodes
(FunctionDecl, ObjCMethodDecl, FunctionType), and updated all relevant
callers of getResultType() to getCallResultType().

Fixes PR7598 and PR7463, along with a bunch of getResultType() call
sites that weren't stripping references off the result type (nothing
stripped cv-qualifiers properly before this change).

llvm-svn: 108234
2010-07-13 08:18:22 +00:00
Douglas Gregor a700f68828 Reinstate the optimization suppressing available_externally functions
at -O0. The only change from the previous patch is that we don't try
to generate virtual method thunks for an available_externally
function.

llvm-svn: 108230
2010-07-13 06:02:28 +00:00
Douglas Gregor 553f3a9b30 Speculatively revert r108156; it appears to be breaking self-host.
llvm-svn: 108194
2010-07-12 21:08:32 +00:00
Douglas Gregor dbb2806a7b Do not generate LLVM IR for available_externally function bodies at
-O0, since we won't be using the definitions for anything anyway. For
lib/System/Path.o when built in Debug+Asserts mode, this leads to a 4%
improvement in compile time (and suppresses 440 function bodies).

<rdar://problem/7987644>

llvm-svn: 108156
2010-07-12 17:24:55 +00:00
Chris Lattner 33919e7450 fix PR7280 by making the warning on code like this:
int test1() {
  return;
}

default to an error.

llvm-svn: 108108
2010-07-11 23:34:02 +00:00
Chris Lattner 06801d7371 allow this to pass on 32-bit hosts.
llvm-svn: 107845
2010-07-08 00:23:21 +00:00
Chris Lattner cb7696cf35 fix the clang side of PR7437: EmitAggregateCopy
was not producing a memcpy with the right address
spaces because of two places in it doing casts of
the arguments to i8, one of which that didn't
preserve the address space.

There is also an optimizer bug here.

llvm-svn: 107842
2010-07-08 00:07:45 +00:00
Chris Lattner 26b1a19842 filecheckize this test.
llvm-svn: 107841
2010-07-08 00:05:45 +00:00
John McCall 11086fcb65 Don't consider casted non-global pointers to be evaluatable.
Fixes rdar://problem/8154689

llvm-svn: 107755
2010-07-07 05:08:32 +00:00
Chris Lattner c401de9998 in the "coerce" case, the ABI handling code ends up making the
alloca for an argument.  Make sure the argument gets the proper
decl alignment, which may be different than the type alignment.

This fixes PR7567

llvm-svn: 107627
2010-07-05 20:21:00 +00:00
Chris Lattner 53b479ff6a fix PR7564 a cast where the bitfield struct init code
wasn't handling array padding elements right.

llvm-svn: 107621
2010-07-05 18:03:30 +00:00
Chris Lattner 0e7929f30c fix rdar://8147692 - yet another crash due to my abi work.
llvm-svn: 107387
2010-07-01 06:20:47 +00:00
Daniel Dunbar bb7ac52e02 Driver/IRgen: Add support for -momit-leaf-frame-pointer.
llvm-svn: 107367
2010-07-01 01:31:45 +00:00
Chris Lattner 5c740f1523 Reapply:
r107173, "fix PR7519: after thrashing around and remembering how all this stuff"
r107216, "fix PR7523, which was caused by the ABI code calling ConvertType instead"

This includes a fix to make ConvertTypeForMem handle the "recursive" case, and call
it as such when lowering function types which have an indirect result.

llvm-svn: 107310
2010-06-30 19:14:05 +00:00
Daniel Dunbar e422266926 Revert r107173, "fix PR7519: after thrashing around and remembering how all this stuff", it broke bootstrap.
llvm-svn: 107232
2010-06-30 00:22:35 +00:00
Daniel Dunbar c85ea8e175 IRgen: Assignment to Objective-C properties shouldn't reload the value (which
would trigger an extra method call).
 - While in the area, I also changed Clang to not emit an unnecessary load from
   'x' in cases like 'y = (x = 1)'.

llvm-svn: 107210
2010-06-29 22:00:45 +00:00
Daniel Dunbar 99e13101b2 tests: Fix test to not depend on instruction names.
llvm-svn: 107186
2010-06-29 18:34:40 +00:00
Chris Lattner ab1e65e2ea fix PR7519: after thrashing around and remembering how all this stuff
works, the fix is quite simple: just make sure to call ConvertTypeRecursive
when the function type being lowered is in the midst of ConvertType.

llvm-svn: 107173
2010-06-29 17:56:33 +00:00
Chris Lattner 22a931e3bb Change X86_64ABIInfo to have ASTContext and TargetData ivars to
avoid passing ASTContext down through all the methods it has.

When classifying an argument, or argument piece, as INTEGER, check
to see if we have a pointer at exactly the same offset in the 
preferred type.  If so, use that pointer type instead of i64.  This
allows us to compile A function taking a stringref into something
like this:

define i8* @foo(i64 %D.coerce0, i8* %D.coerce1) nounwind ssp {
entry:
  %D = alloca %struct.DeclGroup, align 8          ; <%struct.DeclGroup*> [#uses=4]
  %0 = getelementptr %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1]
  store i64 %D.coerce0, i64* %0
  %1 = getelementptr %struct.DeclGroup* %D, i32 0, i32 1 ; <i8**> [#uses=1]
  store i8* %D.coerce1, i8** %1
  %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1]
  %tmp1 = load i64* %tmp                          ; <i64> [#uses=1]
  %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i8**> [#uses=1]
  %tmp3 = load i8** %tmp2                         ; <i8*> [#uses=1]
  %add.ptr = getelementptr inbounds i8* %tmp3, i64 %tmp1 ; <i8*> [#uses=1]
  ret i8* %add.ptr
}

instead of this:

define i8* @foo(i64 %D.coerce0, i64 %D.coerce1) nounwind ssp {
entry:
  %D = alloca %struct.DeclGroup, align 8          ; <%struct.DeclGroup*> [#uses=3]
  %0 = insertvalue %0 undef, i64 %D.coerce0, 0    ; <%0> [#uses=1]
  %1 = insertvalue %0 %0, i64 %D.coerce1, 1       ; <%0> [#uses=1]
  %2 = bitcast %struct.DeclGroup* %D to %0*       ; <%0*> [#uses=1]
  store %0 %1, %0* %2, align 1
  %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1]
  %tmp1 = load i64* %tmp                          ; <i64> [#uses=1]
  %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i8**> [#uses=1]
  %tmp3 = load i8** %tmp2                         ; <i8*> [#uses=1]
  %add.ptr = getelementptr inbounds i8* %tmp3, i64 %tmp1 ; <i8*> [#uses=1]
  ret i8* %add.ptr
}

This implements rdar://7375902 - [codegen quality] clang x86-64 ABI lowering code punishing StringRef

llvm-svn: 107123
2010-06-29 06:01:59 +00:00
Chris Lattner 9e748e9d6e add IR names to coerced arguments.
llvm-svn: 107105
2010-06-29 00:14:52 +00:00
Chris Lattner 3dd716c3c3 Change CGCall to handle the "coerce" case where the coerce-to type
is a FCA to pass each of the elements as individual scalars.  This
produces code fast isel is less likely to reject and is easier on
the optimizers.

For example, before we would compile:
struct DeclGroup { long NumDecls; char * Y; };
char * foo(DeclGroup D) {
  return D.NumDecls+D.Y;
}

to:
%struct.DeclGroup = type { i64, i64 }

define i64 @_Z3foo9DeclGroup(%struct.DeclGroup) nounwind {
entry:
  %D = alloca %struct.DeclGroup, align 8          ; <%struct.DeclGroup*> [#uses=3]
  store %struct.DeclGroup %0, %struct.DeclGroup* %D, align 1
  %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1]
  %tmp1 = load i64* %tmp                          ; <i64> [#uses=1]
  %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i64*> [#uses=1]
  %tmp3 = load i64* %tmp2                         ; <i64> [#uses=1]
  %add = add nsw i64 %tmp1, %tmp3                 ; <i64> [#uses=1]
  ret i64 %add
}

Now we get:

%0 = type { i64, i64 }
%struct.DeclGroup = type { i64, i8* }

define i8* @_Z3foo9DeclGroup(i64, i64) nounwind {
entry:
  %D = alloca %struct.DeclGroup, align 8          ; <%struct.DeclGroup*> [#uses=3]
  %2 = insertvalue %0 undef, i64 %0, 0            ; <%0> [#uses=1]
  %3 = insertvalue %0 %2, i64 %1, 1               ; <%0> [#uses=1]
  %4 = bitcast %struct.DeclGroup* %D to %0*       ; <%0*> [#uses=1]
  store %0 %3, %0* %4, align 1
  %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1]
  %tmp1 = load i64* %tmp                          ; <i64> [#uses=1]
  %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i8**> [#uses=1]
  %tmp3 = load i8** %tmp2                         ; <i8*> [#uses=1]
  %add.ptr = getelementptr inbounds i8* %tmp3, i64 %tmp1 ; <i8*> [#uses=1]
  ret i8* %add.ptr
}

Elimination of the FCA inside the function is still-to-come.

llvm-svn: 107099
2010-06-28 23:44:11 +00:00
Chris Lattner a7d81ab7f3 X86-64:
pass/return structs of float/int as float/i32 instead of double/i64
to make the code generated for ABI cleaner.  Passing in the low part
of a double is the same as passing in a float.

For example, we now compile:

struct DeclGroup { float NumDecls; };
float foo(DeclGroup D);
void bar(DeclGroup *D) {
 foo(*D);
}

into:

%struct.DeclGroup = type { float }

define void @_Z3barP9DeclGroup(%struct.DeclGroup* %D) nounwind {
entry:
  %D.addr = alloca %struct.DeclGroup*, align 8    ; <%struct.DeclGroup**> [#uses=2]
  %agg.tmp = alloca %struct.DeclGroup, align 4    ; <%struct.DeclGroup*> [#uses=2]
  store %struct.DeclGroup* %D, %struct.DeclGroup** %D.addr
  %tmp = load %struct.DeclGroup** %D.addr         ; <%struct.DeclGroup*> [#uses=1]
  %tmp1 = bitcast %struct.DeclGroup* %agg.tmp to i8* ; <i8*> [#uses=1]
  %tmp2 = bitcast %struct.DeclGroup* %tmp to i8*  ; <i8*> [#uses=1]
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %tmp1, i8* %tmp2, i64 4, i32 4, i1 false)
  %coerce.dive = getelementptr %struct.DeclGroup* %agg.tmp, i32 0, i32 0 ; <float*> [#uses=1]
  %0 = load float* %coerce.dive, align 1          ; <float> [#uses=1]
  %call = call float @_Z3foo9DeclGroup(float %0)  ; <float> [#uses=0]
  ret void
}

instead of:

%struct.DeclGroup = type { float }

define void @_Z3barP9DeclGroup(%struct.DeclGroup* %D) nounwind {
entry:
  %D.addr = alloca %struct.DeclGroup*, align 8    ; <%struct.DeclGroup**> [#uses=2]
  %agg.tmp = alloca %struct.DeclGroup, align 4    ; <%struct.DeclGroup*> [#uses=2]
  %tmp3 = alloca double                           ; <double*> [#uses=2]
  store %struct.DeclGroup* %D, %struct.DeclGroup** %D.addr
  %tmp = load %struct.DeclGroup** %D.addr         ; <%struct.DeclGroup*> [#uses=1]
  %tmp1 = bitcast %struct.DeclGroup* %agg.tmp to i8* ; <i8*> [#uses=1]
  %tmp2 = bitcast %struct.DeclGroup* %tmp to i8*  ; <i8*> [#uses=1]
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %tmp1, i8* %tmp2, i64 4, i32 4, i1 false)
  %coerce.dive = getelementptr %struct.DeclGroup* %agg.tmp, i32 0, i32 0 ; <float*> [#uses=1]
  %0 = bitcast double* %tmp3 to float*            ; <float*> [#uses=1]
  %1 = load float* %coerce.dive                   ; <float> [#uses=1]
  store float %1, float* %0, align 1
  %2 = load double* %tmp3                         ; <double> [#uses=1]
  %call = call float @_Z3foo9DeclGroup(double %2) ; <float> [#uses=0]
  ret void
}

which is this machine code (at -O0):

__Z3barP9DeclGroup:
	subq	$24, %rsp
	movq	%rdi, 16(%rsp)
	movq	16(%rsp), %rdi
	leaq	8(%rsp), %rax
	movl	(%rdi), %ecx
	movl	%ecx, (%rax)
	movss	8(%rsp), %xmm0
	callq	__Z3foo9DeclGroup
	addq	$24, %rsp
	ret

vs this:

__Z3barP9DeclGroup:
	subq	$24, %rsp
	movq	%rdi, 16(%rsp)
	movq	16(%rsp), %rdi
	leaq	8(%rsp), %rax
	movl	(%rdi), %ecx
	movl	%ecx, (%rax)
	movss	8(%rsp), %xmm0
	movss	%xmm0, (%rsp)
	movsd	(%rsp), %xmm0
	callq	__Z3foo9DeclGroup
	addq	$24, %rsp
	ret

At -O3, it is the difference between this now:

__Z3barP9DeclGroup:
	movss	(%rdi), %xmm0
	jmp	__Z3foo9DeclGroup  # TAILCALL

vs this before:

__Z3barP9DeclGroup:
	movl	(%rdi), %eax
	movd	%rax, %xmm0
	jmp	__Z3foo9DeclGroup  # TAILCALL

llvm-svn: 107048
2010-06-28 19:56:59 +00:00
Fariborz Jahanian 36ad0e99d5 Have __func__ and siblings point to block's implementation function
name. Fixes radar 7860965.

llvm-svn: 107044
2010-06-28 18:58:34 +00:00
Chris Lattner d250b8e9a8 tweak test to pass on windows
llvm-svn: 107040
2010-06-28 18:29:14 +00:00
Chris Lattner c1028f689e Fix UnitTests/2004-02-02-NegativeZero.c, which regressed when
I broke negate of FP values.

llvm-svn: 107019
2010-06-28 17:12:37 +00:00
Chris Lattner 055097f024 If coercing something from int or pointer type to int or pointer type
(potentially after unwrapping it from a struct) do it without going through
memory.  We now compile:

struct DeclGroup {
  unsigned NumDecls;
};

int foo(DeclGroup D) {
  return D.NumDecls;
}

into:

%struct.DeclGroup = type { i32 }

define i32 @_Z3foo9DeclGroup(i64) nounwind ssp noredzone {
entry:
  %D = alloca %struct.DeclGroup, align 4          ; <%struct.DeclGroup*> [#uses=2]
  %coerce.dive = getelementptr %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1]
  %coerce.val.ii = trunc i64 %0 to i32            ; <i32> [#uses=1]
  store i32 %coerce.val.ii, i32* %coerce.dive
  %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1]
  %tmp1 = load i32* %tmp                          ; <i32> [#uses=1]
  ret i32 %tmp1
}

instead of:

%struct.DeclGroup = type { i32 }

define i32 @_Z3foo9DeclGroup(i64) nounwind ssp noredzone {
entry:
  %D = alloca %struct.DeclGroup, align 4          ; <%struct.DeclGroup*> [#uses=2]
  %tmp = alloca i64                               ; <i64*> [#uses=2]
  %coerce.dive = getelementptr %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1]
  store i64 %0, i64* %tmp
  %1 = bitcast i64* %tmp to i32*                  ; <i32*> [#uses=1]
  %2 = load i32* %1, align 1                      ; <i32> [#uses=1]
  store i32 %2, i32* %coerce.dive
  %tmp1 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1]
  %tmp2 = load i32* %tmp1                         ; <i32> [#uses=1]
  ret i32 %tmp2
}

... which is quite a bit less terrifying.

llvm-svn: 106975
2010-06-27 06:26:04 +00:00
Chris Lattner 895c52ba8b Same patch as the previous on the store side. Before we compiled this:
struct DeclGroup {
  unsigned NumDecls;
};

int foo(DeclGroup D) {
  return D.NumDecls;
}

to:

%struct.DeclGroup = type { i32 }

define i32 @_Z3foo9DeclGroup(i64) nounwind ssp noredzone {
entry:
  %D = alloca %struct.DeclGroup, align 4          ; <%struct.DeclGroup*> [#uses=2]
  %tmp = alloca i64                               ; <i64*> [#uses=2]
  store i64 %0, i64* %tmp
  %1 = bitcast i64* %tmp to %struct.DeclGroup*    ; <%struct.DeclGroup*> [#uses=1]
  %2 = load %struct.DeclGroup* %1, align 1        ; <%struct.DeclGroup> [#uses=1]
  store %struct.DeclGroup %2, %struct.DeclGroup* %D
  %tmp1 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1]
  %tmp2 = load i32* %tmp1                         ; <i32> [#uses=1]
  ret i32 %tmp2
}

which caused fast isel bailouts due to the FCA load/store of %2.  Now
we generate this just blissful code:

%struct.DeclGroup = type { i32 }

define i32 @_Z3foo9DeclGroup(i64) nounwind ssp noredzone {
entry:
  %D = alloca %struct.DeclGroup, align 4          ; <%struct.DeclGroup*> [#uses=2]
  %tmp = alloca i64                               ; <i64*> [#uses=2]
  %coerce.dive = getelementptr %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1]
  store i64 %0, i64* %tmp
  %1 = bitcast i64* %tmp to i32*                  ; <i32*> [#uses=1]
  %2 = load i32* %1, align 1                      ; <i32> [#uses=1]
  store i32 %2, i32* %coerce.dive
  %tmp1 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1]
  %tmp2 = load i32* %tmp1                         ; <i32> [#uses=1]
  ret i32 %tmp2
}

This avoids fastisel bailing out and is groundwork for future patch.
This reduces bailouts on CGStmt.ll to 911 from 935.

llvm-svn: 106974
2010-06-27 06:04:18 +00:00
Chris Lattner e01d966ce2 merge two tests.
llvm-svn: 106971
2010-06-27 01:08:03 +00:00
Chris Lattner 3fcc790cd8 Change IR generation for return (in the simple case) to avoid doing silly
load/store nonsense in the epilog.  For example, for:

int foo(int X) {
  int A[100];
  return A[X];
}

we used to generate:

  %arrayidx = getelementptr inbounds [100 x i32]* %A, i32 0, i64 %idxprom ; <i32*> [#uses=1]
  %tmp1 = load i32* %arrayidx                     ; <i32> [#uses=1]
  store i32 %tmp1, i32* %retval
  %0 = load i32* %retval                          ; <i32> [#uses=1]
  ret i32 %0
}

which codegen'd to this code:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	subq	$408, %rsp              ## imm = 0x198
	movl	%edi, 400(%rsp)
	movl	400(%rsp), %edi
	movslq	%edi, %rax
	movl	(%rsp,%rax,4), %edi
	movl	%edi, 404(%rsp)
	movl	404(%rsp), %eax
	addq	$408, %rsp              ## imm = 0x198
	ret

Now we generate:

  %arrayidx = getelementptr inbounds [100 x i32]* %A, i32 0, i64 %idxprom ; <i32*> [#uses=1]
  %tmp1 = load i32* %arrayidx                     ; <i32> [#uses=1]
  ret i32 %tmp1
}

and:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	subq	$408, %rsp              ## imm = 0x198
	movl	%edi, 404(%rsp)
	movl	404(%rsp), %edi
	movslq	%edi, %rax
	movl	(%rsp,%rax,4), %eax
	addq	$408, %rsp              ## imm = 0x198
	ret

This actually does matter, cutting out 2000 lines of IR from CGStmt.ll 
for example.

Another interesting effect is that altivec.h functions which are dead
now get dce'd by the inliner.  Hence all the changes to 
builtins-ppc-altivec.c to ensure the calls aren't dead.

llvm-svn: 106970
2010-06-27 01:06:27 +00:00
Chris Lattner 6c5abe88bf Implement rdar://7530813 - collapse multiple GEP instructions in IRgen
This avoids generating two gep's for common array operations.  Before
we would generate something like:

  %tmp = load i32* %X.addr                        ; <i32> [#uses=1]
  %arraydecay = getelementptr inbounds [100 x i32]* %A, i32 0, i32 0 ; <i32*> [#uses=1]
  %arrayidx = getelementptr inbounds i32* %arraydecay, i32 %tmp ; <i32*> [#uses=1]
  %tmp1 = load i32* %arrayidx                     ; <i32> [#uses=1]

Now we generate:

  %tmp = load i32* %X.addr                        ; <i32> [#uses=1]
  %arrayidx = getelementptr inbounds [100 x i32]* %A, i32 0, i32 %tmp ; <i32*> [#uses=1]
  %tmp1 = load i32* %arrayidx                     ; <i32> [#uses=1]

Less IR is better at -O0.

llvm-svn: 106966
2010-06-26 23:03:20 +00:00
Chris Lattner 431bef4409 fix inc/dec to honor -fwrapv and -ftrapv, implementing PR7426.
llvm-svn: 106962
2010-06-26 22:18:28 +00:00
Chris Lattner 0bf27620f0 Fix unary minus to trap on overflow with -ftrapv, refactoring binop
code so we can use it from VisitUnaryMinus.

llvm-svn: 106957
2010-06-26 21:48:21 +00:00
Chris Lattner 51924e517b Implement support for -fwrapv, rdar://7221421
As part of this, pull together trapv handling into the same enum.

This also add support for NSW multiplies.

This also makes PCH disagreement on overflow behavior silent, since it
really doesn't matter except for warnings and codegen (no macros get 
defined etc).

llvm-svn: 106956
2010-06-26 21:25:03 +00:00
Chris Lattner 217e056e40 implement rdar://7432000 - signed negate should codegen as NSW.
While I'm in there, adjust pointer to member adjustments as well.

llvm-svn: 106955
2010-06-26 20:27:24 +00:00
Benjamin Kramer 9aa0d39443 A bug I've introduced in STDIN handling surfaced a few broken tests, fix them.
Lexer/hexfloat.cpp is now XFAIL'd, I'd appreciate if someone could look into it.

llvm-svn: 106840
2010-06-25 12:48:07 +00:00
Chris Lattner 3c77a355e0 implement support for -finstrument-functions, patch by Nelson
Elhage!

llvm-svn: 106507
2010-06-22 00:03:40 +00:00
Anton Korobeynikov cc50b7d7d5 More AltiVec support.
Patch by Anton Yartsev!

llvm-svn: 106387
2010-06-19 09:47:18 +00:00
Douglas Gregor 77e274fbc6 Merge the "regparm" attribute from a previous declaration of a
function to redeclarations of that function. Fixes PR7025.

llvm-svn: 106317
2010-06-18 21:30:25 +00:00
Rafael Espindola 23a8a06554 Change the test for which ABI/CC to use on ARM to be base on the environment
(the last argument of the triple).

llvm-svn: 106131
2010-06-16 19:01:17 +00:00
Rafael Espindola ad64acde72 A a new test for my previous patch.
llvm-svn: 106120
2010-06-16 18:02:31 +00:00
Rafael Espindola b35e7b8659 Fix tests that I missed from my previous commit.
llvm-svn: 106118
2010-06-16 17:49:52 +00:00
Benjamin Kramer c0b8f3bc53 Enable basic testing of __builtin_fpclassify.
llvm-svn: 105937
2010-06-14 10:41:45 +00:00
John McCall 875679eea0 Fix the constant evaluator for AltiVec-style vector literals so that the
vector is filled with the given constant;  we were just initializing the
first element.

llvm-svn: 105824
2010-06-11 17:54:15 +00:00
Rafael Espindola 2569885963 Add a test to the previous commit.
llvm-svn: 105596
2010-06-08 03:59:28 +00:00
Rafael Espindola e971b9a260 Correctly align large arrays in x86-64. This fixes PR5599.
llvm-svn: 105500
2010-06-04 23:15:27 +00:00
John McCall 8e346702b6 Preserve more information from a block's original function declarator, if one
was given.  Remove some unnecessary accounting from BlockScopeInfo.  Handle
typedef'ed function types until such time as we decide not.

llvm-svn: 105478
2010-06-04 19:02:56 +00:00
Fariborz Jahanian 6e81492151 Empty enum in c is now error to match gcc's behavior.
(radar 8040068).

llvm-svn: 105011
2010-05-28 22:23:22 +00:00
Fariborz Jahanian 93bef10131 Fix a miscompile of wchar pascal strings.
(radar 8020384)

llvm-svn: 104996
2010-05-28 19:40:48 +00:00
John McCall 02269a66b3 Enable the implementation of __builtin_setjmp and __builtin_longjmp. Not all
LLVM backends support these yet.

llvm-svn: 104867
2010-05-27 18:47:06 +00:00
Douglas Gregor aab11ede6e Fix testsuite for blocks mangling change
llvm-svn: 104618
2010-05-25 17:46:21 +00:00
Benjamin Kramer fdb61d78e9 Implement codegen for __builtin_isnormal.
llvm-svn: 104118
2010-05-19 11:24:26 +00:00
Douglas Gregor 162b419a02 Add missing test case, provided by Steven Watanabe.
llvm-svn: 104037
2010-05-18 17:43:51 +00:00
Douglas Gregor a941dcae16 Add support for Microsoft's __thiscall, from Steven Watanabe!
llvm-svn: 104026
2010-05-18 16:57:00 +00:00
Eli Friedman b41ad0fbea PR7117: Make sure we don't lose the calling convention for K&R-style
definitions.
 

llvm-svn: 103932
2010-05-17 02:50:18 +00:00
John McCall b1fb0d3610 The FP constant evaluator was missing a few cases of unary operators that return floats
but whose operand isn't a float:  specifically, __real__ and __imag__.  Instead
of filtering these out, just implement them.

Fixes <rdar://problem/7958272>.

llvm-svn: 103307
2010-05-07 22:08:54 +00:00
Chris Lattner dbff4bf5f4 implement codegen support for __builtin_isfinite, part of PR6083
llvm-svn: 103168
2010-05-06 06:04:13 +00:00
Chris Lattner 68784efaf6 optimize builtin_isnan/isinf to not do an extraneous extension from
float -> double (which happens because they are modelled as int(...)
functions), and add a testcase for isinf.

llvm-svn: 103167
2010-05-06 05:50:07 +00:00
John McCall 4a39ab8078 Emit the globals, metadata, etc. associated with static variables even when
they're unreachable.  This matters because (if they're POD, or if this is C)
the scope containing the variable might be reachable even if the variable
isn't.  Fixes PR7044.

llvm-svn: 103052
2010-05-04 20:45:42 +00:00
Devang Patel dfcd0661a1 Use clang::VarDecl name instead of llvm::GlobalVariable name.
llvm::GLobalVariable name may not match user visibile name for function static variables.

llvm-svn: 102644
2010-04-29 17:48:37 +00:00
Mon P Wang 75c645c6d7 A not equal for an unordered relation should return true as specified in IEEE-754, e.g.,
NAN != NAN ? 1 : 0 should return 1.  Also fix the case for complex.

llvm-svn: 102598
2010-04-29 05:53:29 +00:00
John McCall d06fb865eb Properly pass the address of a lazily-generated function declaration with
incomplete type.  Fixes PR6911.

llvm-svn: 102473
2010-04-28 00:00:30 +00:00