Commit Graph

86 Commits

Author SHA1 Message Date
Daniel Neilson c8bdc8db73 Change memcpy/memove/memset to have dest and source alignment attributes.
Summary:
  This change is step three in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. Steps:

Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API.
Step 4) Update Polly to use the new IRBuilder API.
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use getDestAlignment()
and getSourceAlignment() instead.
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.

Reference
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

Reviewers: rjmccall

Subscribers: jyknight, nemanjai, nhaehnle, javed.absar, sbc100, aheejin, kbarton, fedor.sergeev, cfe-commits

Differential Revision: https://reviews.llvm.org/D41677

llvm-svn: 323617
2018-01-28 17:27:45 +00:00
Daniel Neilson 6e938effaa Change memcpy/memove/memset to have dest and source alignment attributes (Step 1).
Summary:
  Upstream LLVM is changing the the prototypes of the @llvm.memcpy/memmove/memset
intrinsics. This change updates the Clang tests for this change.

  The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument
which is required to be a constant integer. It represents the alignment of the
dest (and source), and so must be the minimum of the actual alignment of the
two.

 This change removes the alignment argument in favour of placing the alignment
attribute on the source and destination pointers of the memory intrinsic call.

 For example, code which used to read:
   call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false)
will now read
   call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false)

 At this time the source and destination alignments must be the same (Step 1).
Step 2 of the change, to be landed shortly, will relax that contraint and allow
the source and destination to have different alignments.

llvm-svn: 322964
2018-01-19 17:12:54 +00:00
Yaxun Liu 84744c152a CodeGen: Cast temporary variable to proper address space
In C++ all variables are in default address space. Previously change has been
made to cast automatic variables to default address space. However that is
not sufficient since all temporary variables need to be casted to default
address space.

This patch casts all temporary variables to default address space except those
for passing indirect arguments since they are only used for load/store.

This patch only affects target having non-zero alloca address space.

Differential Revision: https://reviews.llvm.org/D33706

llvm-svn: 305711
2017-06-19 17:03:41 +00:00
David Majnemer b439dfe6ba [CodeGen] Ignore unnamed bitfields before handling vector fields
We processed unnamed bitfields after our logic for non-vector field
elements in records larger than 128 bits.  The vector logic would
determine that the bit-field disqualifies the record from occupying a
register despite the unnamed bit-field not participating in the record
size nor its alignment.

N.B. This behavior matches GCC and ICC.

llvm-svn: 278656
2016-08-15 07:20:40 +00:00
David Majnemer b229cb0a43 [CodeGen] Correctly implement the AVX512 psABI rules
An __m512 vector type wrapped in a structure should be passed in a
vector register.

Our prior implementation was based on a draft version of the psABI.

This fixes PR28975.

N.B. The update to the ABI was made here:
https://github.com/hjl-tools/x86-psABI/commit/30f9c9

llvm-svn: 278655
2016-08-15 06:39:18 +00:00
David Majnemer e2ae228c76 [X86] Pass __m64 types via SSE registers for GCC compatibility
For compatibility with GCC, classify __m64 as SSE.
However, clang is a platform compiler for certain targets; retain our
old behavior on those targets: classify __m64 as integer.

This fixes PR26832.

llvm-svn: 262688
2016-03-04 05:26:16 +00:00
Petar Jovanovic 402257b84e [PowerPC] Fix calculating address of arguments on stack for variadic func
Fix calculating address of arguments larger than 32 bit on stack for
variadic functions (rounding up address to alignment) on ppc32 architecture.

Patch by Strahinja Petrovic.

Differential Revision: http://reviews.llvm.org/D14871

llvm-svn: 254670
2015-12-04 00:26:47 +00:00
Pete Cooper 3b39e88ae0 Revert "Change memcpy/memset/memmove to have dest and source alignments."
This reverts commit r253512.

This likely broke the bots in:
http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202
http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787

llvm-svn: 253542
2015-11-19 05:55:59 +00:00
Pete Cooper 7bfd5cb7be Change memcpy/memset/memmove to have dest and source alignments.
This is a follow on from a similar LLVM commit: r253511.

Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

These intrinsics currently have an explicit alignment argument which is
required to be a constant integer.  It represents the alignment of the
source and dest, and so must be the minimum of those.

This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments.  The alignment
argument itself is removed.

The only code change to clang is hidden in CGBuilder.h which now passes
both dest and source alignment to IRBuilder, instead of taking the minimum of
dest and source alignments.

Reviewed by Hal Finkel.

llvm-svn: 253512
2015-11-18 22:18:45 +00:00
Ahmed Bougacha 0b938284da [CodeGen] Teach X86_64ABIInfo about AVX512.
As specified in the SysV AVX512 ABI drafts. It follows the same scheme
as AVX2: 

    Arguments of type __m512 are split into eight eightbyte chunks.
    The least significant one belongs to class SSE and all the others
    to class SSEUP.

This also means we change the OpenMP SIMD default alignment on AVX512.

Based on r240337.
Differential Revision: http://reviews.llvm.org/D9894

llvm-svn: 240338
2015-06-22 21:31:43 +00:00
Ahmed Bougacha 71185aca7e [CodeGen] Check x86_64-arguments.c tests on AVX as well. NFC.
We used to only check the differing tests on AVX, but we might
as well check all of them.

llvm-svn: 237818
2015-05-20 18:39:16 +00:00
David Blaikie d6c88ece21 [opaque pointer types] Explicit non-pointer type for call expressions
(migration for recent LLVM change to textual IR for calls)

llvm-svn: 235147
2015-04-16 23:25:00 +00:00
David Blaikie a953f2825b Update Clang tests to handle explicitly typed load changes in LLVM.
llvm-svn: 230795
2015-02-27 21:19:58 +00:00
David Blaikie 218b783192 Update Clang tests to handle explicitly typed gep changes in LLVM.
llvm-svn: 230783
2015-02-27 19:18:17 +00:00
Sanjay Patel eb2af4e8b1 x86-64 ABI: unwrap single element structs / arrays of 256-bit vectors to pass and return in registers
This is a patch for PR22563 ( http://llvm.org/bugs/show_bug.cgi?id=22563 ).

We were not correctly unwrapping a single 256-bit AVX vector that was defined as an array of 1 inside a struct.

We would generate a <4 x float> param/return value instead of <8 x float> and lose half of the vector.

Differential Revision: http://reviews.llvm.org/D7614

llvm-svn: 229408
2015-02-16 17:26:51 +00:00
Stephen Lin 4362261b00 CHECK-LABEL-ify some code gen tests to improve diagnostic experience when tests fail.
llvm-svn: 188447
2013-08-15 06:47:53 +00:00
Eli Friedman 96fd264cc0 Make va_arg and argument passing to varargs functions work correctly with
AVX vectors when AVX is turned on.

Fixes <rdar://problem/10513611>.

llvm-svn: 183813
2013-06-12 00:13:45 +00:00
Eli Friedman 2761350730 Fix a very silly mistake in r183590.
llvm-svn: 183720
2013-06-11 01:59:28 +00:00
Eli Friedman c11c169530 Fix va_arg on x86-64 for a struct containing a single int128_t. PR16248
llvm-svn: 183590
2013-06-07 23:20:55 +00:00
Bill Wendling 48939ced20 Update testcases due to Attribute sorting improvements.
llvm-svn: 175253
2013-02-15 05:25:49 +00:00
Bill Wendling 85ab57ac5d Update the tests.
This update coincides with r174110. That change ordered the attributes
alphabetically.

llvm-svn: 174111
2013-01-31 23:17:12 +00:00
Bill Wendling 9806806f39 Modify the tests for the (sorted) order that the attributes come out as now.
llvm-svn: 173762
2013-01-29 03:21:00 +00:00
John McCall c818bbb8b2 Fix the required args count for variadic blocks.
We were emitting calls to blocks as if all arguments were
required --- i.e. with signature (A,B,C,D,...) rather than
(A,B,...).  This patch fixes that and accounts for the
implicit block-context argument as a required argument.
In addition, this patch changes the function type under which
we call unprototyped functions on platforms like x86-64 that
guarantee compatibility of variadic functions with unprototyped
function types;  previously we would always call such functions
under the LLVM type T (...)*, but now we will call them under
the type T (A,B,C,D,...)*.  This last change should have no
material effect except for making the type conventions more
explicit;  it was a side-effect of the most convenient implementation.

llvm-svn: 169588
2012-12-07 07:03:17 +00:00
Manman Ren 836a93bdb3 ABI: comments from Eli on r168820.
rdar://12723368

llvm-svn: 168821
2012-11-28 22:29:41 +00:00
Manman Ren 84b921f805 ABI: modify CreateCoercedLoad and CreateCoercedStore to not use load or store of
the original parameter or return type.

Since we do not accurately represent the data fields of a union, we should not
directly load or store a union type.

As an exmple, if we have i8,i8, i32, i32 as one field type and i32,i32 as
another field type, the first field type will be chosen to represent the union.
If we load with the union's type, the 3rd byte and the 4th byte will be skipped.

rdar://12723368

llvm-svn: 168820
2012-11-28 22:08:52 +00:00
Daniel Dunbar f07b5ec0dc IRgen/ABI/x86_64: Avoid passing small structs using byval sometimes.
- We do this when it is easy to determine that the backend will pass them on
   the stack properly by itself.

Currently LLVM codegen is really bad in some cases with byval, for example, on
the test case here (which is derived from Sema code, which likes to pass
SourceLocations around)::

  struct s47 { unsigned a; };
  void f47(int,int,int,int,int,int,struct s47);
  void test47(int a, struct s47 b) { f47(a, a, a, a, a, a, b); }

we used to emit code like this::

  ...
  movl	%esi, -8(%rbp)
  movl	-8(%rbp), %ecx
  movl	%ecx, (%rsp)
  ...

to handle moving the struct onto the stack, which is just appalling.

Now we generate::

  movl	%esi, (%rsp)

which seems better, no?

llvm-svn: 152462
2012-03-10 01:03:58 +00:00
Eli Friedman bfd5addf4c When we're passing a vector with an illegal type through memory on x86-64, use byval so we're sure the backend does the right thing. Fixes va_arg with illegal vectors and an obscure ABI mismatch with __m64 vectors.
llvm-svn: 145652
2011-12-02 00:11:43 +00:00
Eli Friedman f37bd2f2f1 Don't use a varargs convention for calls unprototyped functions where one of the arguments is an AVX vector.
llvm-svn: 145574
2011-12-01 04:53:19 +00:00
Tanya Lattner 71f1b2dcd4 Correct the code generation for function arguments of vec3 types on x86_64 when they are greater than 128 bits. This was incorrectly coercing things like long3 into a double2.
Add test case.

llvm-svn: 145312
2011-11-28 23:18:11 +00:00
Eli Friedman a1748564b4 Make va_arg on x86-64 compute alignment the same way as argument passing.
Fixes <rdar://problem/10463281>.

llvm-svn: 144966
2011-11-18 02:44:19 +00:00
John McCall a5efa7386a Track whether an AggValueSlot is potentially aliased, and do not
emit call results into potentially aliased slots.  This allows us
to properly mark indirect return slots as noalias, at the cost
of requiring an extra memcpy when assigning an aggregate call
result into a l-value.  It also brings us into compliance with
the x86-64 ABI.

llvm-svn: 138599
2011-08-25 23:04:34 +00:00
Bruno Cardoso Lopes 98154a76fd Reapply r134946 with fixes. Tested on Benjamin testcase and other test-suite failures.
llvm-svn: 135091
2011-07-13 21:58:55 +00:00
Bruno Cardoso Lopes 0aadf83f80 Revert r134946
llvm-svn: 135004
2011-07-12 22:30:58 +00:00
Chris Lattner 73e3004e75 fix an unintended behavior change in the type system rewrite, which caused us to compile
stuff like this:

typedef struct {
 int x, y, z; 
} foo_t;

foo_t g;

into:
%"struct.<anonymous>" = type { i32, i32, i32 }
we now get:
%struct.foo_t = type { i32, i32, i32 }

This doesn't change the behavior of the compiler, but makes the IR much easier to read.

llvm-svn: 134969
2011-07-12 05:53:08 +00:00
Bruno Cardoso Lopes 75541d00e0 Do the same as r134946 for arrays. Add more testcases for avx x86_64 arg
passing.

llvm-svn: 134951
2011-07-12 01:27:38 +00:00
Bruno Cardoso Lopes 7a26681092 Fix one x86_64 abi issue and the test to actually look for the right thing,
which is: { <4 x float>, <4 x float> } should continue to go through memory.

llvm-svn: 134946
2011-07-12 00:30:27 +00:00
Bruno Cardoso Lopes 21a41bb5ec Reapply r134754, which turns out to be working correctly and also
add one more testcase.

llvm-svn: 134934
2011-07-11 22:41:29 +00:00
Chris Lattner a5f58b05e8 clang side to match the LLVM IR type system rewrite patch.
llvm-svn: 134831
2011-07-09 17:41:47 +00:00
Bruno Cardoso Lopes 129b4cc9ec Revert x86_64 ABI changes until I have time to check the items raised by Eli.
llvm-svn: 134765
2011-07-08 22:57:35 +00:00
Bruno Cardoso Lopes 308d7423a9 Add support for AVX 256-bit in the x86_64 ABI (as in the 0.99.5 draft)
llvm-svn: 134754
2011-07-08 22:18:40 +00:00
Eli Friedman 1310c68bb0 Don't use x86_mmx where it isn't necessary.
The start of some work on getting -mno-mmx working the way we want it to.

llvm-svn: 134300
2011-07-02 00:57:27 +00:00
Chris Lattner 44c2b90556 Fix x86-64 byval passing to specify the alignment even when the code
generator will give it something sufficient.  This is important because
the mid-level optimizer doesn't know what alignment is required otherwise.

llvm-svn: 131879
2011-05-22 23:21:23 +00:00
John McCall e0fda7377e The 0.98 revision of the x86-64 ABI clarified a lot of things, some
of which break strict compatibility with previous compilers.  Implement
one of them and then immediately opt out on Darwin.

llvm-svn: 129899
2011-04-21 01:20:55 +00:00
Chris Lattner 69e683fb35 vector of long and ulong are also classified as INTEGER in x86-64 abi,
this fixes rdar://8358475 a failure of the gcc.dg/compat/vector_1 abi
test.

llvm-svn: 112205
2010-08-26 18:13:50 +00:00
Chris Lattner 46830f2fd6 1 x ulonglong needs to be classified as INTEGER, just like 1 x longlong,
this fixes a miscompilation on the included testcase, rdar://8359248

llvm-svn: 112201
2010-08-26 18:03:20 +00:00
Chris Lattner 51e1cc2fe2 tame an assertion, fixing rdar://8357396
llvm-svn: 112174
2010-08-26 06:28:35 +00:00
Chris Lattner 9f8b451876 Finally pass "two floats in a 64-bit unit" as a <2 x float> instead of
as a double in the x86-64 ABI.  This allows us to generate much better
code for certain things, e.g.:

_Complex float f32(_Complex float A, _Complex float B) {
  return A+B;
}

Used to compile into (look at the integer silliness!):

_f32:                                   ## @f32
## BB#0:                                ## %entry
	movd	%xmm1, %rax
	movd	%eax, %xmm1
	movd	%xmm0, %rcx
	movd	%ecx, %xmm0
	addss	%xmm1, %xmm0
	movd	%xmm0, %edx
	shrq	$32, %rax
	movd	%eax, %xmm0
	shrq	$32, %rcx
	movd	%ecx, %xmm1
	addss	%xmm0, %xmm1
	movd	%xmm1, %eax
	shlq	$32, %rax
	addq	%rdx, %rax
	movd	%rax, %xmm0
	ret

Now we get:

_f32:                                   ## @f32
	movdqa	%xmm0, %xmm2
	addss	%xmm1, %xmm2
	pshufd	$16, %xmm2, %xmm2
	pshufd	$1, %xmm1, %xmm1
	pshufd	$1, %xmm0, %xmm0
	addss	%xmm1, %xmm0
	pshufd	$16, %xmm0, %xmm1
	movdqa	%xmm2, %xmm0
	unpcklps	%xmm1, %xmm0
	ret

and compile stuff like:

extern float _Complex ccoshf( float _Complex ) ;
float _Complex ccosf ( float _Complex z ) {
 float _Complex iz;
 (__real__ iz) = -(__imag__ z);
 (__imag__ iz) = (__real__ z);
 return ccoshf(iz);
}

into:

_ccosf:                                 ## @ccosf
## BB#0:                                ## %entry
	pshufd	$1, %xmm0, %xmm1
	xorps	LCPI4_0(%rip), %xmm1
	unpcklps	%xmm0, %xmm1
	movaps	%xmm1, %xmm0
	jmp	_ccoshf                 ## TAILCALL

instead of:

_ccosf:                                 ## @ccosf
## BB#0:                                ## %entry
	movd	%xmm0, %rax
	movq	%rax, %rcx
	shlq	$32, %rcx
	shrq	$32, %rax
	xorl	$-2147483648, %eax      ## imm = 0xFFFFFFFF80000000
	addq	%rcx, %rax
	movd	%rax, %xmm0
	jmp	_ccoshf                 ## TAILCALL


There is still "stuff to be done" here for the struct case,
but this resolves rdar://6379669 - [x86-64 ABI] Pass and return 
_Complex float / double efficiently

llvm-svn: 112111
2010-08-25 23:39:14 +00:00
Chris Lattner 7f4b81af7a fix rdar://8251384, another case where we could access beyond the
end of a struct.  This improves the case when the struct being passed
contains 3 floats, either due to a struct or array of 3 things.  Before
we'd generate this IR for the testcase:

define float @bar(double %X.coerce0, double %X.coerce1) nounwind {
entry:
  %X = alloca %struct.foof, align 8               ; <%struct.foof*> [#uses=2]
  %0 = bitcast %struct.foof* %X to %1*            ; <%1*> [#uses=2]
  %1 = getelementptr %1* %0, i32 0, i32 0         ; <double*> [#uses=1]
  store double %X.coerce0, double* %1
  %2 = getelementptr %1* %0, i32 0, i32 1         ; <double*> [#uses=1]
  store double %X.coerce1, double* %2
  %tmp = getelementptr inbounds %struct.foof* %X, i32 0, i32 2 ; <float*> [#uses=1]
  %tmp1 = load float* %tmp                        ; <float> [#uses=1]
  ret float %tmp1
}

which compiled (with optimization) to:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movd	%xmm1, %rax
	movd	%eax, %xmm0
	ret

Now we produce:

define float @bar(double %X.coerce0, float %X.coerce1) nounwind {
entry:
  %X = alloca %struct.foof, align 8               ; <%struct.foof*> [#uses=2]
  %0 = bitcast %struct.foof* %X to %0*            ; <%0*> [#uses=2]
  %1 = getelementptr %0* %0, i32 0, i32 0         ; <double*> [#uses=1]
  store double %X.coerce0, double* %1
  %2 = getelementptr %0* %0, i32 0, i32 1         ; <float*> [#uses=1]
  store float %X.coerce1, float* %2
  %tmp = getelementptr inbounds %struct.foof* %X, i32 0, i32 2 ; <float*> [#uses=1]
  %tmp1 = load float* %tmp                        ; <float> [#uses=1]
  ret float %tmp1
}

and:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movaps	%xmm1, %xmm0
	ret

llvm-svn: 109776
2010-07-29 18:13:09 +00:00
Chris Lattner 3f76342cfc handle a case where we could access off the end of a function
that Eli pointed out, rdar://8249586

llvm-svn: 109762
2010-07-29 17:34:39 +00:00
Chris Lattner 44f9c3b3f1 in release mode, irbuilder doesn't add names to instructions,
this will hopefully fix the osuosl clang-i686-darwin10 builder.

llvm-svn: 109760
2010-07-29 17:14:05 +00:00