Commit Graph

2618 Commits

Author SHA1 Message Date
Adam Nemet 2278fcbf0c [AVX512] Add FMA intrinsics
Part of <rdar://problem/17688758>

llvm-svn: 215666
2014-08-14 17:17:57 +00:00
Justin Bogner 085c4b294b Revert "CodeGen: When bitfields fall on natural boundaries, split them up"
It fits better with LLVM's memory model to try to do this in the
backend. Specifically, narrowing wide loads in the backends should be
relatively straightforward and is generally valuable, whereas widening
loads tends to be very constrained.

Discussion here:

  http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20140811/112581.html

This reverts commit r215614.

llvm-svn: 215648
2014-08-14 15:44:29 +00:00
Rafael Espindola 764837431a Delete support for AuroraUX.
auroraux.org is not resolving.

llvm-svn: 215644
2014-08-14 15:14:51 +00:00
Pekka Jaaskelainen ab751a8f71 Fix a crash when compiling blocks in OpenCL with multiple
address spaces.

llvm-svn: 215629
2014-08-14 09:37:50 +00:00
Justin Bogner caf1c6e3dd CodeGen: When bitfields fall on natural boundaries, split them up
Currently when laying out bitfields that don't need any padding, we
represent them as a wide enough int to contain all of the bits. This
can be hard on the backend since we'll do things like represent stores
to a few bits as loading an i144, masking it with a large constant,
and storing it back.

This turns up in less pathological cases where we load and mask 64 bit
word on a 32 bit platform when we actually only need to access 32 bits.
This leads to bad code being generated in most of our 32 bit backends.

In practice, there are often natural breaks in bitfields, and it's a
fairly simple and effective heuristic to split these fields into legal
integer sized chunks when it will be equivalent (ie, it won't force us
to add any extra padding).

llvm-svn: 215614
2014-08-14 02:42:10 +00:00
Yi Kong 45a09319bf ARM: Add mappings for ACLE prefetch intrinsics
Implement __pld, __pldx, __pli and __plix builtin intrinsics as specified in
ARM ACLE 2.0.

llvm-svn: 215599
2014-08-13 23:20:15 +00:00
Justin Bogner 5ea05aed15 test/CodeGen: Don't rely on a value's number in check lines
The tests in r215568 hard code a value as %0 in their checks. This
isn't correct in asserts builds.

llvm-svn: 215585
2014-08-13 21:54:06 +00:00
Yi Kong a5548431a5 AArch64: Prefetch intrinsic
llvm-svn: 215569
2014-08-13 19:18:20 +00:00
Yi Kong 26d104a9ec ARM: Prefetch intrinsics
llvm-svn: 215568
2014-08-13 19:18:14 +00:00
Adam Nemet 4abc07cb75 [AVX512] Add intrinsics for FP scalar broadcasts
Similar approach to the set1 intrinsics is used: implement in terms of vector
initializers and then ensure with an LLVM test that a broadcast is generated
at the end.

Part of <rdar://problem/17688758>

llvm-svn: 215486
2014-08-13 00:29:01 +00:00
Alexey Samsonov de443c5002 [UBSan] Add returns-nonnull sanitizer.
Summary:
This patch adds a runtime check verifying that functions
annotated with "returns_nonnull" attribute do in fact return nonnull pointers.
It is based on suggestion by Jakub Jelinek:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140623/223693.html.

Test Plan: regression test suite

Reviewers: rsmith

Reviewed By: rsmith

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D4849

llvm-svn: 215485
2014-08-13 00:26:40 +00:00
David Blaikie 77bbb5fd0b DebugInfo: Blocks: Do not depend on LLVM argument numbering when choosing the debug info argument numbering.
Due to the possible presence of return-by-out parameters, using the LLVM
argument number count when numbering debug info arguments can end up
off-by-one. This could produce two arguments with the same number, which
would in turn cause LLVM to emit only one of those arguments (whichever
it found last) or assert (r215157).

llvm-svn: 215227
2014-08-08 17:10:14 +00:00
Adam Nemet 5bf7baa938 [AVX512] Add intrinsic for valignd/q
Note that similar to palingr, we could further optimize these to emit
shufflevector when the shift count is <=64.  This however does not
change the overall design that unlike palignr we would still need the LLVM
intrinsic corresponding to this intruction to handle the >64 cases.  (palignr
uses the psrldq intrinsic in this case.)

llvm-svn: 214891
2014-08-05 17:28:23 +00:00
David Majnemer c017d3613e MS ABI: Aligned tentative definitions don't have CommonLinkage
int __declspec(align(16)) foo; is a tentative definition but the storage
for that variable should not have CommonLinkage.

llvm-svn: 214828
2014-08-05 00:01:13 +00:00
Bill Schmidt ccbe0a8022 [PPC64LE] Fix wrong IR for vec_sld and vec_vsldoi
My original LE implementation of the vsldoi instruction, with its
altivec.h interfaces vec_sld and vec_vsldoi, produces incorrect
shufflevector operations in the LLVM IR.  Correct code is generated
because the back end handles the incorrect shufflevector in a
consistent manner.

This patch and a companion patch for LLVM correct this problem by
removing the fixup from altivec.h and the corresponding fixup from the
PowerPC back end.  Several test cases are also modified to reflect the
now-correct LLVM IR.

The vec_sums and vec_vsumsws interfaces in altivec.h are also fixed,
because they used vec_perm calls intended to be recognized as vsldoi
instructions.  These vec_perm calls are now replaced with code that
more clearly shows the intent of the transformation.

llvm-svn: 214801
2014-08-04 23:21:26 +00:00
Joerg Sonnenberger 466a31eb65 vcfsx and dss instructions require immediates, variables are not valid.
llvm-svn: 214635
2014-08-02 15:07:21 +00:00
Alexey Samsonov d9ad5cec0c [ASan] Use metadata to pass source-level information from Clang to ASan.
Instead of creating global variables for source locations and global names,
just create metadata nodes and strings. They will be transformed into actual
globals in the instrumentation pass (if necessary). This approach is more
flexible:
1) we don't have to ensure that our custom globals survive all the optimizations
2) if globals are discarded for some reason, we will simply ignore metadata for them
   and won't have to erase corresponding globals
3) metadata for source locations can be reused for other purposes: e.g. we may
   attach source location metadata to alloca instructions and provide better descriptions
   for stack variables in ASan error reports.

No functionality change.

llvm-svn: 214604
2014-08-02 00:35:50 +00:00
Reid Kleckner e2d6429493 MS inline asm: Tests for r214550
These tests seem like an exception to the rule against assembly emitting
tests in clang.  I made an LLVM side change that can only be tested by
setting up the inline assembly machinery that is only implemented by
Clang.

llvm-svn: 214552
2014-08-01 20:23:29 +00:00
Daniel Sanders 2ef3cdd3d5 Revert r214497: [mips] Defer va_arg expansion to the backend.
It appears that the backend does not handle all cases that were handled by clang.
In particular, it does not handle structs as used in
SingleSource/UnitTests/2003-05-07-VarArgs.

llvm-svn: 214512
2014-08-01 13:26:28 +00:00
Daniel Sanders cd8ba86990 [mips] Defer va_arg expansion to the backend.
Summary:
This patch causes clang to emit va_arg instructions to the backend instead of
expanding them into an implementation itself. The backend already implements
va_arg since this is necessary for NaCl so this patch is removing redundant
code.

Together with the llvm patch (D4556) that accounts for the effect of endianness
on the expansion of va_arg, this fixes PR19612.

Depends on D4556

Reviewers: sstankovic, dsanders

Reviewed By: dsanders

Subscribers: rnk, cfe-commits

Differential Revision: http://reviews.llvm.org/D4742

llvm-svn: 214497
2014-08-01 10:29:21 +00:00
Hans Wennborg f51dc3b5d4 Local extern redeclarations of dllimport variables stay dllimport even if they don't specify the attribute
llvm-svn: 214425
2014-07-31 19:29:39 +00:00
Ehsan Akhgari 9f507382dd ms-inline-asm: Add a test to ensure that call doesn't clobber eax.
Note that it's not clear whether this is the right behavior, please see
the review for the discussion.

Reviewers: rnk

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D4577

llvm-svn: 214401
2014-07-31 13:43:17 +00:00
Richard Smith 77be48ac47 PR18097: Support initializing an _Atomic(T) from an object of C++ class type T
or a class derived from T. We already supported this when initializing
_Atomic(T) from T for most (and maybe all) other reasonable values of T.

llvm-svn: 214390
2014-07-31 06:31:19 +00:00
Adam Nemet da82bcc4dd [AVX512] Add unaligned FP load intrinsics
Part of <rdar://problem/17688758>

llvm-svn: 214380
2014-07-31 04:00:39 +00:00
Rafael Espindola 42affc2db6 Update for llvm change.
llvm-svn: 214356
2014-07-30 22:52:16 +00:00
Adam Nemet 2db1d2fb32 [AVX512] Add intrinsic for knot
Part of <rdar://problem/17688758>

llvm-svn: 214316
2014-07-30 16:51:27 +00:00
Adam Nemet c871ff95f3 [AVX512] Add some of the FP cast intrinsics
Part of <rdar://problem/17688758>

llvm-svn: 214315
2014-07-30 16:51:24 +00:00
Adam Nemet f42e7a274a [AVX512] Add set1 intrinsics
(Dropped the byte and word variants from the patch.  Turns out these are not
part of AVX512F but only AVX512BW/VL.)

Part of <rdar://problem/17688758>

llvm-svn: 214314
2014-07-30 16:51:22 +00:00
Richard Smith 00db2f13f8 PR20473: Don't "deduplicate" string literals with the same value but different
lengths! In passing, simplify string literal deduplication by relying on LLVM
to deduplicate the underlying constant values.

llvm-svn: 214222
2014-07-29 21:20:12 +00:00
Tobias Grosser b3af390087 Revert "Emit column debug information for loads"
This broke the following gdb tests:

gdb.base__annota1.exp
gdb.base__consecutive.exp
gdb.python__py-symtab.exp
gdb.reverse__consecutive-precsave.exp
gdb.reverse__consecutive-reverse.exp

I will look into this.

This reverts commit 214162.

llvm-svn: 214163
2014-07-29 06:53:14 +00:00
Tobias Grosser 01b923d55b Emit column debug information for loads
This allows us to give more precise diagnostics.

Diego kindly tested the impact on debug info size: "The increase on average
debug sizes is 0.1%. The total file size increase is ~0%."

llvm-svn: 214162
2014-07-29 06:10:47 +00:00
Adam Nemet fce1ad0b99 [AVX512] Add non-masking FP store intrinsics
Part of <rdar://problem/17688758>

llvm-svn: 214099
2014-07-28 17:14:45 +00:00
Adam Nemet a3ebe6214b [AVX512] Add FP add/sub/mul intrinsics
Part of <rdar://problem/17688758>

llvm-svn: 214098
2014-07-28 17:14:42 +00:00
Adam Nemet 062ba618f5 [AVX512] Add CHECK-LABELs to test/CodeGen/avx512f-builtins.c
llvm-svn: 214095
2014-07-28 17:14:36 +00:00
Ulrich Weigand 8afad61a93 [PowerPC] Support ELFv1/ELFv2 ABI selection via -mabi= option
While Clang now supports both ELFv1 and ELFv2 ABIs, their use is currently
hard-coded via the target triple: powerpc64-linux is always ELFv1, while
powerpc64le-linux is always ELFv2.

These are of course the most common scenarios, but in principle it is
possible to support the ELFv2 ABI on big-endian or the ELFv1 ABI on
little-endian systems (and GCC does support that), and there are some
special use cases for that (e.g. certain Linux kernel versions could
only be built using ELFv1 on LE).

This patch implements the Clang side of supporting this, based on the
LLVM commit 214072.  The command line options -mabi=elfv1 or -mabi=elfv2
select the desired ABI if present.  (If not, Clang uses the same default
rules as now.)

Specifically, the patch implements the following changes based on the
presence of the -mabi= option:

In the driver:
- Pass the appropiate -target-abi flag to the back-end
- Select the correct dynamic loader version (/lib64/ld64.so.[12])

In the preprocessor:
- Define _CALL_ELF to the appropriate value (1 or 2)

In the compiler back-end:
- Select the correct ABI in TargetInfo.cpp
- Select the desired ABI for LLVM via feature (elfv1/elfv2)

llvm-svn: 214074
2014-07-28 13:17:52 +00:00
Ehsan Akhgari 755597c83d Fix test/CodeGen/ms-inline-asm.c from r213916.
llvm-svn: 213919
2014-07-25 02:39:33 +00:00
Ehsan Akhgari fa2d9aa798 Fix test/CodeGen/ms-inline-asm.cpp from r213916.
llvm-svn: 213918
2014-07-25 02:35:50 +00:00
Ehsan Akhgari 2f93b448a8 clang-cl: Merge adjacent single-line __asm blocks
Summary:
This patch extends the __asm parser to make it keep parsing input tokens
as inline assembly if a single-line __asm line is followed by another line
starting with __asm too.  It also makes sure that we correctly keep
matching braces in such situations by separating the notions of how many
braces we are matching and whether we are in single-line asm block mode.

Reviewers: rnk

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D4598

llvm-svn: 213916
2014-07-25 02:27:14 +00:00
Mark Heffernan c888e41c0c Add support for #pragma nounroll.
llvm-svn: 213885
2014-07-24 18:09:38 +00:00
Mark Heffernan 44ca416a64 Rename metadata in test which was missed when renaming loop unroll metadata in r213771.
llvm-svn: 213775
2014-07-23 17:59:07 +00:00
Mark Heffernan 450c23843e In unroll pragma syntax and loop hint metadata, change "enable" forms to a new form using the string "full".
llvm-svn: 213771
2014-07-23 17:31:31 +00:00
Tim Northover 18b7512faa AArch64: use aarch64_be instead of arm64_be in all tests.
arm64_be doesn't really exist; it was useful for testing while AArch64 and
ARM64 were separate, but now the only real way to refer to the system is
aarch64_be.

llvm-svn: 213747
2014-07-23 12:57:31 +00:00
Robert Lytton 26149def6e remove hardcoded metadata numbers from tests
llvm-svn: 213659
2014-07-22 14:47:42 +00:00
Elena Demikhovsky fcc6df310d AVX-512: Added intrinsics to clang.
The set is small, that what I have right now.
Everybody is welcome to add more.

llvm-svn: 213641
2014-07-22 11:31:39 +00:00
Mark Heffernan 34735af3cb Rename metadata llvm.loop.vectorize.unroll to llvm.loop.vectorize.interleave.
llvm-svn: 213587
2014-07-21 23:10:56 +00:00
Mark Heffernan bd26f5ea4d Add support for '#pragma unroll'.
llvm-svn: 213574
2014-07-21 18:08:34 +00:00
Ulrich Weigand 601957fa23 [PowerPC] Optimize passing certain aggregates by value
In addition to enabling ELFv2 homogeneous aggregate handling,
LLVM support to pass array types directly also enables a performance
enhancement.  We can now pass (non-homogeneous) aggregates that fit
fully in registers as direct integer arrays, using an element type
to encode the alignment requirement (that would otherwise go to the
"byval align" field).

This is preferable since "byval" forces the back-end to write the
aggregate out to the stack, even if it could be passed fully in
registers.  This is particularly annoying on ELFv2, if there is
no parameter save area available, since we then need to allocate
space on the callee's stack just to hold those aggregates.

Note that to implement this optimization, this patch does not attempt
to fully anticipate register allocation rules as (defined in the
ABI and) implemented in the back-end.  Instead, the patch is simply
passing *any* aggregate passed by value using the array mechanism
if its size is up to 64 bytes.   This means that some of those will
end up being passed in stack slots anyway, but the generated code
shouldn't be any worse either.  (*Large* aggregates remain passed
using "byval" to enable optimized copying via memcpy etc.)

llvm-svn: 213495
2014-07-21 00:56:36 +00:00
Ulrich Weigand b712237da6 [PowerPC] Support the ELFv2 ABI
This patch implements clang support for the PowerPC ELFv2 ABI.
Together with a series of companion patches in LLVM, this makes
clang/LLVM fully usable on powerpc64le-linux.

Most of the ELFv2 ABI changes are fully implemented on the LLVM side.
On the clang side, we only need to implement some changes in how
aggregate types are passed by value.   Specifically, we need to:
- pass (and return) "homogeneous" floating-point or vector aggregates in
  FPRs and VRs (this is similar to the ARM homogeneous aggregate ABI)
- return aggregates of up to 16 bytes in one or two GPRs

The second piece is trivial to implement in any case.  To implement
the first piece, this patch makes use of infrastructure recently
enabled in the LLVM PowerPC back-end to support passing array types
directly, where the array element type encodes properties needed to
handle homogeneous aggregates correctly.

Specifically, the array element type encodes:
- whether the parameter should be passed in FPRs, VRs, or just
  GPRs/stack slots  (for float / vector / integer element types,
  respectively)
- what the alignment requirements of the parameter are when passed in
  GPRs/stack slots  (8 for float / 16 for vector / the element type
  size for integer element types) -- this corresponds to the
  "byval align" field

With this support in place, the clang part simply needs to *detect*
whether an aggregate type implements a float / vector homogeneous
aggregate as defined by the ELFv2 ABI, and if so, pass/return it
as array type using the appropriate float / vector element type.

llvm-svn: 213494
2014-07-21 00:48:09 +00:00
Hal Finkel 48d53e2c4c Use the dereferenceable attribute on C99 array parameters with static
In C99, an array parameter declarator might have the form:
  direct-declarator '[' 'static' type-qual-list[opt] assign-expr ']'

where the static keyword indicates that the caller will always provide a
pointer to the beginning of an array with at least the number of elements
specified by the assignment expression. For constant sizes, we can use the
new dereferenceable attribute to pass this information to the optimizer. For
VLAs, we don't know the size, but (for addrspace(0)) do know that the pointer
must be nonnull (and so we can use the nonnull attribute).

llvm-svn: 213444
2014-07-19 01:41:07 +00:00
Oliver Stannard e022851f3b [ARM] Fix AAPCS regression caused by r211898
r211898 introduced a regression where a large struct, which would
normally be passed ByVal, was causing padding to be inserted to
prevent the backend from using some GPRs, in order to follow the
AAPCS. However, the type of the argument was not being set correctly,
so the backend cannot align 8-byte aligned struct types on the stack.

The fix is to not insert the padding arguments when the argument is
being passed ByVal.

llvm-svn: 213359
2014-07-18 09:09:31 +00:00