It fits better with LLVM's memory model to try to do this in the
backend. Specifically, narrowing wide loads in the backends should be
relatively straightforward and is generally valuable, whereas widening
loads tends to be very constrained.
Discussion here:
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20140811/112581.html
This reverts commit r215614.
llvm-svn: 215648
Currently when laying out bitfields that don't need any padding, we
represent them as a wide enough int to contain all of the bits. This
can be hard on the backend since we'll do things like represent stores
to a few bits as loading an i144, masking it with a large constant,
and storing it back.
This turns up in less pathological cases where we load and mask 64 bit
word on a 32 bit platform when we actually only need to access 32 bits.
This leads to bad code being generated in most of our 32 bit backends.
In practice, there are often natural breaks in bitfields, and it's a
fairly simple and effective heuristic to split these fields into legal
integer sized chunks when it will be equivalent (ie, it won't force us
to add any extra padding).
llvm-svn: 215614
Similar approach to the set1 intrinsics is used: implement in terms of vector
initializers and then ensure with an LLVM test that a broadcast is generated
at the end.
Part of <rdar://problem/17688758>
llvm-svn: 215486
Summary:
This patch adds a runtime check verifying that functions
annotated with "returns_nonnull" attribute do in fact return nonnull pointers.
It is based on suggestion by Jakub Jelinek:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140623/223693.html.
Test Plan: regression test suite
Reviewers: rsmith
Reviewed By: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D4849
llvm-svn: 215485
Due to the possible presence of return-by-out parameters, using the LLVM
argument number count when numbering debug info arguments can end up
off-by-one. This could produce two arguments with the same number, which
would in turn cause LLVM to emit only one of those arguments (whichever
it found last) or assert (r215157).
llvm-svn: 215227
Note that similar to palingr, we could further optimize these to emit
shufflevector when the shift count is <=64. This however does not
change the overall design that unlike palignr we would still need the LLVM
intrinsic corresponding to this intruction to handle the >64 cases. (palignr
uses the psrldq intrinsic in this case.)
llvm-svn: 214891
My original LE implementation of the vsldoi instruction, with its
altivec.h interfaces vec_sld and vec_vsldoi, produces incorrect
shufflevector operations in the LLVM IR. Correct code is generated
because the back end handles the incorrect shufflevector in a
consistent manner.
This patch and a companion patch for LLVM correct this problem by
removing the fixup from altivec.h and the corresponding fixup from the
PowerPC back end. Several test cases are also modified to reflect the
now-correct LLVM IR.
The vec_sums and vec_vsumsws interfaces in altivec.h are also fixed,
because they used vec_perm calls intended to be recognized as vsldoi
instructions. These vec_perm calls are now replaced with code that
more clearly shows the intent of the transformation.
llvm-svn: 214801
Instead of creating global variables for source locations and global names,
just create metadata nodes and strings. They will be transformed into actual
globals in the instrumentation pass (if necessary). This approach is more
flexible:
1) we don't have to ensure that our custom globals survive all the optimizations
2) if globals are discarded for some reason, we will simply ignore metadata for them
and won't have to erase corresponding globals
3) metadata for source locations can be reused for other purposes: e.g. we may
attach source location metadata to alloca instructions and provide better descriptions
for stack variables in ASan error reports.
No functionality change.
llvm-svn: 214604
These tests seem like an exception to the rule against assembly emitting
tests in clang. I made an LLVM side change that can only be tested by
setting up the inline assembly machinery that is only implemented by
Clang.
llvm-svn: 214552
It appears that the backend does not handle all cases that were handled by clang.
In particular, it does not handle structs as used in
SingleSource/UnitTests/2003-05-07-VarArgs.
llvm-svn: 214512
Summary:
This patch causes clang to emit va_arg instructions to the backend instead of
expanding them into an implementation itself. The backend already implements
va_arg since this is necessary for NaCl so this patch is removing redundant
code.
Together with the llvm patch (D4556) that accounts for the effect of endianness
on the expansion of va_arg, this fixes PR19612.
Depends on D4556
Reviewers: sstankovic, dsanders
Reviewed By: dsanders
Subscribers: rnk, cfe-commits
Differential Revision: http://reviews.llvm.org/D4742
llvm-svn: 214497
Note that it's not clear whether this is the right behavior, please see
the review for the discussion.
Reviewers: rnk
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D4577
llvm-svn: 214401
or a class derived from T. We already supported this when initializing
_Atomic(T) from T for most (and maybe all) other reasonable values of T.
llvm-svn: 214390
(Dropped the byte and word variants from the patch. Turns out these are not
part of AVX512F but only AVX512BW/VL.)
Part of <rdar://problem/17688758>
llvm-svn: 214314
This broke the following gdb tests:
gdb.base__annota1.exp
gdb.base__consecutive.exp
gdb.python__py-symtab.exp
gdb.reverse__consecutive-precsave.exp
gdb.reverse__consecutive-reverse.exp
I will look into this.
This reverts commit 214162.
llvm-svn: 214163
This allows us to give more precise diagnostics.
Diego kindly tested the impact on debug info size: "The increase on average
debug sizes is 0.1%. The total file size increase is ~0%."
llvm-svn: 214162
While Clang now supports both ELFv1 and ELFv2 ABIs, their use is currently
hard-coded via the target triple: powerpc64-linux is always ELFv1, while
powerpc64le-linux is always ELFv2.
These are of course the most common scenarios, but in principle it is
possible to support the ELFv2 ABI on big-endian or the ELFv1 ABI on
little-endian systems (and GCC does support that), and there are some
special use cases for that (e.g. certain Linux kernel versions could
only be built using ELFv1 on LE).
This patch implements the Clang side of supporting this, based on the
LLVM commit 214072. The command line options -mabi=elfv1 or -mabi=elfv2
select the desired ABI if present. (If not, Clang uses the same default
rules as now.)
Specifically, the patch implements the following changes based on the
presence of the -mabi= option:
In the driver:
- Pass the appropiate -target-abi flag to the back-end
- Select the correct dynamic loader version (/lib64/ld64.so.[12])
In the preprocessor:
- Define _CALL_ELF to the appropriate value (1 or 2)
In the compiler back-end:
- Select the correct ABI in TargetInfo.cpp
- Select the desired ABI for LLVM via feature (elfv1/elfv2)
llvm-svn: 214074
Summary:
This patch extends the __asm parser to make it keep parsing input tokens
as inline assembly if a single-line __asm line is followed by another line
starting with __asm too. It also makes sure that we correctly keep
matching braces in such situations by separating the notions of how many
braces we are matching and whether we are in single-line asm block mode.
Reviewers: rnk
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D4598
llvm-svn: 213916
arm64_be doesn't really exist; it was useful for testing while AArch64 and
ARM64 were separate, but now the only real way to refer to the system is
aarch64_be.
llvm-svn: 213747
In addition to enabling ELFv2 homogeneous aggregate handling,
LLVM support to pass array types directly also enables a performance
enhancement. We can now pass (non-homogeneous) aggregates that fit
fully in registers as direct integer arrays, using an element type
to encode the alignment requirement (that would otherwise go to the
"byval align" field).
This is preferable since "byval" forces the back-end to write the
aggregate out to the stack, even if it could be passed fully in
registers. This is particularly annoying on ELFv2, if there is
no parameter save area available, since we then need to allocate
space on the callee's stack just to hold those aggregates.
Note that to implement this optimization, this patch does not attempt
to fully anticipate register allocation rules as (defined in the
ABI and) implemented in the back-end. Instead, the patch is simply
passing *any* aggregate passed by value using the array mechanism
if its size is up to 64 bytes. This means that some of those will
end up being passed in stack slots anyway, but the generated code
shouldn't be any worse either. (*Large* aggregates remain passed
using "byval" to enable optimized copying via memcpy etc.)
llvm-svn: 213495
This patch implements clang support for the PowerPC ELFv2 ABI.
Together with a series of companion patches in LLVM, this makes
clang/LLVM fully usable on powerpc64le-linux.
Most of the ELFv2 ABI changes are fully implemented on the LLVM side.
On the clang side, we only need to implement some changes in how
aggregate types are passed by value. Specifically, we need to:
- pass (and return) "homogeneous" floating-point or vector aggregates in
FPRs and VRs (this is similar to the ARM homogeneous aggregate ABI)
- return aggregates of up to 16 bytes in one or two GPRs
The second piece is trivial to implement in any case. To implement
the first piece, this patch makes use of infrastructure recently
enabled in the LLVM PowerPC back-end to support passing array types
directly, where the array element type encodes properties needed to
handle homogeneous aggregates correctly.
Specifically, the array element type encodes:
- whether the parameter should be passed in FPRs, VRs, or just
GPRs/stack slots (for float / vector / integer element types,
respectively)
- what the alignment requirements of the parameter are when passed in
GPRs/stack slots (8 for float / 16 for vector / the element type
size for integer element types) -- this corresponds to the
"byval align" field
With this support in place, the clang part simply needs to *detect*
whether an aggregate type implements a float / vector homogeneous
aggregate as defined by the ELFv2 ABI, and if so, pass/return it
as array type using the appropriate float / vector element type.
llvm-svn: 213494
In C99, an array parameter declarator might have the form:
direct-declarator '[' 'static' type-qual-list[opt] assign-expr ']'
where the static keyword indicates that the caller will always provide a
pointer to the beginning of an array with at least the number of elements
specified by the assignment expression. For constant sizes, we can use the
new dereferenceable attribute to pass this information to the optimizer. For
VLAs, we don't know the size, but (for addrspace(0)) do know that the pointer
must be nonnull (and so we can use the nonnull attribute).
llvm-svn: 213444
r211898 introduced a regression where a large struct, which would
normally be passed ByVal, was causing padding to be inserted to
prevent the backend from using some GPRs, in order to follow the
AAPCS. However, the type of the argument was not being set correctly,
so the backend cannot align 8-byte aligned struct types on the stack.
The fix is to not insert the padding arguments when the argument is
being passed ByVal.
llvm-svn: 213359