Add support for the `-fdebug-prefix-map=` option as in GCC. The syntax is
`-fdebug-prefix-map=OLD=NEW`. When compiling files from a path beginning with
OLD, change the debug info to indicate the path as start with NEW. This is
particularly helpful if you are preprocessing in one path and compiling in
another (e.g. for a build cluster with distcc).
Note that the linearity of the implementation is not as terrible as it may seem.
This is normally done once per file with an expectation that the map will be
small (1-2) entries, making this roughly linear in the number of input paths.
Addresses PR24619.
llvm-svn: 250094
This fixes a bug where one can take the address of a conditionally
enabled function to drop its enable_if guards. For example:
int foo(int a) __attribute__((enable_if(a > 0, "")));
int (*p)(int) = &foo;
int result = p(-1); // compilation succeeds; calls foo(-1)
Overloading logic has been updated to reflect this change, as well.
Functions with enable_if attributes that are always true are still
allowed to have their address taken.
Differential Revision: http://reviews.llvm.org/D13607
llvm-svn: 250090
Rationale :
// sse3
__m128d test_mm_addsub_pd(__m128d A, __m128d B) {
return _mm_addsub_pd(A, B);
}
// mmx
void shift(__m64 a, __m64 b, int c) {
_mm_slli_pi16(a, c);
_mm_slli_pi32(a, c);
_mm_slli_si64(a, c);
_mm_srli_pi16(a, c);
_mm_srli_pi32(a, c);
_mm_srli_si64(a, c);
_mm_srai_pi16(a, c);
_mm_srai_pi32(a, c);
}
clang -msse3 -mno-mmx file.c -c
For this code we should be able to explicitly turn off MMX
without affecting the compilation of the SSE3 function and then
diagnose and error on compiling the MMX function.
This is a preparatory patch to the actual diagnosis code which is
coming in a future patch. This sets us up to have the correct information
where we need it and verifies that it's being emitted for the backend
to handle.
llvm-svn: 249733
that we can build up an accurate set of features rather than relying on
TargetInfo initialization via handleTargetFeatures to munge the list
of features.
llvm-svn: 249732
Enums without an explicit, fixed, underlying type are implicitly given a
fixed 'int' type for ABI compatibility with MSVC. However, we can
enforce the standard-mandated rules on these types as-if we didn't know
this fact if the tag is not part of a definition.
llvm-svn: 249667
These test updates almost exclusively around the change in behavior
around enum: enums without a definition are considered incomplete except
when targeting MSVC ABIs. Since these tests are interested in the
'incomplete-enum' behavior, restrict them to %itanium_abi_triple.
llvm-svn: 249660
No ABI for C++ currently makes it possible to implement the standard
100% perfectly. We wrongly hid some of our compatible behavior behind
-fms-compatibility instead of tying it to the compiler ABI.
llvm-svn: 249656
With this change, most 'g' options are rejected by CompilerInvocation.
They remain only as Driver options. The new way to request debug info
from cc1 is with "-debug-info-kind={line-tables-only|limited|standalone}"
and "-dwarf-version={2|3|4}". In the absence of a command-line option
to specify Dwarf version, the Toolchain decides it, rather than placing
Toolchain-specific logic in CompilerInvocation.
Also fix a bug in the Windows compatibility argument parsing
in which the "rightmost argument wins" principle failed.
Differential Revision: http://reviews.llvm.org/D13221
llvm-svn: 249655
Currently FastISel doesn't know how to select vector bitcasts.
During instruction selection, fast-isel always falls back to SelectionDAG
every time it encounters a vector bitcast.
As a consequence of this, all the 'packed vector shift by immedate count'
test cases in avx2-builtins.c are optimized by the DAGCombiner.
In particular, the DAGCombiner would always fold trivial stack loads of
constant shift counts into the operands of packed shift builtins.
This behavior would start changing as soon as I reapply revision 249121.
That revision would teach x86 fast-isel how to select bitcasts between vector
types of the same size.
As a consequence of that change, fast-isel would less often fall back to
SelectionDAG. More importantly, DAGCombiner would no longer be able to
simplify the code by folding the stack reload of a constant.
No functional change.
llvm-svn: 249142
test that our intrinsics behave the same under -fsigned-char and
-funsigned-char.
This further testing uncovered that AVX-2 has a broken cmpgt for 8-bit
elements, and has for a long time. This is fixed in the same way as
SSE4 handles the case.
The other ISA extensions currently work correctly because they use
specific instruction intrinsics. As soon as they are rewritten in terms
of generic IR, they will need to add these special casts. I've added the
necessary testing to catch this however, so we shouldn't have to chase
it down again.
I considered changing the core typedef to be signed, but that seems like
a bad idea. Notably, it would be an ABI break if anyone is reaching into
the innards of the intrinsic headers and passing __v16qi on an API
boundary. I can't be completely confident that this wouldn't happen due
to a macro expanding in a lambda, etc., so it seems much better to leave
it alone. It also matches GCC's behavior exactly.
A fun side note is that for both GCC and Clang, -funsigned-char really
does change the semantics of __v16qi. To observe this, consider:
% cat x.cc
#include <smmintrin.h>
#include <iostream>
int main() {
__v16qi a = { 1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
__v16qi b = _mm_set1_epi8(-1);
std::cout << (int)(a / b)[0] << ", " << (int)(a / b)[1] << '\n';
}
% clang++ -o x x.cc && ./x
-1, 1
% clang++ -funsigned-char -o x x.cc && ./x
0, 1
However, while this may be surprising, both Clang and GCC agree.
Differential Revision: http://reviews.llvm.org/D13324
llvm-svn: 249097
recently when we started using direct conversion to model sign
extension. The __v16qi type we use for SSE v16i8 vectors is defined in
terms of 'char' which may or may not be signed! This causes us to
generate pmovsx and pmovzx depending on the setting of -funsigned-char.
This patch just forms an explicitly signed type and uses that to
formulate the sign extension. While this gets the correct behavior
(which we now verify with the enhanced test) this is just the tip of the
ice berg. Now that I know what to look for, I have found errors of this
sort *throughout* our vector code. Fortunately, this is the only
specific place where I know of users actively having their code
miscompiled by Clang due to this, so I'm keeping the fix for those users
minimal and targeted.
I'll be sending a proper email for discussion of how to fix these
systematically, what the implications are, and just how widely broken
this is... From what I can tell, we have never shipped a correct set of
builtin headers for x86 when users rely on -funsigned-char. Oops.
llvm-svn: 248980
Summary: __nvvm_atom_cas_* returns the old value instead of whether the swap succeeds.
Reviewers: eliben, tra
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D13306
llvm-svn: 248951
This is the clang commit associated with llvm r248887.
This commit changes the interface of the vld[1234], vld[234]lane, and vst[1234],
vst[234]lane ARM neon intrinsics and associates an address space with the
pointer that these intrinsics take. This changes, e.g.,
<2 x i32> @llvm.arm.neon.vld1.v2i32(i8*, i32)
to
<2 x i32> @llvm.arm.neon.vld1.v2i32.p0i8(i8*, i32)
This change ensures that address spaces are fully taken into account in the ARM
target during lowering of interleaved loads and stores.
Differential Revision: http://reviews.llvm.org/D13127
llvm-svn: 248888
This patch corresponds to review:
http://reviews.llvm.org/D13190
Implemented the following interfaces to conform to ELF V2 ABI version 1.1.
vector signed __int128 vec_adde (vector signed __int128, vector signed __int128, vector signed __int128);
vector unsigned __int128 vec_adde (vector unsigned __int128, vector unsigned __int128, vector unsigned __int128);
vector signed __int128 vec_addec (vector signed __int128, vector signed __int128, vector signed __int128);
vector unsigned __int128 vec_addec (vector unsigned __int128, vector unsigned __int128, vector unsigned __int128);
vector signed int vec_addc(vector signed int __a, vector signed int __b);
vector bool char vec_cmpge (vector signed char __a, vector signed char __b);
vector bool char vec_cmpge (vector unsigned char __a, vector unsigned char __b);
vector bool short vec_cmpge (vector signed short __a, vector signed short __b);
vector bool short vec_cmpge (vector unsigned short __a, vector unsigned short __b);
vector bool int vec_cmpge (vector signed int __a, vector signed int __b);
vector bool int vec_cmpge (vector unsigned int __a, vector unsigned int __b);
vector bool char vec_cmple (vector signed char __a, vector signed char __b);
vector bool char vec_cmple (vector unsigned char __a, vector unsigned char __b);
vector bool short vec_cmple (vector signed short __a, vector signed short __b);
vector bool short vec_cmple (vector unsigned short __a, vector unsigned short __b);
vector bool int vec_cmple (vector signed int __a, vector signed int __b);
vector bool int vec_cmple (vector unsigned int __a, vector unsigned int __b);
vector double vec_double (vector signed long long __a);
vector double vec_double (vector unsigned long long __a);
vector bool char vec_eqv(vector bool char __a, vector bool char __b);
vector bool short vec_eqv(vector bool short __a, vector bool short __b);
vector bool int vec_eqv(vector bool int __a, vector bool int __b);
vector bool long long vec_eqv(vector bool long long __a, vector bool long long __b);
vector signed short vec_madd(vector signed short __a, vector signed short __b, vector signed short __c);
vector signed short vec_madd(vector signed short __a, vector unsigned short __b, vector unsigned short __c);
vector signed short vec_madd(vector unsigned short __a, vector signed short __b, vector signed short __c);
vector unsigned short vec_madd(vector unsigned short __a, vector unsigned short __b, vector unsigned short __c);
vector bool long long vec_mergeh(vector bool long long __a, vector bool long long __b);
vector bool long long vec_mergel(vector bool long long __a, vector bool long long __b);
vector bool char vec_nand(vector bool char __a, vector bool char __b);
vector bool short vec_nand(vector bool short __a, vector bool short __b);
vector bool int vec_nand(vector bool int __a, vector bool int __b);
vector bool long long vec_nand(vector bool long long __a, vector bool long long __b);
vector bool char vec_orc(vector bool char __a, vector bool char __b);
vector bool short vec_orc(vector bool short __a, vector bool short __b);
vector bool int vec_orc(vector bool int __a, vector bool int __b);
vector bool long long vec_orc(vector bool long long __a, vector bool long long __b);
vector signed long long vec_sub(vector signed long long __a, vector signed long long __b);
vector signed long long vec_sub(vector bool long long __a, vector signed long long __b);
vector signed long long vec_sub(vector signed long long __a, vector bool long long __b);
vector unsigned long long vec_sub(vector unsigned long long __a, vector unsigned long long __b);
vector unsigned long long vec_sub(vector bool long long __a, vector unsigned long long __b);
vector unsigned long long vec_sub(vector unsigned long long __V2 ABI V1.1
http://ror float vec_sub(vector float __a, vector float __b);
unsigned char vec_extract(vector bool char __a, int __b);
signed short vec_extract(vector signed short __a, int __b);
unsigned short vec_extract(vector bool short __a, int __b);
signed int vec_extract(vector signed int __a, int __b);
unsigned int vec_extract(vector bool int __a, int __b);
signed long long vec_extract(vector signed long long __a, int __b);
unsigned long long vec_extract(vector unsigned long long __a, int __b);
unsigned long long vec_extract(vector bool long long __a, int __b);
double vec_extract(vector double __a, int __b);
vector bool char vec_insert(unsigned char __a, vector bool char __b, int __c);
vector signed short vec_insert(signed short __a, vector signed short __b, int __c);
vector bool short vec_insert(unsigned short __a, vector bool short __b, int __c);
vector signed int vec_insert(signed int __a, vector signed int __b, int __c);
vector bool int vec_insert(unsigned int __a, vector bool int __b, int __c);
vector signed long long vec_insert(signed long long __a, vector signed long long __b, int __c);
vector unsigned long long vec_insert(unsigned long long __a, vector unsigned long long __b, int __c);
vector bool long long vec_insert(unsigned long long __a, vector bool long long __b, int __c);
vector double vec_insert(double __a, vector double __b, int __c);
vector signed long long vec_splats(signed long long __a);
vector unsigned long long vec_splats(unsigned long long __a);
vector signed __int128 vec_splats(signed __int128 __a);
vector unsigned __int128 vec_splats(unsigned __int128 __a);
vector double vec_splats(double __a);
int vec_all_eq(vector double __a, vector double __b);
int vec_all_ge(vector double __a, vector double __b);
int vec_all_gt(vector double __a, vector double __b);
int vec_all_le(vector double __a, vector double __b);
int vec_all_lt(vector double __a, vector double __b);
int vec_all_nan(vector double __a);
int vec_all_ne(vector double __a, vector double __b);
int vec_all_nge(vector double __a, vector double __b);
int vec_all_ngt(vector double __a, vector double __b);
int vec_any_eq(vector double __a, vector double __b);
int vec_any_ge(vector double __a, vector double __b);
int vec_any_gt(vector double __a, vector double __b);
int vec_any_le(vector double __a, vector double __b);
int vec_any_lt(vector double __a, vector double __b);
int vec_any_ne(vector double __a, vector double __b);
vector unsigned char vec_sbox_be (vector unsigned char);
vector unsigned char vec_cipher_be (vector unsigned char, vector unsigned char);
vector unsigned char vec_cipherlast_be (vector unsigned char, vector unsigned char);
vector unsigned char vec_ncipher_be (vector unsigned char, vector unsigned char);
vector unsigned char vec_ncipherlast_be (vector unsigned char, vector unsigned char);
vector unsigned int vec_shasigma_be (vector unsigned int, const int, const int);
vector unsigned long long vec_shasigma_be (vector unsigned long long, const int, const int);
vector unsigned short vec_pmsum_be (vector unsigned char, vector unsigned char);
vector unsigned int vec_pmsum_be (vector unsigned short, vector unsigned short);
vector unsigned long long vec_pmsum_be (vector unsigned int, vector unsigned int);
vector unsigned __int128 vec_pmsum_be (vector unsigned long long, vector unsigned long long);
vector unsigned char vec_gb (vector unsigned char);
vector unsigned long long vec_bperm (vector unsigned __int128 __a, vector unsigned char __b);
Removed the folowing interfaces either because their signatures have changed
in version 1.1 of the ABI or because they were implemented for ELF V2 ABI but
have actually been deprecated in version 1.1.
vector signed char vec_eqv(vector bool char __a, vector signed char __b);
vector signed char vec_eqv(vector signed char __a, vector bool char __b);
vector unsigned char vec_eqv(vector bool char __a, vector unsigned char __b);
vector unsigned char vec_eqv(vector unsigned char __a, vector bool char __b);
vector signed short vec_eqv(vector bool short __a, vector signed short __b);
vector signed short vec_eqv(vector signed short __a, vector bool short __b);
vector unsigned short vec_eqv(vector bool short __a, vector unsigned short __b);
vector unsigned short vec_eqv(vector unsigned short __a, vector bool short __b);
vector signed int vec_eqv(vector bool int __a, vector signed int __b);
vector signed int vec_eqv(vector signed int __a, vector bool int __b);
vector unsigned int vec_eqv(vector bool int __a, vector unsigned int __b);
vector unsigned int vec_eqv(vector unsigned int __a, vector bool int __b);
vector signed long long vec_eqv(vector bool long long __a, vector signed long long __b);
vector signed long long vec_eqv(vector signed long long __a, vector bool long long __b);
vector unsigned long long vec_eqv(vector bool long long __a, vector unsigned long long __b);
vector unsigned long long vec_eqv(vector unsigned long long __a, vector bool long long __b);
vector float vec_eqv(vector bool int __a, vector float __b);
vector float vec_eqv(vector float __a, vector bool int __b);
vector double vec_eqv(vector bool long long __a, vector double __b);
vector double vec_eqv(vector double __a, vector bool long long __b);
vector unsigned short vec_nand(vector bool short __a, vector unsigned short __b);
llvm-svn: 248813
Currently it's 64-bit which will lead to mismatch between host and
device code if we compile for i386.
Differential Revision: http://reviews.llvm.org/D13181
llvm-svn: 248753
Currently, the availability of DSP instructions (ACLE 6.4.7) is handled in
a hand-rolled tricky condition block in lib/Basic/Targets.cpp, with a FIXME:
attached.
http://reviews.llvm.org/D12937 moved the handling of the DSP feature over to
ARMTargetParser.def in LLVM, to be in line with other architecture extensions.
This is the corresponding patch to clang, to clear the FIXME: and update
the tests.
Differential Revision: http://reviews.llvm.org/D12938
llvm-svn: 248521
Summary:
Strictly speaking, the MIPS*R2 ISA's should not permit -mnan=2008 since this
feature was added in MIPS*R3. However, other toolchains permit this and we
should do the same.
Reviewers: atanasyan
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D13057
llvm-svn: 248481
This commit fixes an assert that is triggered when optnone is being
added to an IR function that is already marked with minsize and optsize.
rdar://problem/22723716
Differential Revision: http://reviews.llvm.org/D13004
llvm-svn: 248191
Currently, the availability of DSP instructions (ACLE 6.4.7) is handled in
a hand-rolled tricky condition block in lib/Basic/Targets.cpp, with a FIXME:
attached.
http://reviews.llvm.org/D12937 moved the handling of +t2dsp over to
ARMTargetParser.def in LLVM, to be in line with other architecture extensions.
This is the corresponding patch to clang, to clear the FIXME: and update
the tests.
Differential Revision: http://reviews.llvm.org/D12938
llvm-svn: 248154
128-bit vector integer sign extensions correctly lower to the pmovsx instructions even for debug builds.
This patch removes the builtins and reimplements the _mm_cvtepi*_epi* intrinsics __using builtin_shufflevector (to extract the bottom most subvector) and __builtin_convertvector (to actually perform the sign extension).
Differential Revision: http://reviews.llvm.org/D12835
llvm-svn: 248092
Summary:
This change adds support for `__builtin_ms_va_list`, a GCC extension for
variadic `ms_abi` functions. The existing `__builtin_va_list` support is
inadequate for this because `va_list` is defined differently in the Win64
ABI vs. the System V/AMD64 ABI.
Depends on D1622.
Reviewers: rsmith, rnk, rjmccall
CC: cfe-commits
Differential Revision: http://reviews.llvm.org/D1623
llvm-svn: 247941
Mingw generally wraps an old copy of msvcrt.dll which has these
personalities, so things should work out, or so I hear. I haven't tested
it.
llvm-svn: 247902
convert i64 to FP and vice versa
reduceps & reducepd
rangeps & rangepd
all in their 512bit versions
Differential Revision: http://reviews.llvm.org/D11716
llvm-svn: 247881
Current implementation may end up emitting an undefined reference for
an "inline __attribute__((always_inline))" function by generating an
"available_externally alwaysinline" IR function for it and then failing to
inline all the calls. This happens when a call to such function is in dead
code. As the inliner is an SCC pass, it does not process dead code.
Libc++ relies on the compiler never emitting such undefined reference.
With this patch, we emit a pair of
1. internal alwaysinline definition (called F.alwaysinline)
2a. A stub F() { musttail call F.alwaysinline }
-- or, depending on the linkage --
2b. A declaration of F.
The frontend ensures that F.inlinefunction is only used for direct
calls, and the stub is used for everything else (taking the address of
the function, really). Declaration (2b) is emitted in the case when
"inline" is meant for inlining only (like __gnu_inline__ and some
other cases).
This approach, among other nice properties, ensures that alwaysinline
functions are always internal, making it impossible for a direct call
to such function to produce an undefined symbol reference.
This patch is based on ideas by Chandler Carruth and Richard Smith.
llvm-svn: 247494
Current implementation may end up emitting an undefined reference for
an "inline __attribute__((always_inline))" function by generating an
"available_externally alwaysinline" IR function for it and then failing to
inline all the calls. This happens when a call to such function is in dead
code. As the inliner is an SCC pass, it does not process dead code.
Libc++ relies on the compiler never emitting such undefined reference.
With this patch, we emit a pair of
1. internal alwaysinline definition (called F.alwaysinline)
2a. A stub F() { musttail call F.alwaysinline }
-- or, depending on the linkage --
2b. A declaration of F.
The frontend ensures that F.inlinefunction is only used for direct
calls, and the stub is used for everything else (taking the address of
the function, really). Declaration (2b) is emitted in the case when
"inline" is meant for inlining only (like __gnu_inline__ and some
other cases).
This approach, among other nice properties, ensures that alwaysinline
functions are always internal, making it impossible for a direct call
to such function to produce an undefined symbol reference.
This patch is based on ideas by Chandler Carruth and Richard Smith.
llvm-svn: 247465
-force-align-stack.
Also, make changes to the driver so that -mno-stack-realign is no longer
an option exposed to the end-user that disallows stack realignment in
the backend.
Differential Revision: http://reviews.llvm.org/D11815
llvm-svn: 247451
It seems that there is small bug, and we can't generate assume loads
when some virtual functions have internal visibiliy
This reverts commit 982bb7d966947812d216489b3c519c9825cacbf2.
llvm-svn: 247332
This flag causes the compiler to emit bit set entries for functions as well
as runtime bitset checks at indirect call sites. Depends on the new function
bitset mechanism.
Differential Revision: http://reviews.llvm.org/D11857
llvm-svn: 247238
Generating call assume(icmp %vtable, %global_vtable) after constructor
call for devirtualization purposes.
For more info go to:
http://lists.llvm.org/pipermail/cfe-dev/2015-July/044227.html
Edit:
Fixed version because of PR24479.
After this patch got reverted because of ScalarEvolution bug (D12719)
Merged after John McCall big patch (Added Address).
http://reviews.llvm.org/D11859
llvm-svn: 247199
The tests in test/CodeGen/arm-target-features.c are currently
passing but warning messages are suppressed. These tests are now
synchronized with the corresponding changes in Target Parser.
This patch will fix the regressions in clang caused by r247136
Differential Revision: http://reviews.llvm.org/D12722
llvm-svn: 247138
Summary:
Currently clang provides no general way to generate nontemporal loads/stores.
There are some architecture specific builtins for doing so (e.g. in x86), but
there is no way to generate non-temporal store on, e.g. AArch64. This patch adds
generic builtins which are expanded to a simple store with '!nontemporal'
attribute in IR.
Differential Revision: http://reviews.llvm.org/D12313
llvm-svn: 247104
instruction used the ReturnValue as pointer operand or value operand. This
led to wrong code gen - in later stages (load-store elision code) the found
store and its operand would be erased, causing ReturnValue to become a <badref>.
The patch adds a check that makes sure that ReturnValue is a pointer operand of
store instruction. Regression test is also added.
This fixes PR24386.
Differential Revision: http://reviews.llvm.org/D12400
llvm-svn: 247003
Introduce an Address type to bundle a pointer value with an
alignment. Introduce APIs on CGBuilderTy to work with Address
values. Change core APIs on CGF/CGM to traffic in Address where
appropriate. Require alignments to be non-zero. Update a ton
of code to compute and propagate alignment information.
As part of this, I've promoted CGBuiltin's EmitPointerWithAlignment
helper function to CGF and made use of it in a number of places in
the expression emitter.
The end result is that we should now be significantly more correct
when performing operations on objects that are locally known to
be under-aligned. Since alignment is not reliably tracked in the
type system, there are inherent limits to this, but at least we
are no longer confused by standard operations like derived-to-base
conversions and array-to-pointer decay. I've also fixed a large
number of bugs where we were applying the complete-object alignment
to a pointer instead of the non-virtual alignment, although most of
these were hidden by the very conservative approach we took with
member alignment.
Also, because IRGen now reliably asserts on zero alignments, we
should no longer be subject to an absurd but frustrating recurring
bug where an incomplete type would report a zero alignment and then
we'd naively do a alignmentAtOffset on it and emit code using an
alignment equal to the largest power-of-two factor of the offset.
We should also now be emitting much more aggressive alignment
attributes in the presence of over-alignment. In particular,
field access now uses alignmentAtOffset instead of min.
Several times in this patch, I had to change the existing
code-generation pattern in order to more effectively use
the Address APIs. For the most part, this seems to be a strict
improvement, like doing pointer arithmetic with GEPs instead of
ptrtoint. That said, I've tried very hard to not change semantics,
but it is likely that I've failed in a few places, for which I
apologize.
ABIArgInfo now always carries the assumed alignment of indirect and
indirect byval arguments. In order to cut down on what was already
a dauntingly large patch, I changed the code to never set align
attributes in the IR on non-byval indirect arguments. That is,
we still generate code which assumes that indirect arguments have
the given alignment, but we don't express this information to the
backend except where it's semantically required (i.e. on byvals).
This is likely a minor regression for those targets that did provide
this information, but it'll be trivial to add it back in a later
patch.
I partially punted on applying this work to CGBuiltin. Please
do not add more uses of the CreateDefaultAligned{Load,Store}
APIs; they will be going away eventually.
llvm-svn: 246985
Apparently there are many cast kinds that may cause implicit pointer
arithmetic to happen. In light of this, the cast ignoring logic
introduced in r246877 has been changed to only ignore a small set of
cast kinds, and a test for this behavior has been added.
Thanks to Richard for catching this before it became a bug report. :)
llvm-svn: 246890
Improvements:
- For all types, we would give up in a case such as:
__builtin_object_size((char*)&foo, N);
even if we could provide an answer to
__builtin_object_size(&foo, N);
We now provide the same answer for both of the above examples in all
cases.
- For type=1|3, we now support subobjects with unknown bases, as long
as the designator is valid.
Thanks to Richard Smith for the review + design planning.
Review: http://reviews.llvm.org/D12169
llvm-svn: 246877
This implements basic support for compiling (though not yet assembling
or linking) for a WebAssembly target. Note that ABI details are not yet
finalized, and may change.
Differential Revision: http://reviews.llvm.org/D12002
llvm-svn: 246814
The ACLE (ARM C Language Extensions) 2.0 allows the __fp16 type to be
used as a functon argument or return type (ACLE 1.1 did not).
The current public release of the AAPCS (2.09) states that __fp16 values
should be converted to single-precision before being passed or returned,
but AAPCS 2.10 (to be released shortly) changes this, so that they are
passed in the least-significant 16 bits of either a GPR (for base AAPCS)
or a single-precision register (for AAPCS-VFP). This does not change how
arguments are passed if they get passed on the stack.
This patch brings clang up to compliance with the latest versions of
both of these specs.
We can now set the __ARM_FP16_ARGS ACLE predefine, and we have always
been able to set the __ARM_FP16_FORMAT_IEEE predefine (we do not support
the alternative format).
llvm-svn: 246764
Original commit message:
[ARM] Allow passing/returning of __fp16 arguments
The ACLE (ARM C Language Extensions) 2.0 allows the __fp16 type to be
used as a functon argument or return type (ACLE 1.1 did not).
The current public release of the AAPCS (2.09) states that __fp16 values
should be converted to single-precision before being passed or returned,
but AAPCS 2.10 (to be released shortly) changes this, so that they are
passed in the least-significant 16 bits of either a GPR (for base AAPCS)
or a single-precision register (for AAPCS-VFP). This does not change how
arguments are passed if they get passed on the stack.
This patch brings clang up to compliance with the latest versions of
both of these specs.
We can now set the __ARM_FP16_ARGS ACLE predefine, and we have always
been able to set the __ARM_FP16_FORMAT_IEEE predefine (we do not support
the alternative format).
llvm-svn: 246760
The ACLE (ARM C Language Extensions) 2.0 allows the __fp16 type to be
used as a functon argument or return type (ACLE 1.1 did not).
The current public release of the AAPCS (2.09) states that __fp16 values
should be converted to single-precision before being passed or returned,
but AAPCS 2.10 (to be released shortly) changes this, so that they are
passed in the least-significant 16 bits of either a GPR (for base AAPCS)
or a single-precision register (for AAPCS-VFP). This does not change how
arguments are passed if they get passed on the stack.
This patch brings clang up to compliance with the latest versions of
both of these specs.
We can now set the __ARM_FP16_ARGS ACLE predefine, and we have always
been able to set the __ARM_FP16_FORMAT_IEEE predefine (we do not support
the alternative format).
llvm-svn: 246755
This patch depends on r246688 (D12341).
The goal is to make LLVM generate different code for these functions for a target that
has cheap branches (see PR23827 for more details):
int foo();
int normal(int x, int y, int z) {
if (x != 0 && y != 0) return foo();
return 1;
}
int crazy(int x, int y) {
if (__builtin_unpredictable(x != 0 && y != 0)) return foo();
return 1;
}
Differential Revision: http://reviews.llvm.org/D12458
llvm-svn: 246699
GCC 4.8+ has a PowerPC-specific intrinsic, __builtin_ppc_get_timebase, to do
what Clang's __builtin_readcyclecounter does. For compatibility with code that
uses GCC's spelling (including glibc), support it as well.
Partially fixes PR23681.
llvm-svn: 246510
In release builds labels are numbers. Matching just the number may result
in false matches where the label is contained in other numbers, such as
14 inside [114 x i8]. A stricter match requiring start of line or > character
before the label avoids these false matches.
llvm-svn: 246385
Without this, 64-byte vector types (__m512), specified to be 64-byte
aligned in the AVX512 draft SysV ABI, will only be 32-byte aligned.
This is analoguous to AVX, for which we accept 32-byte max alignment.
Differential Revision: http://reviews.llvm.org/D10724
llvm-svn: 246230
There's no point in using a larger alignment if we have no instructions
that would benefit from it.
Differential Revision: http://reviews.llvm.org/D12389
llvm-svn: 246229
A couple of changes here:
a) Do less work in the case where we don't have a target attribute on the
function. We've already canonicalized the attributes for the function -
no need to do more work.
b) Use the newer canonicalized feature adding functions from TargetInfo
to do the work when we do have a target attribute. This enables us to diagnose
some warnings in the case of conflicting written attributes (only ppc does
this today) and also make sure to get all of the features for a cpu that's
listed rather than just change the cpu.
Updated all testcases accordingly and added a new testcase to verify that we'll
error out on ppc if we have some incompatible options using the existing diagnosis
framework there.
llvm-svn: 246195
We agreed for r245605 that, as long as we don't affect -O0 codegen
too much, it's OK to use native constructs rather than intrinsics.
Let's test that, starting with AVX2 here.
See PR24580.
Differential Revision: http://reviews.llvm.org/D12212
llvm-svn: 245987
As discussed in PR23648 - the intrinsics _m_from_int, _m_to_int and _m_prefetch are defined in mmintrin.h and prfchwintrin.h so we don't need to in Intrin.h
Added tests for _m_from_int and _m_to_int
D11338 already added a test for _m_prefetch
Differential Revision: http://reviews.llvm.org/D12272
llvm-svn: 245975
_rotl, _rotwl and _lrotl (and their right-shift counterparts) are official x86
intrinsics, and should be supported regardless of environment. This is in contrast
to _rotl8, _rotl16, and _rotl64 which are MS-specific.
Note that the MS documentation for _lrotl is different from the Intel
documentation. Intel explicitly documents it as a 64-bit rotate, while for MS,
since sizeof(unsigned long) for MSVC is always 4, a 32-bit rotate is implied.
Differential Revision: http://reviews.llvm.org/D12271
llvm-svn: 245923
This is important in the case that the LLVM-inferred llvm-struct
alignment is not the same as the clang-known C-struct alignment.
Differential Revision: http://reviews.llvm.org/D12243
llvm-svn: 245719
This lets us optimize them better. We agreed to remove the intrinsics,
instead of combining them later, as, at -O0, we generate the expected
instructions. Plus, it's a nice cleanup.
Differential Revision: http://reviews.llvm.org/D10556
llvm-svn: 245605
alignment is ignored, and they always allocate a complete
storage unit.
Also, change the dumping of AST record layouts: use the more
readable C++-style dumping even in C, include bitfield offset
information in the dump, and don't print sizeof/alignof
information for fields of record type, since we don't do so
for bases or other kinds of field.
rdar://22275433
llvm-svn: 245514
__builtin_object_size would return incorrect answers for many uses where
type=3. This fixes the inaccuracy by making us emit 0 instead of LLVM's
objectsize intrinsic.
Additionally, there are many cases where we would emit suboptimal (but
correct) answers, such as when arrays are involved. This patch fixes
some of these cases (please see new tests in test/CodeGen/object-size.c
for specifics on which cases are improved)
Resubmit of r245323 with PR24493 fixed.
Patch mostly by Richard Smith.
Differential Revision: http://reviews.llvm.org/D12000
This fixes PR15212.
llvm-svn: 245403
__builtin_object_size would return incorrect answers for many uses where
type=3. This fixes the inaccuracy by making us emit 0 instead of LLVM's
objectsize intrinsic.
Additionally, there are many cases where we would emit suboptimal (but
correct) answers, such as when arrays are involved. This patch fixes
some of these cases (please see new tests in test/CodeGen/object-size.c
for specifics on which cases are improved)
Patch mostly by Richard Smith.
Differential Revision: http://reviews.llvm.org/D12000
This fixes PR15212.
llvm-svn: 245323
The fix for this is in LLVM but it depends on how clang handles the alias
attribute, so add a test to the clang tests to make sure everything works
together as expected.
Differential Revision: http://reviews.llvm.org/D11980
llvm-svn: 244756
Summary:
float_cast_overflow is the only UBSan check without a source location attached.
This patch propagates SourceLocations where necessary to get them to the
EmitCheck() call.
Reviewers: rsmith, ABataev, rjmccall
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D11757
llvm-svn: 244568
Summary:
NaCl is a platform where long double is the same as double.
Its mangling is spelled with "long double" but its ABI lowering is the same
as double.
Reviewers: rnk, chh
Subscribers: jfb, cfe-commits, dschuff
Differential Revision: http://reviews.llvm.org/D11922
llvm-svn: 244541
A test was recently (r244468) added to cover long double calling convention
codegen, distinguishing between Android and GNU conventions (where long doubles
are fp128 and x86_fp80, respectively). Native Client is a target where long
doubles are the same as doubles. This change augments the test to cover
that case.
Also rename the test to test/codeGen/X86_64-longdouble.c
Differential Revision: http://reviews.llvm.org/D11921
llvm-svn: 244524
When clang is built with -DLLVM_ENABLE_ASSERTIONS=Off,
it does not create names for IR values.
Differential Revision: http://reviews.llvm.org/D11437
llvm-svn: 244502
These changes are for Android x86_64 targets to be compatible
with current Android g++ and conform to AMD64 ABI.
https://llvm.org/bugs/show_bug.cgi?id=23897
* Return type of long double (fp128) should be fp128, not x86_fp80.
* Vararg of long double (fp128) could be in register and overflowed to memory.
https://llvm.org/bugs/show_bug.cgi?id=24111
* Return value of long double (fp128) _Complex should be in memory like a structure of {fp128,fp128}.
Differential Revision: http://reviews.llvm.org/D11437
llvm-svn: 244468
Function types without prototypes can arise when mangling a function type
within an overloadable function in C. We mangle these as the absence of
any parameter types (not even an empty parameter list).
Differential Revision: http://reviews.llvm.org/D11848
llvm-svn: 244374
Summary:
By default, 'clang' emits dwarf and 'clang-cl' emits codeview. You can
force emission of one or both by passing -gcodeview and -gdwarf to
either driver.
Reviewers: dblaikie, hans
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D11742
llvm-svn: 244097
Support for emitting libcalls for __atomic_fetch_nand and
__atomic_{add,sub,and,or,xor,nand}_fetch was missing; add it, and some
test cases.
Differential Revision: http://reviews.llvm.org/D10847
llvm-svn: 244063
Update testcases after LLVM change r243774.
Most of these had no need to check `tag:` field, but did so as a way of
getting to the `name:` field. In a few cases I've converted the `tag:`
checks to `arg:` or `CHECK-NOT: arg:`.
llvm-svn: 243775
This patch adds support for the System Z vector built-in functions.
The API-defined header file has the name vecintrin.h.
The user-level functions are defined in the same style as the clang
version of altivec.h, making heavy use of the __overloadable__ and
__always_inline__ attributes. Where possible the functions expand to
generic operations rather than specific built-in functions, in the hope
that that form can be optimised better.
Where a built-in routine is specified to require an immediate integer
argument, the __enable_if__ attribute is used to verify the argument is
in fact constant and in the appropriate range.
Based on a patch by Richard Sandiford.
llvm-svn: 243643
The z13 vector facility has an associated language extension,
closely modeled on AltiVec/VSX. The main differences are:
- vector long, vector float and vector pixel are not supported
- vector long long and vector double are supported (like VSX)
- comparison operators return a vector rather than a scalar integer
- shift operators behave like the OpenCL shift operators
- vector bool is only supported as argument to certain operators;
some operators allow mixing a bool with a non-bool vector
This patch adds clang support for the extension. It is closely modelled
on the AltiVec support. Similarly to the -faltivec option, there's a
new -fzvector option to enable the extensions (as well as an -mzvector
alias for compatibility with GCC). There's also a separate LangOpt.
The extension as implemented here is intended to be compatible with
the -mzvector extension recently implemented by GCC.
Based on a patch by Richard Sandiford.
Differential Revision: http://reviews.llvm.org/D11001
llvm-svn: 243642
This will be used for old targets like Android that do not
support ELF TLS models.
Differential Revision: http://reviews.llvm.org/D10524
llvm-svn: 243441
The 3DNOW/PRFCHW cpu targets define both the PREFETCHW (set cache line modified) and PREFETCH (set cache line exclusive) instructions but only the _m_prefetchw (PREFETCHW) intrinsic is included in the header. This patch adds the missing _m_prefetch intrinsic.
I'm basing this off AMD documentation - the intel docs on the support for PREFETCHW isn't clear whether Silvermont/Broadwell properly support PREFETCH but given that the intrinsic implementation is a default __builtin_prefetch call, it is safe whatever.
Fix for PR23648
Differential Revision: http://reviews.llvm.org/D11338
llvm-svn: 243305
__builtin_frame_address requires its argument to be a constant
expression which already implies that it cannot have undefined behavior.
However, we used EmitScalarExpr to emit the argument causing UBSan to
try to check for overflow.
Instead, use the constant expression emission system.
This fixes PR24256.
llvm-svn: 243206
This is the PS4 counterpart to r229376, which quotes the library name if the
name contains space. It was discovered that if a library name contains both
double-quote and space characters, quoting the name might produce unexpected
results, but we are mostly concerned with a Windows host environment, which
does not allow double-quote or slashes in file/folder names.
Differential Revision: http://reviews.llvm.org/D11275
llvm-svn: 242689
Currently, -save-temp will cause ObjCARC optimization to be dropped,
sanitizer pass to run early in the pipeline, and profiling
instrumentation to run twice.
Fix the issue by properly disable all passes in the optimization
pipeline when generating bitcode output and parse some of the Language
Options even when the input is bitcode so the passes can be setup
correctly.
llvm-svn: 242565
"-arm-use-movt=0".
This change is needed since backend options do not make it to the backend
when doing LTO and are not capable of changing the behavior of code-gen
passes on a per-function basis.
rdar://problem/21529937
Differential Revision: http://reviews.llvm.org/D11025
llvm-svn: 242368
Revision 224297 modified the behavior of vec_sld for little endian so
that LLVM will generate the correct corresponding vsldoi instruction.
I neglected to update the existing tests, which continued to pass
because they were not specific enough. This patch adds enough
specificity to the tests to make them useful for BE and LE testing of
vec_sld.
llvm-svn: 242313
This patch corresponds to review:
http://reviews.llvm.org/D11184
A number of new interfaces for altivec.h (as mandated by the ABI):
vector float vec_cpsgn(vector float, vector float)
vector double vec_cpsgn(vector double, vector double)
vector double vec_or(vector bool long long, vector double)
vector double vec_or(vector double, vector bool long long)
vector double vec_re(vector double)
vector signed char vec_cntlz(vector signed char)
vector unsigned char vec_cntlz(vector unsigned char)
vector short vec_cntlz(vector short)
vector unsigned short vec_cntlz(vector unsigned short)
vector int vec_cntlz(vector int)
vector unsigned int vec_cntlz(vector unsigned int)
vector signed long long vec_cntlz(vector signed long long)
vector unsigned long long vec_cntlz(vector unsigned long long)
vector signed char vec_nand(vector bool signed char, vector signed char)
vector signed char vec_nand(vector signed char, vector bool signed char)
vector signed char vec_nand(vector signed char, vector signed char)
vector unsigned char vec_nand(vector bool unsigned char, vector unsigned char)
vector unsigned char vec_nand(vector unsigned char, vector bool unsigned char)
vector unsigned char vec_nand(vector unsigned char, vector unsigned char)
vector short vec_nand(vector bool short, vector short)
vector short vec_nand(vector short, vector bool short)
vector short vec_nand(vector short, vector short)
vector unsigned short vec_nand(vector bool unsigned short, vector unsigned short)
vector unsigned short vec_nand(vector unsigned short, vector bool unsigned short)
vector unsigned short vec_nand(vector unsigned short, vector unsigned short)
vector int vec_nand(vector bool int, vector int)
vector int vec_nand(vector int, vector bool int)
vector int vec_nand(vector int, vector int)
vector unsigned int vec_nand(vector bool unsigned int, vector unsigned int)
vector unsigned int vec_nand(vector unsigned int, vector bool unsigned int)
vector unsigned int vec_nand(vector unsigned int, vector unsigned int)
vector signed long long vec_nand(vector bool long long, vector signed long long)
vector signed long long vec_nand(vector signed long long, vector bool long long)
vector signed long long vec_nand(vector signed long long, vector signed long long)
vector unsigned long long vec_nand(vector bool long long, vector unsigned long long)
vector unsigned long long vec_nand(vector unsigned long long, vector bool long long)
vector unsigned long long vec_nand(vector unsigned long long, vector unsigned long long)
vector signed char vec_orc(vector bool signed char, vector signed char)
vector signed char vec_orc(vector signed char, vector bool signed char)
vector signed char vec_orc(vector signed char, vector signed char)
vector unsigned char vec_orc(vector bool unsigned char, vector unsigned char)
vector unsigned char vec_orc(vector unsigned char, vector bool unsigned char)
vector unsigned char vec_orc(vector unsigned char, vector unsigned char)
vector short vec_orc(vector bool short, vector short)
vector short vec_orc(vector short, vector bool short)
vector short vec_orc(vector short, vector short)
vector unsigned short vec_orc(vector bool unsigned short, vector unsigned short)
vector unsigned short vec_orc(vector unsigned short, vector bool unsigned short)
vector unsigned short vec_orc(vector unsigned short, vector unsigned short)
vector int vec_orc(vector bool int, vector int)
vector int vec_orc(vector int, vector bool int)
vector int vec_orc(vector int, vector int)
vector unsigned int vec_orc(vector bool unsigned int, vector unsigned int)
vector unsigned int vec_orc(vector unsigned int, vector bool unsigned int)
vector unsigned int vec_orc(vector unsigned int, vector unsigned int)
vector signed long long vec_orc(vector bool long long, vector signed long long)
vector signed long long vec_orc(vector signed long long, vector bool long long)
vector signed long long vec_orc(vector signed long long, vector signed long long)
vector unsigned long long vec_orc(vector bool long long, vector unsigned long long)
vector unsigned long long vec_orc(vector unsigned long long, vector bool long long)
vector unsigned long long vec_orc(vector unsigned long long, vector unsigned long long)
vector signed char vec_div(vector signed char, vector signed char)
vector unsigned char vec_div(vector unsigned char, vector unsigned char)
vector signed short vec_div(vector signed short, vector signed short)
vector unsigned short vec_div(vector unsigned short, vector unsigned short)
vector signed int vec_div(vector signed int, vector signed int)
vector unsigned int vec_div(vector unsigned int, vector unsigned int)
vector signed long long vec_div(vector signed long long, vector signed long long)
vector unsigned long long vec_div(vector unsigned long long, vector unsigned long long)
vector unsigned char vec_mul(vector unsigned char, vector unsigned char)
vector unsigned int vec_mul(vector unsigned int, vector unsigned int)
vector unsigned long long vec_mul(vector unsigned long long, vector unsigned long long)
vector unsigned short vec_mul(vector unsigned short, vector unsigned short)
vector signed char vec_mul(vector signed char, vector signed char)
vector signed int vec_mul(vector signed int, vector signed int)
vector signed long long vec_mul(vector signed long long, vector signed long long)
vector signed short vec_mul(vector signed short, vector signed short)
vector signed long long vec_mergeh(vector signed long long, vector signed long long)
vector signed long long vec_mergeh(vector signed long long, vector bool long long)
vector signed long long vec_mergeh(vector bool long long, vector signed long long)
vector unsigned long long vec_mergeh(vector unsigned long long, vector unsigned long long)
vector unsigned long long vec_mergeh(vector unsigned long long, vector bool long long)
vector unsigned long long vec_mergeh(vector bool long long, vector unsigned long long)
vector double vec_mergeh(vector double, vector double)
vector double vec_mergeh(vector double, vector bool long long)
vector double vec_mergeh(vector bool long long, vector double)
vector signed long long vec_mergel(vector signed long long, vector signed long long)
vector signed long long vec_mergel(vector signed long long, vector bool long long)
vector signed long long vec_mergel(vector bool long long, vector signed long long)
vector unsigned long long vec_mergel(vector unsigned long long, vector unsigned long long)
vector unsigned long long vec_mergel(vector unsigned long long, vector bool long long)
vector unsigned long long vec_mergel(vector bool long long, vector unsigned long long)
vector double vec_mergel(vector double, vector double)
vector double vec_mergel(vector double, vector bool long long)
vector double vec_mergel(vector bool long long, vector double)
vector signed int vec_pack(vector signed long long, vector signed long long)
vector unsigned int vec_pack(vector unsigned long long, vector unsigned long long)
vector bool int vec_pack(vector bool long long, vector bool long long)
llvm-svn: 242171
add 2 bit to ObjCOrBuiltinID (changed from 11bits to 13bits), see discussion in
Add new intrinsics support that already covered by the BE.
All the intrinsics are covered by tests
Differential Revision: http://reviews.llvm.org/D10893
llvm-svn: 242144
Previously, clang/llvm treated inline-asm instructions conservatively,
choosing not to eliminate the instructions or hoisting them out of a loop
even when it was safe to do so. This commit makes changes to attach a
readonly or readnone attribute to an inline-asm instruction, which enables
passes such as LICM and EarlyCSE to move or optimize away the instruction.
rdar://problem/11358192
Differential Revision: http://reviews.llvm.org/D10546
llvm-svn: 241930
tools/clang/test/CodeGen/packed-nest-unpacked.c contains this test:
struct XBitfield {
unsigned b1 : 10;
unsigned b2 : 12;
unsigned b3 : 10;
};
struct YBitfield {
char x;
struct XBitfield y;
} __attribute((packed));
struct YBitfield gbitfield;
unsigned test7() {
// CHECK: @test7
// CHECK: load i32, i32* getelementptr inbounds (%struct.YBitfield, %struct.YBitfield* @gbitfield, i32 0, i32 1, i32 0), align 4
return gbitfield.y.b2;
}
The "align 4" is actually wrong. Accessing all of "gbitfield.y" as a single
i32 is of course possible, but that still doesn't make it 4-byte aligned as
it remains packed at offset 1 in the surrounding gbitfield object.
This alignment was changed by commit r169489, which also introduced changes
to bitfield access code in CGExpr.cpp. Code before that change used to take
into account *both* the alignment of the field to be accessed within the
current struct, *and* the alignment of that outer struct itself; this logic
was removed by the above commit.
Neglecting to consider both values can cause incorrect code to be generated
(I've seen an unaligned access crash on SystemZ due to this bug).
In order to always use the best known alignment value, this patch removes
the CGBitFieldInfo::StorageAlignment member and replaces it with a
StorageOffset member specifying the offset from the start of the surrounding
struct to the bitfield's underlying storage. This offset can then be combined
with the best-known alignment for a bitfield access lvalue to determine the
alignment to use when accessing the bitfield's storage.
Differential Revision: http://reviews.llvm.org/D11034
llvm-svn: 241916
This patch corresponds to review:
http://reviews.llvm.org/D10972
Fix for the handling of dependent features that are enabled by default
on some CPU's (such as -mvsx, -mpower8-vector).
Also provides a number of new interfaces or fixes existing ones in
altivec.h.
Changed signatures to conform to ABI:
vector short vec_perm(vector signed short, vector signed short, vector unsigned char)
vector int vec_perm(vector signed int, vector signed int, vector unsigned char)
vector long long vec_perm(vector signed long long, vector signed long long, vector unsigned char)
vector signed char vec_sld(vector signed char, vector signed char, const int)
vector unsigned char vec_sld(vector unsigned char, vector unsigned char, const int)
vector bool char vec_sld(vector bool char, vector bool char, const int)
vector unsigned short vec_sld(vector unsigned short, vector unsigned short, const int)
vector signed short vec_sld(vector signed short, vector signed short, const int)
vector signed int vec_sld(vector signed int, vector signed int, const int)
vector unsigned int vec_sld(vector unsigned int, vector unsigned int, const int)
vector float vec_sld(vector float, vector float, const int)
vector signed char vec_splat(vector signed char, const int)
vector unsigned char vec_splat(vector unsigned char, const int)
vector bool char vec_splat(vector bool char, const int)
vector signed short vec_splat(vector signed short, const int)
vector unsigned short vec_splat(vector unsigned short, const int)
vector bool short vec_splat(vector bool short, const int)
vector pixel vec_splat(vector pixel, const int)
vector signed int vec_splat(vector signed int, const int)
vector unsigned int vec_splat(vector unsigned int, const int)
vector bool int vec_splat(vector bool int, const int)
vector float vec_splat(vector float, const int)
Added a VSX path to:
vector float vec_round(vector float)
Added interfaces:
vector signed char vec_eqv(vector signed char, vector signed char)
vector signed char vec_eqv(vector bool char, vector signed char)
vector signed char vec_eqv(vector signed char, vector bool char)
vector unsigned char vec_eqv(vector unsigned char, vector unsigned char)
vector unsigned char vec_eqv(vector bool char, vector unsigned char)
vector unsigned char vec_eqv(vector unsigned char, vector bool char)
vector signed short vec_eqv(vector signed short, vector signed short)
vector signed short vec_eqv(vector bool short, vector signed short)
vector signed short vec_eqv(vector signed short, vector bool short)
vector unsigned short vec_eqv(vector unsigned short, vector unsigned short)
vector unsigned short vec_eqv(vector bool short, vector unsigned short)
vector unsigned short vec_eqv(vector unsigned short, vector bool short)
vector signed int vec_eqv(vector signed int, vector signed int)
vector signed int vec_eqv(vector bool int, vector signed int)
vector signed int vec_eqv(vector signed int, vector bool int)
vector unsigned int vec_eqv(vector unsigned int, vector unsigned int)
vector unsigned int vec_eqv(vector bool int, vector unsigned int)
vector unsigned int vec_eqv(vector unsigned int, vector bool int)
vector signed long long vec_eqv(vector signed long long, vector signed long long)
vector signed long long vec_eqv(vector bool long long, vector signed long long)
vector signed long long vec_eqv(vector signed long long, vector bool long long)
vector unsigned long long vec_eqv(vector unsigned long long, vector unsigned long long)
vector unsigned long long vec_eqv(vector bool long long, vector unsigned long long)
vector unsigned long long vec_eqv(vector unsigned long long, vector bool long long)
vector float vec_eqv(vector float, vector float)
vector float vec_eqv(vector bool int, vector float)
vector float vec_eqv(vector float, vector bool int)
vector double vec_eqv(vector double, vector double)
vector double vec_eqv(vector bool long long, vector double)
vector double vec_eqv(vector double, vector bool long long)
vector bool long long vec_perm(vector bool long long, vector bool long long, vector unsigned char)
vector double vec_round(vector double)
vector double vec_splat(vector double, const int)
vector bool long long vec_splat(vector bool long long, const int)
vector signed long long vec_splat(vector signed long long, const int)
vector unsigned long long vec_splat(vector unsigned long long,
vector bool int vec_sld(vector bool int, vector bool int, const int)
vector bool short vec_sld(vector bool short, vector bool short, const int)
llvm-svn: 241904
Code in CGCall.cpp that loads up function arguments that need to be
coerced to a different type may in some cases ignore the fact that
the source of the argument is not naturally aligned. This may cause
incorrect code to be generated. In some places in CreateCoercedLoad,
we already have setAlignment calls to address this, but I ran into one
where it was missing, causing wrong code generation on SystemZ.
However, in that location, we do not actually know what alignment of
the source location we can rely on; the callers do not pass anything
to this routine. This is already an issue in other places in
CreateCoercedLoad; and the same problem exists for CreateCoercedStore.
To avoid pessimising code, and to fix the FIXMEs already in place,
this patch also adds an alignment argument to the CreateCoerced*
routines and uses it instead of forcing an alignment of 1. The
callers are changed to pass in the best information they have.
This actually requires changes in a number of existing test cases
since we now get better alignment in many places.
Differential Revision: http://reviews.llvm.org/D11033
llvm-svn: 241898
Move the diagnostic back to codegen so that we can compile ATL on the
self-host bot. We don't actually end up emitting code for the __try, so
the diagnostic won't be hit.
llvm-svn: 241761
This patch adds ObjectFilePCHContainerOperations uses the LLVM backend
to put the contents of a PCH into a __clangast section inside a COFF, ELF,
or Mach-O object file container.
This is done to facilitate module debugging by makeing it possible to
store the debug info for the types defined by a module alongside the AST.
rdar://problem/20091852
llvm-svn: 241620
"-arm-long-calls".
This change allows using -mlong-calls/-mno-long-calls for LTO and enabling or
disabling long call on a per-function basis.
rdar://problem/21529937
Differential Revision: http://reviews.llvm.org/D9414
llvm-svn: 241565
different function signatures. (Previously clang would emit all block
pointer types with the type of the first block pointer in the compile
unit.)
rdar://problem/21602473
llvm-svn: 241534
This reverts commit r241244, but restricts SEH support to Win64.
This way, Chromium builds will still fall back on TUs with SEH, and
Clang developers can work on this incrementally upstream while patching
this small predicate locally. It'll also make it easier to review small
fixes.
llvm-svn: 241533
The patch is the same except for the addition of a new test for the
issue that required reverting the dependent llvm commit.
--Original Commit Message--
Pass down the -flto option to the -cc1 job, and from there into the
CodeGenOptions and onto the PassManagerBuilder. This enables gating
the new EliminateAvailableExternally module pass on whether we are
preparing for LTO.
If we are preparing for LTO (e.g. a -flto -c compile), the new pass is not
included as we want to preserve available externally functions for possible
link time inlining.
llvm-svn: 241467
instructions introduced in POWER8.
These are the Clang-related changes for http://reviews.llvm.org/D10704
All builtins are added in altivec.h and guarded with the POWER8_VECTOR macro.
Phabricator review: http://reviews.llvm.org/D10736
llvm-svn: 241293
32-bit finally funclets are intended to be called both directly from the
parent function and indirectly from the EH runtime. Because we aren't
contorting LLVM's X86 prologue to match MSVC's, calling the finally
block directly passes in a different value of EBP than the one that the
runtime provides. We need an adapter thunk to adjust EBP to the expected
value. However, WinEHPrepare already has to solve this problem when
cleanups are not pre-outlined, so we can go ahead and rely on it rather
than duplicating work.
Now we only do the llvm.x86.seh.recoverfp dance for 32-bit SEH filter
functions.
llvm-svn: 241187
This re-lands r236052 and adds support for __exception_code().
In 32-bit SEH, the exception code is not available in eax. It is only
available in the filter function, and now we arrange to load it and
store it into an escaped variable in the parent frame.
As a consequence, we have to disable the "catch i8* null" optimization
on 32-bit and always generate a filter function. We can re-enable the
optimization if we detect an __except block that doesn't use the
exception code, but this probably isn't worth optimizing.
Reviewers: majnemer
Differential Revision: http://reviews.llvm.org/D10852
llvm-svn: 241171
This reinstates part of the hack removed in r233223, by special
casing sse4 as part of the feature additions. The notable change
here is that we consider it only as part of setting the SSE level
and not as part of the actual target features set which handles
setting the rest of the masks.
llvm-svn: 241130
using a string map to canonicalize. Fix up a couple of testcases
that needed changing since we are no longer simply appending features
to the list, but all of their mask dependencies as well.
llvm-svn: 241129
Add intrinsics for the FXSR instructions (FXSAVE/FXSAVE64/FXRSTOR/FXRSTOR64)
These were previously declared in Intrin.h for MSVC compatibility, but now
that we have them implemented, these declarations can be removed.
llvm-svn: 241053
isTriviallyRecursive is a hack used to bridge a gap between the
expectations that source code assumes and the semantics that LLVM IR can
provide. Specifically, asm labels on functions are treated as an
explicit name for a GlobalObject in Clang but treated like an
output-processing step in GCC. Tweak this hack a little further to emit
calls to library functions instead of emitting an incorrect definition.
The definition in question would have available_externally linkage (this
is OK) but result in a call to itself which will either result in an
infinite loop or stack overflow.
This fixes PR23964.
llvm-svn: 241043
This matches the implementation of the gcc support for the same
feature, including checking the values set up by libgcc at runtime.
The structure looks like this:
unsigned int __cpu_vendor;
unsigned int __cpu_type;
unsigned int __cpu_subtype;
unsigned int __cpu_features[1];
with a set of enums to match various fields that are field out after
parsing the output of the cpuid instruction.
This also adds a set of errors checking for valid input (and cpu).
compiler-rt support for this and the other builtins in this family
(__builtin_cpu_init and __builtin_cpu_is) are forthcoming.
llvm-svn: 240994
We failed to see that we should have deferred the creation of a type
which references a type currently under construction because of atomic
sugar.
This fixes PR23985.
llvm-svn: 240989
Several tests wouldn't pass when executed on an armv7a_pc_linux triple
due to the non-default arm_aapcs calling convention produced on the
function definitions in the IR output. Account for this with the
application of a little regex.
Patch by Ying Yi.
llvm-svn: 240971
This patch corresponds to review:
http://reviews.llvm.org/D10637
This is the first round of additions of missing builtins listed in the ABI document. More to come (this builds onto what seurer already addes). This patch adds:
vector signed long long vec_abs(vector signed long long)
vector double vec_abs(vector double)
vector signed long long vec_add(vector signed long long, vector signed long long)
vector unsigned long long vec_add(vector unsigned long long, vector unsigned long long)
vector double vec_add(vector double, vector double)
vector double vec_and(vector bool long long, vector double)
vector double vec_and(vector double, vector bool long long)
vector double vec_and(vector double, vector double)
vector signed long long vec_and(vector signed long long, vector signed long long)
vector double vec_andc(vector bool long long, vector double)
vector double vec_andc(vector double, vector bool long long)
vector double vec_andc(vector double, vector double)
vector signed long long vec_andc(vector signed long long, vector signed long long)
vector double vec_ceil(vector double)
vector bool long long vec_cmpeq(vector double, vector double)
vector bool long long vec_cmpge(vector double, vector double)
vector bool long long vec_cmpge(vector signed long long, vector signed long long)
vector bool long long vec_cmpge(vector unsigned long long, vector unsigned long long)
vector bool long long vec_cmpgt(vector double, vector double)
vector bool long long vec_cmple(vector double, vector double)
vector bool long long vec_cmple(vector signed long long, vector signed long long)
vector bool long long vec_cmple(vector unsigned long long, vector unsigned long long)
vector bool long long vec_cmplt(vector double, vector double)
vector bool long long vec_cmplt(vector signed long long, vector signed long long)
vector bool long long vec_cmplt(vector unsigned long long, vector unsigned long long)
llvm-svn: 240821
Attribute 'nodebug' means no llvm.dbg.* intrinsics, no !dbg
annotations, and no DISubprogram for the function.
Differential Revision: http://reviews.llvm.org/D10747
llvm-svn: 240747
Integer variants are implemented as atomicrmw or cmpxchg instructions.
Atomic add for floating point (__nvvm_atom_add_gen_f()) is implemented
as a call to an overloaded @llvm.nvvm.atomic.load.add.f32.* LVVM
intrinsic.
Differential Revision: http://reviews.llvm.org/D10666
llvm-svn: 240669
The Microsoft-extension _MoveToCoprocessor and _MoveToCoprocessor2
builtins take the register value to be moved as the first argument,
but the corresponding mcr and mcr2 LLVM intrinsics expect that value
to be the third argument. Handle this as a special case, while still
leaving those intrinsics as generic MSBuiltins. I considered the
alternative of handling these in EmitARMBuiltinExpr, but that does
not work well for the follow-up change that I'm going to make to improve
the error handling for PR22560 -- we need the GetBuiltinType() checks
for ICEArguments, and the ARM version of that code is only used for
Neon intrinsics where the last argument is special and not
checked in the normal way.
llvm-svn: 240462
As specified in the SysV AVX512 ABI drafts. It follows the same scheme
as AVX2:
Arguments of type __m512 are split into eight eightbyte chunks.
The least significant one belongs to class SSE and all the others
to class SSEUP.
This also means we change the OpenMP SIMD default alignment on AVX512.
Based on r240337.
Differential Revision: http://reviews.llvm.org/D9894
llvm-svn: 240338
This patch adds initial support for the -fsanitize=kernel-address flag to Clang.
Right now it's quite restricted: only out-of-line instrumentation is supported, globals are not instrumented, some GCC kasan flags are not supported.
Using this patch I am able to build and boot the KASan tree with LLVMLinux patches from github.com/ramosian-glider/kasan/tree/kasan_llvmlinux.
To disable KASan instrumentation for a certain function attribute((no_sanitize("kernel-address"))) can be used.
llvm-svn: 240131
Base type of attribute((mode)) can actually be a vector type.
The patch is to distinguish between base type and base element type.
This fixes http://llvm.org/PR17453.
Differential Revision: http://reviews.llvm.org/D10058
llvm-svn: 240125
This flag controls whether a given sanitizer traps upon detecting
an error. It currently only supports UBSan. The existing flag
-fsanitize-undefined-trap-on-error has been made an alias of
-fsanitize-trap=undefined.
This change also cleans up some awkward behavior around the combination
of -fsanitize-trap=undefined and -fsanitize=undefined. Previously we
would reject command lines containing the combination of these two flags,
as -fsanitize=vptr is not compatible with trapping. This required the
creation of -fsanitize=undefined-trap, which excluded -fsanitize=vptr
(and -fsanitize=function, but this seems like an oversight).
Now, -fsanitize=undefined is an alias for -fsanitize=undefined-trap,
and if -fsanitize-trap=undefined is specified, we treat -fsanitize=vptr
as an "unsupported" flag, which means that we error out if the flag is
specified explicitly, but implicitly disable it if the flag was implied
by -fsanitize=undefined.
Differential Revision: http://reviews.llvm.org/D10464
llvm-svn: 240105
This patch adds the -fsanitize=safe-stack command line argument for clang,
which enables the Safe Stack protection (see http://reviews.llvm.org/D6094
for the detailed description of the Safe Stack).
This patch is our implementation of the safe stack on top of Clang. The
patches make the following changes:
- Add -fsanitize=safe-stack and -fno-sanitize=safe-stack options to clang
to control safe stack usage (the safe stack is disabled by default).
- Add __attribute__((no_sanitize("safe-stack"))) attribute to clang that can be
used to disable the safe stack for individual functions even when enabled
globally.
Original patch by Volodymyr Kuznetsov and others at the Dependable Systems
Lab at EPFL; updates and upstreaming by myself.
Differential Revision: http://reviews.llvm.org/D6095
llvm-svn: 239762
in section 10.1, __arm_{w,r}sr{,p,64}.
This includes arm_acle.h definitions with builtins and codegen to support
these, the intrinsics are implemented by generating read/write_register calls
which get appropriately lowered in the backend based on the register string
provided. SemaChecking is also implemented to fault invalid parameters.
Differential Revision: http://reviews.llvm.org/D9697
llvm-svn: 239737
Instead, just EvaluateAsInt().
Follow-up to r239549: rsmith points out that isICE() is expensive;
seems like it's not the right concept anyway, as it fails on
`static const' in C, and will actually trigger the assert below on:
test/Sema/inline-asm-validate-x86.c
llvm-svn: 239651
Summary:
In addition to easier syntax, IRBuilder makes sure to set correct
debug locations for newly added instructions (bitcast and
llvm.lifetime itself). This restores the original behavior, which
was modified by r234581 (reapplied as r235553).
Extend one of the tests to check for debug locations.
Test Plan: regression test suite
Reviewers: aadg, dblaikie
Subscribers: cfe-commits, majnemer
Differential Revision: http://reviews.llvm.org/D10418
llvm-svn: 239643
Right now we're ignoring the fpmath attribute since there's no
backend support for a feature like this and to do so would require
checking the validity of the strings and doing general subtarget
feature parsing of valid and invalid features with the target
attribute feature.
llvm-svn: 239582
Modeled after the gcc attribute of the same name, this feature
allows source level annotations to correspond to backend code
generation. In llvm particular parlance, this allows the adding
of subtarget features and changing the cpu for a particular function
based on source level hints.
This has been added into the existing support for function level
attributes without particular verification for any target outside
of whether or not the backend will support the features/cpu given
(similar to section, etc).
llvm-svn: 239579
For inline assembly immediate constraints, we currently always use
EmitScalarExpr, instead of directly emitting the constant. When the
overflow sanitizer is enabled, this generates overflow intrinsics
instead of constants.
Instead, emit a constant for constraints that either require an
immediate (e.g. 'I' on X86), or only accepts constants (immediate
or symbolic; i.e., don't accept registers or memory).
Fixes PR19763.
Differential Revision: http://reviews.llvm.org/D10255
llvm-svn: 239549
This patch corresponds to review:
http://reviews.llvm.org/D10095
This is for just two instructions and related builtins:
vbpermq
vgbbd
llvm-svn: 239506