Currently block is translated to a structure equivalent to
struct Block {
void *isa;
int flags;
int reserved;
void *invoke;
void *descriptor;
};
Except invoke, which is the pointer to the block invoke function,
all other fields are useless for OpenCL, which clutter the IR and
also waste memory since the block struct is passed to the block
invoke function as argument.
On the other hand, the size and alignment of the block struct is
not stored in the struct, which causes difficulty to implement
__enqueue_kernel as library function, since the library function
needs to know the size and alignment of the argument which needs
to be passed to the kernel.
This patch removes the useless fields from the block struct and adds
size and align fields. The equivalent block struct will become
struct Block {
int size;
int align;
generic void *invoke;
/* custom fields */
};
It also changes the pointer to the invoke function to be
a generic pointer since the address space of a function
may not be private on certain targets.
Differential Revision: https://reviews.llvm.org/D37822
llvm-svn: 314932
Allow the proper recognition of Enum values and global variables inside ms inline-asm memory / immediate expressions, as they require some additional overhead and treated incorrect if doesn't early recognized.
supersedes D33278, D35774
Differential Revision: https://reviews.llvm.org/D37413
llvm-svn: 314494
Allow the proper recognition of Enum values and global variables inside ms inline-asm memory / immediate expressions, as they require some additional overhead and treated incorrect if doesn't early recognized.
supersedes D33277, D35775
Corrsponds with D37412, D37413
llvm-svn: 314300
This patch fixes clang to decorate reference accesses as pointers
and not as "omnipotent chars".
Differential Revision: https://reviews.llvm.org/D38074
llvm-svn: 314209
Summary:
This is the follow-up patch to D37924.
This change refactors clang to use the the newly added section headers
in SpecialCaseList to specify which sanitizers blacklists entries
should apply to, like so:
[cfi-vcall]
fun:*bad_vcall*
[cfi-derived-cast|cfi-unrelated-cast]
fun:*bad_cast*
The SanitizerSpecialCaseList class has been added to allow querying by
SanitizerMask, and SanitizerBlacklist and its downstream users have been
updated to provide that information. Old blacklists not using sections
will continue to function identically since the blacklist entries will
be placed into a '[*]' section by default matching against all
sanitizers.
Reviewers: pcc, kcc, eugenis, vsk
Reviewed By: eugenis
Subscribers: dberris, cfe-commits, mgorny
Differential Revision: https://reviews.llvm.org/D37925
llvm-svn: 314171
Clang regression tests that depend on the optimizer can break
when there are changes to LLVM...as in:
https://reviews.llvm.org/rL314117
llvm-svn: 314144
This commit fixes a bug in the handling of storage-only __fp16 vectors
where clang didn't promote __fp16 vector operands to float vectors.
Conceptually, it performs the following transformation on the AST in
CreateBuiltinBinOp and CreateBuiltinUnaryOp:
(Before)
typedef __fp16 half4 __attribute__ ((vector_size (8)));
typedef float float4 __attribute__ ((vector_size (16)));
half4 hv0, hv1, hv2, hv3;
hv0 = hv1 + hv2 + hv3;
(After)
float4 t0 = (float4)hv1 + (float4)hv2;
float4 t1 = t0 + (float4)hv3;
hv0 = (half4)t1;
Note that this commit fixes the bug for targets that set
HalfArgsAndReturns to true (ARM and ARM64). Targets using intrinsics
such as llvm.convert.to.fp16 to handle __fp16 are still broken.
rdar://problem/20625184
Differential Revision: https://reviews.llvm.org/D32520
llvm-svn: 314056
body of global block invoke functions.
This commit fixes an infinite loop in IRGen that occurs when compiling
the following code:
void FUNC2() {
static void (^const block1)(int) = ^(int a){
if (a--)
block1(a);
};
}
This is how IRGen gets stuck in the infinite loop:
1. GenerateBlockFunction is called to emit the body of "block1".
2. GetAddrOfGlobalBlock is called to get the address of "block1". The
function calls getAddrOfGlobalBlockIfEmitted to check whether the
global block has been emitted. If it hasn't been emitted, it then
tries to emit the body of the block function by calling
GenerateBlockFunction, which goes back to step 1.
This commit prevents the inifinite loop by building the global block in
GenerateBlockFunction before emitting the body of the block function.
rdar://problem/34541684
Differential Revision: https://reviews.llvm.org/D38118
llvm-svn: 314029
The recently behavior in the code that these tests were meant to be checking will be ammended as soon as a suitable change can be properly reviewed.
llvm-svn: 313684
Summary:
Restore the `__builtin_wasm_rethrow` builtin deleted in D37931. On second
thought, it appears it can be used to implement `__cxa_rethrow`.
Reviewers: dschuff, sunfish
Reviewed By: dschuff
Subscribers: jfb, sbc100, jgravelle-google
Differential Revision: https://reviews.llvm.org/D37942
llvm-svn: 313430
This patch replaces the perm2f128 intrinsics with native shuffle vectors.
This uses a pretty simple approach to allocate source 0 to the lower half input and source 1 to the upper half input. Then its just a matter of filling in the indices to use either the lower or upper half of that specific source. This can result in the same source being used by both operands. InstCombine or SelectionDAGBuilder should be able to clean that up.
Differential Revision: https://reviews.llvm.org/D37892
llvm-svn: 313418
Summary:
Remove `__builtin_wasm_rethrow` builtin. I thought it was required to implement
`__cxa_rethrow` function in libcxxabi, but it turned out it will be using
`__builtin_wasm_throw` instead.
Reviewers: dschuff, jgravelle-google
Reviewed By: jgravelle-google
Subscribers: jfb, sbc100, jgravelle-google
Differential Revision: https://reviews.llvm.org/D37931
llvm-svn: 313402
The __builtin_ia32_pbroadcastq512_mem_mask we were previously trying to use in 32-bit mode is not implemented in the x86 backend and causes isel to fail in release builds. In debug builds it fails even earlier during legalization with an llvm_unreachable.
While there add the missing test case for this intrinsic for this for 64-bit mode.
This fixes PR34631. D37668 should be able to recover this for 32-bit mode soon. But I wanted to fix the crash ahead of that.
llvm-svn: 313392
Summary:
As the attributed statements are considered simple statements no
stoppoint was generated before emitting attributed do/while/for/range-
statement. This lead to faulty debug locations.
Reviewers: echristo, aaron.ballman, dblaikie
Reviewed By: dblaikie
Subscribers: bjope, aprantl, cfe-commits
Differential Revision: https://reviews.llvm.org/D37428
llvm-svn: 312623
Based off the Intel Intrinsics guide, we should expect a void const* argument.
Prevents 'passing 'const void *' to parameter of type 'void *' discards qualifiers' warnings.
Differential Revision: https://reviews.llvm.org/D37449
llvm-svn: 312523
Because it is common to treat vector types as an array of their elements, or
even some other type that's not the element type, and thus index into them, we
can't use struct-path TBAA for these accesses. Even though we already treat all
vector types as equivalent to 'char', we were using field-offset information
for them with TBAA, and this renders undefined the intra-value indexing we
intend to allow. Note that, although 'char' is universally aliasing, with path
TBAA, we can still differentiate between access to s.a and s.b in
struct { char a, b; } s;. We can't use this capability as-is for vector types.
Fixes PR33967.
llvm-svn: 312447
Tests fail on ARM targets due to ABI name between define and void. Added reg ex to skip.
Patch by Glenn Howe (and expanded on by Douglas Yung)!
Differential Revision: https://reviews.llvm.org/D33410
llvm-svn: 312181
This patch implements the broadcastf32x2/broadcasti32x2 intrinsics using __builtin_shufflevector.
Differential Revision: https://reviews.llvm.org/D37287
llvm-svn: 312135
Summary:
An implementation of ubsan runtime library suitable for use in production.
Minimal attack surface.
* No stack traces.
* Definitely no C++ demangling.
* No UBSAN_OPTIONS=log_file=/path (very suid-unfriendly). And no UBSAN_OPTIONS in general.
* as simple as possible
Minimal CPU and RAM overhead.
* Source locations unnecessary in the presence of (split) debug info.
* Values and types (as in A+B overflows T) can be reconstructed from register/stack dumps, once you know what type of error you are looking at.
* above two items save 3% binary size.
When UBSan is used with -ftrap-function=abort, sometimes it is hard to reason about failures. This library replaces abort with a slightly more informative message without much extra overhead. Since ubsan interface in not stable, this code must reside in compiler-rt.
Reviewers: pcc, kcc
Subscribers: srhines, mgorny, aprantl, krytarowski, llvm-commits
Differential Revision: https://reviews.llvm.org/D36810
llvm-svn: 312029
This adds builtin_cpu_init which will emit a call to cpu_indicator_init in libgcc or compiler-rt.
This is needed to support builtin_cpu_supports/builtin_cpu_is in an ifunc resolver.
Differential Revision: https://reviews.llvm.org/D36336
llvm-svn: 311874
Summary: With accurate sample profile, we can do more aggressive size optimization. For some size-critical application, this can reduce the text size by 20%
Reviewers: davidxl, rsmith
Reviewed By: davidxl, rsmith
Subscribers: mehdi_amini, eraman, sanjoy, cfe-commits
Differential Revision: https://reviews.llvm.org/D37091
llvm-svn: 311707
This patch is intended to enable the use of basic double letter constraints used in GCC extended inline asm {Yi Y2 Yz Y0 Ym Yt}.
Supersedes D35205
llvm counterpart: D36369
Differential Revision: https://reviews.llvm.org/D36371
llvm-svn: 311643