match the feature set of the function that they're being called from.
This ensures that we can effectively diagnose some[1] code that would
instead ICE in the backend with a failure to select message.
Example:
__m128d foo(__m128d a, __m128d b) {
return __builtin_ia32_addsubps(b, a);
}
compiled for normal x86_64 via:
clang -target x86_64-linux-gnu -c
would fail to compile in the back end because the normal subtarget
features for x86_64 only include sse2 and the builtin requires sse3.
[1] We're still not erroring on:
__m128i bar(__m128i const *p) { return _mm_lddqu_si128(p); }
where we should fail and error on an always_inline function being
inlined into a function that doesn't support the subtarget features
required.
llvm-svn: 250473
This recommits r250398 with fixes to the tests for bot failures.
Add "-target x86_64-unknown-linux" to the clang invocations that
check for the gold plugin.
llvm-svn: 250455
Rolling this back for now since there are a couple of bot failures on
the new tests I added, and I won't have a chance to look at them in detail
until later this afternoon. I think the new tests need some restrictions on
having the gold plugin available.
This reverts commit r250398.
llvm-svn: 250402
Summary:
Add clang support for -flto=thin option, which is used to set the
EmitFunctionSummary code gen option on compiles.
Add -flto=full as an alias to the existing -flto.
Add tests to check for proper overriding of -flto variants on the
command line, and convert grep tests to FileCheck.
Reviewers: dexonsmith, joker.eph
Subscribers: davidxl, cfe-commits
Differential Revision: http://reviews.llvm.org/D11908
llvm-svn: 250398
Add support for the `-fdebug-prefix-map=` option as in GCC. The syntax is
`-fdebug-prefix-map=OLD=NEW`. When compiling files from a path beginning with
OLD, change the debug info to indicate the path as start with NEW. This is
particularly helpful if you are preprocessing in one path and compiling in
another (e.g. for a build cluster with distcc).
Note that the linearity of the implementation is not as terrible as it may seem.
This is normally done once per file with an expectation that the map will be
small (1-2) entries, making this roughly linear in the number of input paths.
Addresses PR24619.
llvm-svn: 250094
CGBlocks.cpp.
This commit fixes a bug in clang's code-gen where it creates the
following functions but doesn't attach function attributes to them:
__copy_helper_block_
__destroy_helper_block_
__Block_byref_object_copy_
__Block_byref_object_dispose_
rdar://problem/20828324
Differential Revision: http://reviews.llvm.org/D13525
llvm-svn: 249735
early.
This is needed in a patch I plan to commit later, in which a null Decl
pointer is passed to SetLLVMFunctionAttributesForDefinition.
Relevant discussion is in http://reviews.llvm.org/D13525.
llvm-svn: 249722
No ABI for C++ currently makes it possible to implement the standard
100% perfectly. We wrongly hid some of our compatible behavior behind
-fms-compatibility instead of tying it to the compiler ABI.
llvm-svn: 249656
The backend restores the stack pointer after recovering from an
exception. This is similar to r245879, but it doesn't try to use the
normal cleanup mechanism, so hopefully it won't cause the same breakage.
llvm-svn: 249640
We don't have a good place to put them. Our previous spot was causing us
to optimize loads from the exception object to undef, because it was
after the catchpad instruction that models the write to the catch
object.
llvm-svn: 249616
Currently codegen crashes trying to emit casting to bool &. It happens because bool type is converted to i1 and later then lvalue for reference is converted to i1*. But when codegen tries to load this lvalue it crashes trying to load value from this i1*.
Differential Revision: http://reviews.llvm.org/D13325
llvm-svn: 249534
include/clang/CodeGenABITypes.h is in meant to be included by external
users, but using a unique_ptr on the private CodeGenModule introduces a
dependency on the type definition that prevents such a use.
NFC
llvm-svn: 249328
OpenCL v1.1 s6.2.2: for the boolean value true, every bit in the result vector should be set.
This change treats the i1 value as signed for the purposes of performing the cast to integer,
and therefore sign extend into the result.
Patch by Neil Hickey!
http://reviews.llvm.org/D13349
llvm-svn: 249301
I randomly came across this difference between AArch64 and other targets:
on the latter, we don't emit nil checks for known non-nil class method
calls thanks to r247350, but we still do for AArch64 stret calls.
They use different code paths, because those are special, as they go
through the regular msgSend, not the msgSend*_stret variants.
llvm-svn: 249205
Ensure that the vptr store in the most-derived constructor is not behind
an invariant group barrier. Previously, the base-most vptr store would
be the one behind no barrier, and that could result in the creator of
the object thinking it had the base-most vtable.
This bug caused clang call pure virtual functions when called from
constructor body.
http://reviews.llvm.org/D13373
llvm-svn: 249197
This patch implements the outlining for offloading functions for code
annotated with the OpenMP target directive. It uses a temporary naming
of the outlined functions that will have to be updated later on once
target side codegen and registration of offloading libraries is
implemented - the naming needs to be made unique in the produced
library.
llvm-svn: 249148
This reverts commit r248982 as it was breaking the ARM buildbots and the fix didn't work.
This reverts commit r248984, the fix that didn't work.
llvm-svn: 249005
Summary: __nvvm_atom_cas_* returns the old value instead of whether the swap succeeds.
Reviewers: eliben, tra
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D13306
llvm-svn: 248951
- Remove virtual SC_OpenCLWorkGroupLocal storage type specifier
as it conflicts with static local variables now and prevents
diagnosing static local address space variables correctly.
- Allow static local and global variables (OpenCL2.0 s6.8 and s6.5.1).
- Improve diagnostics of allowed ASes for variables in different scopes:
(i) Global or static local variables have to be in global
or constant ASes (OpenCL1.2 s6.5, OpenCL2.0 s6.5.1);
(ii) Non-kernel function variables can't be declared in local
or constant ASes (OpenCL1.1 s6.5.2 and s6.5.3).
http://reviews.llvm.org/D13105
llvm-svn: 248906
This is the clang commit associated with llvm r248887.
This commit changes the interface of the vld[1234], vld[234]lane, and vst[1234],
vst[234]lane ARM neon intrinsics and associates an address space with the
pointer that these intrinsics take. This changes, e.g.,
<2 x i32> @llvm.arm.neon.vld1.v2i32(i8*, i32)
to
<2 x i32> @llvm.arm.neon.vld1.v2i32.p0i8(i8*, i32)
This change ensures that address spaces are fully taken into account in the ARM
target during lowering of interleaved loads and stores.
Differential Revision: http://reviews.llvm.org/D13127
llvm-svn: 248888
Description.
If the simd clause is specified, the ordered regions encountered by any thread will use only a single SIMD lane to execute the ordered regions in the order of the loop iterations.
Restrictions.
An ordered construct with the simd clause is the only OpenMP construct that can appear in the simd region.
An ordered directive with ‘simd’ clause is generated as an outlined function and corresponding function call to prevent this part of code from vectorization later in backend.
llvm-svn: 248772
Parsing and sema analysis for 'simd' clause in 'ordered' directive.
Description
If the simd clause is specified, the ordered regions encountered by any thread will use only a single SIMD lane to execute the ordered
regions in the order of the loop iterations.
Restrictions
An ordered construct with the simd clause is the only OpenMP construct that can appear in the simd region
llvm-svn: 248696
OpenMP 4.1 extends format of '#pragma omp ordered'. It adds 3 additional clauses: 'threads', 'simd' and 'depend'.
If no clause is specified, the ordered construct behaves as if the threads clause had been specified. If the threads clause is specified, the threads in the team executing the loop region execute ordered regions sequentially in the order of the loop iterations.
The loop region to which an ordered region without any clause or with a threads clause binds must have an ordered clause without the parameter specified on the corresponding loop directive.
llvm-svn: 248569
when building a module. Clang already records the module signature when
building a skeleton CU to reference a clang module.
Matching the id in the skeleton with the one in the module allows a DWARF
consumer to verify that they found the correct version of the module
without them needing to know about the clang module format.
llvm-svn: 248345
* adds -aux-triple option to specify target triple
* propagates aux target info to AST context and Preprocessor
* pulls in target specific preprocessor macros.
* pulls in target-specific builtins from aux target.
* sets appropriate host or device attribute on builtins.
Differential Revision: http://reviews.llvm.org/D12917
llvm-svn: 248299
This commit fixes an assert that is triggered when optnone is being
added to an IR function that is already marked with minsize and optsize.
rdar://problem/22723716
Differential Revision: http://reviews.llvm.org/D13004
llvm-svn: 248191
The signature may not have been computed at the time the module reference
is generated (e.g.: in the future while emitting debug info for a clang
module). Using the full module name is safe because each clang module may
only have a single definition.
NFC.
llvm-svn: 248037
Summary:
This change adds support for `__builtin_ms_va_list`, a GCC extension for
variadic `ms_abi` functions. The existing `__builtin_va_list` support is
inadequate for this because `va_list` is defined differently in the Win64
ABI vs. the System V/AMD64 ABI.
Depends on D1622.
Reviewers: rsmith, rnk, rjmccall
CC: cfe-commits
Differential Revision: http://reviews.llvm.org/D1623
llvm-svn: 247941
Mingw generally wraps an old copy of msvcrt.dll which has these
personalities, so things should work out, or so I hear. I haven't tested
it.
llvm-svn: 247902
This avoids building a fake LLVM IR global variable just to ferry an i32
down into LLVM codegen. It also puts a nail in the coffin of using MS
ABI C++ EH with landingpads, since now we'll assert in the lpad code
when flags are present.
llvm-svn: 247843
ptr in dtor.
Summary:
After destruction, invocation of virtual functions prevented
by poisoning vtable pointer.
Reviewers: eugenis, kcc
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D12712
Fixed testing callback emission order to account for vptr.
Poison vtable in either complete or base dtor, depending on
if virtual bases exist. If virtual bases exist, poison in
complete dtor. Otherwise, poison in base.
Remove commented-out block.
llvm-svn: 247762
It is dangerous to do LTO on code with strict-vtable-pointers, because
one module has invariant.group.barriers, and the other one not.
In the future I want to just strip all invariant.group metadata from
vptrs loads/stores and get rid of invariant.group.barrier calls.
http://reviews.llvm.org/D12580
llvm-svn: 247724
Patch improves codegen for OpenMP constructs. If the OpenMP region does not have internal 'cancel' construct, a call to 'void __kmpc_barrier()' runtime function is generated for all implicit/explicit barriers. If the region has inner 'cancel' directive, then
```
if (__kmpc_cancel_barrier())
exit from outer construct;
```
code is generated.
Also, the code for 'canellation point' directive is not generated if parent directive does not have 'cancel' directive.
llvm-svn: 247681
These are a few cleanups I happened to have from trying to go in a
different direction recently, so just flushing them out while I have
them.
llvm-svn: 247593
This was the wrong direction to take anyway (because ultimately the
GlobalValue needed the pointee type again and /it/ used
PointerType::getElementType eventually anyway)... let's go a different way.
This reverts commit r236161.
llvm-svn: 247586
Current implementation may end up emitting an undefined reference for
an "inline __attribute__((always_inline))" function by generating an
"available_externally alwaysinline" IR function for it and then failing to
inline all the calls. This happens when a call to such function is in dead
code. As the inliner is an SCC pass, it does not process dead code.
Libc++ relies on the compiler never emitting such undefined reference.
With this patch, we emit a pair of
1. internal alwaysinline definition (called F.alwaysinline)
2a. A stub F() { musttail call F.alwaysinline }
-- or, depending on the linkage --
2b. A declaration of F.
The frontend ensures that F.inlinefunction is only used for direct
calls, and the stub is used for everything else (taking the address of
the function, really). Declaration (2b) is emitted in the case when
"inline" is meant for inlining only (like __gnu_inline__ and some
other cases).
This approach, among other nice properties, ensures that alwaysinline
functions are always internal, making it impossible for a direct call
to such function to produce an undefined symbol reference.
This patch is based on ideas by Chandler Carruth and Richard Smith.
llvm-svn: 247494
Current implementation may end up emitting an undefined reference for
an "inline __attribute__((always_inline))" function by generating an
"available_externally alwaysinline" IR function for it and then failing to
inline all the calls. This happens when a call to such function is in dead
code. As the inliner is an SCC pass, it does not process dead code.
Libc++ relies on the compiler never emitting such undefined reference.
With this patch, we emit a pair of
1. internal alwaysinline definition (called F.alwaysinline)
2a. A stub F() { musttail call F.alwaysinline }
-- or, depending on the linkage --
2b. A declaration of F.
The frontend ensures that F.inlinefunction is only used for direct
calls, and the stub is used for everything else (taking the address of
the function, really). Declaration (2b) is emitted in the case when
"inline" is meant for inlining only (like __gnu_inline__ and some
other cases).
This approach, among other nice properties, ensures that alwaysinline
functions are always internal, making it impossible for a direct call
to such function to produce an undefined symbol reference.
This patch is based on ideas by Chandler Carruth and Richard Smith.
llvm-svn: 247465
-force-align-stack.
Also, make changes to the driver so that -mno-stack-realign is no longer
an option exposed to the end-user that disallows stack realignment in
the backend.
Differential Revision: http://reviews.llvm.org/D11815
llvm-svn: 247451
clang modules, if -dwarf-ext-refs (DebugTypesExtRefs) is specified.
This reimplements r247369 in about a third of the amount of code.
Thanks to David Blaikie pointing this out in post-commit review!
llvm-svn: 247432
When uses of personality functions were moved from LandingPadInst to
Function, we forgot to update SimplifyPersonality(). This patch corrects
that.
Note: SimplifyPersonality() is an optimization which replaces
personality functions with the default C++ personality when possible.
Without this update, some ObjC++ projects fail to link against C++
libraries (seeing as the exception ABI had effectively changed).
rdar://problem/22155434
llvm-svn: 247421
The type of a member pointer is incomplete if it has no inheritance
model. This lets us reuse more general logic already embedded in clang.
llvm-svn: 247346
It seems that there is small bug, and we can't generate assume loads
when some virtual functions have internal visibiliy
This reverts commit 982bb7d966947812d216489b3c519c9825cacbf2.
llvm-svn: 247332
Currently all variables used in OpenMP regions are captured into a record and passed to outlined functions in this record. It may result in some poor performance because of too complex analysis later in optimization passes. Patch makes to emit outlined functions for parallel-based regions with a list of captured variables. It reduces code for 2*n GEPs, stores and loads at least.
Codegen for task-based regions remains unchanged because runtime requires that all captured variables are passed in captured record.
llvm-svn: 247251
This flag causes the compiler to emit bit set entries for functions as well
as runtime bitset checks at indirect call sites. Depends on the new function
bitset mechanism.
Differential Revision: http://reviews.llvm.org/D11857
llvm-svn: 247238
Seems it broke the Polly build.
From http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly-fast/builds/11687/steps/compile/logs/stdio:
In file included from /home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.src/lib/TableGen/Record.cpp:14:0:
/home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.src/include/llvm/TableGen/Record.h:369:3: error: looser throw specifier for 'virtual llvm::TypedInit::~TypedInit()'
/home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.src/include/llvm/TableGen/Record.h:270:11: error: overriding 'virtual llvm::Init::~Init() noexcept (true)'
llvm-svn: 247222
Generating call assume(icmp %vtable, %global_vtable) after constructor
call for devirtualization purposes.
For more info go to:
http://lists.llvm.org/pipermail/cfe-dev/2015-July/044227.html
Edit:
Fixed version because of PR24479.
After this patch got reverted because of ScalarEvolution bug (D12719)
Merged after John McCall big patch (Added Address).
http://reviews.llvm.org/D11859
llvm-svn: 247199
We know that a reference can always be dereferenced. However, we don't
always know the number of bytes if the reference's pointee type is
incomplete. This case was correctly handled but we didn't consider the
case where the type is complete but we cannot calculate its size for ABI
specific reasons. In this specific case, a member pointer's size is
available only under certain conditions.
This fixes PR24703.
llvm-svn: 247188
Summary:
Currently clang provides no general way to generate nontemporal loads/stores.
There are some architecture specific builtins for doing so (e.g. in x86), but
there is no way to generate non-temporal store on, e.g. AArch64. This patch adds
generic builtins which are expanded to a simple store with '!nontemporal'
attribute in IR.
Differential Revision: http://reviews.llvm.org/D12313
llvm-svn: 247104
This function can be used to create a metadata identifier for a specific
type. No functionality change, but this will be used by D11857 and D12026.
Differential Revision: http://reviews.llvm.org/D12038
llvm-svn: 247098
When -fmodule-format is set to "obj", emit debug info for all types
declared in a module or referenced by a declaration into the module's
object file container.
This patch adds support for Objective-C types and methods.
llvm-svn: 247068
When -fmodule-format is set to "obj", emit debug info for all types
declared in a module or referenced by a declaration into the module's
object file container.
This patch adds support for C and C++ types.
llvm-svn: 247049
instruction used the ReturnValue as pointer operand or value operand. This
led to wrong code gen - in later stages (load-store elision code) the found
store and its operand would be erased, causing ReturnValue to become a <badref>.
The patch adds a check that makes sure that ReturnValue is a pointer operand of
store instruction. Regression test is also added.
This fixes PR24386.
Differential Revision: http://reviews.llvm.org/D12400
llvm-svn: 247003
separately from building the instruction so that it's
preserved even in -Asserts builds.
Employ C++'s mystical "comment" feature to discourage
breaking this in the future.
llvm-svn: 246991
Introduce an Address type to bundle a pointer value with an
alignment. Introduce APIs on CGBuilderTy to work with Address
values. Change core APIs on CGF/CGM to traffic in Address where
appropriate. Require alignments to be non-zero. Update a ton
of code to compute and propagate alignment information.
As part of this, I've promoted CGBuiltin's EmitPointerWithAlignment
helper function to CGF and made use of it in a number of places in
the expression emitter.
The end result is that we should now be significantly more correct
when performing operations on objects that are locally known to
be under-aligned. Since alignment is not reliably tracked in the
type system, there are inherent limits to this, but at least we
are no longer confused by standard operations like derived-to-base
conversions and array-to-pointer decay. I've also fixed a large
number of bugs where we were applying the complete-object alignment
to a pointer instead of the non-virtual alignment, although most of
these were hidden by the very conservative approach we took with
member alignment.
Also, because IRGen now reliably asserts on zero alignments, we
should no longer be subject to an absurd but frustrating recurring
bug where an incomplete type would report a zero alignment and then
we'd naively do a alignmentAtOffset on it and emit code using an
alignment equal to the largest power-of-two factor of the offset.
We should also now be emitting much more aggressive alignment
attributes in the presence of over-alignment. In particular,
field access now uses alignmentAtOffset instead of min.
Several times in this patch, I had to change the existing
code-generation pattern in order to more effectively use
the Address APIs. For the most part, this seems to be a strict
improvement, like doing pointer arithmetic with GEPs instead of
ptrtoint. That said, I've tried very hard to not change semantics,
but it is likely that I've failed in a few places, for which I
apologize.
ABIArgInfo now always carries the assumed alignment of indirect and
indirect byval arguments. In order to cut down on what was already
a dauntingly large patch, I changed the code to never set align
attributes in the IR on non-byval indirect arguments. That is,
we still generate code which assumes that indirect arguments have
the given alignment, but we don't express this information to the
backend except where it's semantically required (i.e. on byvals).
This is likely a minor regression for those targets that did provide
this information, but it'll be trivial to add it back in a later
patch.
I partially punted on applying this work to CGBuiltin. Please
do not add more uses of the CreateDefaultAligned{Load,Store}
APIs; they will be going away eventually.
llvm-svn: 246985
We were crashing in CodeGen given input like this:
int self_alias(void) __attribute__((weak, alias("self_alias")));
such a self-alias is invalid, but instead of diagnosing the situation, we'd
proceed to produce IR for both the function declaration and the alias. Because
we already had a function named 'self_alias', the alias could not be named the
same thing, and so LLVM would pick a different name ('self_alias1' for example)
for that value. When we later called CodeGenModule::checkAliases, we'd look up
the IR value corresponding to the alias name, find the function declaration
instead, and then assert in a cast to llvm::GlobalAlias. The easiest way to prevent
this is simply to avoid creating the wrongly-named alias value in the first
place and issue the diagnostic there (instead of in checkAliases). We detect a
related cycle case in CodeGenModule::EmitAliasDefinition already, so this just
adds a second such check.
Even though the other test cases for this 'alias definition is part of a cycle'
diagnostic are in test/Sema/attr-alias-elf.c, I've added a separate regression
test for this case. This is because I can't add this check to
test/Sema/attr-alias-elf.c without disturbing the other test cases in that
file. In order to avoid construction of the bad IR values, this diagnostic
is emitted from within CodeGenModule::EmitAliasDefinition (and the relevant
declaration is not added to the Aliases vector). The other cycle checks are
done within the CodeGenModule::checkAliases function based on the Aliases
vector, called from CodeGenModule::Release. However, if there have been errors
earlier, HandleTranslationUnit does not call Release, and so checkAliases is
never called, and so none of the other diagnostics would be produced.
Fixes PR23509.
llvm-svn: 246882
This fixes an issue raised in D12412, where we generated invalid IR.
Thanks to Vedant Kumar for coming up with the initial work around.
Differential Revision: http://reviews.llvm.org/D12412
llvm-svn: 246880
Fix processing of shared variables with reference types in OpenMP constructs. Previously, if the variable was not marked in one of the private clauses, the reference to this variable was emitted incorrectly and caused an assertion later.
llvm-svn: 246846
Summary:
Dtor sanitization handled amidst other dtor cleanups,
between cleaning bases and fields. Sanitizer call pushed onto
stack of cleanup operations.
Reviewers: eugenis, kcc
Differential Revision: http://reviews.llvm.org/D12022
Refactoring dtor sanitizing emission order.
- Support multiple inheritance by poisoning after
member destructors are invoked, and before base
class destructors are invoked.
- Poison for virtual destructor and virtual bases.
- Repress dtor aliasing when sanitizing in dtor.
- CFE test for dtor aliasing, and repression of aliasing in dtor
code generation.
- Poison members on field-by-field basis, with collective poisoning
of trivial members when possible.
- Check msan flags and existence of fields, before dtor sanitizing,
and when determining if aliasing is allowed.
- Testing sanitizing bit fields.
llvm-svn: 246815
This implements basic support for compiling (though not yet assembling
or linking) for a WebAssembly target. Note that ABI details are not yet
finalized, and may change.
Differential Revision: http://reviews.llvm.org/D12002
llvm-svn: 246814