Remove the restriction which forbade forming pointers to member
functions which had parameter types or return types which were not
convertible.
llvm-svn: 239499
CodeGenOptions and onto the PassManagerBuilder. This enables gating
the new EliminateAvailableExternally module pass on whether we are
preparing for LTO.
If we are preparing for LTO (e.g. a -flto -c compile), the new pass is not
included as we want to preserve available externally functions for possible
link time inlining.
llvm-svn: 239481
Based on previous discussion on the mailing list, clang currently lacks support
for C99 partial re-initialization behavior:
Reference: http://lists.cs.uiuc.edu/pipermail/cfe-dev/2013-April/029188.html
Reference: http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_253.htm
This patch attempts to fix this problem.
Given the following code snippet,
struct P1 { char x[6]; };
struct LP1 { struct P1 p1; };
struct LP1 l = { .p1 = { "foo" }, .p1.x[2] = 'x' };
// this example is adapted from the example for "struct fred x[]" in DR-253;
// currently clang produces in l: { "\0\0x" },
// whereas gcc 4.8 produces { "fox" };
// with this fix, clang will also produce: { "fox" };
Differential Review: http://reviews.llvm.org/D5789
llvm-svn: 239446
This commit adds back the code that seems to have been dropped unintentionally
in r176985.
rdar://problem/13752163
Differential Revision: http://reviews.llvm.org/D10100
llvm-svn: 239426
On ARM/AArch64, we currently always use EmitScalarExpr for the immediate
builtin arguments, instead of directly emitting the constant. When the
overflow sanitizer is enabled, this generates overflow intrinsics
instead of constants, breaking assumptions in various places.
Instead, use the knowledge of "immediates" to directly emit a constant:
- teach the tablegen backend to emit the "immediate" modifiers
- use those modifiers in the NEON CodeGen, on ARM and AArch64.
Fixes PR23517.
Differential Revision: http://reviews.llvm.org/D10045
llvm-svn: 239002
This patch fixes an assertion failure in method
'X86_64ABIInfo::GetByteVectorType'.
Method 'GetByteVectorType' (in TargetInfo.cpp) is responsible
for mapping a QualType 'Ty' (for an argument or return value) to an LLVM IR
type that, according to the ABI, must be passed in a XMM/YMM vector register.
When selecting the IR vector type, method 'GetByteVectorType' always tries to
choose the "best" IR vector type for the 'Ty' in input. In particular, if Ty
is a wrapper structure, it keeps unwrapping it until it finds a vector type VTy.
That VTy is the "preferred IR type".
However, function 'isSingleElementStructure' (used to unwrap structures) does
not know how to look through union types. So, before this patch, if Ty was in
a nest of wrapper structures with at least two union types, we would have
triggered an assertion failure (added at revision 230971).
With this patch, if method 'GetByteVectorType' fails to find the preferred
vector type, we just return a valid (although potentially 'less friendly')
vector type based on the type size. So, rather than asserting on an 'unexpected'
'Ty' in input, we conservatively return vector type <2 x double> if Ty is 16
bytes, or <4 x double> if Ty is 32 bytes.
Differential Revision: http://reviews.llvm.org/D10190
llvm-svn: 238861
The first named data member is the field used to default initialize the
union. An IndirectFieldDecl can introduce the first named data member
of a union.
llvm-svn: 238649
If the type isn't trivially moveable emplace can skip a potentially
expensive move. It also saves a couple of characters.
Call sites were found with the ASTMatcher + some semi-automated cleanup.
memberCallExpr(
argumentCountIs(1), callee(methodDecl(hasName("push_back"))),
on(hasType(recordDecl(has(namedDecl(hasName("emplace_back")))))),
hasArgument(0, bindTemporaryExpr(
hasType(recordDecl(hasNonTrivialDestructor())),
has(constructExpr()))),
unless(isInTemplateInstantiation()))
No functional change intended.
llvm-svn: 238601
This is a follow-up to r238266. It turned out structors are codegened through a different path,
and didn't get the storage class set in EmitGlobalFunctionDefinition.
llvm-svn: 238443
The representation of a pointer-to-member in the MS ABI is governed by
the layout of the relevant class or if a model has been explicitly
specified. If no model is specified, then an appropriate
"worst-case-scenario" model is implicitly chosen if, and only, if the
pointer-to-member type's representation was needed.
Debug info cannot force a pointer-to-member type to have a
representation so do not try to query the size of such a type unless we
know it is safe to do so.
llvm-svn: 238259
Types can be classified as being zero-initializable or
non-zero-initializable. We used to classify array types by giving them
the classification of their base element type. However, incomplete
array types are never initialized directly and thus are always
zero-initializable.
llvm-svn: 238256
Re-land the change r238200, but with modifications in the tests that should
prevent new failures in some environments as reported with the original
change on the mailing list.
llvm-svn: 238253
On MIPS unsigned int type should not be zero extended but sign-extended.
Patch by Strahinja Petrovic.
Differential Revision: http://reviews.llvm.org/D9198
llvm-svn: 238200
We already have the ABI, we don't need a "HasAVX" flag.
This will also makes it easier to add an AVX512 ABI.
No functional change intended.
llvm-svn: 237989
If loop control variable in a worksharing construct is marked as lastprivate, we should copy last calculated value of private counter back to original variable.
llvm-svn: 237879
-fprofile-instr-generate does not emit counter increment intrinsics
for Dtor_Deleting and Dtor_Complete destructors with assigned
counters. This causes unnecessary [-Wprofile-instr-out-of-date]
warnings during profile-use runs even if the source has never been
modified since profile collection.
Patch by Betul Buyukkurt. Thanks!
llvm-svn: 237804
Patch fixes codegen for aggregate copying of VLAs. Currently method CodeGenFunction::EmitAggregateCopy() does not support copying of VLAs. Patch checks if the size of the type is 0, then checks if the type is actually a variable-length array. Then it calculates total length for this array and calculates total size of the array in bytes:
<total number of elements in array> * aligned_sizeof(ElementType) (if copy assignment is requested).
If simple copying is requested, size is calculated like:
<total number of elements in array> * aligned_sizeof(ElementType) - aligned_sizeof(ElementType) + sizeof(ElementType).
memcpy() is used with this calculated size of the VLA.
Differential Revision: http://reviews.llvm.org/D9851
llvm-svn: 237768
The implicit conversion was causing issues for a helper being added that
would take an llvm::Function rather than an llvm::Value to make the
CallInst. Since we'll eventually need to specify the type of the call
explicitly anyway, fix these up to avoid the future ambiguity.
llvm-svn: 237729
This modification generates proper copyin/initialization sequences for array variables/parameters. Before they were considered as pointers, not arrays.
llvm-svn: 237691
Also add trivial handling of transparent unions.
PPC32, MSP430, and XCore apparently all rely on DefaultABIInfo. This
should worry you, because DefaultABIInfo is not implementing the rules
of any particular ABI.
Fixes PR23097, patch by Andy Gibbs.
llvm-svn: 237630
Internal task structure must be generated like
typedef struct kmp_task {
void * shareds;
kmp_routine_entry_t routine;
kmp_int32 part_id;
kmp_routine_entry_t destructors;
} kmp_task_t;
struct kmp_task_t_with_privates {
kmp_task_t task_data;
.kmp_private. privates;
};
to avoid possible additional alignment bytes in first fields (shareds, routine, part_id and destructors). Runtime library is not aware of such kind additional alignment bytes.
llvm-svn: 237561
With this change, enabling -fmodules-local-submodule-visibility results in name
visibility rules being applied to submodules of the current module in addition
to imported modules (that is, names no longer "leak" between submodules of the
same top-level module). This also makes it much safer to textually include a
non-modular library into a module: each submodule that textually includes that
library will get its own "copy" of that library, and so the library becomes
visible no matter which including submodule you import.
llvm-svn: 237473
The issue I was trying to solve in r236547 was about built-in macros,
but I disabled coverage in all system macros. This is actually a bit
of overkill, and makes the display of coverage around system macros
degrade unnecessarily. Instead, limit this to builtins specifically.
llvm-svn: 237397
Summary:
Space on stack allocated for unused structures returned by functions was unused
even when it's lifetime didn't intersect with lifetime of any other objects that
could use the same space.
The test added also checks for named and auto objects. It seems to make sense
to have this all in one place.
Reviewers: aadg, rsmith, rjmccall, rnk
Reviewed By: rnk
Subscribers: asl, cfe-commits
Differential Revision: http://reviews.llvm.org/D9743
llvm-svn: 237385
GetOutputStream() owns the stream it returns pointer to and the
pointer should never be freed by us. When we fail to load and exit
early, unique_ptr still holds the pointer and frees it which leads to
compiler crash when CompilerInstance attempts to free it again.
Added regression test for failed bitcode linking.
Differential Revision: http://reviews.llvm.org/D9625
llvm-svn: 237159
'schedule' clause for combined directives requires additional processing. Special helper variable is generated, that is captured in the outlined parallel region for 'parallel for' region. This captured variable is used to store chunk expression from the 'schedule' clause in this 'parallel for' region.
llvm-svn: 237100
We didn't supporting taking the address of virtual member functions
which overrode a method in a virtual base. We simply need to encode the
virtual base index in the member pointer.
This fixes PR23452.
N.B. There is no data member pointer side to this change because taking
the address of a virtual bases' data member gives you a member pointer
whose type is derived from the virtual bases' type, not the most derived
type.
llvm-svn: 236962
MSVC 2015 renamed the symbol found by name lookup for 'std::terminate'
so we cannot rely on using '?terminate@@YAXXZ'. Furthermore, it seems
that 2015 will be the first release of MSVC which permits inlining a
function which is noexcept into a function which isn't. This is
implemented by creating a cleanup for the invoker which jumps to
__std_terminate. Clang's implementation of this aspect of the MSVC
scheme is slightly less efficient in this respect because we use a
catch handler configured as a catch-all handler instead.
llvm-svn: 236961
Patch from Geoff Berry <gberry@codeaurora.org>
Fix BackendConsumer::EmitOptimizationMessage() to check if the
DiagnosticInfoOptimizationBase object has a valid location before
calling getLocation() to avoid dereferencing a null pointer inside
getLocation() when no debug info is present.
llvm-svn: 236898
Functions with available_externally linkage will not be emitted to object
files (they will just be undefined symbols), so it does not make sense to
put them in comdats.
Creates a second overload of maybeSetTrivialComdat that uses the GlobalObject
instead of the Decl, and uses that in several places that had the faulty
logic.
Differential Revision: http://reviews.llvm.org/D9580
llvm-svn: 236879
Summary:
Possible coverage levels are:
* -fsanitize-coverage=func - function-level coverage
* -fsanitize-coverage=bb - basic-block-level coverage
* -fsanitize-coverage=edge - edge-level coverage
Extra features are:
* -fsanitize-coverage=indirect-calls - coverage for indirect calls
* -fsanitize-coverage=trace-bb - tracing for basic blocks
* -fsanitize-coverage=trace-cmp - tracing for cmp instructions
* -fsanitize-coverage=8bit-counters - frequency counters
Levels and features can be combined in comma-separated list, and
can be disabled by subsequent -fno-sanitize-coverage= flags, e.g.:
-fsanitize-coverage=bb,trace-bb,8bit-counters -fno-sanitize-coverage=trace-bb
is equivalient to:
-fsanitize-coverage=bb,8bit-counters
Original semantics of -fsanitize-coverage flag is preserved:
* -fsanitize-coverage=0 disables the coverage
* -fsanitize-coverage=1 is a synonym for -fsanitize-coverage=func
* -fsanitize-coverage=2 is a synonym for -fsanitize-coverage=bb
* -fsanitize-coverage=3 is a synonym for -fsanitize-coverage=edge
* -fsanitize-coverage=4 is a synonym for -fsanitize-coverage=edge,indirect-calls
Driver tries to diagnose invalid flag usage, in particular:
* At most one level (func,bb,edge) must be specified.
* "trace-bb" and "8bit-counters" features require some level to be specified.
See test case for more examples.
Test Plan: regression test suite
Reviewers: kcc
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D9577
llvm-svn: 236790
- added -fcuda-include-gpubinary option to incorporate results of
device-side compilation into host-side one.
- generate code to register GPU binaries and associated kernels
with CUDA runtime and clean-up on exit.
- added test case for init/deinit code generation.
Differential Revision: http://reviews.llvm.org/D9507
llvm-svn: 236765
Summary:
The next step is to add user-friendly control over these options
to driver via -fsanitize-coverage= option.
Test Plan: regression test suite
Reviewers: kcc
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D9545
llvm-svn: 236756
Fix for codegen of static variables declared inside of captured statements. Captured statements are actually a transparent DeclContexts, so we have to skip them when trying to get a mangled name for statics.
Differential Revision: http://reviews.llvm.org/D9522
llvm-svn: 236701
The MSVC 2015 ABI utilizes a rather straightforward adaptation of the
algorithm found in the appendix of N2382. While we are here, implement
support for emitting cleanups if an exception is thrown while we are
intitializing a static local variable.
llvm-svn: 236697
Inner bodies of OpenMP worksharing loop-based constructs with dynamic or guided scheduling are allowed to be marked with !llvm.mem.parallel_loop_access metadata for better optimization. Worksharing constructs with static scheduling cannot be marked this way (according to OpenMP standard "A data dependence between the same logical iterations in two such loops is guaranteed").
Constructs with auto and runtime scheduling are also not marked because automatically chosen scheduling may be static also.
Differential Revision: http://reviews.llvm.org/D9518
llvm-svn: 236693
Fixed codegen for reduction operations min, max, && and ||. Codegen for them is quite similar and I was confused by this similarity.
Also added a call to kmpc_end_reduce() in atomic part of reduction codegen (call to kmpc_end_reduce_nowait() is not required).
Differential Revision: http://reviews.llvm.org/D9513
llvm-svn: 236689
All callers should be passing `CXXConstructorDecl` or
`CXXDestructorDecl` here, so use `cast<>` instead of `dyn_cast<>` when
setting up the `GlobalDecl`.
llvm-svn: 236651
It doesn't make much sense to try to show coverage inside system
macros, and source locations in builtins confuses the coverage
mapping. Just avoid doing this.
Fixes an assert that fired when a __block storage specifier starts a
region.
llvm-svn: 236547
This adds low-level builtins to allow access to all of the z13 vector
instructions. Note that instructions whose semantics can be described
by standard C (including clang extensions) do not get any builtins.
For each instructions whose semantics *cannot* (fully) be described, we
define a builtin named __builtin_s390_<insn> that directly maps to this
instruction. These are intended to be compatible with GCC.
For instructions that also set the condition code, the builtin will take
an extra argument of type "int *" at the end. The integer pointed to by
this argument will be set to the post-instruction CC value.
For many instructions, the low-level builtin is mapped to the corresponding
LLVM IR intrinsic. However, a number of instructions can be represented
in standard LLVM IR without requiring use of a target intrinsic.
Some instructions require immediate integer operands within a certain
range. Those are verified at the Sema level.
Based on a patch by Richard Sandiford.
llvm-svn: 236532
This patch adds support for the z13 architecture type. For compatibility
with GCC, a pair of options -mvx / -mno-vx can be used to selectively
enable/disable use of the vector facility.
When the vector facility is present, we default to the new vector ABI.
This is characterized by two major differences:
- Vector types are passed/returned in vector registers
(except for unnamed arguments of a variable-argument list function).
- Vector types are at most 8-byte aligned.
The reason for the choice of 8-byte vector alignment is that the hardware
is able to efficiently load vectors at 8-byte alignment, and the ABI only
guarantees 8-byte alignment of the stack pointer, so requiring any higher
alignment for vectors would require dynamic stack re-alignment code.
However, for compatibility with old code that may use vector types, when
*not* using the vector facility, the old alignment rules (vector types
are naturally aligned) remain in use.
These alignment rules are not only implemented at the C language level,
but also at the LLVM IR level. This is done by selecting a different
DataLayout string depending on whether the vector ABI is in effect or not.
Based on a patch by Richard Sandiford.
llvm-svn: 236531
Destructors are never called for cleanups, so we can't use SmallVector as a member.
Differential Revision: http://reviews.llvm.org/D9399
llvm-svn: 236491
Destructors are never called for cleanups, so we can't use SmallVector as a member.
Differential Revision: http://reviews.llvm.org/D9399
llvm-svn: 236487
Destructors are never called for cleanups, so we can't use SmallVector as a member.
Differential Revision: http://reviews.llvm.org/D9399
llvm-svn: 236482
Destructors are never called for cleanups, so we can't use SmallVector as a member.
Differential Revision: http://reviews.llvm.org/D9399
llvm-svn: 236480
For tasks codegen for private/firstprivate variables are different rather than for other directives.
1. Build an internal structure of privates for each private variable:
struct .kmp_privates_t. {
Ty1 var1;
...
Tyn varn;
};
2. Add a new field to kmp_task_t type with list of privates.
struct kmp_task_t {
void * shareds;
kmp_routine_entry_t routine;
kmp_int32 part_id;
kmp_routine_entry_t destructors;
.kmp_privates_t. privates;
};
3. Create a function with destructors calls for all privates after end of task region.
kmp_int32 .omp_task_destructor.(kmp_int32 gtid, kmp_task_t *tt) {
~Destructor(&tt->privates.var1);
...
~Destructor(&tt->privates.varn);
return 0;
}
4. Perform initialization of all firstprivate fields (by simple copying for POD data, copy constructor calls for classes) + provide address of a destructor function after kmpc_omp_task_alloc() and before kmpc_omp_task() calls.
kmp_task_t *new_task = __kmpc_omp_task_alloc(ident_t *, kmp_int32 gtid, kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds, kmp_routine_entry_t *task_entry);
CopyConstructor(new_task->privates.var1, *new_task->shareds.var1_ref);
new_task->shareds.var1_ref = &new_task->privates.var1;
...
CopyConstructor(new_task->privates.varn, *new_task->shareds.varn_ref);
new_task->shareds.varn_ref = &new_task->privates.varn;
new_task->destructors = .omp_task_destructor.;
kmp_int32 __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t *new_task)
Differential Revision: http://reviews.llvm.org/D9370
llvm-svn: 236479
The fact that PGO has a say in how these branch weights are determined
isn't interesting to most of CodeGen, so it makes more sense for this
API to be accessible via CodeGenFunction rather than CodeGenPGO.
llvm-svn: 236380
The underlying problem is that there is currently no way to run
ObjCARCContract from llvm bitcode which is required by ObjC ARC.
This fix the problem by always enable ObjCARCContract pass if
optimization is enabled. The ObjCARC Contract pass has almost no
overhead on code that is not using ARC.
llvm-svn: 236372
No functional change. This just makes it more obvious that the logic
in ComputeRegionCounts only depends on the counter map and local
state.
llvm-svn: 236370
This removes the RegionCounter class, which is only used as a helper
in teh ComputeRegionCounts stmt visitor. This class is just an extra
layer of abstraction that makes the code harder to follow at this
point, and removing it makes the logic quite a bit more direct.
llvm-svn: 236364
This change is the third of 3 patches to add support for specifying
the profile output from the command line via -fprofile-instr-generate=<path>,
where the specified output path/file will be overridden by the
LLVM_PROFILE_FILE environment variable.
This patch adds the necessary support to the clang frontend, and adds a
new test.
The compiler-rt and llvm parts are r236055 and r236288, respectively.
Patch by Teresa Johnson. Thanks!
llvm-svn: 236289
We were assigning the counter for the body of the loop to the loop
variable initialization for some reason here, but our tests completely
lacked coverage for range-for loops. This fixes that and makes the
logic generally more similar to the logic for a regular for.
llvm-svn: 236277
For tasks codegen for private/firstprivate variables are different rather than for other directives.
1. Build an internal structure of privates for each private variable:
struct .kmp_privates_t. {
Ty1 var1;
...
Tyn varn;
};
2. Add a new field to kmp_task_t type with list of privates.
struct kmp_task_t {
void * shareds;
kmp_routine_entry_t routine;
kmp_int32 part_id;
kmp_routine_entry_t destructors;
.kmp_privates_t. privates;
};
3. Create a function with destructors calls for all privates after end of task region.
kmp_int32 .omp_task_destructor.(kmp_int32 gtid, kmp_task_t *tt) {
~Destructor(&tt->privates.var1);
...
~Destructor(&tt->privates.varn);
return 0;
}
4. Perform default initialization of all private fields (no initialization for POD data, default constructor calls for classes) + provide address of a destructor function after kmpc_omp_task_alloc() and before kmpc_omp_task() calls.
kmp_task_t *new_task = __kmpc_omp_task_alloc(ident_t *, kmp_int32 gtid, kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds, kmp_routine_entry_t *task_entry);
DefaultConstructor(new_task->privates.var1);
new_task->shareds.var1_ref = &new_task->privates.var1;
...
DefaultConstructor(new_task->privates.varn);
new_task->shareds.varn_ref = &new_task->privates.varn;
new_task->destructors = .omp_task_destructor.;
kmp_int32 __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t *new_task)
Differential Revision: http://reviews.llvm.org/D9322
llvm-svn: 236207
Fixed initialization of 'single' region completion + changed type of the third argument of __kmpc_copyprivate() runtime function to size_t.
llvm-svn: 236198
and as artificial local variables in the debug info.
This is a follow-up to r236059. We can't get rid of the local variables
entirely because the gdb buildbot depends on them, but we can mark them
as artificial while still emitting the correct debug info. As I learned
from review comments other compilers also follow this model.
A paired commit in LLVM temporarily relaxes the debug info verifier to
not check the integrity of DW_OP_bit_pieces of artificial variables.
rdar://problem/20730771
llvm-svn: 236125
LLVM r236120 renamed debug info IR constructs to use a `DI` prefix, now
that the `DIDescriptor` hierarchy has been gone for about a week. This
commit was generated using the rename-md-di-nodes.sh upgrade script
attached to PR23080, followed by running clang-format-diff.py on the
`lib/` portion of the patch.
llvm-svn: 236121
This issue was fixed elsewhere in r235396 in a more general way, hence these
changes no longer do anything. Keep the testcase however, to ensure that we
don't regress this for ARM.
llvm-svn: 236104
in the debug info. This patch deletes a hack that emits the members
of local anonymous unions as local variables.
Besides being morally wrong, the existing representation using local
variables breaks internal assumptions about the local variables' storage
size.
Compiling
```
void fn1() {
union {
int i;
char c;
};
i = c;
}
```
with -g -O3 -verify will cause the verifier to fail after SROA splits
the 32-bit storage for the "local variable" c into two pieces because the
second piece is clearly outside the 8-bit range that is expected for a
variable of type char. Given the choice I'd rather fix the debug
representation than weaken the verifier.
Debuggers generally already know how to deal with anonymous unions when
they are members of C++ record types, but they may have problems finding
the local anonymous struct members in the expression evaluator.
rdar://problem/20730771
llvm-svn: 236059
This is just the clang-side of 32-bit SEH. LLVM still needs work, and it
will determinstically fail to compile until it's feature complete.
On x86, all outlined handlers have no parameters, but they do implicitly
take the EBP value passed in and use it to address locals of the parent
frame. We model this with llvm.frameaddress(1).
This works (mostly), but __finally block inlining can break it. For now,
we apply the 'noinline' attribute. If we really want to inline __finally
blocks on 32-bit x86, we should teach the inliner how to untangle
frameescape and framerecover.
Promote the error diagnostic from codegen to sema. It now rejects SEH on
non-Windows platforms. LLVM doesn't implement SEH on non-x86 Windows
platforms, but there's nothing preventing it.
llvm-svn: 236052
ability to generate code that CodeGen likes. Test
cases can use this functionality by calling
// RUN: %clang_cc1 -emit-obj -o /dev/null -ast-merge %t.1.ast -ast-merge %t.2.ast %s
llvm-svn: 236011
When creating a global variable with a type of a struct with bitfields, we must
forcibly set the alignment of the global from the RecordDecl. We must do this so
that the proper bitfield alignment makes its way down to LLVM, since clang will
mangle the bitfields into one large type.
llvm-svn: 235976
This makes sure that the front end is specific about what they're expecting
the backend to produce. Update a FIXME with the idea that the target-features
could be more precise using backend knowledge.
llvm-svn: 235936
Currently clang emits file-scope asm during *both* host and device
compilation modes which is usually a wrong thing to do.
There's no way to attach any attribute to an __asm statement, so
there's no way to differentiate between host-side and device-side
file-scope asm. This patch makes clang to match nvcc behavior and
emit file-scope-asm only during host-side compilation.
Differential Revision: http://reviews.llvm.org/D9270
llvm-svn: 235905
Emit the following code for 'taskwait' directive within tied task:
call i32 @__kmpc_omp_taskwait(<loc>, i32 <thread_id>);
Differential Revision: http://reviews.llvm.org/D9245
llvm-svn: 235836
Emit a code for reduction clause. Next code should be emitted for reductions:
static kmp_critical_name lock = { 0 };
void reduce_func(void *lhs[<n>], void *rhs[<n>]) {
*(Type0*)lhs[0] = ReductionOperation0(*(Type0*)lhs[0], *(Type0*)rhs[0]);
...
*(Type<n>-1*)lhs[<n>-1] =
ReductionOperation<n>-1(*(Type<n>-1*)lhs[<n>-1],
*(Type<n>-1*)rhs[<n>-1]);
}
...
void *RedList[<n>] = {&<RHSExprs>[0], ..., &<RHSExprs>[<n>-1]};
switch (__kmpc_reduce{_nowait}(<loc>, <gtid>, <n>, sizeof(RedList), RedList, reduce_func, &<lock>)) {
case 1:
<LHSExprs>[0] = ReductionOperation0(*<LHSExprs>[0], *<RHSExprs>[0]);
...
<LHSExprs>[<n>-1] = ReductionOperation<n>-1(*<LHSExprs>[<n>-1], *<RHSExprs>[<n>-1]);
__kmpc_end_reduce{_nowait}(<loc>, <gtid>, &<lock>);
break;
case 2:
Atomic(<LHSExprs>[0] = ReductionOperation0(*<LHSExprs>[0], *<RHSExprs>[0]));
...
Atomic(<LHSExprs>[<n>-1] = ReductionOperation<n>-1(*<LHSExprs>[<n>-1], *<RHSExprs>[<n>-1]));
break;
default:;
}
Reduction variables are a kind of a private variables, they have private copies, but initial values are chosen in accordance with the reduction operation.
If sections directive has only single section, then original shared variables are used instead with barrier at the end of the directive.
Differential Revision: http://reviews.llvm.org/D9242
llvm-svn: 235835
#pragma omp sections lastprivate(<var>)
<BODY>;
This construct is translated into something like:
<last_iter> = alloca i32
<init for lastprivates>;
<last_iter> = 0
; No initializer for simple variables or a default constructor is called for objects.
; For arrays perform element by element initialization by the call of the default constructor.
...
OMP_FOR_START(...,<last_iter>, ..); sets <last_iter> to 1 if this is the last iteration.
<BODY>
...
OMP_FOR_END
if (<last_iter> != 0) {
<final copy for lastprivate>; Update original variable with the lastprivate value.
}
call __kmpc_cancel_barrier() ; an implicit barrier to avoid possible data race.
If there is only one section, there is no special code generation, original shared variables are used + barrier is emitted at the end of the directive.
Differential Revision: http://reviews.llvm.org/D9240
llvm-svn: 235834
If there are 2 or more sections in a 'section' directive the following code is generated:
<default init for privates>
@__kmpc_for_static_init_4();
<BODY for sections directive>
@__kmpc_for_static_fini()
If there is only one section, the following code is generated:
if (@__kmpc_single()) {
<default init for privates>
@__kmpc_end_single();
}
Differential Revision: http://reviews.llvm.org/D9239
llvm-svn: 235833
Emit the following code for 'single' directive with 'private' clause:
if (@__kmpc_single()) {
<default init for privates>
@__kmpc_end_single();
}
Differential Revision: http://reviews.llvm.org/D9238
llvm-svn: 235832
Fixes rdar://20621065.
A more elegant fix would preclude this case by defining the
rules such that zero-size classes are always formally empty.
I believe the only extensions which create zero-size classes
right now are flexible arrays and zero-length arrays; it's
not abstractly unreasonable to say that those don't count
as members for the purposes of emptiness, just as zero-width
bitfields don't count. But that's an ABI-affecting change
and requires further discussion; in the meantime, let's not
assert / miscompile.
llvm-svn: 235815
This fixes a crash when we're emitting coverage and a macro appears
between two binary conditional operators, ie, "foo ?: MACRO ?: bar",
and fixes the interaction of macros and conditional operators in
general.
llvm-svn: 235793
Emit the following code for 'single' directive with 'firtstprivate' clause:
if (@__kmpc_single()) {
<init for firstprivates>
@__kmpc_end_single();
}
@__kmpc_cancel_barrier(); // To avoid data race in firstprivate init
Differential Revision: http://reviews.llvm.org/D9223
llvm-svn: 235694
Runtime function for 'copyprivate' directive generates implicit barriers, so no need to emit it.
Differential Revision: http://reviews.llvm.org/D9215
llvm-svn: 235692
If there are 2 or more sections in a 'section' directive the following code is generated:
<init for firstprivates>
@__kmpc_cancel_barrier();// To avoid data race in firstprivate init
@__kmpc_for_static_init_4();
<BODY for sections directive>
@__kmpc_for_static_fini()
If there is only one section, the following code is generated:
if (@__kmpc_single()) {
<init for firstprivates>
@__kmpc_end_single();
}
@__kmpc_cancel_barrier(); // To avoid data race in firstprivate init
Differential Revision: http://reviews.llvm.org/D9214
llvm-svn: 235691
The RegionCounter type does a lot of legwork, but most of it is only
meaningful within the implementation of CodeGenPGO. The uses elsewhere
in CodeGen generally just want to increment or read counters, so do
that directly.
llvm-svn: 235664
In r235553, Clang started emitting lifetime markers more often. This
caused false negative in MSan, because MSan only poisons all allocas
once at function entry. Eventually, MSan should poison allocas at
lifetime start and probably also lifetime end, but until then, let's not
emit markers that aren't going to be useful.
llvm-svn: 235613
Adds codegen for 'atomic capture' constructs with the following forms of expressions/statements:
v = x binop= expr;
v = x++;
v = ++x;
v = x--;
v = --x;
v = x = x binop expr;
v = x = expr binop x;
{v = x; x = binop= expr;}
{v = x; x++;}
{v = x; ++x;}
{v = x; x--;}
{v = x; --x;}
{x = x binop expr; v = x;}
{x binop= expr; v = x;}
{x++; v = x;}
{++x; v = x;}
{x--; v = x;}
{--x; v = x;}
{x = x binop expr; v = x;}
{x = expr binop x; v = x;}
{v = x; x = expr;}
If x and expr are integer and binop is associative or x is a LHS in a RHS of the assignment expression, and atomics are allowed for type of x on the target platform atomicrmw instruction is emitted.
Otherwise compare-and-swap sequence is emitted.
Update of 'v' is not required to be be atomic with respect to the read or write of the 'x'.
bb:
...
atomic load <x>
cont:
<expected> = phi [ <x>, label %bb ], [ <new_failed>, %cont ]
<desired> = <expected> binop <expr>
<res> = cmpxchg atomic &<x>, desired, expected
<new_failed> = <res>.field1;
br <res>field2, label %exit, label %cont
exit:
atomic store <old/new x>, <v>
...
Differential Revision: http://reviews.llvm.org/D9049
llvm-svn: 235573
Summary:
Make sure signed overflow in "x--" is checked with
llvm.ssub.with.overflow intrinsic and is reported as:
"-2147483648 - 1 cannot be represented in type 'int'"
instead of:
"-2147483648 + -1 cannot be represented in type 'int'"
, like we do for unsigned overflow.
Test Plan: clang + compiler-rt regression test suite
Reviewers: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D8236
llvm-svn: 235568
We try to use the member variable "FuncName" here, but we've also used
that name as a parameter. This ends with us getting the length of the
function name wrong when we generate the coverage data.
llvm-svn: 235565
These extra endcatch markers aren't helping identify regions to outline,
so let's get rid of them. LLVM outlines (more or less) from begincatch
to endcatch. Any unwind edge from an enclosed invoke is a transition to
a new exception handler, which has it's own outlining markers.
llvm-svn: 235562
This reverts commit r234700. It turns out that the lifetime markers
were not the cause of Chromium failing but a bug which was uncovered by
optimizations exposed by the markers.
llvm-svn: 235553
Otherwise -fno-omit-frame-pointer and other flags like it aren't
applied.
Basic idea taken from Gao's patch, thanks!
Differential Revision: http://reviews.llvm.org/D9203
llvm-svn: 235537
If condition evaluates to true, the code executes task by calling @__kmpc_omp_task() runtime function.
If condition evaluates to false, the code executes serial version of the code by executing the following code:
call void @__kmpc_omp_task_begin_if0(<loc>, <threadid>, <task_t_ptr, returned by @__kmpc_omp_task_alloc()>);
proxy_task_entry(<gtid>, <task_t_ptr, returned by @__kmpc_omp_task_alloc()>);
call void @__kmpc_omp_task_complete_if0(<loc>, <threadid>, <task_t_ptr, returned by @__kmpc_omp_task_alloc()>);
Also it checks if the condition is constant and if it is constant it evaluates its value and then generates either parallel version of the code (if the condition evaluates to true), or the serial version of the code (if the condition evaluates to false).
Differential Revision: http://reviews.llvm.org/D9143
llvm-svn: 235507
This patch generates helper variables which used as a private copies of the corresponding original variables inside an OpenMP 'for' directive. These generated variables are initialized by default (with the default constructor, if any). In OpenMP region references to original variables are replaced by the references to these private helper variables.
Differential Revision: http://reviews.llvm.org/D9106
llvm-svn: 235503
Patch fixes bugs in codegen for loops with unsigned counters and zero trip count. Previously preconditions for all loops were built using logic (Upper - Lower) > 0. But if the loop is a loop with zero trip count, then Upper - Lower is < 0 only for signed integer, for unsigned we're running into an underflow situation.
In this patch we're using original Lower<Upper condition to check that loop body can be executed at least once. Also this allows to skip code generation for loops, if it is known that preconditions for the loop are always false.
Differential Revision: http://reviews.llvm.org/D9103
llvm-svn: 235500
Add codegen for 'ordered' directive:
__kmpc_ordered(ident_t *, gtid);
<associated statement>;
__kmpc_end_ordered(ident_t *, gtid);
Also for 'for' directives with the dynamic scheduling and an 'ordered' clause added a call to '__kmpc_dispatch_fini_(4|8)[u]()' function after increment expression for loop control variable:
while(__kmpc_dispatch_next(&LB, &UB)) {
idx = LB;
while (idx <= UB) { BODY; ++idx;
__kmpc_dispatch_fini_(4|8)[u](); // For ordered loops only.
} // inner loop
}
Differential Revision: http://reviews.llvm.org/D9070
llvm-svn: 235496
- Changed CUDALaunchBounds arguments from integers to Expr* so they can
be saved in AST for instantiation.
- Added support for template instantiation of launch_bounds attrubute.
- Moved evaluation of launch_bounds arguments to NVPTXTargetCodeGenInfo::
SetTargetAttributes() where it can be done after template instantiation.
- Added a warning on negative launch_bounds arguments.
- Amended test cases.
Differential Revision: http://reviews.llvm.org/D8985
llvm-svn: 235452
An upcoming LLVM commit will remove the `DIArray` and `DITypeArray`
typedefs that shadow `DebugNodeArray` and `MDTypeRefArray`,
respectively. Use those types directly.
llvm-svn: 235412
Code in CodeGenModule::GetOrCreateLLVMGlobal that sets up GlobalValue
object for LLVM external symbols has this comment:
// FIXME: This code is overly simple and should be merged with other global
// handling.
One part does seems to be "overly simple" currently is that this code
never sets any alignment info on the GlobalValue, so that the emitted
IR does not have any align attribute on external globals. This can
lead to unnecessarily inefficient code generation.
This patch adds a GV->setAlignment call to set alignment info.
llvm-svn: 235396
Prepare for the deletion in LLVM of the subclasses of (the already
deleted) `DIScope` by using the raw pointers they were wrapping
directly.
llvm-svn: 235355
Subclasses of (the already deleted) `DIType` will be deleted by an
upcoming LLVM commit. Remove references.
While `DICompositeType` wraps `MDCompositeTypeBase` and `DIDerivedType`
wraps `MDDerivedTypeBase`, most uses of each really meant the more
specific `MDCompositeType` and `MDDerivedType`. I updated accordingly.
llvm-svn: 235350
Something like { void*, void * } would be passed to a function as a [2 x i64], but returned as an i128. This patch unifies the 2 behaviours so that we also return it as a [2 x i64].
This is better for the quality of the IR, and the size of the final LLVM binary as we tend to want to insert/extract values from these types and do so with the insert/extract instructions is less IR than shifting, truncating, and or'ing values.
Reviewed by Tim Northover.
llvm-svn: 235231
LLVM r235111 changed the `DIBuilder` API to stop using `DIDescriptor`
and its subclasses. Rolled into this was some tightening up of types:
- Scopes: `DIDescriptor` => `MDScope*`.
- Generic debug nodes: `DIDescriptor` => `DebugNode*`.
- Subroutine types: `DICompositeType` => `MDSubroutineType*`.
- Composite types: `DICompositeType` => `MDCompositeType*`.
Note that `DIDescriptor` wraps `MDNode`, and `DICompositeType` wraps
`MDCompositeTypeBase`.
It's this new type strictness that requires changes here.
llvm-svn: 235112
Emits the following code for the clause at the beginning of the outlined function for implicit threads:
if (<not a master thread>) {
...
<thread local copy of var> = <master thread local copy of var>;
...
}
<sync point>;
Checking for a non-master thread is performed by comparing of the address of the thread local variable with the address of the master's variable. Master thread always uses original variables, so you always know the address of the variable in the master thread.
Differential Revision: http://reviews.llvm.org/D9026
llvm-svn: 235075
#pragma omp for lastprivate(<var>)
for (i = a; i < b; ++b)
<BODY>;
This construct is translated into something like:
<last_iter> = alloca i32
<lastprivate_var> = alloca <type>
<last_iter> = 0
; No initializer for simple variables or a default constructor is called for objects.
; For arrays perform element by element initialization by the call of the default constructor.
...
OMP_FOR_START(...,<last_iter>, ..); sets <last_iter> to 1 if this is the last iteration.
<BODY>
...
OMP_FOR_END
if (<last_iter> != 0) {
<var> = <lastprivate_var> ; Update original variable with the lastprivate value.
}
call __kmpc_cancel_barrier() ; an implicit barrier to avoid possible data race.
Differential Revision: http://reviews.llvm.org/D8658
llvm-svn: 235074
Things can't both be in comdats and have common linkage, so never give things
in comdats common linkage. Common linkage is only used in .c files, and the
only thing that can trigger a comdat in c is selectany from what I can tell.
Fixes PR23243.
Also address an over-the-shoulder review comment from rnk by moving the
hasAttr<SelectAnyAttr>() in Decl.cpp around a bit. It only makes a minor
difference for selectany on global variables, so it goes well with the rest of
this patch.
http://reviews.llvm.org/D9042
llvm-svn: 235053
This reverts commit r234767, as it was breaking all ARM buildbots for two days and the
assert is not in the code, making it difficult to spot the error, which would keep the
bots red for a few more days. New errors were silently introduced because of this bug,
and we don't want this to escalate.
llvm-svn: 234983
Adds proper codegen for 'firstprivate' clause in for directive. Initially codegen for 'firstprivate' clause was implemented for 'parallel' directive only.
Also this patch emits sync point only after initialization of firstprivate variables, not all private variables. This sync point is not required for privates, lastprivates etc., only for initialization of firstprivate variables.
Differential Revision: http://reviews.llvm.org/D8660
llvm-svn: 234978
Follow up to r234962, start respecting `-emit-llvm-uselists even for
LLVM assembly. Note that the driver never passes this flag; this is
just a interface convenience/consistency for those using `-cc1`
directly. This required LLVM r234969 (and predecessors).
llvm-svn: 234970
Stop relying on `cl::opt` to pass along the driver's decision to
preserve use-lists. Create a new `-cc1` option called
`-emit-llvm-uselists` that does the right thing (when -emit-llvm-bc).
Note that despite its generic name, it *doesn't* do the right thing when
-emit-llvm (LLVM assembly) yet. I'll hook that up soon.
This doesn't really change the behaviour of the driver. The default is
still to preserve use-lists for `clang -emit-llvm` and `clang
-save-temps`, and nothing else. But it stops relying on global state
(and also is a nicer interface for hackers using `clang -cc1`).
llvm-svn: 234962
Reverts the code changes from r234675 but keeps the test case.
We were already maintaining a DenseMap of globals with dynamic
initializers anyway.
Fixes the test case from PR23234.
llvm-svn: 234961
Now that `addBitcodeWriterPass()` requires an explicit bit to preserve
use-list order, send it in from `clang`. It looks like I'll be able to
push this up to the `-cc1` options.
llvm-svn: 234960
Fixed a bug with codegen of variables with array types specified in 'copyprivate' clause of 'single' directive.
Differential Revision: http://reviews.llvm.org/D8914
llvm-svn: 234856