Summary:
This patch fixes an issue detected when firstprivate variables are passed to an OpenMP outlined function vararg list. Currently they are not compatible with what the runtime library expects causing malfunction in some targets.
This patch fixes the issue by moving the casting logic already in place for offloading to the common code that creates the outline function and arguments and updates the regression tests accordingly.
Reviewers: hfinkel, arpith-jacob, carlo.bertolli, kkwli0, ABataev
Subscribers: cfe-commits, caomhin
Differential Revision: http://reviews.llvm.org/D21150
llvm-svn: 272900
Parameters and non-static members of aggregates are still excluded,
and probably should remain that way.
Differential Revision: http://reviews.llvm.org/D19754
llvm-svn: 272859
As noted in the code comment, a potential follow-on would be to remove
the builtins themselves. Other than ord/unord, this already works as
expected. Eg:
typedef float v4sf __attribute__((__vector_size__(16)));
v4sf fcmpgt(v4sf a, v4sf b) { return a > b; }
Differential Revision: http://reviews.llvm.org/D21268
llvm-svn: 272840
Patch adds intrinsics for mrrc/mrrc2. The
intrinsics for mrrc/mrrc2 return a single
uint64_t to represent two 32 bit values.
The mcrr/mcrr2 intrinsic was changed to
accept a single uint64_t instead of two
32 bit values as the input for consistency.
Differential Revision: http://reviews.llvm.org/D21179
llvm-svn: 272777
classes.
MSVC actively uses unqualified lookup in dependent bases, lookup at the
instantiation point (non-dependent names may be resolved on things
declared later) etc. and all this stuff is the main cause of
incompatibility between clang and MSVC.
Clang tries to emulate MSVC behavior but it may fail in many cases.
clang could store lexed tokens for member functions definitions within
ClassTemplateDecl for later parsing during template instantiation.
It will allow resolving many possible issues with lookup in dependent
base classes and removing many already existing MSVC-specific
hacks/workarounds from the clang code.
llvm-svn: 272774
Summary:
This is similar to other loop pragmas like 'vectorize'. Currently it
only has state values: distribute(enable) and distribute(disable). When
one of these is specified the corresponding loop metadata is generated:
!{!"llvm.loop.distribute.enable", i1 true/false}
As a result, loop distribution will be attempted on the loop even if
Loop Distribution in not enabled globally. Analogously, with 'disable'
distribution can be turned off for an individual loop even when the pass
is otherwise enabled.
There are some slight differences compared to the existing loop pragmas.
1. There is no 'assume_safety' variant which makes its handling slightly
different from 'vectorize'/'interleave'.
2. Unlike the existing loop pragmas, it does not have a corresponding
numeric pragma like 'vectorize' -> 'vectorize_width'. So for the
consistency checks in CheckForIncompatibleAttributes we don't need to
check it against other pragmas. We just need to check for duplicates of
the same pragma.
Reviewers: rsmith, dexonsmith, aaron.ballman
Subscribers: bob.wilson, cfe-commits, hfinkel
Differential Revision: http://reviews.llvm.org/D19403
llvm-svn: 272656
pointer-to-pointer representing the parameter. An aggregate rvalue representing
a pointer does not make sense.
We got away with this weirdness because CGCall happens to blindly load an
RValue in aggregate form in this case, without checking whether an RValue for
the type should be in scalar or aggregate form.
llvm-svn: 272609
We can now use __builtin_nontemporal_store instead of target specific builtins for naturally aligned nontemporal stores which avoids the need for handling in CGBuiltin.cpp
The scalar integer nontemporal (unaligned) store builtins will have to wait as __builtin_nontemporal_store currently assumes natural alignment and doesn't accept the 'packed struct' trick that we use for normal unaligned load/stores.
The nontemporal loads require further backend support before we can safely convert them to __builtin_nontemporal_load
Differential Revision: http://reviews.llvm.org/D21272
llvm-svn: 272540
These ExprWithCleanups are added for holding a RunCleanupsScope not
for destructor calls; rather, they are for lifetime marks. This requires
ExprWithCleanups to keep a bit to indicate whether it have cleanups with
side effects (e.g. dtor calls).
Differential Revision: http://reviews.llvm.org/D20498
llvm-svn: 272296
Summary:
This should have been a very simple change, but it was greatly
complicated by the construction of new Decls during IR generation.
In particular, we reconstruct the AST function type in order to get the
implicit 'this' parameter into C++ method types.
We also have to worry about FunctionDecls whose types are not
FunctionTypes because CGBlocks.cpp constructs some dummy FunctionDecls
with 'void' type.
Depends on D21114
Reviewers: aprantl, dblaikie
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D21141
llvm-svn: 272198
__builtin_astype does not generate correct LLVM IR for vec3 types. This patch inserts bitcasts to/from vec4 when necessary in addition to generating vector shuffle. Sema and codegen tests are added.
Differential Revision: http://reviews.llvm.org/D20133
llvm-svn: 272153
We have an assertion failure if, for example, the definition of an unused
inline function starts in one macro and ends in another. This patch fixes
the issue by finding the common ancestor of the start and end locations
of that function's body and changing the locations accordingly.
Thanks to NAKAMURA Takumi for helping with fixing the test failure on Windows.
Differential Revision: http://reviews.llvm.org/D20997
llvm-svn: 271995
We have an assertion failure if, for example, the definition of an unused
inline function starts in one macro and ends in another. This patch fixes
the issue by finding the common ancestor of the start and end locations
of that function's body and changing the locations accordingly.
Differential Revision: http://reviews.llvm.org/D20997
llvm-svn: 271969
The assertion added earlier was overly strict. We need to strip the pointer
casts (as when constructing the GV). Correct the types (Function or Variable).
llvm-svn: 271750
The `isa' member was previously not given the correct DLL Storage. Ensure that
we give the `isa' constant `__CFConstantStringClassReference' the correct DLL
storage. Default to dllimport unless an explicit specification gives it a
dllexport storage.
llvm-svn: 271361
Summary:
This is particularly important because a some convergent CUDA intrinsics
(e.g. __shfl_down) are implemented in terms of inline asm.
Reviewers: tra
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D20836
llvm-svn: 271336
I added this call in r271308. It's redundant because it's dominated by a
call to extendRegion().
Thanks to Justin Bogner for pointing this out!
llvm-svn: 271331
Adjust the constant CFString emission to emit into more appropriate sections on
ELF and COFF targets. It would previously try to use MachO section names
irrespective of the file format.
llvm-svn: 271211
'simd' modifier.
Runtime library defines new schedule constant kmp_sch_static_balanced_chunked = 45 for static loop-based directives static with chunk adjustment (e.g., simd). Added codegen for this kind of schedule.
llvm-svn: 271204
directives.
'kmp_task_t' record type added a new field for 'priority' clause and
changed the representation of pointer to destructors for privates used
within loop-based directives.
Old representation:
typedef struct kmp_task { /* GEH: Shouldn't this be
aligned somehow? */
void *shareds; /**< pointer to block of
pointers to shared vars */
kmp_routine_entry_t routine; /**< pointer to routine
to call for executing task */
kmp_int32 part_id; /**< part id for the
task */
kmp_routine_entry_t destructors; /* pointer to function to
invoke deconstructors of firstprivate C++ objects */
/* private vars */
} kmp_task_t;
New representation:
typedef struct kmp_task { /* GEH: Shouldn't this be
aligned somehow? */
void *shareds; /**< pointer to block of
pointers to shared vars */
kmp_routine_entry_t routine; /**< pointer to routine
to call for executing task */
kmp_int32 part_id; /**< part id for the
task */
kmp_cmplrdata_t data1; /* Two known
optional additions: destructors and priority */
kmp_cmplrdata_t data2; /* Process
destructors first, priority second */
/* future data */
/* private vars */
} kmp_task_t;
Also excessive initialization of 'destructors' fields to 'null' was
removed from codegen if it is known that no destructors shal be used.
Currently a special bit is used in 'kmp_tasking_flags_t' bitfields
('destructors_thunk' bitfield).
llvm-svn: 271201