It can happen that when we only have 1 more register left in the regsave
area we need to store a value bigger than 1 register and therefore we
go to the overflow area. In this case we have to leave the last slot
in the regsave area unused and keep using overflow area. Do this
by storing a limit value to the used register counter in the overflow block.
Issue diagnosed by and solution tested by Mark Millard!
llvm-svn: 261422
Add support for opencl_unroll_hint attribute from OpenCL v2.0 s6.11.5.
Reusing most of metadata generation from CGLoopInfo helper class.
The code is based on Khronos OpenCL compiler:
https://github.com/KhronosGroup/SPIR/tree/spirv-1.0
Patch by Liu Yaxun (Sam)!
Differential Revision: http://reviews.llvm.org/D16686
llvm-svn: 261350
Llvm module object is shared between CodeGenerator and BackendConsumer,
in both classes it is stored as std::unique_ptr, which is not a good
design solution and can cause double deletion error. Usually it does
not occur because in BackendConsumer::HandleTranslationUnit the
ownership of CodeGenerator over the module is taken away. If however
this method is not called, the module is deleted twice and compiler crashes.
As the module owned by BackendConsumer is always the same as CodeGenerator
has, pointer to llvm module can be removed from BackendGenerator.
Differential Revision: http://reviews.llvm.org/D15450
llvm-svn: 261222
Patch fixes bug with codegen for lastprivate loop counters. Also it may
improve performance for lastprivates calculations in some cases.
llvm-svn: 261209
The assert is triggered because isObjCRetainableType() is called on the
canonicalized return type that has been stripped of the typedefs and
attributes attached to it. To fix this assert, this commit gets the
original return type from CurCodeDecl or BlockInfo and uses it instead
of the canoicalized type.
rdar://problem/24470031
Differential Revision: http://reviews.llvm.org/D16914
llvm-svn: 261151
Summary: Use the new pipeline implemented in D17115
Reviewers: tejohnson
Subscribers: joker.eph, cfe-commits
Differential Revision: http://reviews.llvm.org/D17272
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 261045
Expressions inside 'schedule'|'dist_schedule' clause must be captured in
combined directives to avoid possible crash during codegen. Patch
improves handling of such constructs
llvm-svn: 260954
Sync barrier will be emitted after generation of firstprivate variables
only if one of the firstprivate vars is used in lastprivate clause.
llvm-svn: 260877
Summary:
Unlike other outlined regions in OpenMP, offloading entry points have to have be visible (external linkage) for the device side. Using dots in the names of the entries can be therefore problematic for some toolchains, e.g. NVPTX.
Also the patch drops the column information in the unique name of the entry points. The parsing of directives ignore unknown tokens, preventing several target regions to be implemented in the same line. Therefore, the line information is sufficient for the name to be unique. Also, the preprocessor printer does not preserve the column information, causing offloading-entry detection issues if the host uses an integrated preprocessor and the target doesn't (or vice versa).
Reviewers: hfinkel, arpith-jacob, carlo.bertolli, kkwli0, ABataev
Subscribers: cfe-commits, fraggamuffin, caomhin
Differential Revision: http://reviews.llvm.org/D17179
llvm-svn: 260837
This commit changes the root from "Simple C/C++ TBAA" to "Simple C++ TBAA" for
C++.
The problem is that the type name in the TBAA nodes is generated differently
for C vs C++. If we link an IR file for C with an IR file for C++, since they
have the same root and the type names are different, accesses to the two type
nodes will be considered no-alias, even though the two type nodes are from
the same type in a header file.
The fix is to use different roots for C and C++. Types from C will be treated
conservatively in respect to types from C++.
Follow-up commits will change the C root to "Simple C TBAA" plus some mangling
change for C types to make it a little more aggresive.
llvm-svn: 260567
This reverts commit r260449.
We would supress our emission of vftable definitions if we thought
another translation unit would provide the definition because we saw an
explicit instantiation declaration. This is not the case with
dllimport, we want to synthesize a definition of the vftable regardless.
This fixes PR26569.
llvm-svn: 260548
The current macho linker just copies symbols in section datacoal_nt to
section data, so it doesn't really matter whether or not section
"datacoal_nt" is attached to the global variable.
This is a follow-up to r250370, which made changes in llvm to stop
putting functions and data in the *coal* sections.
rdar://problem/24528611
llvm-svn: 260496
OMPCapturedExprDecl allows caopturing not only of fielddecls, but also
other expressions. It also allows to simplify codegen for several
clauses.
llvm-svn: 260492
Summary:
We can't do the right thing, since there's no right thing to do, but at
least we can not crash the compiler.
Reviewers: majnemer, rnk
Subscribers: cfe-commits, jhen, tra
Differential Revision: http://reviews.llvm.org/D17103
llvm-svn: 260479
Referencing a dllimported vtable is impossible in a constexpr
constructor. It would be friendlier to C++ programmers if we
synthesized a copy of the vftable which referenced imported virtual
functions. This would let us initialize the object in a way which
preserves both the intent to import functionality from another DLL while
also making constexpr work.
Differential Revision: http://reviews.llvm.org/D17061
llvm-svn: 260388
being called with the wrong size: convert CGFunctionInfo to use TrailingObjects
and ask TrailingObjects to provide a working 'operator delete' for us.
llvm-svn: 260181
When handling 'if' statements, we crash if the condition and the consequent
branch are spanned by a single macro expansion.
The crash occurs because of a sanity 'reset' in popRegions(): if an expansion
exactly spans an entire region, we set MostRecentLocation to the start of the
expansion (its 'include location'). This ensures we don't handleFileExit()
ourselves out of the expansion before we're done processing all of the regions
within it. This is tested in test/CoverageMapping/macro-expressions.c.
This causes a problem when an expansion spans both the condition and the
consequent branch of an 'if' statement. MostRecentLocation is updated to the
start of the 'if' statement in popRegions(), so the file for the expansion
isn't exited by the time we're done handling the statement. We then crash with
'fatal: File exit not handled before popRegions'.
The fix for this is to detect these kinds of expansions, and conservatively
update MostRecentLocation to the end of expansion region containing the
conditional. I've added tests to make sure we don't have the same problem with
other kinds of statements.
rdar://problem/23630316
Differential Revision: http://reviews.llvm.org/D16934
llvm-svn: 260129
OpenMP 4.5 introduces privatization of non-static data members of current class in non-static member functions.
To correctly handle such kind of privatization a new (pseudo)declaration VarDecl-based node is added. It allows to reuse an existing code for capturing variables in Lambdas/Block/Captured blocks of code for correct privatization and codegen.
llvm-svn: 260077
A dllimport'd vtable always points one past the RTTI data, this means
that the initializer will never end up referencing the data. Our
emission is a harmless waste.
llvm-svn: 260062
Summary:
Different devices may in some cases require different code generation schemes in order to implement OpenMP. This is required not only for performance reasons, but also because it may not be possible to have the current (default) implementation working for these devices. E.g. GPU's cannot implement the same scheme a target such as powerpc or x86b would use, in the sense that it does not have the ability to fork threads, instead all the threads are always executing and need to be managed by the implementation.
This patch proposes a reorganization of the code in the OpenMP code generation to pave the way to have specialized implementation of OpenMP support. More than a "real" patch this is more a request for comments in order to understand if what is proposed is acceptable or if there are better/easier ways to do it.
In this patch part of the common OpenMP codegen infrastructure is moved to a new file under a new namespace (CGOpenMPCommon) so it can be shared between the default implementation and the specialized one. When CGOpenMPRuntime is created, an attempt to select a specialized implementation is done.
In the patch a specialization for nvptx targets is done which currently checks if the target is an OpenMP device and trap if it is not.
Let me know comments suggestions you may have.
Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev
Subscribers: Hahnfeld, cfe-commits, fraggamuffin, caomhin, jholewinski
Differential Revision: http://reviews.llvm.org/D16784
llvm-svn: 259977
It is possible for enums to be created as part of their own
declcontext. We need to cache a placeholder to avoid the type being
created twice before hitting the cache.
<rdar://problem/24493203>
llvm-svn: 259975
Because the Decl is explicitly passed as nullptr further up the call chain, it
is possible to invoke isa on a nullptr, which will assert. Guard against the
nullptr.
Take the opportunity to reuse the helper method rather than re-implementing this
logic.
llvm-svn: 259874
This patch changes cc1 option -fprofile-instr-generate to an enum option
-fprofile-instrument={clang|none}. It also changes cc1 options
-fprofile-instr-generate= to -fprofile-instrument-path=.
The driver level option -fprofile-instr-generate and -fprofile-instr-generate=
remain intact. This change will pave the way to integrate new PGO
instrumentation in IR level.
Review: http://reviews.llvm.org/D16730
llvm-svn: 259811
Codegen for array sections/array subscripts worked only for expressions with arrays as base. Patch fixes codegen for bases with pointer/reference types.
llvm-svn: 259776
Avoid crashing when printing diagnostics for vtable-related CFI
errors. In diagnostic mode, the frontend does an additional check of
the vtable pointer against the set of all known vtable addresses and
lets the runtime handler know if it is safe to inspect the vtable.
http://reviews.llvm.org/D16823
llvm-svn: 259716