This patch adds OMPIRBuilder support for the simd directive (without any clause). This will be a first step towards lowering simd directive in LLVM_Flang. The patch uses existing CanonicalLoop infrastructure of IRBuilder to add the support. Also adds necessary code to add llvm.access.group and llvm.loop metadata wherever needed.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D114379
This patch changes the visiblity of variables declared within a declare
target directive. Variable declarations within a declare target
directive need to be externally visible to the plugin for initialization
or reading. Previously this would cause runtime errors where the named
global could not be found because it was not included in the symbol
table.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D117362
This patch changes the visiblity of variables declared within a declare
target directive. Variable declarations within a declare target
directive need to be externally visible to the plugin for initialization
or reading. Previously this would cause runtime errors where the named
global could not be found because it was not included in the symbol
table.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D117362
This patch changes the visibility of the `__omp_rtl_debug_kind` variable
to be hidden. These variables are only used by the plugin so they do not
need to be read externally. Previously the default visibility prevented
these variables from being completely eliminated in the module.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D117320
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions.
I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.
Test updates are made as a separate patch: D108453
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D105169
Since 2959e082e1, we conservatively
assume all inputs are enabled by default. This isn't the best
interface for controlling these anyway, since it's not granular and
only allows trimming the last fields.
EHTerminateScope is used to implement C++ noexcept semantics. Per C++
[except.terminate], it is implemented-defined whether no, some, or all
cleanups are run prior to terminatation.
Therefore, the code to run cleanups on the way towards termination is
unnecessary, and may be omitted.
After this change, we will still run some cleanups: any cleanups in a
function called from the noexcept function will continue to run, while
those in the noexcept function itself will not.
Differential Revision: https://reviews.llvm.org/D113620
Use ASTContext::getTypeDeclType() to get type of omp_interop_t since
TypeDecl::getTypeForDecl() may return null if TypeForDecl is not
setup yet.
Handle functions where the function type is under an AttributedType.
Differential Revision: https://reviews.llvm.org/D117172
Passing any feature in the device-isa trait which is not supported by the host
was causing a compilation failure.
Differential Revision: https://reviews.llvm.org/D116549
Alternative to D116817.
This introduces a new value-based folding interface for Or (FoldOr),
which takes 2 values and returns an existing Value or a constant if the
Or can be simplified. Otherwise nullptr is returned. This replaces the
more restrictive CreateOr which takes 2 constants.
This is the used to implement a folder that uses InstructionSimplify.
The logic to simplify `Or` instructions is moved there. Subsequent
patches are going to transition other CreateXXX to the more general
FoldXXX interface.
Reviewed By: nikic, lebedev.ri
Differential Revision: https://reviews.llvm.org/D116935
The `EmitDeclareOfAutoVariable` introduced in D114504 and D115510 has a
precondition that cannot be violated. It is unclear if we should call it
directly given the sparse usage in clang but for now we should at least
not crash if the debug info kind is too low.
Fixes#52938.
Differential Revision: https://reviews.llvm.org/D116865
The fold for merging a GEP of GEP into a single GEP currently bails
if doing so would result in notional overindexing. The justification
given in the comment above this check is dangerously incorrect: GEPs
with notional overindexing are perfectly fine, and if some code
treats them incorrectly, then that code is broken, not the GEP.
Such a GEP might legally appear in source IR, so only preventing
its creation cannot be sufficient. (The constant folder also ends
up canonicalizing the GEP to remove the notional overindexing, but
that's neither here nor there.)
This check dates back to
bd4fef4a89,
and as far as I can tell the original issue this was trying to
patch around has since been resolved.
Differential Revision: https://reviews.llvm.org/D116587
One of the unused ident_t fields now holds the size of the string
(=const char *) field so we have an easier time dealing with those
in the future.
Differential Revision: https://reviews.llvm.org/D113126
This patch changes the default aligntment from 8 to 16, and encodes this
information in the `__kmpc_alloc_shared` runtime call to communicate it
to the HeapToStack pass. The previous alignment of 8 was not sufficient
for the maximum size of primitive types on 64-bit systems, and needs to
be increaesd. This reduces the amount of space availible in the data
sharing stack, so this implementation will need to be improved later to
include the alignment requirements in the allocation call, and use it
properly in the data sharing stack in the runtime.
Depends on D115888
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D115971
This patch adds the support for `atomic compare` in parser. The support
in Sema and CodeGen will come soon. For now, it simply eimits an error when it
is encountered.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D115561
MakeNaturalAlignAddrLValue() expects the pointee type, but the
pointer type was passed. As a result, the natural alignment of
the pointer (usually 8) was always used in place of the natural
alignment of the value type.
Differential Revision: https://reviews.llvm.org/D116171
Need to look through the base of the member function calls at the DSA
analysis stage to correctly capture implicit class instances.
Differential Revision: https://reviews.llvm.org/D115902
The problem with the old scheme is that we would need to keep track of
the "next region" and reset the num_threads value after it. The new RT
doesn't do it and an assertion is triggered. The old RT doesn't do it
either, I haven't tested it but I assume a num_threads clause might
impact multiple parallel regions "accidentally". Further, in SPMD mode
num_threads was simply ignored, for some reason beyond me.
In any case, parallel_51 is designed to take the clause value directly,
so let's do that instead.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D113623
WG14 adopted the _ExtInt feature from Clang for C23, but renamed the
type to be _BitInt. This patch does the vast majority of the work to
rename _ExtInt to _BitInt, which accounts for most of its size. The new
type is exposed in older C modes and all C++ modes as a conforming
extension. However, there are functional changes worth calling out:
* Deprecates _ExtInt with a fix-it to help users migrate to _BitInt.
* Updates the mangling for the type.
* Updates the documentation and adds a release note to warn users what
is going on.
* Adds new diagnostics for use of _BitInt to call out when it's used as
a Clang extension or as a pre-C23 compatibility concern.
* Adds new tests for the new diagnostic behaviors.
I want to call out the ABI break specifically. We do not believe that
this break will cause a significant imposition for early adopters of
the feature, and so this is being done as a full break. If it turns out
there are critical uses where recompilation is not an option for some
reason, we can consider using ABI tags to ease the transition.
Need to do the analysis of the captured expressions in the clauses.
Previously the compiler ignored them and it may lead to a compiler crash
trying to get the address of the mapped variables.
Differential Revision: https://reviews.llvm.org/D114546
Currently the last value of linear is calculated as var = init + num_iters * step.
Replaced it with var = var_priv, i.e. original variable gets the value
of the last private copy.
Differential Revision: https://reviews.llvm.org/D105151
Need to postpone anlysis of the ranged for loops till the actual
instantiation to avoid erroneous emission of error messages.
Differential Revision: https://reviews.llvm.org/D114560
Need to postpone analysis for addressable lvalue in a depend clause with
iterators, otherwise the incorrect error message is emitted.
Differential Revision: https://reviews.llvm.org/D114653
Fix for the case when there are no instructions in the entry basic block before the call
to `createSections`
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D114143
Currently variables appearing inside private/firstprivate/lastprivate
clause of openmp task construct are not visible inside lldb debugger.
This is because compiler does not generate debug info for it.
Please consider the testcase debug_private.c attached with patch.
```
28 #pragma omp task shared(res) private(priv1, priv2) firstprivate(fpriv)
29 {
30 priv1 = n;
31 priv2 = n + 2;
32 printf("Task n=%d,priv1=%d,priv2=%d,fpriv=%d\n",n,priv1,priv2,fpriv);
33
-> 34 res = priv1 + priv2 + fpriv + foo(n - 1);
35 }
36 #pragma omp taskwait
37 return res;
(lldb) p priv1
error: <user expression 0>:1:1: use of undeclared identifier 'priv1'
priv1
^
(lldb) p priv2
error: <user expression 1>:1:1: use of undeclared identifier 'priv2'
priv2
^
(lldb) p fpriv
error: <user expression 2>:1:1: use of undeclared identifier 'fpriv'
fpriv
^
```
After the current patch, lldb is able to show the variables
```
(lldb) p priv1
(int) $0 = 10
(lldb) p priv2
(int) $1 = 12
(lldb) p fpriv
(int) $2 = 14
```
Reviewed By: djtodoro
Differential Revision: https://reviews.llvm.org/D114504
Eachempati.
This patch adds clang (parsing, sema, serialization, codegen) support for the 'depend' clause on the 'taskwait' directive.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D113540
After the changes introduced by D106799 it is possible to tag
outlined function with both AlwaysInline and NoInline attributes using
-fno-inline command line options.
This issue is similiar to D107649.
Differential Revision: https://reviews.llvm.org/D112645
Extension of D112504. Lower amdgpu printf to `__llvm_omp_vprintf`
which takes the same const char*, void* arguments as cuda vprintf and also
passes the size of the void* alloca which will be needed by a non-stub
implementation of `__llvm_omp_vprintf` for amdgpu.
This removes the amdgpu link error on any printf in a target region in favour
of silently compiling code that doesn't print anything to stdout.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D112680
at the start of the entry block, which in turn would aid better code transformation/optimization.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D110257
This patch removes the assumption propagation that was added in D110655
primarily to get assumption informatino on opaque call sites for
optimizations. The analysis done in D111445 allows us to do this more
intelligently in the back-end.
Depends on D111445
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D111463
The existing CGOpenMPRuntimeAMDGCN and CGOpenMPRuntimeNVPTX classes are
just code bloat. By removing them, the codebase gets a bit cleaner.
Reviewed By: jdoerfert, JonChesterfield, tianshilei1992
Differential Revision: https://reviews.llvm.org/D113421