This enabled opaque pointers by default in LLVM. The effect of this
is twofold:
* If IR that contains *neither* explicit ptr nor %T* types is passed
to tools, we will now use opaque pointer mode, unless
-opaque-pointers=0 has been explicitly passed.
* Users of LLVM as a library will now default to opaque pointers.
It is possible to opt-out by calling setOpaquePointers(false) on
LLVMContext.
A cmake option to toggle this default will not be provided. Frontends
or other tools that want to (temporarily) keep using typed pointers
should disable opaque pointers via LLVMContext.
Differential Revision: https://reviews.llvm.org/D126689
This patch adds basic support for `omp task` to the OpenMPIRBuilder.
The outlined function after code extraction is called from a wrapper function with appropriate arguments. This wrapper function is passed to the runtime calls for task allocation.
This approach is different from the Clang approach - clang directly emits the runtime call to the outlined function. The outlining utility (OutlineInfo) simply outlines the code and generates a function call to the outlined function. After the function has been generated by the outlining utility, there is no easy way to alter the function arguments without meddling with the outlining itself. Hence the wrapper function approach is taken.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D71989
The callback is expected to create a branch to the ContinuationBB (sometimes called FiniBB in some lambdas) argument when finishing. This creates problems:
1. The InsertPoint used for CodeGenIP does not need to be the end of a block. If it is not, a naive callback will insert a branch instruction into the middle of the block.
2. The BasicBlock the CodeGenIP is pointing to may or may not have a terminator. There is an conflict where to branch to if the block already has a terminator.
3. Some API functions work only with block having a terminator. Some workarounds have been used to insert a temporary terminator that is removed again.
4. Some callbacks are sensitive to whether the BasicBlock has a terminator or not. This creates a callback ordering problem where different callback may have different behaviour depending on whether a previous callback created a terminator or not. The problem also exists for FinalizeCallbackTy where some callbacks do create branch to another "continue" block, but unlike BodyGenCallbackTy does not receive the target as argument. This is not addressed in this patch.
With this patch, the callback receives an CodeGenIP into a BasicBlock where to insert instructions. If it has to insert control flow, it can split the block at that position as needed but otherwise no separate ContinuationBB is needed. In particular, a callback can be empty without breaking the emitted IR. If the caller needs the control flow to branch to a specific target, it can insert the branch instruction itself and pass an InsertPoint before the terminator to the callback.
Certain frontends such as Clang may expect the current IRBuilder position to be at the end of a basic block. In this case its callbacks must split the block at CodeGenIP before setting the IRBuilder position such that the instructions after CodeGenIP are moved to another basic block and before returning create a new branch instruction to the split block.
Some utility functions such as `splitBB` are supporting correct splitting of BasicBlocks, independent of whether they have a terminator or not, returning/setting the InsertPoint of an IRBuilder to the end of split predecessor block, and optionally omitting creating a branch to the split successor block to be added later.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D118409
This reverts commit af0285122f.
The test "libomp::loop_dispatch.c" on builder
openmp-gcc-x86_64-linux-debian fails from time-to-time.
See #54969. This patch is unrelated.
The OMPScheduleType enum stores the constants from libomp's internal sched_type in kmp.h and are used by several kmp API functions. The enum values have an internal structure, namely each scheduling algorithm (e.g.) exists in four variants: unordered, orderend, normerge unordered, and nomerge ordered.
This patch (basically a followup to D114940) splits the "ordered" and "nomerge" bits into separate flags, as was already done for the "monotonic" and "nonmonotonic", so we can apply bit flags operations on them. It also now contains all possible combinations according to kmp's sched_type. Deriving of the OMPScheduleType enum from clause parameters has been moved form MLIR's OpenMPToLLVMIRTranslation.cpp to OpenMPIRBuilder to make available for clang as well. Since the primary purpose of the flag is the binary interface to libomp, it has been made more private to LLVMFrontend. The primary interface for generating worksharing-loop using OpenMPIRBuilder code becomes `applyWorkshareLoop` which derives the OMPScheduleType automatically and calls the appropriate emitter function.
While this is mostly a NFC refactor, it still applies the following functional changes:
* The logic from OpenMPToLLVMIRTranslation to derive the OMPScheduleType also applies to clang. Most notably, it now applies the nonmonotonic flag for non-static schedules by default.
* In OpenMPToLLVMIRTranslation, the nonmonotonic default flag was previously not applied if the simd modifier was used. I assume this was a bug, since the effect was due to `loop.schedule_modifier()` returning `mlir::omp::ScheduleModifier::none` instead of `llvm::Optional::None`.
* In OpenMPToLLVMIRTranslation, the nonmonotonic default flag was set even if ordered was specified, in breach to what the comment before citing the OpenMP specification says. I assume this was an oversight.
The ordered flag with parameter was not considered in this patch. Changes will need to be made (e.g. adding/modifying function parameters) when support for it is added. The lengthy names of the enum values can be discussed, for the moment this is avoiding reusing previously existing enum value names such as `StaticChunked` to avoid confusion.
Reviewed By: peixin
Differential Revision: https://reviews.llvm.org/D123403
This patch supports ordered clause specified without parameter in
worksharing-loop directive in the OpenMPIRBuilder and lowering MLIR to
LLVM IR.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D114940
This patch adds the nowait parameter to `createSingle` in
OpenMPIRBuilder and handling for IR generation from OpenMP Dialect.
Also added tests for the same.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D122371
This patch fixes the condition for emitting atomic update using
`atomicrmw` instruction or compare-exchange loop.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D121546
Add applyStaticChunkedWorkshareLoop method implementing static schedule when chunk-size is specified. Unlike a static schedule without chunk-size (where chunk-size is chosen by the runtime such that each thread receives one chunk), we need two nested loops: one for looping over the iterations of a chunk, and a second for looping over all chunks assigned to the threads.
This patch includes the following related changes:
* Adapt applyWorkshareLoop to triage between the schedule types, now possible since all schedules have been implemented. The default schedule is assumed to be non-chunked static, as without OpenMPIRBuilder.
* Remove the chunk parameter from applyStaticWorkshareLoop, it is ignored by the runtime. Change the value for the value passed to the init function to 0, as without OpenMPIRBuilder.
* Refactor CanonicalLoopInfo::setTripCount and CanonicalLoopInfo::mapIndVar as used by both, applyStaticWorkshareLoop and applyStaticChunkedWorkshareLoop.
* Enable Clang to use the OpenMPIRBuilder in the presence of the schedule clause.
Differential Revision: https://reviews.llvm.org/D114413
This patch fixes `createAtomicUpdate` for lowering with float types.
Test added for the same.
This patch also changes the alloca argument for createAtomicUpdate and
createAtomicCapture from `Instruction*` to `InsertPointTy`. This is in
line with the other functions of the OpenMPIRBuilder class which take
AllocaIP as an `InsertPointTy`.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D118227
With opaque pointers, we can no longer derive this from the pointer
type, so we need to explicitly provide the element type the atomic
operation should work with.
Differential Revision: https://reviews.llvm.org/D118359
Summary:
This patch modifies code generation in OpenMPIRBuilder to pass arguments
to the parallel region outlined function in an aggregate (struct),
besides the global_tid and bound_tid arguments. It depends on the
updated CodeExtractor (see D96854) for support. It mirrors functionality
of Clang codegen (see D102107).
Differential Revision: https://reviews.llvm.org/D110114
When a Builder methods accepts multiple InsertPoints, when both point to
the same position, inserting instructions at one position will "move" the
other after the inserted position since the InsertPoint is pegged to the
instruction following the intended InsertPoint. For instance, when
creating a parallel region at Loc and passing the same position as AllocaIP,
creating instructions at Loc will "move" the AllocIP behind the Loc
position.
To avoid this ambiguity, add an assertion checking this condition and
fix the unittests.
In case of AllocaIP, an alternative solution could be to implicitly
split BasicBlock at InsertPoint, using the first as AllocaIP, the second
for inserting the instructions themselves. However, this solution is
specific to AllocaIP since AllocaIP will always have to be first. Hence,
this is an argument to generally handling ambiguous InsertPoints as API
sage error.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D117226
This patch adds OMPIRBuilder support for the simd directive (without any clause). This will be a first step towards lowering simd directive in LLVM_Flang. The patch uses existing CanonicalLoop infrastructure of IRBuilder to add the support. Also adds necessary code to add llvm.access.group and llvm.loop metadata wherever needed.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D114379
OpenMP runtime requires depend vec with i64 type and the alignment of
store instruction should be set as 8.
Reviewed By: kiranchandramohan, shraiysh
Differential Revision: https://reviews.llvm.org/D116300
This patch adds lowering from omp.sections and omp.section (simple lowering along with the nowait clause) to LLVM IR.
Tests for the same are also added.
Reviewed By: ftynse, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D115030
Make the reduction handling in OpenMPIRBuilder compatible with
opaque pointers by explicitly storing the element type in ReductionInfo,
and also passing it to the atomic reduction callback, as at least
the ones in the test need the type there.
This doesn't make things fully compatible yet, there are other
uses of element types in this class. I also left one
getPointerElementType() call in mlir, because I'm not familiar
with that area.
Differential Revison: https://reviews.llvm.org/D115638
Fix for the case when there are no instructions in the entry basic block before the call
to `createSections`
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D114143
This patch adds the inclusive clause (which was missed in previous
reorganization - https://reviews.llvm.org/D110903) in omp.wsloop operation.
Added a test for validating it.
Also fixes the order clause, which was not accepting any values. It now accepts
"concurrent" as a value, as specified in the standard.
Reviewed By: kiranchandramohan, peixin, clementval
Differential Revision: https://reviews.llvm.org/D112198
Recommit of 707ce34b06. Don't introduce a
dependency to the LLVMPasses component, instead register the required
passes individually.
Add methods for loop unrolling to the OpenMPIRBuilder class and use them in Clang if `-fopenmp-enable-irbuilder` is enabled. The unrolling methods are:
* `unrollLoopFull`
* `unrollLoopPartial`
* `unrollLoopHeuristic`
`unrollLoopPartial` and `unrollLoopHeuristic` can use compiler heuristics to automatically determine the unroll factor. If possible, that is if no CanonicalLoopInfo is required to pass to another method, metadata for LLVM's LoopUnrollPass is added. Otherwise the unroll factor is determined using the same heurstics as user by LoopUnrollPass. Not requiring a CanonicalLoopInfo, especially with `unrollLoopHeuristic` allows greater flexibility.
With full unrolling and partial unrolling with known unroll factor, instead of duplicating instructions by the OpenMPIRBuilder, the full unroll is still delegated to the LoopUnrollPass. In case of partial unrolling the loop is first tiled using the existing `tileLoops` methods, then the inner loop fully unrolled using the same mechanism.
Reviewed By: jdoerfert, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D107764
Add support for ordered directive in the OpenMPIRBuilder.
This patch also modidies clang to use the ordered directive when the
option -fopenmp-enable-irbuilder is enabled.
Also fix one ICE when parsing one canonical for loop with the relational
operator LE or GE in openmp region by replacing unary increment
operation of the expression of the variable "Expr A" minus the variable
"Expr B" (++(Expr A - Expr B)) with binary addition operation of the
experssion of the variable "Expr A" minus the variable "Expr B" and the
expression with constant value "1" (Expr A - Expr B + "1").
Reviewed By: Meinersbur, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D107430
Breaks build with -DBUILD_SHARED_LIBS=ON
```
CMake Error: The inter-target dependency graph contains the following strongly connected component (cycle):
"LLVMFrontendOpenMP" of type SHARED_LIBRARY
depends on "LLVMPasses" (weak)
"LLVMipo" of type SHARED_LIBRARY
depends on "LLVMFrontendOpenMP" (weak)
"LLVMCoroutines" of type SHARED_LIBRARY
depends on "LLVMipo" (weak)
"LLVMPasses" of type SHARED_LIBRARY
depends on "LLVMCoroutines" (weak)
depends on "LLVMipo" (weak)
At least one of these targets is not a STATIC_LIBRARY. Cyclic dependencies are allowed only among static libraries.
CMake Generate step failed. Build files cannot be regenerated correctly.
```
This reverts commit 707ce34b06.
Add methods for loop unrolling to the OpenMPIRBuilder class and use them in Clang if `-fopenmp-enable-irbuilder` is enabled. The unrolling methods are:
* `unrollLoopFull`
* `unrollLoopPartial`
* `unrollLoopHeuristic`
`unrollLoopPartial` and `unrollLoopHeuristic` can use compiler heuristics to automatically determine the unroll factor. If possible, that is if no CanonicalLoopInfo is required to pass to another method, metadata for LLVM's LoopUnrollPass is added. Otherwise the unroll factor is determined using the same heurstics as user by LoopUnrollPass. Not requiring a CanonicalLoopInfo, especially with `unrollLoopHeuristic` allows greater flexibility.
With full unrolling and partial unrolling with known unroll factor, instead of duplicating instructions by the OpenMPIRBuilder, the full unroll is still delegated to the LoopUnrollPass. In case of partial unrolling the loop is first tiled using the existing `tileLoops` methods, then the inner loop fully unrolled using the same mechanism.
Reviewed By: jdoerfert, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D107764
Add in-source documentation on how CanonicalLoopInfo is intended to be used. In particular, clarify what parts of a CanonicalLoopInfo is considered part of the loop, that those parts must be side-effect free, and that InsertPoints to instructions outside those parts can be expected to be preserved after method calls implementing loop-associated directives.
CanonicalLoopInfo are now invalidated after it does not describe canonical loop anymore and asserts when trying to use it afterwards.
In addition, rename `createXYZWorkshareLoop` to `applyXYZWorkshareLoop` and remove the update location to avoid that the impression that they insert something from scratch at that location where in reality its InsertPoint is ignored. createStaticWorkshareLoop does not return a CanonicalLoopInfo anymore. First, it was not a canonical loop in the clarified sense (containing side-effects in form of calls to the OpenMP runtime). Second, it is ambiguous which of the two possible canonical loops it should actually return. It will not be needed before a feature expected to be introduced in OpenMP 6.0
Also see discussion in D105706.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D107540
This introduces a builder function for emitting IR performing reductions in
OpenMP. Reduction variable privatization and initialization to the
reduction-neutral value is expected to be handled separately. The caller
provides the reduction functions. Further commits can provide implementation of
reduction functions for the reduction operators defined in the OpenMP
specification.
This implementation was tested on an MLIR fork targeting OpenMP from C and
produced correct executable code.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D104928
Need to emit a call for __kmpc_cancel_barrier in the exit block for
__kmpc_cancel function call if cancellation of the parallel block is
requested.
Differential Revision: https://reviews.llvm.org/D103646
When lowering the dynamic, guided, auto and runtime types of scheduling,
there is an optional monotonic or non-monotonic modifier. This patch
adds support in the OMP IR Builder to pass this down to the runtime
functions.
Also implements tests for the variants.
Differential Revision: https://reviews.llvm.org/D102008
When lowering the dynamic, guided, auto and runtime types of scheduling,
there is an optional monotonic or non-monotonic modifier. This patch
adds support in the OMP IR Builder to pass this down to the runtime
functions.
Also implements tests for the variants.
Differential Revision: https://reviews.llvm.org/D102008
When using parallel loop construct, the OpenMP specification allows for
guided, auto and runtime as scheduling variants (as well as static and
dynamic which are already supported).
This adds the translation from MLIR to LLVM-IR for these scheduling
variants.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D101435