Commit Graph

102 Commits

Author SHA1 Message Date
Johannes Doerfert e2cfbfcc0c [OpenMP] Unified entry point for SPMD & generic kernels in the device RTL
In the spirit of TRegions [0], this patch provides a simpler and uniform
interface for a kernel to set up the device runtime. The OMPIRBuilder is
used for reuse in Flang. A custom state machine will be generated in the
follow up patch.

The "surplus" threads of the "master warp" will not exit early anymore
so we need to use non-aligned barriers. The new runtime will not have an
extra warp but also require these non-aligned barriers.

[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11

This was in parts extracted from D59319.

Reviewed By: ABataev, JonChesterfield

Differential Revision: https://reviews.llvm.org/D101976
2021-07-10 17:53:56 -05:00
Nico Weber d3e7491333 Revert Attributor patch series
Broke check-clang, see https://reviews.llvm.org/D102307#2869065
Ran `git revert -n ebbe149a6f08535ede848a531a601ae6591cfbc5..269416d41908bb670f67af689155d5ab8eea689a`
2021-07-10 16:15:55 -04:00
Johannes Doerfert 1d5711c3ee [OpenMP] Unified entry point for SPMD & generic kernels in the device RTL
In the spirit of TRegions [0], this patch provides a simpler and uniform
interface for a kernel to set up the device runtime. The OMPIRBuilder is
used for reuse in Flang. A custom state machine will be generated in the
follow up patch.

The "surplus" threads of the "master warp" will not exit early anymore
so we need to use non-aligned barriers. The new runtime will not have an
extra warp but also require these non-aligned barriers.

[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11

This was in parts extracted from D59319.

Reviewed By: ABataev, JonChesterfield

Differential Revision: https://reviews.llvm.org/D101976
2021-07-10 12:32:50 -05:00
Alexey Bataev c84a5448b5 [OPENMP]Fix PR50129: omp cancel parallel not working as expected.
Need to emit a call for __kmpc_cancel_barrier in the exit block for
__kmpc_cancel function call if cancellation of the parallel block is
requested.

Differential Revision: https://reviews.llvm.org/D103646
2021-06-04 08:24:55 -07:00
Mats Petersson 9091ecdae0 [OpenMP]Add support for workshare loop modifier in lowering
When lowering the dynamic, guided, auto and runtime types of scheduling,
there is an optional monotonic or non-monotonic modifier. This patch
adds support in the OMP IR Builder to pass this down to the runtime
functions.

Also implements tests for the variants.

Differential Revision: https://reviews.llvm.org/D102008
2021-05-27 15:33:05 +01:00
Mats Petersson 86627be233 Revert "[OpenMP]Add support for workshare loop modifier in lowering"
This reverts commit ea4c5fb04c.
2021-05-27 13:09:47 +01:00
Mats Petersson ea4c5fb04c [OpenMP]Add support for workshare loop modifier in lowering
When lowering the dynamic, guided, auto and runtime types of scheduling,
there is an optional monotonic or non-monotonic modifier. This patch
adds support in the OMP IR Builder to pass this down to the runtime
functions.

Also implements tests for the variants.

Differential Revision: https://reviews.llvm.org/D102008
2021-05-27 12:28:27 +01:00
Vitaly Buka 676a789a5b [NFC][OMP] Fix 'unused' warning 2021-05-24 17:14:38 -07:00
Fady Ghanim b43bb33eb5 [NFC] Removing leftover debug code
Removing a missed value::dump() used to debug during development of
OMPBuilder atomic.
2021-05-23 19:13:33 -04:00
Fady Ghanim 766ad7d0aa [OpenMP][OMPIRBuilder]Adding support for `omp atomic`
This patch adds support for generating `omp atomic` for all different
atomic clauses
2021-05-23 17:44:09 -04:00
Mats Petersson 7280f4b279 [OpenMP][MLIR]Add support for guided, auto and runtime scheduling
When using parallel loop construct, the OpenMP specification allows for
guided, auto and runtime as scheduling variants (as well as static and
dynamic which are already supported).

This adds the translation from MLIR to LLVM-IR for these scheduling
variants.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D101435
2021-05-10 09:18:52 +00:00
Valentin Clement 63f8226f25 [OpenMPIRBuilder] Add createOffloadMaptypes and createOffloadMapnames functions
Add function to create the offload_maptypes and the offload_mapnames globals. These two functions
are used in clang. They will be used in the Flang/MLIR lowering as well.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D101503
2021-05-03 15:42:32 -04:00
Chirag Khandelwal fbd3548d1c [LLVM][OpenMP] Adding support for OpenMP sections construct in OpenMPIRBuilder
This patch adds section support in the OpenMP IRBuilder module, along with a test for the same.

Reviewed By: fghanim

Differential Revision: https://reviews.llvm.org/D89671
2021-04-29 18:39:49 +05:30
Mats Petersson 517c3aee4d [OpenMP IRBuilder, MLIR] Add support for OpenMP do schedule dynamic
The implementation supports static schedule for Fortran do loops. This
implements the dynamic variant of the same concept.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D97393
2021-04-16 16:09:49 +01:00
cchen e0c2125d1d [OpenMP] Added codegen for masked directive
Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D100514
2021-04-15 12:55:07 -05:00
Joseph Huber 8140d0ec4a [OpenMP] Change OMPIRBuilder to append function attributes
Summary:
Currently the OMPIRBuilder overwrites the function's existing attributes
when it assigns the ones defined in OMPKinds.def. This changes the
behaviour to append the current function's attributes with them instead.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D98740
2021-03-24 09:08:29 -04:00
Valentin Clement d709dcc090 [openacc][openmp] Reduce number of generated file and prefer inclusion of .inc
Follow up from D92955 and D83636. This patch makes the base cpp files
OMP.cpp and ACC.cpp normal files and they now include the XXX.inc file
generated by tablegen. This reduces the number of file generated by the
DirectiveEmitter backend and makes it closer to the proposal in D83636.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D93560
2021-03-23 09:16:53 -04:00
Johannes Doerfert 49ed3032ff Revert "[OpenMP] Do not propagate match extensions to nested contexts"
Two tests failed for some reason, need to investigate:
  https://lab.llvm.org/buildbot/#/builders/109/builds/10399

This reverts commit ad9e98b8ef.
2021-03-11 23:48:36 -06:00
Johannes Doerfert ad9e98b8ef [OpenMP] Do not propagate match extensions to nested contexts
If we have nested declare variant context, it doesn't make sense to
inherit the match extension from the parent. Instead, just skip it.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D95764
2021-03-11 23:31:21 -06:00
Nikita Popov 46354bac76 [OpaquePtrs] Remove some uses of type-less CreateLoad APIs (NFC)
Explicitly pass loaded type when creating loads, in preparation
for the deprecation of these APIs.

There are still a couple of uses left.
2021-03-11 14:40:57 +01:00
Michael Kruse b119120673 [clang][OpenMP] Use OpenMPIRBuilder for workshare loops.
Initial support for using the OpenMPIRBuilder by clang to generate loops using the OpenMPIRBuilder. This initial support is intentionally limited to:
 * Only the worksharing-loop directive.
 * Recognizes only the nowait clause.
 * No loop nests with more than one loop.
 * Untested with templates, exceptions.
 * Semantic checking left to the existing infrastructure.

This patch introduces a new AST node, OMPCanonicalLoop, which becomes parent of any loop that has to adheres to the restrictions as specified by the OpenMP standard. These restrictions allow OMPCanonicalLoop to provide the following additional information that depends on base language semantics:
 * The distance function: How many loop iterations there will be before entering the loop nest.
 * The loop variable function: Conversion from a logical iteration number to the loop variable.

These allow the OpenMPIRBuilder to act solely using logical iteration numbers without needing to be concerned with iterator semantics between calling the distance function and determining what the value of the loop variable ought to be. Any OpenMP logical should be done by the OpenMPIRBuilder such that it can be reused MLIR OpenMP dialect and thus by flang.

The distance and loop variable function are implemented using lambdas (or more exactly: CapturedStmt because lambda implementation is more interviewed with the parser). It is up to the OpenMPIRBuilder how they are called which depends on what is done with the loop. By default, these are emitted as outlined functions but we might think about emitting them inline as the OpenMPRuntime does.

For compatibility with the current OpenMP implementation, even though not necessary for the OpenMPIRBuilder, OMPCanonicalLoop can still be nested within OMPLoopDirectives' CapturedStmt. Although OMPCanonicalLoop's are not currently generated when the OpenMPIRBuilder is not enabled, these can just be skipped when not using the OpenMPIRBuilder in case we don't want to make the AST dependent on the EnableOMPBuilder setting.

Loop nests with more than one loop require support by the OpenMPIRBuilder (D93268). A simple implementation of non-rectangular loop nests would add another lambda function that returns whether a loop iteration of the rectangular overapproximation is also within its non-rectangular subset.

Reviewed By: jdenny

Differential Revision: https://reviews.llvm.org/D94973
2021-03-04 22:52:59 -06:00
Michael Kruse 26b5be66f9 [OpenMPIRBuilder] Implement collapseLoops.
The collapseLoops method implements a transformations facilitating the implementation of the collapse-clause. It takes a list of loops from a loop nest and reduces it to a single loop that can be used by other methods that are implemented on just a single loop, such as createStaticWorkshareLoop.

This patch shares some changes with D92974 (such as adding some getters to CanonicalLoopNest), used by both patches.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D93268
2021-02-03 19:12:02 -06:00
Michael Kruse b890fafe67 [OpenMPIRBuilder] Silence compiler warning. NFC.
Address the compiler warning
OMPIRBuilder.cpp:1232:27: comparison of integers of different signs: 'size_t' (aka 'unsigned long') and 'int' [-Wsign-compare]
2021-01-23 21:00:37 -06:00
Michael Kruse b7dee667b6 [OpenMPIRBuilder] Implement tileLoops.
The  tileLoops method implements the code generation part of the tile directive introduced in OpenMP 5.1. It takes a list of loops forming a loop nest, tiles it, and returns the CanonicalLoopInfo representing the generated loops.

The implementation takes n CanonicalLoopInfos, n tile size Values and returns 2*n new CanonicalLoopInfos. The input CanonicalLoopInfos are invalidated and BBs not reused in the new loop nest removed from the function.

In a modified version of D76342, I was able to correctly compile and execute a tiled loop nest.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D92974
2021-01-23 19:39:29 -06:00
Kazu Hirata cfa241680f [llvm] Don't include StringSwitch.h where unnecessary (NFC) 2021-01-21 19:59:48 -08:00
Giorgis Georgakoudis 9751705512 [OpenMPOpt][WIP] Expand parallel region merging
The existing implementation of parallel region merging applies only to
consecutive parallel regions that have speculatable sequential
instructions in-between. This patch lifts this limitation to expand
merging with any sequential instructions in-between, except calls to
unmergable OpenMP runtime functions. In-between sequential instructions
in the merged region are sequentialized in a "master" region and any
output values are broadcasted to the following parallel regions and the
sequential region continuation of the merged region.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D90909
2021-01-11 08:06:23 -08:00
Johannes Doerfert 0b0f2e6ee0 [OpenMP][FIX] Avoid string literal comparison, use `StringRef::equal` 2021-01-07 14:53:20 -06:00
Johannes Doerfert d970a285b8 [OpenMP][Fix] Make the arch selector for x86_64 work
The triple uses a bar "x86-64" instead of an underscore. Since we
have troubles accepting x86-64 as an identifier, we stick with
x86_64 in the frontend and translate it explicitly.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D93786
2021-01-07 14:31:18 -06:00
Brandon Bergren 2288319733 [PowerPC] Enable OpenMP for powerpcle target. [5/5]
Enable OpenMP for powerpcle to match the rest of powerpc*.

Update tests.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D92445
2021-01-02 12:18:07 -06:00
Michael Kruse ba0265a8d8 [OpenMPIRBuilder] Various changes required for tileLoops.
Extract some changes not directly related to tileLoops out of D92974:
 * Refactor `createLoopSkeleton` out of `createCanonicalLoop`.
 * Introduce `ComputeIP` parameter to the `createCanonicalLoop` overload inserts instructions to compute the trip count. Specifying the location is necessary to make these instructions appear before the outermost loop of a loop nest that is tiled.
 * Introduce `Name` parameter to `createCanonicalLoop`. This can help better understanding the origin of values of basic blocks with many loops. The default value is "loop" instead of "for" which could be confused with the "for directive" (aka worksharing-loop) and does not apply to Fortran.
 * Remove `CanonicalLoopInfo::eraseFromParent` which is currently unused and untested and was added in anticipation to be used by `tileLoops`. `eraseFromParent` has shown to be insufficient when more than a single loop is involved and is replaced by `removeUnusedBlocksFromParent` in D92974.

Reviewed By: SouraVX

Differential Revision: https://reviews.llvm.org/D93088
2020-12-11 11:37:45 -06:00
Alex Zinenko f31704f8ae [OpenMPIRBuilder] Put the barrier in the exit block in createWorkshapeLoop
The original code was inserting the barrier at the location given by the
caller. Make sure it is always inserted at the end of the loop exit block
instead.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D92849
2020-12-09 11:33:04 +01:00
Alex Zinenko c102c783cd [OpenMPIRBuilder] introduce createStaticWorkshareLoop
Introduce a function that creates a statically-scheduled workshare loop
out of a canonical loop created earlier by the OpenMPIRBuilder. This
basically amounts to injecting runtime calls to the preheader and the
after block and updating the trip count. Static scheduling kind is
currently hardcoded and needs to be extracted from the runtime library
into common TableGen definitions.

Differential Revision: https://reviews.llvm.org/D92476
2020-12-07 22:30:59 +01:00
Alex Zinenko 240dd92432 [OpenMPIRBuilder] forward arguments as pointers to outlined function
OpenMPIRBuilder::createParallel outlines the body region of the parallel
construct into a new function that accepts any value previously defined outside
the region as a function argument. This function is called back by OpenMP
runtime function __kmpc_fork_call, which expects trailing arguments to be
pointers. If the region uses a value that is not of a pointer type, e.g. a
struct, the produced code would be invalid. In such cases, make createParallel
emit IR that stores the value on stack and pass the pointer to the outlined
function instead. The outlined function then loads the value back and uses as
normal.

Reviewed By: jdoerfert, llitchev

Differential Revision: https://reviews.llvm.org/D92189
2020-12-02 14:59:41 +01:00
Nick Lewycky fe43168348 Creating a named struct requires only a Context and a name, but looking up a struct by name requires a Module. The method on Module merely accesses the LLVMContextImpl and no data from the module itself, so this patch moves getTypeByName to a static method on StructType that takes a Context and a name.
There's a small number of users of this function, they are all updated.

This updates the C API adding a new method LLVMGetTypeByName2 that takes a context and a name.

Differential Revision: https://reviews.llvm.org/D78793
2020-11-30 11:34:12 -08:00
Alex Richardson 51e09e1d5a [AMDGPU] Set the default globals address space to 1
This will ensure that passes that add new global variables will create them
in address space 1 once the passes have been updated to no longer default
to the implicit address space zero.
This also changes AutoUpgrade.cpp to add -G1 to the DataLayout if it wasn't
already to present to ensure bitcode backwards compatibility.

Reviewed by: arsenm

Differential Revision: https://reviews.llvm.org/D84345
2020-11-20 15:46:53 +00:00
serge-sans-paille 733f7b5084 Revert "[build] normalize components dependencies"
This reverts commit c6ef6e1690.

Basically, publicly linked libraries have a different semantic than components,
which link libraries privately.

Differential Revision: https://reviews.llvm.org/D91461
2020-11-18 19:23:11 +01:00
serge-sans-paille c6ef6e1690 [build] normalize components dependencies
Use LINK_COMPONENTS instead of explicit target_link_libraries for components.
This avoids redundancy and potential inconsistencies.

Differential Revision: https://reviews.llvm.org/D91461
2020-11-17 10:42:34 +01:00
Mehdi Amini ac06b1af40 Revert "Switch libLLVMFrontendOpenACC to be a regular CMake library and not a "component""
This reverts commit e7ed276532.

Build is broken with  -DLLVM_LINK_LLVM_DYLIB=ON
2020-11-14 04:14:17 +00:00
Mehdi Amini e7ed276532 Switch libLLVMFrontendOpenACC to be a regular CMake library and not a "component"
This library is only used in Flang at the moment and not tested withing LLVM.
Having it as a component is breaking llvm-config:

  $ bin/llvm-config --shared-mode
  llvm-config: error: component libraries and shared library

  llvm-config: error: missing: [...]/lib/libLLVMFrontendOpenACC.a

This will reverted when unit-tests are provided for it.

Reviewed By: clementval

Differential Revision: https://reviews.llvm.org/D91470
2020-11-14 02:18:25 +00:00
serge-sans-paille 9218ff50f9 llvmbuildectomy - replace llvm-build by plain cmake
No longer rely on an external tool to build the llvm component layout.

Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.

These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.

Differential Revision: https://reviews.llvm.org/D90848
2020-11-13 10:35:24 +01:00
Michael Kruse e5dba2d7e5 [OMPIRBuilder] Start 'Create' methods with lower case. NFC.
For consistency with the IRBuilder, OpenMPIRBuilder has method names starting with 'Create'. However, the LLVM coding style has methods names starting with lower case letters, as all other OpenMPIRBuilder already methods do. The clang-tidy configuration used by Phabricator also warns about the naming violation, adding noise to the reviews.

This patch renames all `OpenMPIRBuilder::CreateXYZ` methods to `OpenMPIRBuilder::createXYZ`, and updates all in-tree callers.

I tested check-llvm, check-clang, check-mlir and check-flang to ensure that I did not miss a caller.

Reviewed By: mehdi_amini, fghanim

Differential Revision: https://reviews.llvm.org/D91109
2020-11-09 19:35:11 -06:00
Michael Kruse f44ee0f5e7 [OpenMPIRBuilder] Implement CreateCanonicalLoop.
CreateCanonicalLoop generates a standardized control flow structure for OpenMP canonical for loops. The structure can be consumed by loop-associated directives such as worksharing-loop, distribute, simd etc. as well as loop transformations such as tile and unroll.

This is a first design without considering all complexities yet. The control-flow emits more basic block than strictly necessary, but these will be optimized by CFGSimplify anyway, provide a nice separation of concerns and might later be useful with more complex scenarios. I successfully implemented a basic tile construct using this API, which is not part of this patch.

The fundamental building block is the CreateCanonicalLoop that only takes the loop trip count and operates on the logical iteration spaces only. An overloaded CreateCanonicalLoop for using LB, UB, Increment is provided as well, but at least for C++, Clang will need to implement a loop counter to logical induction variable mapping anyway, since iterator overload resolution cannot be done in LLVMFrontend.

As there currently is no user for CreateCanonicalLoop, it is only called from unittests. Similarly, CanonicalLoopInfo::eraseFromParent() is used in my file implementation and might be generally useful for implementing loop-associated constructs, but is not used in this patch itself.

The following non-exhaustive list describes not yet covered items:
 * collapse clause (including non-rectangular and non-perfectly nested); idea is to provide a OpenMPIRBuilder::collapseLoopNest method consuming multiple nested loops and returning a new CanonicalLoopInfo that can be used for loop-associated directives.
 * simarly: ordered clause for DOACROSS loops
 * branch weights
 * Cancellation point (?)
 * AllocaIP
 * break statement (if needed at all)
 * Exceptions (if not completely handled in the front-end)
  * Using it in Clang; this requires implementing at least one loop-associated construct.
 * ...

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D90830
2020-11-09 15:03:32 -06:00
Jon Chesterfield 09bc755dea [OpenMP] Emit calls to int64_t functions for amdgcn
[OpenMP] Emit calls to int64_t functions for amdgcn

Two functions, syncwarp and active_thread_mask, return lanemask_t. Currently
this is assumed to be int32, which is true for nvptx. Patch makes the type
target architecture dependent.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D89746
2020-10-22 15:02:47 +01:00
Fady Ghanim aaa93a681b [OpenMP][OMPBuilder] Adding support for `omp single`
This adds support for generating `omp single`, and necessary calls for
`copyprivate` clause.

Differential Revision: https://reviews.llvm.org/D85617
2020-08-16 01:15:16 -04:00
Johannes Doerfert 9240e48a58 [OpenMP][OMPIRBuilder] Use the source (=directory + filename) for locations
Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D85938
2020-08-14 08:59:25 -05:00
Johannes Doerfert fa5d22a045 [OpenMP][NFC] Reuse OMPIRBuilder `struct ident_t` handling in Clang
Replace the `ident_t` handling in Clang with the methods offered by the
OMPIRBuilder. This cuts down on the clang code as well as the
differences between the two, making further transitions easier. Tests
have changed but there should not be a real functional change. The most
interesting difference is probably that we stop generating local ident_t
allocations for now and just use globals. Given that this happens only
with debug info, the location part of the `ident_t` is probably bigger
than the test anyway. As the location part is already a global, we can
avoid the allocation, memcpy, and store in favor of a constant global
that is slightly bigger. This can be revisited if there are
complications.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D80735
2020-08-10 17:13:26 -05:00
Johannes Doerfert 19756ef53a [OpenMP][IRBuilder] Support allocas in nested parallel regions
We need to keep track of the alloca insertion point (which we already
communicate via the callback to the user) as we place allocas as well.

Reviewed By: fghanim, SouraVX

Differential Revision: https://reviews.llvm.org/D82470
2020-07-30 10:19:39 -05:00
Johannes Doerfert ee05167cc4 [OpenMP] Allow traits for the OpenMP context selector `isa`
It was unclear what `isa` was supposed to mean so we did not provide any
traits for this context selector. With this patch we will allow *any*
string or identifier. We use the target attribute and target info to
determine if the trait matches. In other words, we will check if the
provided value is a target feature that is available (at the call site).

Fixes PR46338

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D83281
2020-07-29 10:22:27 -05:00
Johannes Doerfert 7af287d0d9 [OpenMP][IRBuilder] Support nested parallel regions
During code generation we might change/add basic blocks so keeping a
list of them is fairly easy to break. Nested parallel regions were
enough. The new scheme does recompute the list of blocks to be outlined
once it is needed.

Reviewed By: anchu-rajendran

Differential Revision: https://reviews.llvm.org/D82722
2020-07-14 22:39:06 -05:00
Valentin Clement 0a90ffa772 [flang][openacc] OpenACC 3.0 parser
Summary:
This patch introduce the parser for OpenACC 3.0 in Flang. It uses the same TableGen mechanism
than OpenMP.

Reviewers: nvdatian, sscalpone, tskeith, klausler, ichoyjx, jdoerfert, DavidTruby

Reviewed By: klausler

Subscribers: MaskRay, SouraVX, mgorny, hiraditya, jfb, sstefan1, llvm-commits

Tags: #llvm, #flang

Differential Revision: https://reviews.llvm.org/D83649
2020-07-14 14:29:40 -04:00