Commit Graph

14592 Commits

Author SHA1 Message Date
David Blaikie 8d9ddd4f50 DebugInfo: STN: Handle unreconstitutable types in function types 2021-09-23 21:13:16 -07:00
David Blaikie 25ac0d3c73 DebugInfo: Implement the -gsimple-template-names functionality
This excludes certain names that can't be rebuilt from the available
DWARF:

* Atomic types - no DWARF differentiating int from atomic int.
* Vector types - enough DWARF (an attribute on the array type) to do
  this, but I haven't written the extra code to add the attributes
  required for this
* Lambdas - ambiguous with any other unnamed class
* Unnamed classes/enums - would need column info for the type in
  addition to file/line number
* noexcept function types - not encoded in DWARF
2021-09-23 19:58:32 -07:00
Thomas Lively 2f519825ba [WebAssembly] Add prototype relaxed SIMD fma/fms instructions
Add experimental clang builtins, LLVM intrinsics, and backend definitions for
the new {f32x4,f64x2}.{fma,fms} instructions in the relaxed SIMD proposal:
https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md.
Do not allow these instructions to be selected without explicit user opt-in.

Differential Revision: https://reviews.llvm.org/D110295
2021-09-23 11:01:36 -07:00
Hongtao Yu d9b511d8e8 [CSSPGO] Set PseudoProbeInserter as a default pass.
Currenlty PseudoProbeInserter is a pass conditioned on a target switch. It works well with a single clang invocation. It doesn't work so well when the backend is called separately (i.e, through the linker or llc), where user has always to pass -pseudo-probe-for-profiling explictly. I'm making the pass a default pass that requires no command line arg to trigger, but will be actually run depending on whether the CU comes with `llvm.pseudo_probe_desc` metadata.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D110209
2021-09-22 09:09:48 -07:00
Shilei Tian ca999f7191 [OpenMP][Offloading] Use bitset to indicate execution mode instead of value
The execution mode of a kernel is stored in a global variable, whose value means:
- 0 - SPMD mode
- 1 - indicates generic mode
- 2 - SPMD mode execution with generic mode semantics

We are going to add support for SIMD execution mode. It will be come with another
execution mode, such as SIMD-generic mode. As a result, this value-based indicator
is not flexible.

This patch changes to bitset based solution to encode execution mode. Each
position is:
[0] - generic mode
[1] - SPMD mode
[2] - SIMD mode (will be added later)

In this way, `0x1` is generic mode, `0x2` is SPMD mode, and `0x3` is SPMD mode
execution with generic mode semantics. In the future after we add the support for
SIMD mode, `0b1xx` will be in SIMD mode.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D110029
2021-09-22 11:40:52 -04:00
Florian Hahn ea21d688dc
[Matrix] Emit assumption that matrix indices are valid.
The matrix extension requires the indices for matrix subscript
expression to be valid and it is UB otherwise.

extract/insertelement produce poison if the index is invalid, which
limits the optimizer to not be bale to scalarize load/extract pairs for
example, which causes very suboptimal code to be generated when using
matrix subscript expressions with variable indices for large matrixes.

This patch updates IRGen to emit assumes to for index expression to
convey the information that the index must be valid.

This also adjusts the order in which operations are emitted slightly, so
indices & assumes are added before the load of the matrix value.

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D102478
2021-09-22 12:27:37 +01:00
David Blaikie 2ff049b12e DebugInfo: Don't use preferred template names in debug info
Using the preferred name creates a mismatch between the textual name of
a type and the DWARF tags describing the parameters as well as possible
inconsistency between DWARF producers (like Clang and GCC, or
older/newer Clang versions, etc).
2021-09-21 20:08:16 -07:00
David Blaikie db6f1e8a88 DebugInfo: Don't suppress inline namespaces when printing template template parameter names 2021-09-21 19:30:13 -07:00
David Blaikie d31dfc3011 DebugInfo: Unify some printing policy adjustments 2021-09-21 19:30:12 -07:00
Giorgis Georgakoudis ac90dfc43a Revert "[OpenMP] Codegen aggregate for outlined function captures"
This reverts commit 1d66649adf.

Revert to fix AMG GPU issue.
2021-09-21 13:20:39 -07:00
Matheus Izvekov d9308aa39b [clang] don't mark as Elidable CXXConstruct expressions used in NRVO
See PR51862.

The consumers of the Elidable flag in CXXConstructExpr assume that
an elidable construction just goes through a single copy/move construction,
so that the source object is immediately passed as an argument and is the same
type as the parameter itself.

With the implementation of P2266 and after some adjustments to the
implementation of P1825, we started (correctly, as per standard)
allowing more cases where the copy initialization goes through
user defined conversions.

With this patch we stop using this flag in NRVO contexts, to preserve code
that relies on that assumption.
This causes no known functional changes, we just stop firing some asserts
in a cople of included test cases.

Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D109800
2021-09-21 21:41:20 +02:00
Giorgis Georgakoudis 1d66649adf [OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3)  forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.

Reviewed By: jdoerfert, jhuber6

Differential Revision: https://reviews.llvm.org/D102107
2021-09-21 10:50:04 -07:00
Wang, Pengfei 227673398c [X86] Always check the size of SourceTy before getting the next type
D109607 results in a regression in llvm-test-suite.
The reason is we didn't check the size of SourceTy, so that we will
return wrong SSE type when SourceTy is overlapped.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D110037
2021-09-20 23:34:19 +08:00
alokmishra.besu 000875c127 OpenMP 5.0 metadirective
This patch supports OpenMP 5.0 metadirective features.
It is implemented keeping the OpenMP 5.1 features like dynamic user condition in mind.

A new function, getBestWhenMatchForContext, is defined in llvm/Frontend/OpenMP/OMPContext.h

Currently this function return the index of the when clause with the highest score from the ones applicable in the Context.
But this function is declared with an array which can be used in OpenMP 5.1 implementation to select all the valid when clauses which can be resolved in runtime. Currently this array is set to null by default and its implementation is left for future.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D91944
2021-09-18 13:40:44 -05:00
Nico Weber 31cca21565 Revert "OpenMP 5.0 metadirective"
This reverts commit c7d7b98e52.
Breaks tests on macOS, see comment on https://reviews.llvm.org/D91944
2021-09-18 09:10:37 -04:00
Adrian Prantl 843390c58a Apply proper source location to fallthrough switch cases.
This fixes a bug in clang where, when clang sees a switch with a
fallthrough to a default like this:

static void funcA(void) {}
static void funcB(void) {}

int main(int argc, char **argv) {

switch (argc) {
    case 0:
        funcA();
        break;
    case 10:
    default:
        funcB();
        break;
}
}

It does not add a proper debug location for that switch case, such as
case 10: above.

Patch by Shubham Rastogi!

Differential Revision: https://reviews.llvm.org/D109940
2021-09-17 14:45:04 -07:00
cchen 9ff848c5cd Revert "[OpenMP] Use irbuilder as default for masked and master construct"
This reverts commit 2908fc0d3f.
2021-09-17 16:44:09 -05:00
alokmishra.besu 347f3c186d OpenMP 5.0 metadirective
This patch supports OpenMP 5.0 metadirective features.
It is implemented keeping the OpenMP 5.1 features like dynamic user condition in mind.

A new function, getBestWhenMatchForContext, is defined in llvm/Frontend/OpenMP/OMPContext.h

Currently this function return the index of the when clause with the highest score from the ones applicable in the Context.
But this function is declared with an array which can be used in OpenMP 5.1 implementation to select all the valid when clauses which can be resolved in runtime. Currently this array is set to null by default and its implementation is left for future.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D91944
2021-09-17 16:30:06 -05:00
cchen 7efb825382 Revert "OpenMP 5.0 metadirective"
This reverts commit c7d7b98e52.
2021-09-17 16:14:16 -05:00
cchen c7d7b98e52 OpenMP 5.0 metadirective
This patch supports OpenMP 5.0 metadirective features.
It is implemented keeping the OpenMP 5.1 features like dynamic user condition in mind.

A new function, getBestWhenMatchForContext, is defined in llvm/Frontend/OpenMP/OMPContext.h

Currently this function return the index of the when clause with the highest score from the ones applicable in the Context.
But this function is declared with an array which can be used in OpenMP 5.1 implementation to select all the valid when clauses which can be resolved in runtime. Currently this array is set to null by default and its implementation is left for future.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D91944
2021-09-17 16:03:13 -05:00
cchen 2908fc0d3f [OpenMP] Use irbuilder as default for masked and master construct
Use irbuilder as default and remove redundant Clang codegen for masked construct and master construct.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D100874
2021-09-17 15:54:11 -05:00
Erich Keane e3b10525b4 Make multiversioning work with internal linkage
We previously made all multiversioning resolvers/ifuncs have weak
ODR linkage in IR, since we NEED to emit the whole resolver every time
we see a call, but it is not necessarily the place where all the
definitions live.

HOWEVER, when doing so, we neglected the case where the versions have
internal linkage.  This patch ensures we do this, so you don't get weird
behavior with static functions.
2021-09-17 05:56:38 -07:00
Wang, Pengfei e9e1d4751b [X86] Refactor GetSSETypeAtOffset to fix pr51813
D105263 adds support for _Float16 type. It introduced a bug (pr51813) that generates a <4 x half> type instead the default double when passing blank structure by SSE registers.

Although I doubt it may expose a bug somewhere other than D105263, it's good to avoid return half type when no half type in arguments.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D109607
2021-09-17 10:51:59 +08:00
Aaron Ballman aefb81a33a Removing some spurious whitespace; NFC 2021-09-16 13:14:08 -04:00
Arnold Schwaighofer f670c5aeee Add a new frontend flag `-fswift-async-fp={auto|always|never}`
Summary:
Introduce a new frontend flag `-fswift-async-fp={auto|always|never}`
that controls how code generation sets the Swift extended async frame
info bit. There are three possibilities:

* `auto`: which determines how to set the bit based on deployment target, either
statically or dynamically via `swift_async_extendedFramePointerFlags`.
* `always`: default, always set the bit statically, regardless of deployment
target.
* `never`: never set the bit, regardless of deployment target.

Differential Revision: https://reviews.llvm.org/D109451
2021-09-16 08:48:51 -07:00
Yaxun (Sam) Liu abe8b354e3 Fix vtbl field addr space
Storing the vtable field of an object should use the same address space as
the this pointer. Currently it is assumed to be addr space 0 but this may not
be true.

This assumption (added in 054cc3b1b4) caused
issues for the out-of-tree CHERI targets.

Reviewed by: John McCall, Alexander Richardson

Differential Revision: https://reviews.llvm.org/D109841
2021-09-16 10:57:31 -04:00
Zarko Todorovski 1b0a71c5fc [PowerPC][AIX] Add support for varargs for complex types on AIX
Remove the previous error and add support for special handling of small
complex types as in PPC64 ELF ABI. As in, generate code to load from
varargs location and pack it in a temp variable, then return a pointer to
the struct.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D106393
2021-09-16 09:38:03 -04:00
Bjorn Pettersson 8f8616655c [NewPM] Use a separate struct for ModuleThreadSanitizerPass
Split ThreadSanitizerPass into ThreadSanitizerPass (as a function
pass) and ModuleThreadSanitizerPass (as a module pass).
Main reason is to make sure that we have a unique mapping from
ClassName to PassName in the new passmanager framework, making it
possible to correctly identify the passes when dealing with options
such as -print-after and -print-pipeline-passes.

This is a follow-up to D105006 and D105007.
2021-09-16 14:58:42 +02:00
Bjorn Pettersson ab41eef9ac [NewPM] Use a separate struct for ModuleMemorySanitizerPass
Split MemorySanitizerPass into MemorySanitizerPass (as a function
pass) and ModuleMemorySanitizerPass (as a module pass).
Main reason is to make sure that we have a unique mapping from
ClassName to PassName in the new passmanager framework, making it
possible to correctly identify the passes when dealing with options
such as -print-after and -print-pipeline-passes.

This is a follow-up to D105006 and D105007.
2021-09-16 14:58:42 +02:00
David Blaikie 8264846c0e Senticify some comments - post-commit review for e4b9f5e851
Based on feedback from Paul Robinson.
2021-09-15 13:59:11 -07:00
Walter Lee 66c6bbe7ff Put code that avoids heapifying local blocks behind a flag
This change puts the functionality in commit
c5792aa90f behind a flag that is off by
default.  The original commit is not in Apple's Clang fork (and blocks
are an Apple extension in the first place), and there is one known
issue that needs to be addressed before it can be enabled safely.

Differential Revision: https://reviews.llvm.org/D108243
2021-09-14 14:06:05 -04:00
David Blaikie 13e34f9fc1 Fixup some formatting from a recent commit 2021-09-14 00:41:19 -07:00
David Blaikie e4b9f5e851 DebugInfo: Add support for template parameters with reference qualifiers
Followon from the previous commit supporting cvr qualifiers.
2021-09-14 00:39:47 -07:00
David Blaikie db4ff98bf9 DebugInfo: Add support for template parameters with qualifiers
eg: t1<void () const> - DWARF doesn't have a particularly nice way to
encode this, for real member function types (like `void (t1::*)()
const`) the const-ness is encoded in the type of the artificial first
parameter. But `void () const` has no parameters, so encode it like a
normal const-qualified type, using DW_TAG_const_type. (similarly for
restrict and volatile)

Reference qualifiers (& and &&) coming in a separate commit shortly.
2021-09-14 00:04:40 -07:00
Xiang1 Zhang c81d6ab875 [X86] Adjust Keylocker handle mem size
Reviewed By: Topper Craig

Differential Revision: https://reviews.llvm.org/D109488
2021-09-13 18:03:27 +08:00
Xiang1 Zhang bdce8d40c6 Revert "[X86] Adjust Keylocker handle mem size"
This reverts commit 3731de6b7f.
2021-09-13 18:00:46 +08:00
Xiang1 Zhang 3731de6b7f [X86] Adjust Keylocker handle mem size
Reviewed By: Topper Craig

Differential Revision: https://reviews.llvm.org/D109354
2021-09-13 17:59:33 +08:00
Joseph Huber 29b44ca896 [OpenMP] Add flag for setting debug in the offloading device
This patch introduces the flags `-fopenmp-target-debug` and
`-fopenmp-target-debug=` to set the value of a global in the device.
This will be used to enable or disable debugging features statically in
the device runtime library.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D109544
2021-09-10 18:19:19 -04:00
Roman Lebedev 85ba583eba
[NFCI][clang] Move allocation alignment manifestation for malloc-like into Sema from Codegen
... so that it happens right next to `AddKnownFunctionAttributesForReplaceableGlobalAllocationFunction()`,
which is good for consistency.
2021-09-10 20:49:28 +03:00
Chris Lattner 735f46715d [APInt] Normalize naming on keep constructors / predicate methods.
This renames the primary methods for creating a zero value to `getZero`
instead of `getNullValue` and renames predicates like `isAllOnesValue`
to simply `isAllOnes`.  This achieves two things:

1) This starts standardizing predicates across the LLVM codebase,
   following (in this case) ConstantInt.  The word "Value" doesn't
   convey anything of merit, and is missing in some of the other things.

2) Calling an integer "null" doesn't make any sense.  The original sin
   here is mine and I've regretted it for years.  This moves us to calling
   it "zero" instead, which is correct!

APInt is widely used and I don't think anyone is keen to take massive source
breakage on anything so core, at least not all in one go.  As such, this
doesn't actually delete any entrypoints, it "soft deprecates" them with a
comment.

Included in this patch are changes to a bunch of the codebase, but there are
more.  We should normalize SelectionDAG and other APIs as well, which would
make the API change more mechanical.

Differential Revision: https://reviews.llvm.org/D109483
2021-09-09 09:50:24 -07:00
Akira Hatanaka 59cc39ae14 [ObjC][ARC] Use the addresses of the ARC runtime functions instead of
integer 0/1 for the operand of bundle "clang.arc.attachedcall"

This should make it easier to understand what the IR is doing and also
simplify some of the passes as they no longer have to translate the
integer values to the runtime functions.

Differential Revision: https://reviews.llvm.org/D102996
2021-09-08 11:56:22 -07:00
Wang, Pengfei e6e8d25920 [X86][mingw] Modify the alignment of __m128/__m256/__m512 vector type for mingw
This is a follow up patch after D78564 and D108887.

Martin helped to confirm the alignment in GCC mingw is the same as the
size of vector. https://reviews.llvm.org/D108887#inline-1040893

Reviewed By: mstorsjo

Differential Revision: https://reviews.llvm.org/D109265
2021-09-06 20:28:09 +08:00
Qiu Chaofan fae0dfa642 [Clang] Add __ibm128 type to represent ppc_fp128
Currently, we have no front-end type for ppc_fp128 type in IR. PowerPC
target generates ppc_fp128 type from long double now, but there's option
(-mabi=(ieee|ibm)longdouble) to control it and we're going to do
transition from IBM extended double-double ppc_fp128 to IEEE fp128 in
the future.

This patch adds type __ibm128 which always represents ppc_fp128 in IR,
as what GCC did for that type. Without this type in Clang, compilation
will fail if compiling against future version of libstdcxx (which uses
__ibm128 in headers).

Although all operations in backend for __ibm128 is done by software,
only PowerPC enables support for it.

There's something not implemented in this commit, which can be done in
future ones:

- Literal suffix for __ibm128 type. w/W is suitable as GCC documented.
- __attribute__((mode(IF))) should be for __ibm128.
- Complex __ibm128 type.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D93377
2021-09-06 18:00:58 +08:00
Michael Kruse 650bbc5620 [OpenMP][OpenMPIRBuilder] Implement loop unrolling.
Recommit of 707ce34b06. Don't introduce a
dependency to the LLVMPasses component, instead register the required
passes individually.

Add methods for loop unrolling to the OpenMPIRBuilder class and use them in Clang if `-fopenmp-enable-irbuilder` is enabled. The unrolling methods are:

 * `unrollLoopFull`
 * `unrollLoopPartial`
 * `unrollLoopHeuristic`

`unrollLoopPartial` and `unrollLoopHeuristic` can use compiler heuristics to automatically determine the unroll factor. If possible, that is if no CanonicalLoopInfo is required to pass to another method, metadata for LLVM's LoopUnrollPass is added. Otherwise the unroll factor is determined using the same heurstics as user by LoopUnrollPass. Not requiring a CanonicalLoopInfo, especially with `unrollLoopHeuristic` allows greater flexibility.

With full unrolling and partial unrolling with known unroll factor, instead of duplicating instructions by the OpenMPIRBuilder, the full unroll is still delegated to the LoopUnrollPass. In case of partial unrolling the loop is first tiled using the existing `tileLoops` methods, then the inner loop fully unrolled using the same mechanism.

Reviewed By: jdoerfert, kiranchandramohan

Differential Revision: https://reviews.llvm.org/D107764
2021-09-04 19:18:58 -05:00
Kazuaki Ishizaki 8f77dc459e [clang] NFC: Fix trivial typo in comments and document
`the the` -> `the`

Reviewed By: xgupta

Differential Revision: https://reviews.llvm.org/D77470
2021-09-04 12:59:42 +05:30
PeixinQiao a42380ce83 [OMPIRBuilder] Add ordered directive to OMPBuilder
Add support for ordered directive in the OpenMPIRBuilder.

This patch also modidies clang to use the ordered directive when the
option -fopenmp-enable-irbuilder is enabled.

Also fix one ICE when parsing one canonical for loop with the relational
operator LE or GE in openmp region by replacing unary increment
operation of the expression of the variable "Expr A" minus the variable
"Expr B" (++(Expr A - Expr B)) with binary addition operation of the
experssion of the variable "Expr A" minus the variable "Expr B" and the
expression with constant value "1" (Expr A - Expr B + "1").

Reviewed By: Meinersbur, kiranchandramohan

Differential Revision: https://reviews.llvm.org/D107430
2021-09-03 09:37:58 +08:00
David Blaikie 5fb3f43778 Fully qualify template template parameters when printing
I discovered this quirk when working on some DWARF - AST printing prints
type template parameters fully qualified, but printed template template
parameters the way they were written syntactically, or wholely
unqualified - instead, we should print them consistently with the way we
print type template parameters: fully qualified.

The one place this got weird was for partial specializations like in
ast-print-temp-class.cpp - hence the need for checking for
TemplateNameDependenceScope::DependentInstantiation template template
parameters. (not 100% sure that's the right solution to that, though -
open to ideas)

Differential Revision: https://reviews.llvm.org/D108794
2021-09-02 15:04:34 -07:00
Roman Lebedev 3f1f08f0ed
Revert @llvm.isnan intrinsic patchset.
Please refer to
https://lists.llvm.org/pipermail/llvm-dev/2021-September/152440.html
(and that whole thread.)

TLDR: the original patch had no prior RFC, yet it had some changes that
really need a proper RFC discussion. It won't be productive to discuss
such an RFC, once it's actually posted, while said patch is already
committed, because that introduces bias towards already-committed stuff,
and the tree is potentially in broken state meanwhile.

While the end result of discussion may lead back to the current design,
it may also not lead to the current design.

Therefore i take it upon myself
to revert the tree back to last known good state.

This reverts commit 4c4093e6e3.
This reverts commit 0a2b1ba33a.
This reverts commit d9873711cb.
This reverts commit 791006fb8c.
This reverts commit c22b64ef66.
This reverts commit 72ebcd3198.
This reverts commit 5fa6039a5f.
This reverts commit 9efda541bf.
This reverts commit 94d3ff09cf.
2021-09-02 13:53:56 +03:00
Roman Lebedev 50634deaa5
Revert "[OpenMP][OpenMPIRBuilder] Implement loop unrolling."
Breaks build with -DBUILD_SHARED_LIBS=ON
```
CMake Error: The inter-target dependency graph contains the following strongly connected component (cycle):
  "LLVMFrontendOpenMP" of type SHARED_LIBRARY
    depends on "LLVMPasses" (weak)
  "LLVMipo" of type SHARED_LIBRARY
    depends on "LLVMFrontendOpenMP" (weak)
  "LLVMCoroutines" of type SHARED_LIBRARY
    depends on "LLVMipo" (weak)
  "LLVMPasses" of type SHARED_LIBRARY
    depends on "LLVMCoroutines" (weak)
    depends on "LLVMipo" (weak)
At least one of these targets is not a STATIC_LIBRARY.  Cyclic dependencies are allowed only among static libraries.
CMake Generate step failed.  Build files cannot be regenerated correctly.
```

This reverts commit 707ce34b06.
2021-09-02 12:42:23 +03:00
Michael Kruse 707ce34b06 [OpenMP][OpenMPIRBuilder] Implement loop unrolling.
Add methods for loop unrolling to the OpenMPIRBuilder class and use them in Clang if `-fopenmp-enable-irbuilder` is enabled. The unrolling methods are:

 * `unrollLoopFull`
 * `unrollLoopPartial`
 * `unrollLoopHeuristic`

`unrollLoopPartial` and `unrollLoopHeuristic` can use compiler heuristics to automatically determine the unroll factor. If possible, that is if no CanonicalLoopInfo is required to pass to another method, metadata for LLVM's LoopUnrollPass is added. Otherwise the unroll factor is determined using the same heurstics as user by LoopUnrollPass. Not requiring a CanonicalLoopInfo, especially with `unrollLoopHeuristic` allows greater flexibility.

With full unrolling and partial unrolling with known unroll factor, instead of duplicating instructions by the OpenMPIRBuilder, the full unroll is still delegated to the LoopUnrollPass. In case of partial unrolling the loop is first tiled using the existing `tileLoops` methods, then the inner loop fully unrolled using the same mechanism.

Reviewed By: jdoerfert, kiranchandramohan

Differential Revision: https://reviews.llvm.org/D107764
2021-09-02 02:37:25 -05:00