Commit Graph

3720 Commits

Author SHA1 Message Date
Tim Shen b762bbd4c8 [MLIR] change NVVM.mma.sync to the most useful variant.
Summary:
the .row.col variant turns out to be the popular one, contrary to what I
thought as .row.row. Since .row.col is so prevailing (as I inspect
cuDNN's behavior), I'm going to remove the .row.row support here, which
makes the patch a little bit easier.

Reviewers: ftynse

Subscribers: jholewinski, bixia, sanjoy.google, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74655
2020-02-18 17:57:04 -08:00
Tim Shen f581e655ec [MLIR] Add std.assume_alignment op.
Reviewers: ftynse, nicolasvasilache, andydavis1

Subscribers: bixia, sanjoy.google, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74378
2020-02-18 17:55:07 -08:00
River Riddle a82b63a741 [mlir][DialectConversion] Forward capture callback to fix build on older
GCC

Older GCC confuses the type of 'callback' after it gets captured, so
add a forward capture to move it properly.
2020-02-18 17:43:05 -08:00
River Riddle 0d7ff220ed [mlir] Refactor TypeConverter to add conversions without inheritance
Summary:
This revision refactors the TypeConverter class to not use inheritance to add type conversions. It instead moves to a registration based system, where conversion callbacks are added to the converter with `addConversion`. This method takes a conversion callback, which must be convertible to any of the following forms(where `T` is a class derived from `Type`:
* Optional<Type> (T type)
   - This form represents a 1-1 type conversion. It should return nullptr
     or `llvm::None` to signify failure. If `llvm::None` is returned, the
     converter is allowed to try another conversion function to perform
     the conversion.
* Optional<LogicalResult>(T type, SmallVectorImpl<Type> &results)
   - This form represents a 1-N type conversion. It should return
     `failure` or `llvm::None` to signify a failed conversion. If the new
     set of types is empty, the type is removed and any usages of the
     existing value are expected to be removed during conversion. If
     `llvm::None` is returned, the converter is allowed to try another
     conversion function to perform the conversion.

When attempting to convert a type, the TypeConverter walks each of the registered converters starting with the one registered most recently.

Differential Revision: https://reviews.llvm.org/D74584
2020-02-18 16:17:48 -08:00
MaheshRavishankar a8355b5c0f [mlir][Linalg] Allow specifiying zero-rank shaped type operands to linalg.generic ops.
Fixing a bug where using a zero-rank shaped type operand to
linalg.generic ops hit an unrelated assert. This also meant that
lowering the operation to loops was not supported. Adding roundtrip
tests and lowering to loops test for zero-rank shaped type operand
with fixes to make the test pass.

Differential Revision: https://reviews.llvm.org/D74638
2020-02-18 13:23:28 -08:00
Alex Zinenko 870c1fd4c8 [mlir] NFC: rename LLVMOpLowering to ConvertToLLVMPattern
This better reflects the nature of the class and matches the current
naming scheme.

Differential Revision: https://reviews.llvm.org/D74774
2020-02-18 22:19:58 +01:00
River Riddle 94a4ca4bf3 [mlir] Add a TypeRange class that functions similar to ValueRange.
Summary: This class wraps around the various different ways to construct a range of Type, without forcing the materialization of that range into a contiguous vector.

Differential Revision: https://reviews.llvm.org/D74646
2020-02-18 11:37:24 -08:00
Jacques Pienaar fa7d04a0d3 [mlir] Add short readme.txt to docs directory
Summary:
Refer folks to the main website and make it explicit that the rendered
output is what is of interest and that the GitHub viewing experience may
not match (even though we are trying to keep it as close as possible, the
renderers do differ).

Differential Revision: https://reviews.llvm.org/D74739
2020-02-18 08:35:22 -08:00
Alex Zinenko 0f04384daf [mlir] NFC: Rename LLVMOpLowering::lowering to LLVMOpLowering::typeConverter
The existing name is an artifact dating back to the times when we did not have
a dedicated TypeConverter infrastructure. It is also confusing with with the
name of classes using it.

Differential revision: https://reviews.llvm.org/D74707
2020-02-18 15:57:10 +01:00
Jacques Pienaar 1842fd50d2 [mlir] Fix multiple titles
We have one title in every doc which corresponds to `#`, in the some
there are multiple and it is expected to be h1 headers (visual elements
rather than organizational). Indent every nesting by one in all of the
docs with multiple titles.

Also fixing trailing whitespace.
2020-02-17 13:55:46 -08:00
Benjamin Kramer 564a9de28e Hide implementation details. NFC> 2020-02-17 17:55:23 +01:00
Pierre Oechsel 0acd7e02f2 [mlir] Linalg: Extend promotion to non f32 buffers.
Summary:
Linalg's promotion pass was only supporting f32 buffers due to how the
zero value was build for the `fill` operation.

Moreover, `promoteSubViewOperands` was returning a vector with one entry
per float subview while omitting integer subviews. For a program
with only integer subviews the return vector would be of size 0.
However, `promoteSubViewsOperands` would try to access a non zero
number of entries of this vector, resulting in a sefgault.

Reviewers: nicolasvasilache, ftynse

Reviewed By: ftynse

Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74532
2020-02-17 15:56:49 +01:00
Benjamin Kramer 5fc5c7db38 Strength reduce vectors into arrays. NFCI. 2020-02-17 15:37:35 +01:00
River Riddle 7a551600d1 [mlir] Address post commit feedback of D73590 for SymbolsAndSymbolTables.md 2020-02-16 21:07:20 -08:00
riverriddle@google.com 857b655d7a [mlir] Allow adding extra class declarations to interfaces.
Summary: This matches the similar feature on operation definitions.

Reviewers: jpienaar, antiagainst

Reviewed By: jpienaar, antiagainst

Subscribers: mehdi_amini, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74438
2020-02-15 23:54:42 -08:00
River Riddle 9b07512fd3 [mlir][Parser][NFC] Remove several usages of getEncodedSourceLocation
Summary: getEncodedSourceLocation can be very costly to compute, especially if the input line becomes very long. This revision inlines some of the verification of a few `getChecked` methods to avoid the materialization of an encoded source location.

Differential Revision: https://reviews.llvm.org/D74587
2020-02-15 23:52:23 -08:00
Uday Bondhugula 2101590a78 NFC: add indexing operator for ArrayAttr
Summary: - add ArrayAttr::operator[](unsigned idx)

Differential Revision: https://reviews.llvm.org/D74663
2020-02-14 22:54:37 -08:00
Diego Caballero d7058acc14 [mlir] Add MemRef filter to affine data copy optimization
This patch extends affine data copy optimization utility with an
optional memref filter argument. When the memref filter is used, data
copy optimization will only generate copies for such a memref.

Note: this patch is just porting the memref filter feature from Uday's
'hop' branch: https://github.com/bondhugula/llvm-project/tree/hop.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D74342
2020-02-14 13:41:45 -08:00
Alexandre Ganea 8404aeb56a [Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups
The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket.

== Background ==
Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void*). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads.

By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to.

This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market.

== The problem ==
The heavyweight_hardware_concurrency() API was introduced so that only *one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group".

== The changes in this patch ==
To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO).

When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead.
The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware *core* will be used.

When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once.

Differential Revision: https://reviews.llvm.org/D71775
2020-02-14 10:24:22 -05:00
Mehdi Amini 850cb135a3 Do not build the CUBIN conversion pass when NVPTX Backend isn't configured
This pass would currently build, but fail to run when this backend isn't
linked in. On the other hand, we'd like it to initialize only the NVPTX
backend, which isn't possible if we continue to build it without the
backend available. Instead of building a broken configuration, let's
skip building the pass entirely.

Differential Revision: https://reviews.llvm.org/D74592
2020-02-14 09:33:12 +00:00
Alex Zinenko 39cb2a8fc7 [mlir] Fix argument attribute attribute reassignment in ConvertStandardToLLVM
The commit switching the calling convention for memrefs (5a1778057)
inadvertently introduced a bug in the function argument attribute conversion:
due to incorrect indexing of function arguments it was not assigning the
attributes to the arguments beyond those generated from the first original
argument. This was not caught in the commit since the test suite does have a
test for converting multi-argument functions with argument attributes. Fix the
bug and add relevant tests.
2020-02-14 10:22:33 +01:00
Eric Christopher f3b933266a Remove unused lambda argument. 2020-02-13 17:24:55 -08:00
River Riddle 5756bc4382 [mlir][DeclarativeParser] Add support for formatting enum attributes in the string form.
Summary: This revision adds support to the declarative parser for formatting enum attributes in the symbolized form. It uses this new functionality to port several of the SPIRV parsers over to the declarative form.

Differential Revision: https://reviews.llvm.org/D74525
2020-02-13 17:11:48 -08:00
aartbik b21c799952 [mlir] [VectorOps] Initial framework for progressively lowering vector.contract
Summary:
This sets the basic framework for lowering vector.contract progressively
into simpler vector.contract operations until a direct vector.reduction
operation is reached. More details will be filled out progressively as well.

Reviewers: nicolasvasilache

Reviewed By: nicolasvasilache

Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74520
2020-02-13 15:07:57 -08:00
Denis Khalikov a062a3ed7f [mlir][spirv] Add ConvertGpuLaunchFuncToVulkanCallsPass
Implement a pass to convert gpu.launch_func op into a sequence of
Vulkan runtime calls. The Vulkan runtime API surface is huge so currently we
don't expose separate external functions in IR for each of them, instead we
expose a few external functions to wrapper libraries which manages
Vulkan runtime.

Differential Revision: https://reviews.llvm.org/D74549
2020-02-13 14:10:07 -05:00
Stephan Herhut 715783d415 [MLIR][GPU] Implement initial mapping from loop.parallel to gpu.launch.
Summary:
To unblock other work, this implements basic lowering based on mapping
attributes that have to be provided on all loop.parallel. The lowering
does not yet support reduce.

Differential Revision: https://reviews.llvm.org/D73893
2020-02-13 16:54:16 +01:00
Alexander Belyaev 70e6ed1db7 Add '#include <functional>` to PassManager.h.
Summary:
On some platforms the build fails "std::function is not found". The include is used in
PassManager::IRPrinterConfig::enableIRPrinting.

Differential Revision: https://reviews.llvm.org/D74469
2020-02-13 14:43:21 +01:00
Abdurrahman Akkas 2e8c112ecf [mlir] Add elementAttr to TypedArrayAttrBase.
In code generators, one can automate the translation of typed ArrayAttrs
if element attribute translators are already implemented. However, the
type of the element attribute is lost at the construction of
TypedArrayAttrBase. With this change one can inspect the element type
and generate the translation logic automatically, which will reduce the
code repetition.

Differential Revision: https://reviews.llvm.org/D73579
2020-02-13 09:25:27 +01:00
Kern Handa 005b720373 [NFC][mlir] Adding some helpful EDSC intrinsics
Differential Revision: https://reviews.llvm.org/D74119
2020-02-13 09:21:17 +01:00
River Riddle a134ccbbeb [mlir][DeclarativeParser] Move operand type resolution into a functor to
share code.

This reduces the duplication for the two different cases.
2020-02-12 23:56:07 -08:00
River Riddle c74150e75f [mlir][ODS][NFC] Mark OpaqueType as a buildable type.
This allows for using it in the declarative assembly form, among other
things.
2020-02-12 23:51:38 -08:00
Frank Laub fdc7a16a82 [MLIR][Affine] Add affine.parallel op
Summary:
As discussed in https://llvm.discourse.group/t/rfc-add-affine-parallel/350, this is the first in a series of patches to bring in support for the `affine.parallel` operation.

This first patch adds the IR representation along with custom printer/parser implementations.

Reviewers: bondhugula, herhut, mehdi_amini, nicolasvasilache, rriddle, earhart, jbruestle

Reviewed By: bondhugula, nicolasvasilache, rriddle, earhart, jbruestle

Subscribers: jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74288
2020-02-12 18:00:24 -08:00
Nicolas Vasilache 10382ebe8f [mlir][Linalg] Fix build warnings 2020-02-12 16:50:40 -05:00
Tobias Gysi 4f865b7794 [mlir] support creating memref descriptors from static shape with non-zero offset
This patch adapts the method MemRefDescriptor::fromStaticShape to
support static non-zero offsets. The updated method uses the
getStridesAndOffset method to extract strides and offset. The patch also
adapts the test cases since sizes and strides are now set in forward
instead of reverse order.

Differential Revision: https://reviews.llvm.org/D74474
2020-02-12 22:40:49 +01:00
Valentin Clement 56aba9699d [MLIR] Fix wrong header for mlir-cuda-runner
Just updated the wrong header probably copied from the mlir-cpu-runner

Differential Revision: https://reviews.llvm.org/D74497
2020-02-12 22:35:46 +01:00
Nicolas Vasilache bfaf535791 [mlir][Linalg] Refactor in preparation for automatic Linalg "named" ops.
This revision prepares the ground for declaratively defining Linalg "named" ops.
Such named ops form the backbone of operations that are ubiquitous in the ML
application domain.

This revision closely related to the definition of a "Tensor Computation
Primitives Dialect" and demonstrates that ops can be expressed as declarative
configurations of the `linalg.generic` op.

Differential Revision: https://reviews.llvm.org/D74491
2020-02-12 14:47:40 -05:00
Nicolas Vasilache 137415ad28 [mlir][EDSC][Linalg] Compose linalg_matmul and vector.contract
Summary:
This revision allows model builder to create a linalg_matmul whose body
is a vector.contract. This shows the abstractions compose nicely.

Differential Revision: https://reviews.llvm.org/D74457
2020-02-12 13:50:50 -05:00
River Riddle c832145960 [mlir] Allow constructing a ValueRange from an ArrayRef<BlockArgument>
Summary: This was a missed case when ValueRange was originally added, and allows for constructing a ValueRange from the arguments of a block.

Differential Revision: https://reviews.llvm.org/D74363
2020-02-12 09:48:44 -08:00
Alex Zinenko 5ae9c4c868 [mlir] Linalg fusion: ignore indexed_generic producers
They are currently not supported and we should not attempt fusing them.
2020-02-12 15:13:21 +01:00
Pierre Oechsel fd11cda251 [mlir] StdToLLVM: Add error when the sourceMemRef of a subview is not a llvm type.
A memref_cast casting to a memref with a non identity map can't be
lowered to llvm. Take the following case:

```

func @invalid_memref_cast(%arg0: memref<?x?xf64>) {
  %c1 = constant 1 : index
  %c0 = constant 0 : index
  %5 = memref_cast %arg0 : memref<?x?xf64> to memref<?x?xf64, #map1>
  %25 = std.subview %5[%c0, %c0][%c1, %c1][] : memref<?x?xf64, #map1> to memref<?x?xf64, #map1>
  return
}
```

When lowering the subview mlir was assuming `%5` to have an llvm type
(which is not the case as mlir failed to lower the memref_cast).

Differential Revision: https://reviews.llvm.org/D74466
2020-02-12 15:13:18 +01:00
Stephan Herhut 864110b5b4 [MLIR][CUDA] Fix build file for mlir-cuda-runner
Summary:
This was broken recently when moving from dialect registration via
static initializers to explicit intialization.

Differential Revision: https://reviews.llvm.org/D74480
2020-02-12 15:10:51 +01:00
Lei Zhang d3e7816d85 [mlir][spirv] Introduce spv.func
Thus far we have been using builtin func op to model SPIR-V functions.
It was because builtin func op used to have special treatment in
various parts of the core codebase (e.g., pass pipelines, etc.) and
it's easy to bootstrap the development of the SPIR-V dialect. But
nowadays with general op concepts and region support we don't have
such limitations and it's time to tighten the SPIR-V dialect for
completeness.

This commits introduces a spv.func op to properly model SPIR-V
functions. Compared to builtin func op, it can provide the following
benefits:

* We can control the full op so we can integrate SPIR-V information
  bits (e.g., function control) in a more integrated way and define
  our own assembly form and enforcing better verification.
* We can have a better dialect and library boundary. At the current
  moment only functions are modelled with an external op. With this
  change, all ops modelling SPIR-V concpets will be spv.* ops and
  registered to the SPIR-V dialect.
* We don't need to special-case func op anymore when creating
  ConversionTarget declaring SPIR-V dialect as legal. This is quite
  important given we'll see more and more conversions in the future.

In the process, bumps a few FuncOp methods to the FunctionLike trait.

Differential Revision: https://reviews.llvm.org/D74226
2020-02-12 07:46:43 -05:00
Mehdi Amini 7b635880ab Fix MLIR build when the NVPTX target isn't configured
Differential Revision: https://reviews.llvm.org/D74472
2020-02-12 12:38:45 +00:00
Mehdi Amini c64770506b Remove static registration for dialects, and the "alwayslink" hack for passes
In the previous state, we were relying on forcing the linker to include
all libraries in the final binary and the global initializer to self-register
every piece of the system. This change help moving away from this model, and
allow users to compose pieces more freely. The current change is only "fixing"
the dialect registration and avoiding relying on "whole link" for the passes.
The translation is still relying on the global registry, and some refactoring
is needed to make this all more convenient.

Differential Revision: https://reviews.llvm.org/D74461
2020-02-12 09:13:02 +00:00
Marius Brehler a9a305716b [mlir] Revise naming of MLIROptMain and MLIRMlirOptLib
* Rename CMake target MLIROptMain to MLIROptLib:
   The target provides the main library
* Rename CMake target MLIRMlirOptLib to MLIRMlirOptMain:
   The target provides the main() entry function

At the moment, the Bazel configuration of TenorFlow maps the target
MlirOptLib to "lib/Support/MlirOptMain.cpp" and MlirOptMain to
"tools/mlir-opt/mlir-opt.cpp". This is the other way around in the CMake
configuration. As discussed in the context of the pull request
https://github.com/tensorflow/tensorflow/pull/36301, it seems useful to
revise the naming in the MLIR repo.

Differential Revision: https://reviews.llvm.org/D73778
2020-02-12 09:46:09 +01:00
Alexander Belyaev 7e5d8a34e3 [MLIR] Support memrefs with complex element types.
Differential Revision: https://reviews.llvm.org/D74307
2020-02-12 09:07:15 +01:00
Mehdi Amini d6a5c31c0f Removed declared but non-existent createMaterializeVectorsPass() (NFC) 2020-02-12 02:06:03 +00:00
Jacques Pienaar 7baf2a434c [mlir] Start Shape dialect
* Add basic skeleton for Shape dialect;
* Add description of types and ops to be used;

Differential Revision: https://reviews.llvm.org/D73944
2020-02-11 14:42:59 -08:00
Andy Davis 40b2eb3530 [mlir][AffineOps] Adds affine loop fusion transformation function to LoopFusionUtils.
Summary:
Adds affine loop fusion transformation function to LoopFusionUtils.
Updates TestLoopFusion utility to run loop fusion transformation until a fixed point is reached.
Adds unit tests to test the transformation.
Includes ASAN bug fix for D73190.

Reviewers: bondhugula, dcaballe

Reviewed By: bondhugula, dcaballe

Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74330
2020-02-11 13:56:26 -08:00
Andy Davis 813bfffec3 [mlir][VectorOps] Adds canonicalization rewrite patterns for vector ShapeCastOp.
Summary:
Adds two rewrite patterns for the vector ShapeCastOp.
*) ShapeCastOp decomposer: decomposes ShapeCastOp on tuple-of-vectors to multiple ShapeCastOps each on vector types.
*) ShapeCastOp folder: folds canceling shape cast ops (e.g. shape_cast A -> B followed by shape_cast B -> A) away.

Reviewers: nicolasvasilache, aartbik

Reviewed By: nicolasvasilache

Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74327
2020-02-11 13:11:45 -08:00