- also remove stale terminology/references in docs
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#148
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/148 from bondhugula:cleanup e846b641a3c2936e874138aff480a23cdbf66591
PiperOrigin-RevId: 271618279
The strided MemRef RFC discusses a normalized descriptor and interaction with library calls (https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/MaL8m2nXuio).
Lowering of nested LLVM structs as value types does not play nicely with externally compiled C/C++ functions due to ABI issues.
Solving the ABI problem generally is a very complex problem and most likely involves taking
a dependence on clang that we do not want atm.
A simple workaround is to pass pointers to memref descriptors at function boundaries, which this CL implement.
PiperOrigin-RevId: 271591708
linalg_integration_test.mlir and simple.mlir were temporarily disabled due to an OSS-only failure.
The issue is that, once created, an llvm::Error must be explicitly checked before it can be discarded or overwritten.
This CL fixes the issue and reenable the test.
PiperOrigin-RevId: 271589651
This commit introduces the ROCDL Dialect (i.e. the ROCDL ops + the code to lower those ROCDL ops to LLWM intrinsics/functions). Think of ROCDL Dialect as analogous to the NVVM Dialect, but for AMD GPUs. This patch contains just the essentials needed to get a simple example up and running. We expect to make further additions to the ROCDL Dialect.
This is the first of 3 commits, the follow-up will be:
* add a pass that lowers GPU Dialect to ROCDL Dialect
* add a "mlir-rocm-runner" utility
Closestensorflow/mlir#146
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/146 from deven-amd:deven-rocdl-dialect e78e8005c75a78912631116c78dc844fcc4b0de9
PiperOrigin-RevId: 271511259
This CL modifies the linalg-fusion pass such that it does not tile anymore as part of the pass. Tiling is a separate concern that enables linalg fusion but should happen before.
This makes fusion more composable with other decisions.
In particular the fusion pass now becomes greedy and only applies the transformation on a best-effort basis.
This should also let fusion work in a multi-hop fashion with chains of producer/consumers.
Since the fusion pass does not perform tiling anymore, tests are rewritten to be in pretiled form and make the intent of the test clearer (albeit more verbose).
PiperOrigin-RevId: 271357741
The support for functions taking and returning memrefs of floats was introduced
in the first version of the runner, created before MLIR had reliable lowering
of allocation/deallocation to library calls. It forcibly runs MLIR
transformation convering affine, loop and standard dialects into the LLVM
dialect, unlike the other runner flows that accept the LLVM dialect directly.
Memref support leads to more complex layering and is generally fragile. Drop
it in favor of functions returning a scalar, or library-based function calls to
print memrefs and other data structures.
PiperOrigin-RevId: 271330839
The reduction operation is currently fixed to "add", and the scope is fixed to "workgroup".
The implementation is currently limited to sizes that are multiple 32 (warp size) and no larger than 1024.
PiperOrigin-RevId: 271290265
Support the OpBitcast instruction of SPIR-V using the spv.Bitcast
operation. The semantics implemented in the dialect differ from the
SPIR-V spec in that the dialect does not allow conversion to/from
pointer types from/to non-pointer types.
PiperOrigin-RevId: 271255957
Call llvm::outs().flush() to make sure we don't mix streams.
Remove CHECK-LABEL to avoid assuming the relative order
between the additional info and the output IR.
PiperOrigin-RevId: 271131100
1) Process and ignore the following debug instructions: OpSource,
OpSourceContinued, OpSourceExtension, OpString, OpModuleProcessed.
2) While processing OpTypeInt instruction, ignore the signedness
specification. Currently MLIR doesnt make a distinction between signed
and unsigned integer types.
3) Process and ignore BufferBlock decoration (similar to Buffer
decoration). StructType needs to be enhanced to track this attribute
since its needed for proper validation checks.
4) Report better error for unhandled instruction during
deserialization.
PiperOrigin-RevId: 271057060
A base class is added to implement all GLSL Binary operations and is
used to implement the FMax operation. The existing framework already
generates all the necessary (de)serialization code.
PiperOrigin-RevId: 271037166
This change adds support for documenting interfaces and their methods. A tablegen generator for the interface documentation is also added(gen-op-interface-doc).
Documentation is added to an OpInterface via the `description` field:
def MyOpInterface : OpInterface<"MyOpInterface"> {
let description = [{
My interface is very interesting.
}];
}
Documentation is added to an InterfaceMethod via a new `description` field that comes right before the optional body:
InterfaceMethod<"void", "foo", (ins), [{
This is the foo method.
}]>,
PiperOrigin-RevId: 270965485
- introduce splat op in standard dialect (currently for int/float/index input
type, output type can be vector or statically shaped tensor)
- implement LLVM lowering (when result type is 1-d vector)
- add constant folding hook for it
- while on Ops.cpp, fix some stale names
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#141
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/141 from bondhugula:splat 48976a6aa0a75be6d91187db6418de989e03eb51
PiperOrigin-RevId: 270965304
The RFC for unifying Linalg and Affine compilation passes into an end-to-end flow with a predictable ABI and linkage to external function calls raised the question of why we have variable sized descriptors for memrefs depending on whether they have static or dynamic dimensions (https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/MaL8m2nXuio).
This CL standardizes the ABI on the rank of the memrefs.
The LLVM struct for a memref becomes equivalent to:
```
template <typename Elem, size_t Rank>
struct {
Elem *ptr;
int64_t sizes[Rank];
};
```
PiperOrigin-RevId: 270947276
This CL uses the newly added -split-input-file CLI option to
mlir-translate to combine certain (de)serialization tests.
It also renames certain test filenames.
PiperOrigin-RevId: 270816324
According to SPIR-V spec, spirv::CompositeType includes
spirv::RuntimeArrayType. This allows using objects of
spirv::RuntimeArrayType with spirv::AccessChainOp.
PiperOrigin-RevId: 270809492
Similar to mlir-opt, having a -split-input-file mode is quite useful
in mlir-translate. It allows to put logically related tests in the
same test file for better organization.
PiperOrigin-RevId: 270805467
Sdd support in deserializer for OpMemberName instruction. For now
the name is just processed and not associated with the
spirv::StructType being built. That needs an enhancement to
spirv::StructTypes itself.
Add tests to check for errors reported during deserialization with
some refactoring to common out some utility functions.
PiperOrigin-RevId: 270794524
The existing logic to parse spirv::StructTypes is very brittle. This
change simplifies the parsing logic a lot. The simplification also
allows for memberdecorations to be separated by commas instead of
spaces (which was an artifact of the existing parsing logic). The
change also needs a modification to mlir::parseType to return the
number of chars parsed. Adding a new parseType method to do so.
Also allow specification of spirv::StructType with no members.
PiperOrigin-RevId: 270739672
Using the two call interfaces, CallOpInterface and CallableOpInterface, this change adds support for an initial multi-level CallGraph. This call graph builds a set of nodes for each callable region, and connects them via edges. An edge may be any of the following types:
* Abstract
- An edge not produced by a call operation, used for connecting to internal nodes from external nodes.
* Call
- A call edge is an edge defined via a call-like operation.
* Child
- This is an artificial edge connecting nested callgraph nodes.
This callgraph will be used, and improved upon, to begin supporting more interesting interprocedural analyses and transformation. In a followup, this callgraph will be used to support more complex inlining support.
PiperOrigin-RevId: 270724968
These two operation interfaces will be used in a followup to support building a callgraph:
* CallOpInterface
- Operations providing this interface are call-like, and have a "call" target. A call target may be a symbol reference, via SymbolRefAttr, or a SSA value.
* CallableOpInterface
- Operations providing this interfaces define destinations to call-like operations, e.g. FuncOp. These operations may define any number of callable regions.
PiperOrigin-RevId: 270723300
This fixes a problem with current save-restore pattern of diagnostics handlers, as there may be a thread race between when the previous handler is destroyed. For example, this occurs when using multiple ParallelDiagnosticHandlers asynchronously:
Handler A
Handler B | - LifeTime - | Restore A here.
Handler C | --- LifeTime ---| Restore B after it has been destroyed.
The new design allows for multiple handlers to be registered in a stack like fashion. Handlers can return success() to signal that they have fully processed a diagnostic, or failure to propagate otherwise.
PiperOrigin-RevId: 270720625
Roll forward of commit 5684a12.
When outlining GPU kernels, put the kernel function inside a nested module. Then use a nested pipeline to generate the cubins, independently per kernel. In a final pass, move the cubins back to the parent module.
PiperOrigin-RevId: 270639748
The CL adds a rounding mode flag to the class and changes the default to rmNearestTiesToAway from rmNearestTiesToEven because 1) Tensorflow QuantizeV2 ops uses rmNearestTiesToAway; 2) the specialization only implements rmNearestTiesToAway.
PiperOrigin-RevId: 270600739