This CL adds an optional third argument to the vector.outerproduct instruction.
When such a third argument is specified, it is added to the result of the outerproduct and is lowered to FMA intrinsic when the lowering supports it.
In the future, we can add an attribute on the `vector.outerproduct` instruction to modify the operations for which to emit code (e.g. "+/*", "max/+", "min/+", "log/exp" ...).
This CL additionally performs minor cleanups in the vector lowering and adds tests to improve coverage.
This has been independently verified to result in proper fma instructions for haswell as follows.
Input:
```
func @outerproduct_add(%arg0: vector<17xf32>, %arg1: vector<8xf32>, %arg2: vector<17x8xf32>) -> vector<17x8xf32> {
%2 = vector.outerproduct %arg0, %arg1, %arg2 : vector<17xf32>, vector<8xf32>
return %2 : vector<17x8xf32>
}
}
```
Command:
```
mlir-opt vector-to-llvm.mlir -vector-lower-to-llvm-dialect --disable-pass-threading | mlir-opt -lower-to-cfg -lower-to-llvm | mlir-translate --mlir-to-llvmir | opt -O3 | llc -O3 -march=x86-64 -mcpu=haswell -mattr=fma,avx2
```
Output:
```
outerproduct_add: # @outerproduct_add
# %bb.0:
...
vmovaps 112(%rbp), %ymm8
vbroadcastss %xmm0, %ymm0
...
vbroadcastss 64(%rbp), %ymm15
vfmadd213ps 144(%rbp), %ymm8, %ymm0 # ymm0 = (ymm8 * ymm0) + mem
...
vfmadd213ps 400(%rbp), %ymm8, %ymm9 # ymm9 = (ymm8 * ymm9) + mem
...
```
PiperOrigin-RevId: 263743359
Generate the EnumAttr to represent BuiltIns in SPIR-V dialect. The
builtIn can be specified as a StringAttr with value being the
name of the builtin. Extend Decoration (de)serialization to handle
BuiltIns.
Also fix an error in the SPIR-V dialect generator script.
PiperOrigin-RevId: 263596624
This CL is step 2/n towards building a simple, programmable and portable vector abstraction in MLIR that can go all the way down to generating assembly vector code via LLVM's opt and llc tools.
This CL adds the vector.outerproduct operation to the MLIR vector dialect as well as the appropriate roundtrip test. Lowering to LLVM will occur in the following CL.
PiperOrigin-RevId: 262552027
This CL is step 2/n towards building a simple, programmable and portable vector abstraction in MLIR that can go all the way down to generating assembly vector code via LLVM's opt and llc tools.
This CL adds the vector.extractelement operation to the MLIR vector dialect as well as the appropriate roundtrip test. Lowering to LLVM will occur in the following CL.
PiperOrigin-RevId: 262545089
This trait provides the ensureTerminator() utility function and
the checks to make sure a spv.module is indeed terminated with
spv._module_end.
PiperOrigin-RevId: 261664153
This CL extends the existing spv.constant op to also support
specialization constant by adding an extra unit attribute
on it.
PiperOrigin-RevId: 261194869
During serialization, the operand number must be used to get the
values assocaited with an operand. Using the argument number in Op
specification was wrong since some of the elements in the arguments
list might be attributes on the operation. This resulted in a segfault
during serialization.
Add a test that exercise that path.
PiperOrigin-RevId: 260977758
All non-argument attributes specified for an operation are treated as
decorations on the result value and (de)serialized using OpDecorate
instruction. An error is generated if an attribute is not an argument,
and the name doesn't correspond to a Decoration enum. Name of the
attributes that represent decoerations are to be the snake-case-ified
version of the Decoration name.
Add utility methods to convert to snake-case and camel-case.
PiperOrigin-RevId: 260792638
This CL adds an initial implementation for translation of kernel
function in GPU Dialect (used with a gpu.launch_kernel) op to a
spv.Module. The original function is translated into an entry
function.
Most of the heavy lifting is done by adding TypeConversion and other
utility functions/classes that provide most of the functionality to
translate from Standard Dialect to SPIR-V Dialect. These are intended
to be reusable in implementation of different dialect conversion
pipelines.
Note : Some of the files for have been renamed to be consistent with
the norm used by the other Conversion frameworks.
PiperOrigin-RevId: 260759165
Automatic generation of spirv::AccessChainOp (de)serialization needs
the (de)serialization emitters to handle argument specified as
Variadic<...>. To handle this correctly, this argument can only be
the last entry in the arguments list.
Add a test to (de)serialize spirv::AccessChainOp
PiperOrigin-RevId: 260532598
AccessChainOp creates a pointer into a composite object that can be used with
OpLoad and OpStore.
Closestensorflow/mlir#52
PiperOrigin-RevId: 260035676
Per tacit agreement, individual dialects should now live in lib/Dialect/Name
with headers in include/mlir/Dialect/Name and tests in test/Dialect/Name.
PiperOrigin-RevId: 259896851
Some TensorFlow simulated quantize ops such as QuantizeAndDequantizeV2Op have
attribute for the sign of the quantization, so quant_ConstFakeQuant should be
able to represent it with the new attribute is added.
The method for converting these attributes to an QuantizedType is updated to
handle this new argument.
PiperOrigin-RevId: 258810290
We only verify broadcastable trait verifier and don't care about mutations so removed all CHECK statements and FileCheck invocation.
PiperOrigin-RevId: 258662882
Currently, Broadcastable trait also rejects instances when the op result has shape other than what can be statically inferred based on the operand shapes even if the result shape is compatible with the inferred broadcasted shape.
For example,
(tensor<3x2xi32>, tensor<*xi32>) -> tensor<4x3x2xi32>
(tensor<2xi32>, tensor<2xi32>) -> tensor<*xi32>
PiperOrigin-RevId: 258647493
The current syntax separates the name and value with ':', but ':' is already overloaded by several other things(e.g. trailing types). This makes the syntax difficult to parse in some situtations:
Old:
"foo: 10 : i32"
New:
"foo = 10 : i32"
PiperOrigin-RevId: 255097928
This is the standard syntax for types on operations, and is also already used by IntegerAttr and FloatAttr.
Example:
dense<5> : tensor<i32>
dense<[3]> : tensor<1xi32>
PiperOrigin-RevId: 255069157
This name has caused some confusion because it suggests that it's running op verification (and that this verification isn't getting run by default).
PiperOrigin-RevId: 254035268
Adding the additional layer of directory was discussed offline and matches the Target/ tree. The names match the defacto convention we seem to be following where the C++ namespace is ^(.+)Ops/$ matched against the directory name.
This is in preparation for patching the Quantizer into this tree, which would have been confusing without moving the Quantization dialect to its more proper home. It is left to others to move other dialects if desired.
Tested:
ninja check-mlir
--
PiperOrigin-RevId: 248171982
TensorFlow comparison ops like tf.Less supports broadcast behavior but the result
type have different element types as the input types. Extend broadcastable trait
to allow such cases. Added tf.Less to demonstrate it.
PiperOrigin-RevId: 237846127
So that we can use this function to deduce broadcasted shapes elsewhere.
Also added support for unknown dimensions, by following TensorFlow behavior.
PiperOrigin-RevId: 237846065
* Add common broadcastable binary adder in TF ops and use for a few ops;
- Adding Sub, Mul here
* Change the prepare lowering to use TF variants;
* Add some more legalization patterns;
PiperOrigin-RevId: 233310952
That allows TensorFlow Add and Div ops to use Broadcastable op trait instead of
more restrictive SameValueType op trait.
That in turn allows TensorFlow ops to be registered by defining GET_OP_LIST and
including the generated ops file. Currently, tf-raise-control-flow pass tests
are using dynamic shapes in tf.Add op and AddOp can't be registered without
supporting the dynamic shapes.
TESTED with unit tests
PiperOrigin-RevId: 232927998
The operand and result types of binary ops are not necessarily the
same. For those binary ops, we cannot print in the short-form assembly.
Enhance impl:::printBinaryOp to consider operand and result types
to select which assembly form to use.
PiperOrigin-RevId: 229608142
We also need the broadcast logic in the TensorFlow dialect. Move it to a
Dialect/ directory for a broader scope. This Dialect/ directory is intended
for code not in core IR, but can potentially be shared by multiple dialects.
Apart from fixing TensorFlow op TableGen to use this trait, this CL only
contains mechanical code shuffling.
PiperOrigin-RevId: 229563911