A mismatch in the function declaration and function definition,
prevented the implementation of the createGPUToSPIRVLoweringPass from
being exposed.
PiperOrigin-RevId: 282419815
To simplify the lowering into SPIR-V, while still respecting the ABI
requirements of SPIR-V/Vulkan, split the process into two
1) While lowering a function to SPIR-V (when the function is an entry
point function), allow specifying attributes on arguments and
function itself that describe the ABI of the function.
2) Add a pass that materializes the ABI described in the function.
Two attributes are needed.
1) Attribute on arguments of the entry point function that describe
the descriptor_set, binding, storage class, etc, of the
spv.globalVariable this argument will be replaced by
2) Attribute on function that specifies workgroup size, etc. (for now
only workgroup size).
Add the pass -spirv-lower-abi-attrs to materialize the ABI described
by the attributes.
This change makes the SPIRVBasicTypeConverter class unnecessary and is
removed, further simplifying the SPIR-V lowering path.
PiperOrigin-RevId: 282387587
Memref_cast supports cast from static shape to dynamic shape
memrefs. The same should be true for strides as well, i.e a memref
with static strides can be casted to a memref with dynamic strides.
PiperOrigin-RevId: 282381862
This is the counterpart of vector.extractelement op and has the same
limitations at the moment (static I64IntegerArrayAttr to express position).
This restriction will be filterd in the future.
LLVM lowering will be added in a subsequent commit.
PiperOrigin-RevId: 282365760
Introduce a new function-like operation to the GPU dialect to provide a
placeholder for the execution semantic description and to add support for GPU
memory hierarchy. This aligns with the overall goal of the dialect to expose
the common abstraction layer for GPU devices, in particular by providing an
MLIR unit of semantics (i.e. an operation) for memory modeling.
This proposal has been discussed in the mailing list:
https://groups.google.com/a/tensorflow.org/d/msg/mlir/RfXNP7Hklsc/MBNN7KhjAgAJ
As decided, the "convergence" aspect of the execution model will be factored
out into a new discussion and therefore is not included in this commit. This
commit only introduces the operation but does not hook it up with the remaining
flow. The intention is to develop the new flow while keeping the old flow
operational and do the switch in a simple, separately reversible commit.
PiperOrigin-RevId: 282357599
This CL added necessary files and settings for using DRR to
write SPIR-V canonicalization patterns and also converted the
patterns for spv.Bitcast and spv.LogicalNot.
PiperOrigin-RevId: 282132786
The check in isValidSymbol, as far as a DimOp result went, checked if
the dim op was on a top-level memref. However, any alloc'ed, view, or
subview memref would be fine as long as the corresponding dimension of
that memref is either a static one or was in turn created using a valid
symbol in the case of dynamic dimensions.
Reported-by: Jose Gomez
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#252
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/252 from bondhugula:symbol 7b57dc394df9375e651f497231c6e4525a32a662
PiperOrigin-RevId: 282097114
Support for including a file multiple times was added in tablegen, removing the need for these extra guards. This is because we already insert c/c++ style header guards within each of the specific .td files.
PiperOrigin-RevId: 282076728
Add a canonicalizer for `spirv::LogicalNotOp`.
Converts:
* spv.LogicalNot(spv.IEqual(...)) -> spv.INotEqual(...)
* spv.LogicalNot(spv.INotEqual(...)) -> spv.IEqual(...)
* spv.LogicalNot(spv.LogicalEqual(...)) -> spv.LogicalNotEqual(...)
* spv.LogicalNot(spv.LogicalNotEqual(...)) -> spv.LogicalEqual(...)
Also moved the test for spv.IMul to arithemtic tests.
Closestensorflow/mlir#256
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/256 from denis0x0D:sandbox/canon_logical_not 76ab5787b2c777f948c8978db061d99e76453d44
PiperOrigin-RevId: 282012356
Depending on which of the offsets, sizes, or strides are constant, the
subview op can be canonicalized in different ways. Add such
canonicalizations, which generalize the existing approach of
canonicalizing subview op only if all of offsets, sizes and shapes are
constants.
PiperOrigin-RevId: 282010703
Change vector op names from VectorFooOp to Vector_FooOp and from
vector::VectorFooOp to vector::FooOp.
Closestensorflow/mlir#257
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/257 from Kayjukh:master dfc3a0e04114885aaec8740d5951d6984d6e1577
PiperOrigin-RevId: 281967461
This changes changes the OpDefinitionsGen to automatically add the OpAsmOpInterface for operations with multiple result groups using the provided ODS names. We currently just limit the generation to multi-result ops as most single result operations don't have an interesting name(result/output/etc.). An example is shown below:
// The following operation:
def MyOp : ... {
let results = (outs AnyType:$first, Variadic<AnyType>:$middle, AnyType);
}
// May now be printed as:
%first, %middle:2, %0 = "my.op" ...
PiperOrigin-RevId: 281834156
This will make it easier to scale out test patterns and build specific passes that do not interfere with independent testing.
PiperOrigin-RevId: 281736335
Due to legacy reasons, a newline character followed by two spaces was always
inserted before the attributes of the function Op in pretty form. This breaks
formatting when functions are nested in some other operations. Don't print the
newline and just put the attributes on the same line, which is also more
consistent with module Op. Line breaking aware of indentation can be introduced
separately into the parser if deemed useful.
PiperOrigin-RevId: 281721793
This moves the different canonicalizations of regions into one place and invokes them in the fixed-point iteration of the canonicalizer.
PiperOrigin-RevId: 281617072
This is a simple multi-level DCE pass that operates pretty generically on
the IR. Its key feature compared to the existing peephole dead op folding
that happens during canonicalization is being able to delete recursively
dead cycles of the use-def graph, including block arguments.
PiperOrigin-RevId: 281568202
The current SubViewOp specification allows for either all offsets,
shape and stride to be dynamic or all of them to be static. There are
opportunities for more fine-grained canonicalization based on which of
these are static. For example, if the sizes are static, the result
memref is of static shape. The specification of SubViewOp is modified
to allow on or more of offsets, shapes and strides to be statically
specified. The verification is updated to ensure that the result type
of the subview op is consistent with which of these are static and
which are dynamic.
PiperOrigin-RevId: 281560457
This CL uses the pattern rewrite infrastructure to implement a simple VectorOps -> VectorOps legalization strategy to unroll coarse-grained vector operations into finer grained ones.
The transformation is written using local pattern rewrites to allow composition with other rewrites. It proceeds by iteratively introducing fake cast ops and cleaning canonicalizing or lowering them away where appropriate.
This is an example of writing transformations as compositions of local pattern rewrites that should enable us to make them significantly more declarative.
PiperOrigin-RevId: 281555100
This interface provides more fine-grained hooks into the AsmPrinter than the dialect interface, allowing for operations to define the asm name to use for results directly on the operations themselves. The hook is also expanded to enable defining named result "groups". Get a special name to use when printing the results of this operation.
The given callback is invoked with a specific result value that starts a
result "pack", and the name to give this result pack. To signal that a
result pack should use the default naming scheme, a None can be passed
in instead of the name.
For example, if you have an operation that has four results and you want
to split these into three distinct groups you could do the following:
setNameFn(getResult(0), "first_result");
setNameFn(getResult(1), "middle_results");
setNameFn(getResult(3), ""); // use the default numbering.
This would print the operation as follows:
%first_result, %middle_results:2, %0 = "my.op" ...
PiperOrigin-RevId: 281546873
The `vector.strided_slice` takes an n-D vector, k-D `offsets` integer array attribute, a
k-D `sizes` integer array attribute, a k-D `strides` integer array attribute and extracts
the n-D subvector at the proper offset.
Returns an n-D vector where the first k-D dimensions match the `sizes` attribute.
The returned subvector contains the elements starting at offset `offsets` and ending at
`offsets + sizes`.
Example:
```
%1 = vector.strided_slice %0
{offsets : [0, 2], sizes : [2, 4], strides : [1, 1]}:
vector<4x8x16xf32> // returns a vector<2x4x16xf32>
```
This op will be useful for progressive lowering within the VectorOp dialect.
PiperOrigin-RevId: 281352749
This method is needed for N->1 conversion patterns to retrieve remapped
Values used in the original N operations.
Closestensorflow/mlir#237
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/237 from dcaballe:dcaballe/getRemappedValue 1f64fadcf2b203f7b336ff0c5838b116ae3625db
PiperOrigin-RevId: 281321881
The command-line flag name `lower-to-llvm` for the pass performing dialect
conversion from the Standard dialect to the LLVM dialect is misleading and
inconsistent with most of the conversion passses. It leads the user to believe
that there are no restrictions on what can be converted, while in fact only a
subset of the Standard dialect can be converted (with operations from other
dialects converted by separate passes). Use `convert-std-to-llvm` that better
reflects what the pass does and is consistent with most other conversions.
PiperOrigin-RevId: 281238797
Iterates each element to build the array. This includes a little refactor to
combine bool/int/float into a function, since they are similar. The only
difference is calling different function in the end.
PiperOrigin-RevId: 281210288
Convert chained `spirv::BitcastOp` operations into
one `spirv::BitcastOp` operation.
Closestensorflow/mlir#238
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/238 from denis0x0D:sandbox/canon_bitcast 4352ed4f81b959ec92f849c599e733b62a99c010
PiperOrigin-RevId: 281129234
The assertion was introduced in the early days of dialect conversion
infrastructure when we had the matching function separate from the rewriting
function. The infrastructure evolved to have a common matchAndRewrite function
and the separate matching function was dropped without chaning the rewriting
that became matchAndRewrite. This has led to assertion being triggered. Return
a matchFailure instead of failing an assertion on unsupported types.
Closestensorflow/mlir#230
PiperOrigin-RevId: 281113741
This CL utilizies the more robust fusion feasibility analysis being built out in LoopFusionUtils, which will eventually be used to replace the current affine loop fusion pass.
PiperOrigin-RevId: 281112340
This improves consistency and will concretely avoid collisions between VectorExtractElementOp and ExtractElementOp when they are included in the same transforms / rewrites.
PiperOrigin-RevId: 281101588
This CL added op definitions for a few bit operations:
* OpBitFieldInsert
* OpBitFieldSExtract
* OpBitFieldUExtract
Closestensorflow/mlir#233
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/233 from denis0x0D:sandbox/bit_field_ops e7fd85b00d72d483d7992dc42b9cc4d673903455
PiperOrigin-RevId: 280691816
This modification will allow to easily plug lowering of linalg ops to different types of loops (affine, loop.for and other future constructs).
This is purely NFC for now.
PiperOrigin-RevId: 280652186
This is step 1/n in refactoring infrastructure along the Vector dialect to make it ready for retargetability and composable progressive lowering.
PiperOrigin-RevId: 280529784
Refactoring the conversion from StandardOps/GPU dialect to SPIR-V
dialect:
1) Move the SPIRVTypeConversion and SPIRVOpLowering class into SPIR-V
dialect.
2) Add header files that expose functions to add patterns for the
dialects to SPIR-V lowering, as well as a pass that does the
dialect to SPIR-V lowering.
3) Make SPIRVOpLowering derive from OpLowering class.
PiperOrigin-RevId: 280486871
The `Operator` class keeps an `arguments` field, which contains pointers
to `operands` and `attributes` elements. Thus it must be populated after
`operands` and `attributes` are finalized so to have stable pointers.
SmallVector may re-allocate when still having new elements added, which
will invalidate pointers.
PiperOrigin-RevId: 280466896
Previous commits removed all uses of LLVMTypeConverter::k*PosInMemRefDescriptor
outside of the MemRefDescriptor class. These numbers are an implementation
detail and can be hidden under a layer of more semantic APIs.
PiperOrigin-RevId: 280442444
Following up on the consolidation of MemRef descriptor conversion, update
Vector-to-LLVM conversion to use the helper class that abstracts away the
implementation details of the MemRef descriptor. This also makes the types of
the attributes in emitted llvm.insert/extractelement operations consistently
i64 instead of a mix of index and i64.
PiperOrigin-RevId: 280441451
This CL moves VectorOps to Tablegen and cleans up the implementation.
This is almost NFC but 2 changes occur:
1. an interface change occurs in the padding value specification in vector_transfer_read:
the value becomes non-optional. As a shortcut we currently use %f0 for all paddings.
This should become an OpInterface for vectorization in the future.
2. the return type of vector.type_cast is trivial and simplified to `memref<vector<...>>`
Relevant roundtrip and invalid tests that used to sit in core are moved to the vector dialect.
The op documentation is moved to the .td file.
PiperOrigin-RevId: 280430869
Following up on the consolidation of MemRef descriptor conversion, update
Linalg-to-LLVM conversion to use the helper class that abstracts away the
implementation details of the MemRef descriptor. This required MemRefDescriptor
to become publicly visible. Since this conversion is heavily EDSC-based,
introduce locally an additional wrapper that uses builder and location pointed
to by the EDSC context while emitting descriptor manipulation operations.
PiperOrigin-RevId: 280429228
Memref descriptor is becoming increasingly complex. Memrefs are manipulated by
multiple standard instructions, each of which has a non-trivial lowering to the
LLVM dialect. This leads to verbose code that manipulates the descriptors
exposing the internals of insert/extractelement opreations. Implement a wrapper
class that contains a memref descriptor and provides semantically named methods
that build the primitive IR operations instead.
PiperOrigin-RevId: 280371225
Expand local scope printing to skip printing aliases as aliases are printed out at the top of a module and may not be part of the output generated by local scope print.
PiperOrigin-RevId: 280278617
This CL uses the now standard std.subview in linalg.
Two shortcuts are currently taken to allow this port:
1. the type resulting from a view is currently degraded to fully dynamic to pass the SubViewOp verifier.
2. indexing into SubViewOp may access out of bounds since lowering to LLVM does not currently enforce it by construction.
These will be fixed in subsequent commits after discussions.
PiperOrigin-RevId: 280250129
This is a quite complex operation that users are likely to attempt to write
themselves and get wrong (citation: users=me).
Ideally, we could pull this into FunctionLike, but for now, the
FunctionType rewriting makes it FuncOp specific. We would need some hook
for rewriting the function type (which for LLVM's func op, would need to
rewrite the underlying LLVM type).
PiperOrigin-RevId: 280234164
This refactors the implementation of block signature(type) conversion to not insert fake cast operations to perform the type conversion, but to instead create a new block containing the proper signature. This has the benefit of enabling the use of pre-computed analyses that rely on mapping values. It also leads to a much cleaner implementation overall. The major user facing change is that applySignatureConversion will now replace the entry block of the region, meaning that blocks generally shouldn't be cached over calls to applySignatureConversion.
PiperOrigin-RevId: 280226936
The current implementation silently fails if the '@' identifier isn't present, making it similar to the 'optional' parse methods. This change renames the current implementation to 'Optional' and adds a new 'parseSymbolName' that emits an error.
PiperOrigin-RevId: 280214610
Since VariableOp is serialized during processBlock, we add two more fields,
`functionHeader` and `functionBody`, to collect instructions for a function.
After all the blocks have been processed, we append them to the `functions`.
Also, fix a bug in processGlobalVariableOp. The global variables should be
encoded into `typesGlobalValues`.
PiperOrigin-RevId: 280105366
Lowering of CmpIOp, DivISOp, RemISOp, SubIOp and SelectOp to SPIR-V
dialect enables the lowering of operations generated by AffineExpr ->
StandardOps conversion into the SPIR-V dialect.
PiperOrigin-RevId: 280039204
The elements of a DictionaryAttr are guaranteed to be sorted by name, so we can use a more efficient lookup when searching for an attribute.
PiperOrigin-RevId: 280035488
Existing check that sets FuncOp to be dynamically legal was just
checking that the types of the argument are SPIR-V compatible. Since
the current conversion from GPU to SPIR-V does not handle lowering
non-kernel functions, change the legality check to verify that the
FuncOp has the gpu.kernel attribute and has void(void) return type.
PiperOrigin-RevId: 280032782
During deserialization, the loop header block will be moved into the
spv.loop's region. If the loop header block has block arguments,
we need to make sure it is correctly carried over to the block where
the new spv.loop resides.
During serialization, we need to make sure block arguments from the
spv.loop's entry block are not silently dropped.
PiperOrigin-RevId: 280021777
It is often helpful to inspect the operation that the error/warning/remark/etc. originated from, especially in the context of debugging or in the case of a verifier failure. This change adds an option 'mlir-print-op-on-diagnostic' that attaches the operation as a note to any diagnostic that is emitted on it via Operation::emit(Error|Warning|Remark). In the case of an error, the operation is printed in the generic form.
PiperOrigin-RevId: 280021438
loop::ForOp can be lowered to the structured control flow represented
by spirv::LoopOp by making the continue block of the spirv::LoopOp the
loop latch and the merge block the exit block. The resulting
spirv::LoopOp has a single back edge from the continue to header
block, and a single exit from header to merge.
PiperOrigin-RevId: 280015614
This causes the AsmPrinter to use a local value numbering when printing the IR, allowing for the printer to be used safely in a local context, e.g. to ensure thread-safety when printing the IR. This means that the IR printing instrumentation can also be used during multi-threading when module-scope is disabled. Operation::dump and DiagnosticArgument(Operation*) are also updated to always print local scope, as this is the most common use case when debugging.
PiperOrigin-RevId: 279988203
This CL adds an extra pointer to the memref descriptor to allow specifying alignment.
In a previous implementation, we used 2 types: `linalg.buffer` and `view` where the buffer type was the unit of allocation/deallocation/alignment and `view` was the unit of indexing.
After multiple discussions it was decided to use a single type, which conflates both, so the memref descriptor now needs to carry both pointers.
This is consistent with the [RFC-Proposed Changes to MemRef and Tensor MLIR Types](https://groups.google.com/a/tensorflow.org/forum/#!searchin/mlir/std.view%7Csort:date/mlir/-wKHANzDNTg/4K6nUAp8AAAJ).
PiperOrigin-RevId: 279959463
This change allows for adding additional nested references to a SymbolRefAttr to allow for further resolving a symbol if that symbol also defines a SymbolTable. If a referenced symbol also defines a symbol table, a nested reference can be used to refer to a symbol within that table. Nested references are printed after the main reference in the following form:
symbol-ref-attribute ::= symbol-ref-id (`::` symbol-ref-id)*
Example:
module @reference {
func @nested_reference()
}
my_reference_op @reference::@nested_reference
Given that SymbolRefAttr is now more general, the existing functionality centered around a single reference is moved to a derived class FlatSymbolRefAttr. Followup commits will add support to lookups, rauw, etc. for scoped references.
PiperOrigin-RevId: 279860501
This operation is a companion operation to the std.view operation added as proposed in "Updates to the MLIR MemRefType" RFC.
PiperOrigin-RevId: 279766410
This code should be exercised using the existing kernel outlining unit test, but
let me know if I should add a dedicated unit test using a fake call instruction
as well.
PiperOrigin-RevId: 279436321
This also previously triggered the warning:
warning: missing field 'isRecursivelyLegal' initializer [-Wmissing-field-initializers]
legalOperations[op] = {action};
^
PiperOrigin-RevId: 279399175
This CL added op definitions for a few bit operations:
* OpShiftLeftLogical
* OpShiftRightArithmetic
* OpShiftRightLogical
* OpBitCount
* OpBitReverse
* OpNot
Also moved the definition of spv.BitwiseAnd to follow the
lexicographical order.
Closestensorflow/mlir#215
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/215 from denis0x0D:sandbox/bit_ops d9b0852b689ac6c4879a9740b1740a2357f44d24
PiperOrigin-RevId: 279350470
MLIR translation tools can emit diagnostics and we want to be able to check if
it is indeed the case in tests. Reuse the source manager error handlers
provided for mlir-opt to support the verification in mlir-translate. This
requires us to change the signature of the functions that are registered to
translate sources to MLIR: it now takes a source manager instead of a memory
buffer.
PiperOrigin-RevId: 279132972
Now that a view op has graduated to the std dialect, we can update Linalg to use it and remove ops that have become obsolete. As a byproduct, the linalg buffer and associated ops can also disappear.
PiperOrigin-RevId: 279073591
This CL ports the lowering of linalg.view to the newly introduced std.view.
Differences in implementation relate to std.view having slightly different semantics:
1. a static or dynamic offset can be specified.
2. the size of the (contiguous) shape is passed instead of a range.
3. static size and stride information is extracted from the memref type rather than the range.
Besides these differences, lowering behaves the same.
A future CL will update Linalg to use this unified infrastructure.
PiperOrigin-RevId: 278948853
This is useful for making matching cases where a non-zero value is required more readable, such as the results of a constant comparison that are expected to be equal.
PiperOrigin-RevId: 278932874
Many operations with regions add an additional 'attributes' prefix when printing the attribute dictionary to differentiate it from the region body. This leads to duplicated logic for detecting when to actually print the attribute dictionary.
PiperOrigin-RevId: 278747681
This allows GlobalOp to either take a value attribute (for simple constants) or a region that can
contain IR instructions (that must be constant-foldable) to create a ConstantExpr initializer.
Example:
// A complex initializer is constructed with an initializer region.
llvm.mlir.global constant @int_gep() : !llvm<"i32*"> {
%0 = llvm.mlir.addressof @g2 : !llvm<"i32*">
%1 = llvm.mlir.constant(2 : i32) : !llvm.i32
%2 = llvm.getelementptr %0[%1] : (!llvm<"i32*">, !llvm.i32) -> !llvm<"i32*">
llvm.return %2 : !llvm<"i32*">
}
PiperOrigin-RevId: 278717836
This adds an importer from LLVM IR or bitcode to the LLVM dialect. The importer is registered with mlir-translate.
Known issues exposed by this patch but not yet fixed:
* Globals' initializers are attributes, which makes it impossible to represent a ConstantExpr. This will be fixed in a followup.
* icmp returns i32 rather than i1.
* select and a couple of other instructions aren't implemented.
* llvm.cond_br takes its successors in a weird order.
The testing here is known to be non-exhaustive.
I'd appreciate feedback on where this functionality should live. It looks like the translator *from MLIR to LLVM* lives in Target/, but the SPIR-V deserializer lives in Dialect/ which is why I've put this here too.
PiperOrigin-RevId: 278711683
A pattern rewriter hook, mergeBlock, is added that allows for merging the operations of one block into the end of another. This is used to support a canonicalization pattern for branch operations that folds the branch when the successor has a single predecessor(the branch block).
Example:
^bb0:
%c0_i32 = constant 0 : i32
br ^bb1(%c0_i32 : i32)
^bb1(%x : i32):
return %x : i32
becomes:
^bb0:
%c0_i32 = constant 0 : i32
return %c0_i32 : i32
PiperOrigin-RevId: 278677825
This simplifies the implementation quite a bit, and removes the need for explicit string munging. One change is made to some of the enum elements of SPV_DimAttr to ensure that they are proper identifiers; The string form is now prefixed with 'Dim'.
PiperOrigin-RevId: 278027132
This simplifies the implementation, and removes the need to do explicit string manipulation. A utility method 'parseDimensionList' is added to the DialectAsmParser to simplify defining types and attributes that contain shapes.
PiperOrigin-RevId: 278020604
This greatly simplifies the implementation and removes custom parser functionality. The necessary methods are added to the DialectAsmParser.
PiperOrigin-RevId: 278015983
Now that a proper parser is passed to these methods, there isn't a need to explicitly pass a source location. The source location can be recovered from the parser as necessary. This removes the need to explicitly decode an SMLoc in the case where we don't need to, which can be expensive.
This requires adding some basic nesting support to the parser for supporting nested parsers to allow for remapping source locations of the nested parsers to the top level parser for accurate diagnostics. This is due to the fact that the attribute and type parsers use different source buffers than the top level parser, as they may be represented in string form.
PiperOrigin-RevId: 278014858
These classes are functionally similar to the OpAsmParser/Printer classes and provide hooks for parsing attributes/tokens/types/etc. This change merely sets up the base infrastructure and updates the parser hooks, followups will add hooks as needed to simplify existing handrolled dialect parsers.
This has various different benefits:
*) Attribute/Type parsing is much simpler to define.
*) Dialect attributes/types that contain other attributes/types can now use aliases.
*) It provides a 'spec' with which we may use in the future to auto-generate parsers/printers.
*) Error messages emitted by attribute/type parsers can provide character exact locations rather than "beginning of the string"
PiperOrigin-RevId: 278005322
BitEnumAttr is a mechanism for modelling attributes whose value is
a bitfield. It should not be scoped to the SPIR-V dialect and can
be used by other dialects too.
This CL is mostly shuffling code around and adding tests and docs.
Functionality changes are:
* Fixed to use `getZExtValue()` instead of `getSExtValue()` when
getting the value from the underlying IntegerAttr for a case.
* Changed to auto-detect whether there is a case whose value is
all bits unset (i.e., zero). If so handle it specially in all
helper methods.
PiperOrigin-RevId: 277964926
The current lowering of loops to GPU only supports lowering of loop
nests where the loops mapped to workgroups and workitems are perfectly
nested. Here a new lowering is added to handle lowering of imperfectly
nested loop body with the following properties
1) The loops partitioned to workgroups are perfectly nested.
2) The loop body of the inner most loop partitioned to workgroups can
contain one or more loop nests that are to be partitioned across
workitems. Each individual loops nests partitioned to workitems should
also be perfectly nested.
3) The number of workgroups and workitems are not deduced from the
loop bounds but are passed in by the caller of the lowering as values.
4) For statements within the perfectly nested loop nest partitioned
across workgroups that are not loops, it is valid to have all threads
execute that statement. This is NOT verified.
PiperOrigin-RevId: 277958868
This CL adds a simple pattern for specifying producer-consumer fusion on Linalg operations.
Implementing such an extension reveals some interesting properties.
Since Linalg operates on a buffer abstraction, the output buffers are specified as in/out parameters to the ops. As a consequence, there are no SSA use-def chains and one cannot specify complex dag input patterns with the current infrastructure.
Instead this CL uses constraints based on the existing linalg dependence analysis to focus the pattern and refine patterns based on the type of op that last wrote in a buffer.
This is a very local property and is less powerful than the generic dag specification based on SSA use-def chains.
This will be generalized in the future.
PiperOrigin-RevId: 277931503
Upstream LLVM gained support for #ifndef with https://reviews.llvm.org/D61888
This is changed mechanically via the following command:
find . -name "*.td" -exec sed -i -e ':a' -e 'N' -e '$!ba' -e 's/#ifdef \([A-Z_]*\)\n#else/#ifndef \1/g' {} \;
PiperOrigin-RevId: 277789427
MLIR const-correctness policy is to avoid having `const` on IR objects.
LinalgDependenceGraph is not an IR object but an auxiliary data structure.
Furthermore, it is not updated once constructed unlike IR objects. Add const
qualifiers to get* and find* methods of LinalgDependenceGraph since they are
not modifying the graph. This allows transformation functions that require the
dependence graph to take it by const-reference, clearly indicating that they
are not modifying it (and that the graph may have to be recomputed after the
transformation).
PiperOrigin-RevId: 277731608
This CL added op definitions for a few cast operations:
* OpConvertFToU
* OpConvertFToS
* OpConvertSToF
* OpConvertUToF
* OpUConvert
* OpSConvert
* OpFConvert
Also moved the definition of spv.Bitcast to the new file.
Closestensorflow/mlir#208 and tensorflow/mlir#174
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/208 from denis0x0D:sandbox/cast_ops 79bc9b37398aafddee6cf6beb301807988fe67f9
PiperOrigin-RevId: 277587891
Rewrite patterns may make modifications to the CFG, including dropping edges between blocks. This change adds a simple unreachable block elimination run at the end of each iteration to ensure that the CFG remains valid.
PiperOrigin-RevId: 277545805
Linalg ops provide a good anchor for pattern matching/rewriting transformations.
This CL adds a simple example of how multi-level tiling may be specified by attaching a simple StringAttr to ops as they are transformed so we can easily specify partial lowering to control transformation application.
This is a first stab at taking advantage of higher-level information contained in Linalg ops and will evolve in the future.
PiperOrigin-RevId: 277497958
This CL fixed gen_spirv_dialect.py to support nested delimiters when
chunking existing ODS entries in .td files and to allow ops without
correspondence in the spec. This is needed to pull in the definition
of OpUnreachable.
PiperOrigin-RevId: 277486465
When we removed a pattern, we removed it from worklist but not from
worklistMap. Then, when we tried to add a new pattern on the same Operation
again, the pattern wasn't added since it already existed in the
worklistMap (but not in the worklist).
Closestensorflow/mlir#211
PiperOrigin-RevId: 277319669
This removes a bunch of special tailored DFS code in favor of the common
LLVM utility. Besides, we avoid recursion with system stack given that
llvm::depth_first_ext is iterator based and maintains its own stack.
PiperOrigin-RevId: 277272961
The SelectOp always has the same result type as its true/false
value. Add a builder method that uses the operand type to get the
result type.
PiperOrigin-RevId: 277217978
This CL adds another control flow instruction in SPIR-V: OpPhi.
It is modelled as block arguments to be idiomatic with MLIR.
See the rationale.md doc for "Block Arguments vs PHI nodes".
Serialization and deserialization is updated to convert between
block arguments and SPIR-V OpPhi instructions.
PiperOrigin-RevId: 277161545
For ops that recursively re-enter the parser to parse an operation (such as
ops with a "wraps" pretty form), this ensures that the wrapped op will parse
its location, which can then be used for the locations of the wrapping op
and any other implicit ops.
PiperOrigin-RevId: 277152636
This will be used to specify declarative Linalg transformations in a followup CL. In particular, the PatternRewrite mechanism does not allow folding and has its own way of tracking erasure.
PiperOrigin-RevId: 277149158
In some cases, it may be desirable to mark entire regions of operations as legal. This provides an additional granularity of context to the concept of "legal". The `ConversionTarget` supports marking operations, that were previously added as `Legal` or `Dynamic`, as `recursively` legal. Recursive legality means that if an operation instance is legal, either statically or dynamically, all of the operations nested within are also considered legal. An operation can be marked via `markOpRecursivelyLegal<>`:
```c++
ConversionTarget &target = ...;
/// The operation must first be marked as `Legal` or `Dynamic`.
target.addLegalOp<MyOp>(...);
target.addDynamicallyLegalOp<MySecondOp>(...);
/// Mark the operation as always recursively legal.
target.markOpRecursivelyLegal<MyOp>();
/// Mark optionally with a callback to allow selective marking.
target.markOpRecursivelyLegal<MyOp, MySecondOp>([](Operation *op) { ... });
/// Mark optionally with a callback to allow selective marking.
target.markOpRecursivelyLegal<MyOp>([](MyOp op) { ... });
```
PiperOrigin-RevId: 277086382
This allows for parsing things like:
%name_1, %name_2:5, %name_3:2 = "my.op" ...
This is useful for operations that have groups of variadic result values. The
total number of results is expected to match the number of results defined by
the operation.
PiperOrigin-RevId: 276703280
Combine chained `spirv::AccessChainOp` operations into one
`spirv::AccessChainOp` operation.
Closestensorflow/mlir#198
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/198 from denis0x0D:sandbox/canon_access_chain 0cb87955a85511071143d62637ff939d0dabc2bd
PiperOrigin-RevId: 276609345
This allows for them to be used on other non-function, or even other function-like, operations. The algorithms are already generic, so this is simply changing the derived pass type. The majority of this change is just ensuring that the nesting of these passes remains the same, as the pass manager won't auto-nest them anymore.
PiperOrigin-RevId: 276573038
MLIRIR includes generated header for interfaces, including these headers require
an extra dependency to ensure these headers are generated before we attempt to
build MLIREDSCInterface.
PiperOrigin-RevId: 276518255
This simplifies defining expected-* directives when there are multiple that apply to the next or previous line. @below applies the directive to the next non-designator line, i.e. the next line that does not contain an expected-* designator. @above applies to the previous non designator line.
Examples:
// Expect an error on the next line that does not contain a designator.
// expected-remark@below {{remark on function below}}
// expected-remark@below {{another remark on function below}}
func @bar(%a : f32)
// Expect an error on the previous line that does not contain a designator.
func @baz(%a : f32)
// expected-remark@above {{remark on function above}}
// expected-remark@above {{another remark on function above}}
PiperOrigin-RevId: 276369085
The ExecutionEngine was updated recently to only take the LLVM dialect as
input. Memrefs are no longer expected in the signature of the entry point
function by the executor so there is no need to allocate and free them. The
code in MemRefUtils is therefore dead and furthermore out of sync with the
recent evolution of memref type to support strides. Drop it.
PiperOrigin-RevId: 276272302
Previously DRR assumes attributes to appear after operands. This was the
previous requirements on ODS, but that has changed some time ago. Fix
DRR to also support interleaved operands and attributes.
PiperOrigin-RevId: 275983485
We will use block arguments as the way to model SPIR-V OpPhi in
the SPIR-V dialect.
This CL also adds a few useful helper methods to both ops to
get the block arguments.
Also added tests for branch weight (de)serialization.
PiperOrigin-RevId: 275960797
nvvm.shfl.sync.bfly optionally returns a predicate whether source lane was active. Support for this was added to clang in https://reviews.llvm.org/D68892.
Add an optional 'pred' unit attribute to the instruction to return this predicate. Specify this attribute in the partial warp reduction so we don't need to manually compute the predicate.
PiperOrigin-RevId: 275616564
Refactor the implementation to be much cleaner by adding a `make_second_range` utility to walk the `second` value of a range of pairs.
PiperOrigin-RevId: 275598985
This allows dialect-specific attributes to be attached to func results. (or more specifically, FunctionLike ops).
For example:
```
func @f() -> (i32 {my_dialect.some_attr = 3})
```
This attaches my_dialect.some_attr with value 3 to the first result of func @f.
Another more complex example:
```
func @g() -> (i32, f32 {my_dialect.some_attr = "foo", other_dialect.some_other_attr = [1,2,3]}, i1)
```
Here, the second result has two attributes attached.
PiperOrigin-RevId: 275564165
This allows mixing linalg operations with vector transfer operations (with additional modifications to affine ops) and is a step towards solving tensorflow/mlir#189.
PiperOrigin-RevId: 275543361
A VectorTypeCastOp can only be used to lower between statically sized contiguous memrefs of scalar and matching vector type. The sizes and strides are thus fully static and easy to determine.
A relevant test is added.
This is a step towards solving tensorflow/mlir#189.
PiperOrigin-RevId: 275538981
This CL adds support for loop.for operations in EDSC and adds a test.
This will be used in a followup commit to implement lowering of vector_transfer ops so that it works more generally and is not subject to affine constraints.
PiperOrigin-RevId: 275349796
This CL creates a new Linalg promotion pass that operates on SubViewOp and decouples it from Linalg tiling. This is mostly moving code around.
PiperOrigin-RevId: 275329213
Add a canonicalization pattern for spv.selection operation.
Convert spv.selection operation to spv.Select based on
simple pattern.
Closestensorflow/mlir#183
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/183 from denis0x0D:sandbox/canon_select 43d04d923272dd60b9da39f70bdbc51a5168db62
PiperOrigin-RevId: 275312748
'_' is used frequently enough as the separator of words in symbols.
We should allow it in dialect symbols when considering pretty printing.
Also updated LangRef.md regarding pretty form.
PiperOrigin-RevId: 275312494
Previously when we bind a symbol to an op in DRR, it means to capture
the op's result(s) and later references will be expanded to result(s).
This means for ops without result, we are replacing the symbol with
nothing. This CL treats non-result op capturing and referencing as a
special case to mean the op itself.
PiperOrigin-RevId: 275269702
It's usually hard to understand what went wrong if mlir-tblgen
crashes on some input. This CL adds a few useful LLVM_DEBUG
statements so that we can use mlir-tblegn -debug to figure
out the culprit for a crash.
PiperOrigin-RevId: 275253532
We just need to implement a few interface hooks to DialectInlinerInterface
and CallOpInterface to gain the benefits of an inliner. :)
Right now only supports some trivial cases:
* Inlining single block with spv.Return/spv.ReturnValue
* Inlining multi block with spv.Return
* Inlining spv.selection/spv.loop without return ops
More advanced cases will require block argument and Phi support.
PiperOrigin-RevId: 275151132
This Chapter now introduces and makes use of the Interface concept
in MLIR to demonstrate ShapeInference.
END_PUBLIC
Closestensorflow/mlir#191
PiperOrigin-RevId: 275085151
Makes the spv.module generated by the GPU to SPIR-V conversion SPIR-V
spec compliant (validated using spirv-val from Vulkan tools).
1) Separate out the VulkanLayoutUtils from
DecorateSPIRVCompositeTypeLayoutPass to make it reusable within the
Type converter in SPIR-V lowering infrastructure. This is used to
compute the layout of the !spv.struct used in global variable type
description.
2) Set the capabilities of the spv.module to Shader (needed for use of
Logical Memory Model, and the extensions to
SPV_KHR_storage_buffer_storage_class for use of Storage Buffer)
PiperOrigin-RevId: 275081486
In addition to specifying the type of accumulation through the 'op' attribute, the accumulation can now also be specified as arbitrary code region.
Adds a gpu.yield op to specify the result of the accumulation.
Also support more types (integers) and accumulations (mul).
PiperOrigin-RevId: 275065447
The current SignatureConversion framework (part of DialectConversion)
allows remapping input arguments to a function from 1->0, 1->1 or
1->many arguments during conversion. Another case is where the
argument itself is dropped, but it's use are remapped to another
Value*.
An example of this is: The Vulkan/SPIR-V spec requires entry functions
to be of type void(void). The GPU -> SPIR-V conversion implemented
this without having the DialectConversion framework track the
remapping that lead to some undefined behavior. The changes here
addresses that.
PiperOrigin-RevId: 275059656
b843cc5d5a introduced a new op LICM transformation and a LoopLike interface,
but missed the CMake aspects of it. This should fix the build.
PiperOrigin-RevId: 275038533
The SpecId decoration is the handle for providing external specialization.
Similar to descriptor set and binding on global variables, we directly
bake it into assembly parsing and printing.
PiperOrigin-RevId: 274893879
This CL adds a missing lowering for splat of multi-dimensional vectors.
Additional support is also added to the runtime utils library to allow printing memrefs with such vectors.
PiperOrigin-RevId: 274794723
Python bindings currently currently provide a makeScalarType function that
constructs one of the predefined types. It was implemented in the bindings
directly to circumvent the absence of standalone type parsing function. Now
that mlir::parseType has been made available, rely on the core parsing
procedure to construct types from strings in the bindings.
This changes includes a library reshuffling that splits out "CoreAPIs"
implementing the binding helper APIs into a separate library and makes that
dependent on the Parser library.
PiperOrigin-RevId: 274794516
The value defined in a loop was not being used and the function producing it
re-evaluated instead. Use the value to avoid both the warning and the
re-evaluation.
PiperOrigin-RevId: 274794459
When dealing with regions, or other patterns that need to generate temporary operations, it is useful to be able to replace other operations than the root op being matched. Before this PR, these operations would still be considered for legalization meaning that the conversion would either fail, erroneously need to mark these ops as legal, or add unnecessary patterns.
PiperOrigin-RevId: 274598513
When the implementation of the strided memref [RFC](https://groups.google.com/a/tensorflow.org/forum/#!msg/mlir/MaL8m2nXuio/1scRqZa6AQAJ) landed, linalg started using this type instead of the now retired !linalg.view.
As static and partially static cases appear, the stride information needs to be maintained properly. In particular, the result type of the subview op was generally incorrect.
This CL fixes the issue by computing a return type that:
1. always has dynamic sizes, which is generally the only correct way to construct a subview in the absence of data padding and/or code versioning.
2. has the same strides as the base strided memref.
Point 1. above can be further refined but will needs further analysis and canonicalization to optimize the particular case where:
1. The base memref has static size along a given dimension.
2. The subview size can be statically derived (e.g. after canonicalization).
3. *And* the subview size is an even divisor of the base memref.
This 3rd constraint is well-known in the case of tiled layouts that don't assume implicit padding: the boundary tile may be only partial and has size given by `problem_size % tile_size`.
Tests are updated as appropriate.
PiperOrigin-RevId: 274578624
This fixes an omission that prevents Linalg to lower generic ops regions operating on ops in the VectorOps dialect.
To achieve this we simply need to `populateVectorToLLVMConversionPatterns` in the conversion.
Relevant tests are added.
PiperOrigin-RevId: 274577325
Originally, the lowering of `alloc` operations has been computing the number of
bytes to allocate when lowering based on the properties of MLIR type. This does
not take into account type legalization that happens when compiling LLVM IR
down to target assembly. This legalization can widen the type, potentially
leading to out-of-bounds accesses to `alloc`ed data due to mismatches between
address computation that takes the widening into account and allocation that
does not. Use the LLVM IR's equivalent of `sizeof` to compute the number of
bytes to be allocated:
%0 = getelementptr %type* null, %indexType 0
%1 = ptrtoint %type* %0 to %indexType
adapted from
http://nondot.org/sabre/LLVMNotes/SizeOf-OffsetOf-VariableSizedStructs.txt
PiperOrigin-RevId: 274159900
Similarly to `llvm.mlir.undef`, this auxiliary operation creates an SSA value
that corresponds to `null` in LLVM IR. This operation is necessary to model
sizeof(<...>) behavior when allocating memory.
PiperOrigin-RevId: 274158760
- dropping what looks like outdated code post some of the previous
updates
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#179
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/179 from bondhugula:llfix 2a72ea441fe1b3924802273ffbe9870afeb90f91
PiperOrigin-RevId: 274158273
On failure, the IR is likely to be in an invalid state, meaning the custom printer for some operations may now crash. Using the generic op form prevents this from happening.
PiperOrigin-RevId: 274104146
This cl adds support for generating a .mlir file containing a reproducer for crashes and failures that happen during pass execution. The reproducer contains a comment detailing the configuration of the pass manager(e.g. the textual description of the pass pipeline that the pass manager was executing), along with the original input module.
Example Output:
// configuration: -pass-pipeline='func(cse, canonicalize), inline'
// note: verifyPasses=false
module {
...
}
PiperOrigin-RevId: 274088134
In Standard to LLVM dialect conversion, the binary op conversion pattern
implicitly assumed some operands were of LLVM IR dialect type. This is not
necessarily true, for example if the Ops that produce those operands did not
match the existing convresion patterns. Check if all operands are of LLVM IR
dialect type and if not, fail to patch the binary op pattern.
Closestensorflow/mlir#168
PiperOrigin-RevId: 274063207
Translation to LLVM expects the entry module to have only specific types of ops
that correspond to LLVM IR entities allowed in a module. Currently those are
restricted to functions and globals. Introduce an additional check at the
module level. Inside individual functions, the check for supported Ops is
already performed, but it accepts all LLVM dialect Ops and wouldn't be
immediately applicable at the module level.
PiperOrigin-RevId: 274058651
The lowering is specified as a pattern and is done only if the result
is a SPIR-V scalar type or vector type.
Handling ConstantOp with index return type needs special handling
since SPIR-V dialect does not have index types. Based on the bitwidth
of the attribute value, either i32 or i64 is chosen.
Other constant lowerings are left as a TODO.
PiperOrigin-RevId: 274056805
This will allow for inlining newly devirtualized calls, as well as give a more accurate cost model(when we have one). Currently canonicalization will only run for nodes that have no child edges, as the child nodes may be erased during canonicalization. We can support this in the future, but it requires more intricate deletion tracking.
PiperOrigin-RevId: 274011386
When an operation with regions gets replaced, we currently require that all of the remaining nested operations are still converted even though they are going to be replaced when the rewrite is finished. This cl adds a tracking for a minimal set of operations that are known to be "dead". This allows for ignoring the legalization of operations that are won't survive after conversion.
PiperOrigin-RevId: 274009003
This function-like operation allows one to define functions that have wrapped
LLVM IR function type, in particular variadic functions. The operation was
added in parallel to the existing lowering flow, this commit only switches the
flow to use it.
Using a custom function type makes the LLVM IR dialect type system more
consistent and avoids complex conversion rules for functions that previously
had to use the built-in function type instead of a wrapped LLVM IR dialect type
and perform conversions during the analysis.
PiperOrigin-RevId: 273910855
Allow printing out pipelines in a format that is as close as possible to the
textual pass pipeline format. Individual passes can override the print function
in order to format any options that may have been used to construct that pass.
PiperOrigin-RevId: 273813627
The lowering infrastructure needs to be enhanced to lower into a
spv.Module that is consistent with the SPIR-V spec. The following
changes are needed
1) The Vulkan/SPIR-V validation rules dictates entry functions to have
signature of void(void). This requires changes to the function
signature conversion infrastructure within the dialect conversion
framework. When an argument is dropped from the original function
signature, a function can be specified that when invoked will return
the value to use as a replacement for the argument from the original
function.
2) Some changes to the type converter to make the converted type
consistent with the Vulkan/SPIR-V validation rules,
a) Add support for converting dynamically shaped tensors to
spv.rtarray type.
b) Make the global variable of type !spv.ptr<!spv.struct<...>>
3) Generate the entry point operation for the kernel functions and
automatically compute all the interface variables needed
PiperOrigin-RevId: 273784229
This PR is a stepping stone towards supporting generic multi-store
source loop nests in affine loop fusion. It extends the algorithm to
support fusion of multi-store loop nests that:
1. have only one store that writes to a function-local live out, and
2. the remaining stores are involved in loop nest self dependences
or no dependences within the function.
Closestensorflow/mlir#162
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/162 from dcaballe:dcaballe/multi-output-fusion 7fb7dec6fe8b45f5ce176f018bfe37b256420c45
PiperOrigin-RevId: 273773907
Currently SameOperandsAndResultShape trait allows operands to have tensor<*xf32> and tensor<2xf32> but doesn't allow tensor<?xf32> and tensor<10xf32>.
Also, use the updated shape compatibility helper function in TensorCastOp::areCastCompatible method.
PiperOrigin-RevId: 273658336
This enhances the symbol table utility methods to handle the case where an unknown operation may define a symbol table. When walking symbols, we now collect all symbol uses before allowing the user to iterate. This prevents the user from assuming that all symbols are actually known before performing a transformation.
PiperOrigin-RevId: 273651963
This allows individual passes to define options structs and for these options to be parsed per instance of the pass while building the pass pipeline from the command line provided textual specification.
The user can specify these per-instance pipeline options like so:
```
struct MyPassOptions : public PassOptions<MyPassOptions> {
Option<int> exampleOption{*this, "flag-name", llvm:🆑:desc("...")};
List<int> exampleListOption{*this, "list-flag-name", llvm:🆑:desc("...")};
};
static PassRegistration<MyPass, MyPassOptions> pass("my-pass", "description");
```
PiperOrigin-RevId: 273650140
The restriction that symbols can only have identifier names is arbitrary, and artificially limits the names that a symbol may have. This change adds support for parsing and printing symbols that don't fit in the 'bare-identifier' grammar by printing the reference in quotes, e.g. @"0_my_reference" can now be used as a symbol name.
PiperOrigin-RevId: 273644768
This is matching what the runtime library is expecting.
Closestensorflow/mlir#171
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/171 from deven-amd:deven-rocdl-device-func-i64 80762629a8c34e844ebdc542b34dd783990db9db
PiperOrigin-RevId: 273640767
Add a pass to decorate the composite types used by
composite objects in the StorageBuffer, PhysicalStorageBuffer,
Uniform, and PushConstant storage classes with layout information.
Closestensorflow/mlir#156
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/156 from denis0x0D:sandbox/layout_info_decoration 7c50840fd38ca169a2da7ce9886b52b50c868b84
PiperOrigin-RevId: 273634140
This is similar to the `inlineRegionBefore` hook, except the original blocks are unchanged. The region to be cloned *must* not have been modified during the conversion process at the point of cloning, i.e. it must belong an operation that has yet to be converted, or the operation that is currently being converted.
PiperOrigin-RevId: 273622533
- bodies would earlier appear in the order (i, i+3, i+2, i+1) instead of
(i, i+1, i+2, i+3) for example for factor 4.
- clean up hardcoded test cases
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#170
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/170 from bondhugula:ujam b66b405b2b1894a03b376952e32a9d0292042665
PiperOrigin-RevId: 273613131
MLIR uses symbol references to model references to many global entities, such as functions/variables/etc. Before this change, there is no way to actually reason about the uses of such entities. This change provides a walker for symbol references(via SymbolTable::walkSymbolUses), as well as 'use_empty' support(via SymbolTable::symbol_use_empty). It also resolves some deficiencies in the LangRef definition of SymbolRefAttr, namely the restrictions on where a SymbolRefAttr can be stored, ArrayAttr and DictionaryAttr, and the relationship with operations containing the SymbolTable trait.
PiperOrigin-RevId: 273549331
During the conversion, both the original and the converted function may coexist
in the module and have the same symbol name. There is no guarantee which of the
two will be found by the symbol lookup. Avoid returning the result of the
library function lookup when lowering Linalg to Standard or LLVM. Use the
symbol reference instead. After the conversion completes, only one symbol will
remain and the Ops using SymbolRefAttrs will be referring to the correct one.
PiperOrigin-RevId: 273510079
Originally, we were attaching attributes containing CUBIN blobs to the kernel
function called by `gpu.launch_func`. This kernel is now contained in a nested
module that is used as a compilation unit. Attach compiled CUBIN blobs to the
module rather than to the function since we were compiling the module. This
also avoids duplication of the attribute on multiple kernels within the same
module.
PiperOrigin-RevId: 273497303
Originally, the CUBIN getter function was introduced as a mechanism to
circumvent the absence of globals in the LLVM dialect. It would allocate memory
and populate it with the CUBIN data. LLVM dialect now supports globals and they
are already used to store CUBIN data, making the getter function a trivial
address computation of a global. Emit the address computation directly at the
place of `gpu.launch_func` instead of putting it in a function and calling it.
This simplifies the conversion flow and prepares it for using the
DialectConversion infrastructure.
PiperOrigin-RevId: 273496221
Now that the accessor function is a trivial getter of the global variable, it
makes less sense to have the getter generation as a separate pass. Move the
getter generation into the lowering of `gpu.launch_func` to CUDA calls. This
change is mostly code motion, but the process can be simplified further by
generating the addressof inplace instead of using a call. This is will be done
in a follow-up.
PiperOrigin-RevId: 273492517
The kernel function called by gpu.launch_func is now placed into an isolated
nested module during the outlining stage to simplify separate compilation.
Until recently, modules did not have names and could not be referenced. This
limitation was circumvented by introducing a stub kernel at the same name at
the same nesting level as the module containing the actual kernel. This
relation is only effective in one direction: from actual kernel function to its
launch_func "caller".
Leverage the recently introduced symbol name attributes on modules to refer to
a specific nested module from `gpu.launch_func`. This removes the implicit
connection between the identically named stub and kernel functions. It also
enables support for `gpu.launch_func`s to call different kernels located in the
same module.
PiperOrigin-RevId: 273491891
Some modules may have extremely large ElementsAttrs, which makes debugging involving IR dumping extremely slow and painful. This change adds a flag that will elide ElementsAttrs with a "large"(as defined by the user) number of elements by printing "..." instead of the element data.
PiperOrigin-RevId: 273413100
Since MLIR integer types don't make a distinction between signed vs
unsigned integers, during deserialization of SPIR-V binaries, the
OpBitcast might result in a cast from/to the same type. Do not add a
spv.Bitcast operation to the spv.module in these cases.
PiperOrigin-RevId: 273381887
This allows for controlling the behavior of the AsmPrinter programmatically, instead of relying exclusively on cl::opt flags. This will also allow for more fine-tuned control of printing behavior per callsite, instead of being applied globally.
PiperOrigin-RevId: 273368361
The SPIR-V spec recommends all OpUndef instructions be generated at
module level. For the SPIR-V dialect its better for UndefOp to produce
an SSA value for use with other instructions. If UndefOp is to be used
at module level, it cannot produce an SSA value (use of this SSA value
within FuncOp would need implicit capture). To satisfy needs of the
SPIR-V spec while making it simpler to represent UndefOp in the SPIR-V
dialect, the serialization is updated to create OpUndef instruction
at module scope.
PiperOrigin-RevId: 273355526
The structured selection/loop's entry block does not have arguments.
If the function's header block is also part of the structured control
flow, we cannot just simply erase it because it may contain arguments
matching the function signature and used by the cloned blocks. Instead,
turn it into a block only containing a spv.Branch op.
Also, we can directly emit instructions for the spv.selection header
block to the block containing the spv.selection op. This eliminates
unnecessary branches in the SPIR-V blob.
Added a test for nested spv.loop.
PiperOrigin-RevId: 273351424
Now that linalg.view and strided memrefs are unified, there is no reason to
disallow AllocOp in alias analysis. This CLs adds support for AllocOp which allows writing shorter tests that do not require explicitly creating a view for
each operation.
PiperOrigin-RevId: 273303060
Add new `typeDescription` (description was already used by base constraint class) field to type to allow writing longer descriptions about a type being defined. This allows for providing additional information/rationale for a defined type. This currently uses `description` as the heading/name for the type in the generated documentation.
PiperOrigin-RevId: 273299332
See RFC: https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/xE2IzfhE3Wg.
Opaque location stores two pointers, one of them points to some data structure that is external to MLIR, and the other one is unique for each type and represents type id of that data structure. OpaqueLoc also stores an optional location that can be used if the first one is not suitable.
OpaqueLoc is managed similar to FileLineColLoc. It is passed around by MLIR transformations and can be used in compound locations like CallSiteLoc.
PiperOrigin-RevId: 273266510
This allows confirming that a scalar argument has the same element type as a shaped one. It's easy to validate a type is shaped on its own if that's desirable, so this shouldn't make that use case harder. This matches the behavior of other traits that operate on element type (e.g. AllElementTypesMatch). Also this makes the code simpler because now we just use getElementTypeOrSelf.
Verified that all uses in core already check the type is shaped in another way.
PiperOrigin-RevId: 273068507
Use `getParentOfType<FunctionOp>()` instead of `cast<FuncOp>(getParentOp())`
to avoid crash when return ops are used inside spv.selection/spv.loop.
PiperOrigin-RevId: 273006041
Adding support for OpUndef instruction. Updating the dialect
generation script to fix a few bugs in the instruction spec
generation.
PiperOrigin-RevId: 272975685