Concatting lists in TableGen is easy, creating unique lists less so. There is no reason for duplicated op traits so we could throw an error instead but duplicates could occur due to concatting different list of traits in ODS (e.g., for convenience reasons), so just dedup them during Operator trait construction instead.
PiperOrigin-RevId: 286488423
Update vector transfer_read/write ops to operatate on memrefs with vector element type.
This handle cases where the memref vector element type represents the minimal memory transfer unit (or multiple of the minimal memory transfer unit).
PiperOrigin-RevId: 286482115
This CL allows specifying an additional name for specifying the .td file that is used to generate the doc for a dialect. This is necessary for a dialect like Linalg which has different "types" of ops that are used in different contexts.
This CL also restructures the Linalg documentation and renames LinalgLibraryOps -> LinalgStructuredOps but is otherwise NFC.
PiperOrigin-RevId: 286450414
Adds vector ReshapeOp to the VectorOps dialect. An aggregate vector reshape operation, which aggregates multiple hardware vectors, can enable optimizations during decomposition (e.g. loading one input hardware vector and performing multiple rotate and scatter store operations to the vector output).
PiperOrigin-RevId: 286440658
Introduces some centralized methods to move towards
consistent use of i32 as vector subscripts.
Note: sizes/strides/offsets attributes are still i64
PiperOrigin-RevId: 286434133
This function template has been introduced in the early days of MLIR to work
around the absence of common type for ranges of values (operands, block
argumeents, vectors, etc). Core IR now provides ValueRange for exactly this
purpose. Use it instead of the template parameter.
PiperOrigin-RevId: 286431338
Added test cases for the newly added LLVM operations and lowering features.
Closestensorflow/mlir#300
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/300 from dfki-jugr:std_to_llvm da6168bbc1a369ae2e99ad3881fdddd82f075dd4
PiperOrigin-RevId: 286231169
This enables providing a default implementation of an interface method. This method is defined on the Trait that is attached to the operation, and thus has all of the same constraints and properties as any other interface method. This allows for interface authors to provide a conservative default implementation for certain methods, without requiring that all users explicitly define it. The default implementation can be specified via the argument directly after the interface method body:
StaticInterfaceMethod<
/*desc=*/"Returns whether two array of types are compatible result types for an op.",
/*retTy=*/"bool",
/*methodName=*/"isCompatibleReturnTypes",
/*args=*/(ins "ArrayRef<Type>":$lhs, "ArrayRef<Type>":$rhs),
/*methodBody=*/[{
return ConcreteOp::isCompatibleReturnTypes(lhs, rhs);
}],
/*defaultImplementation=*/[{
/// Returns whether two arrays are equal as strongest check for
/// compatibility by default.
return lhs == rhs;
}]
PiperOrigin-RevId: 286226054
'```mlir' is used to indicate the code block is MLIR code/should use MLIR syntax
highlighting, while '{.mlir}' was a markdown extension that used a style file
to color the background differently of the code block. The background color
extension was a custom one that we can retire given we have syntax
highlighting.
Also change '```td' to '```tablegen' to match chroma syntax highlighting
designation.
PiperOrigin-RevId: 286222976
* Fixes use of anonymous namespace for static methods.
* Uses explicit qualifiers(mlir::) instead of wrapping the definition with the namespace.
PiperOrigin-RevId: 286222654
Introduce affine.prefetch: op to prefetch using a multi-dimensional
subscript on a memref; similar to affine.load but has no effect on
semantics, but only on performance.
Provide lowering through std.prefetch, llvm.prefetch and map to llvm's
prefetch instrinsic. All attributes reflected through the lowering -
locality hint, rw, and instr/data cache.
affine.prefetch %0[%i, %j + 5], false, 3, true : memref<400x400xi32>
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#225
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/225 from bondhugula:prefetch 4c3b4e93bc64d9a5719504e6d6e1657818a2ead0
PiperOrigin-RevId: 286212997
The definition of the function template LLVM::ModuleTranslation::lookupValues
has been located in a source file. As long as it has been the only file that
actually called into the function, this did not cause any problem. However, it
creates linking issues if the function is used from other translation units.
PiperOrigin-RevId: 286203078
When memory attributions are present in `gpu.func`, require that they are of
memref type and live in memoryspaces 3 and 5 for workgroup and private memory
attributions, respectively. Adapt the conversion from the GPU dialect to the
NVVM dialect to drop the private memory space from attributions as NVVM is able
to model them as local `llvm.alloca`s in the default memory space.
PiperOrigin-RevId: 286161763
The inline interface uses two methods to check legality of inling:
1) Can a region be inlined into another.
2) Can an operation be inlined into another.
Setting the former to true, allows the inliner to use the second for
legality checks. Add this method to the SPIR-V dialect inlining
interface.
PiperOrigin-RevId: 286041734
The lowering of MemRef types to the LLVM dialect is connected to the underlying
runtime representation of structured memory buffers. It has changed several
times in the past and reached the current state of a LLVM structured-typed
descriptor containing two pointers and all sizes. In several reported use
cases, a different, often simpler, lowering scheme is required. For example,
lowering statically-shaped memrefs to bare LLVM pointers to simplify aliasing
annotation. Split the pattern population functions into those include
memref-related operations and the remaining ones. Users are expected to extend
TypeConverter::convertType to handle the memref types differently.
PiperOrigin-RevId: 286030610
The syntax for LLVM dialect types changed twice since this document was
introduced. First, the quoted types are only prefixed with the dialect name
`!llvm` rather than with `!llvm.type`. Second, for types that are simple enough
(e.g., MLIR identifiers), the pretty form can be used instead of the quoted
form. The relevant commits updated the dialect documentation, but not the
conversion documentation. Use the valid type names in the conversion
documentation.
PiperOrigin-RevId: 286026153
This function has become redundant with MemRefDescriptor::getElementType and is
no longer necessary. Use the MemRefDescriptor pervasively to concentrate
descriptor-related logic in one place and drop the utility function.
PiperOrigin-RevId: 286024168
The conversion procedure has been updated to reflect the most recent MemRef
descriptor proposal, but the documentation was only updated for the type
conversion, omitting the address computation section. Make sure the two
sections agree.
PiperOrigin-RevId: 286022684
This class provides a simplified mechanism for defining a switch over a set of types using llvm casting functionality. More specifically, this allows for defining a switch over a value of type T where each case corresponds to a type(CaseT) that can be used with dyn_cast<CaseT>(...). An example is shown below:
// Traditional piece of code:
Operation *op = ...;
if (auto constant = dyn_cast<ConstantOp>(op))
...;
else if (auto return = dyn_cast<ReturnOp>(op))
...;
else
...;
// New piece of code:
Operation *op = ...;
TypeSwitch<Operation *>(op)
.Case<ConstantOp>([](ConstantOp constant) { ... })
.Case<ReturnOp>([](ReturnOp return) { ... })
.Default([](Operation *op) { ... });
Aside from the above, TypeSwitch supports return values, void return, multiple types per case, etc. The usability is intended to be very similar to StringSwitch.
(Using c++14 template lambdas makes everything even nicer)
More complex example of how this makes certain things easier:
LogicalResult process(Constant op);
LogicalResult process(ReturnOp op);
LogicalResult process(FuncOp op);
TypeSwitch<Operation *, LogicalResult>(op)
.Case<ConstantOp, ReturnOp, FuncOp>([](auto op) { return process(op); })
.Default([](Operation *op) { return op->emitError() << "could not be processed"; });
PiperOrigin-RevId: 286003613
Scope and Memory Semantics attributes need to be serialized as a
constant integer value and the <id> needs to be used to specify the
value. Fix the auto-generated SPIR-V (de)serialization to handle this.
PiperOrigin-RevId: 285849431
This CL adds more Linalg EDSC ops and tests to support building pointwise operations along with conv and dilated_conv.
This also fixes a bug in the existing linalg_matmul EDSC and beefs up the test.
The current set of ops is already enough to build an interesting, albeit simple, model used internally.
PiperOrigin-RevId: 285838012
This updates the lowering pipelines from the GPU dialect to lower-level
dialects (NVVM, SPIRV) to use the recently introduced gpu.func operation
instead of a standard function annotated with an attribute. In particular, the
kernel outlining is updated to produce gpu.func instead of std.func and the
individual conversions are updated to consume gpu.funcs and disallow standard
funcs after legalization, if necessary. The attribute "gpu.kernel" is preserved
in the generic syntax, but can also be used with the custom syntax on
gpu.funcs. The special kind of function for GPU allows one to use additional
features such as memory attribution.
PiperOrigin-RevId: 285822272
This keeps the IR valid and consistent as it is expected that each block should have a valid parent region/operation. Previously, converted blocks were kept floating without a valid parent region.
PiperOrigin-RevId: 285821687
The conversion from the Loops dialect to the Standard dialect, also known as
loop-to-cfg lowering, has been historically a function pass. It can be required
on non-Standard function Ops, in particular the recently introduced GPU
functions. Make the conversion an operation pass instead of a function pass.
PiperOrigin-RevId: 285814560
This PR targest issue tensorflow/mlir#295. It exposes the already existing
subiew promotion pass as a declarative pattern
Change-Id: If901ebef9fb53fcd0b12ecc536f6b174ce320b92
Closestensorflow/mlir#315
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/315 from tetuante:issue295 8e5f268b6d85f31015c33505329dbd7a4db97ac5
PiperOrigin-RevId: 285801463
Similar to insert/extract vector instructions but
(1) work on 1-D vectors only
(2) allow for a dynamic index
%c3 = constant 3 : index
%0 = vector.insertelement %arg0, %arg1[%c : index] : vector<4xf32>
%1 = vector.extractelement %arg0[%c3 : index] : vector<4xf32>
PiperOrigin-RevId: 285792205
ExtractSlicesOp extracts slices of its vector operand and with a specified tiling scheme.
This operation centralizes the tiling scheme around a single op, which simplifies vector op unrolling and subsequent pattern rewrite transformations.
PiperOrigin-RevId: 285761129
During the conversion from the standard dialect to the LLVM dialect,
memref-typed arguments are promoted from registers to memory and passed into
functions by pointer. This had been introduced into the lowering to work around
the abesnce of calling convention modeling in MLIR to enable better
interoperability with LLVM IR generated from C, and has been exerciced for
several months. Make this promotion the default calling covention when
converting to the LLVM dialect. This adds the documentation, simplifies the
code and makes the conversion consistent across function operations and
function types used in other places, e.g. in high-order functions or
attributes, which would not follow the same rule previously.
PiperOrigin-RevId: 285751280
- bring op description comment in sync with the doc
- fix misformat in doc
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#317
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/317 from bondhugula:quickfix 7fcd945b318c973b2488b702874c87526855c8ef
PiperOrigin-RevId: 285574527
This is needed for calling the generator on a .td file that contains both OpInterface definitions and op definitions with DeclareOpInterfaceMethods<...> Traits.
PiperOrigin-RevId: 285465784
This will be evolved into a simple programming model for custom ops and custom layers in followup CLs.
This CL also deletes the obsolete tablegen's reference-impl.td that was using EDSCs.
PiperOrigin-RevId: 285459545
This change allows for DialectConversion to attempt folding as a mechanism to legalize illegal operations. This also expands folding support in OpBuilder::createOrFold to generate new constants when folding, and also enables it to work in the context of a PatternRewriter.
PiperOrigin-RevId: 285448440
The clamp value determines the returned predicate. Previously, the clamp value was fixed to 31 and the predicate was therefore always true. This is incorrect for partial warp reductions, but went unnoticed because the returned values happened to be zero (but it could be anything).
PiperOrigin-RevId: 285343160
This cleans up the implementation of the various operation print methods. This is done via a combination of code cleanup, adding new streaming methods to the printer(e.g. operand ranges), etc.
PiperOrigin-RevId: 285285181
Add variant that does invoke infer type op interface where defined. Also add entry function that invokes that different separate argument builders for wrapped, unwrapped and inference variant.
PiperOrigin-RevId: 285220709
This type is not used anymore now that Linalg view and subview have graduated to std and that alignment is supported on alloc.
PiperOrigin-RevId: 285213424
This allows reusing the implementation in various places by just including and permits more easily writing test functions without explicit template instantiations.
This also modifies UnrankedMemRefType to take a template type parameter since it cannot be type agnostic atm.
PiperOrigin-RevId: 285187711
Both work for the current use case, but the latter allows implementing
prefix sums and is a little easier to understand for partial warps.
PiperOrigin-RevId: 285145287
It is sometimes useful to create operations separately from the builder before insertion as it may be easier to erase them in isolation if necessary. One example use case for this is folding, as we will only want to insert newly generated constant operations on success. This has the added benefit of fixing some silent PatternRewriter failures related to cloning, as the OpBuilder 'clone' methods don't call createOperation.
PiperOrigin-RevId: 285086242
This CL adds more common information to StructuredOpsUtils.h
The n_view attribute is retired in favor of args_in + args_out but the CL is otherwise NFC.
PiperOrigin-RevId: 285000621
Add one more simplification for floordiv and mod affine expressions.
Examples:
(2*d0 + 1) floordiv 2 is simplified to d0
(8*d0 + 4*d1 + d2) floordiv 4 simplified to 4*d0 + d1 + d2 floordiv 4.
etc.
Similarly, (4*d1 + 1) mod 2 is simplified to 1,
(2*d0 + 8*d1) mod 8 simplified to 2*d0 mod 8.
Change getLargestKnownDivisor to return int64_t to be consistent and
to avoid casting at call sites (since the return value is used in expressions
of int64_t/index type).
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#202
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/202 from bondhugula:affine b13fcb2f1c00a39ca5434613a02408e085a80e77
PiperOrigin-RevId: 284866710
Move the definition of gpu.launch_func operation from hand-rolled C++
implementation to the ODS framework. Also move the documentation. This only
performs the move and remains a non-functional change, a follow-up will clean
up the custom functions that can be auto-generated using ODS.
PiperOrigin-RevId: 284842252
This has several benefits:
* The implementation is much cleaner and more efficient.
* The ranges now have support for many useful operations: operator[], slice, drop_front, size, etc.
* Value ranges can now directly query a range for their types via 'getTypes()': e.g:
void foo(Operation::operand_range operands) {
auto operandTypes = operands.getTypes();
}
PiperOrigin-RevId: 284834912
This patch closes issue tensorflow/mlir#272
We add a standalone iterator permutation transformation to Linalg.
This transformation composes a permutation map with the maps in the
"indexing_maps" attribute. It also permutes "iterator_types"
accordingly.
Change-Id: I7c1e693b8203aeecc595a7c012e738ca1100c857
Closestensorflow/mlir#307
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/307 from tetuante:issue272 f7908d58792f4111119721885e247045104f1131
PiperOrigin-RevId: 284824102
This reorganizes the vector transformations to be more easily testable as patterns and more easily composable into fused passes in the future.
PiperOrigin-RevId: 284817474
Add some convenience build methods to SPIR-V ops and update the
lowering to use these methods where possible.
For SPIRV::CompositeExtractOp move the method to deduce type of
element based on base and indices into a convenience function. Some
additional functionality needed to handle differences between parsing
and verification methods.
PiperOrigin-RevId: 284794404
These come from a non-standard extenion that is not available on Github, so it
only clutters the documentation source with {.mlir} or {.ebnf} tags.
PiperOrigin-RevId: 284733003
Avoid `error: could not convert ?(const char*)"reduction"? from ?const char*? to ?llvm::StringLiteral?`. Tested with gcc-5.5.
PiperOrigin-RevId: 284677810
For example
%0 = vector.shuffle %x, %y [3 : i32, 2 : i32, 1 : i32, 0 : i32] : vector<2xf32>, vector<2xf32>
yields a vector<4xf32> result with a permutation of the elements of %x and %y
PiperOrigin-RevId: 284657191
Each of the support classes for Block are now moved into a new header BlockSupport.h. The successor iterator class is also reimplemented as an indexed_accessor_range. This makes the class more efficient, and expands on its available functionality.
PiperOrigin-RevId: 284646792
Many ranges want similar functionality from a range type(e.g. slice/drop_front/operator[]/etc.), so these classes provide a generic implementation that may be used by many different types of ranges. This removes some code duplication, and also empowers many of the existing range types in MLIR(e.g. result type ranges, operand ranges, ElementsAttr ranges, etc.). This change only updates RegionRange and ValueRange, more ranges will be updated in followup commits.
PiperOrigin-RevId: 284615679
This CL starts extracting commonalities between dialects that use the structured ops abstractions. Also fixes an OSS build issue where StringRef were incorrectly used with constexpr.
PiperOrigin-RevId: 284591114
Currently named accessors are generated for attributes returning a consumer
friendly type. But sometimes the attributes are used while transforming an
existing op and then the returned type has to be converted back into an
attribute or the raw `getAttr` needs to be used. Generate raw named accessor
for attributes to reference the raw attributes without having to use the string
interface for better compile time verification. This allows calling
`blahAttr()` instead of `getAttr("blah")`.
Raw here refers to returning the underlying storage attribute.
PiperOrigin-RevId: 284583426
The existing GPU to SPIR-V lowering created a spv.module for every
function with gpu.kernel attribute. A better approach is to lower the
module that the function lives in (which has the attribute
gpu.kernel_module) to a spv.module operation. This better captures the
host-device separation modeled by GPU dialect and simplifies the
lowering as well.
PiperOrigin-RevId: 284574688
Unifies vector op unrolling transformation, by using the same unrolling implementation for contraction and elementwise operations.
Removes fakefork/join operations which are non longer needed now that we have the InsertStridedSlice operation.
PiperOrigin-RevId: 284570784
This CL uses the newly expanded matcher support to easily detect when a linalg.generic has a multiply-accumulate body. A linalg.generic with such a body is rewritten as a vector contraction.
This CL additionally limits the rewrite to the case of matrix multiplication on contiguous and statically shaped memrefs for now.
Before expanding further, we should harden the infrastructure for expressing custom ops with the structured ops abstraction.
PiperOrigin-RevId: 284566659
Follows ValueRange in representing a generic abstraction over the different
ways to represent a range of Regions. This wrapper is not as ValueRange and only
considers the current cases of interest: MutableArrayRef<Region> and
ArrayRef<std::unique_ptr<Region>> as occurs during op construction vs op region
querying.
Note: ArrayRef<std::unique_ptr<Region>> allows for unset regions, so this range
returns a pointer to a Region instead of a Region.
PiperOrigin-RevId: 284563229
This CL adds support for building matchers recursively.
The following matchers are provided:
1. `m_any()` can match any value
2. `m_val(Value *)` binds to a value and must match it
3. `RecursivePatternMatcher<OpType, Matchers...>` n-arity pattern that matches `OpType` and whose operands must be matched exactly by `Matchers...`.
This allows building expression templates for patterns, declaratively, in a very natural fashion.
For example pattern `p9` defined as follows:
```
auto mul_of_muladd = m_Op<MulFOp>(m_Op<MulFOp>(), m_Op<AddFOp>());
auto mul_of_anyadd = m_Op<MulFOp>(m_any(), m_Op<AddFOp>());
auto p9 = m_Op<MulFOp>(m_Op<MulFOp>(
mul_of_muladd, m_Op<MulFOp>()),
m_Op<MulFOp>(mul_of_anyadd, mul_of_anyadd));
```
Successfully matches `%6` in:
```
%0 = addf %a, %b: f32
%1 = addf %a, %c: f32 // matched
%2 = addf %c, %b: f32
%3 = mulf %a, %2: f32 // matched
%4 = mulf %3, %1: f32 // matched
%5 = mulf %4, %4: f32 // matched
%6 = mulf %5, %5: f32 // matched
```
Note that 0-ary matchers can be used as leaves in place of n-ary matchers. This alleviates from passing explicit `m_any()` leaves.
In the future, we may add extra patterns to specify that operands may be matched in any order.
PiperOrigin-RevId: 284469446
This allows for users to provide operand_range and result_range in builder.create<> calls, instead of requiring an explicit copy into a separate data structure like SmallVector/std::vector.
PiperOrigin-RevId: 284360710
This class represents a generic abstraction over the different ways to represent a range of Values: ArrayRef<Value *>, operand_range, result_range. This class will allow for removing the many instances of explicit SmallVector<Value *, N> construction. It has the same memory cost as ArrayRef, and only suffers cost from indexing(if+elsing the different underlying representations).
This change only updates a few of the existing usages, with more to be changed in followups; e.g. 'build' API.
PiperOrigin-RevId: 284307996
This adds an additional filtering mode for printing after a pass that checks to see if the pass actually changed the IR before printing it. This "change" detection is implemented using a SHA1 hash of the current operation and its children.
PiperOrigin-RevId: 284291089
- for the symbol rules, the code was updated but the doc wasn't.
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#284
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/284 from bondhugula:doc 9aad8b8a715559f7ce61265f3da3f8a3c11b45ea
PiperOrigin-RevId: 284283712
Previously the error case was using a sentinel in the error case which was bad. Also make the one `build` invoke the other `build` to reuse verification there.
And follow up on suggestion to use formatv which I missed during previous review.
PiperOrigin-RevId: 284265762
During lowering, spv.module might be within other modules (for example
gpu kernel module). Walk the module op to find spirv module to
serialize.
PiperOrigin-RevId: 284262550
Move the definition of the GPU launch opreation from hand-rolled C++ code to
ODS framework. This only does the moves, a follow-up is necessary to clean up
users of custom functions that could be auto-generated by ODS.
PiperOrigin-RevId: 284261856
The "FunctionLike" and "IsIsolatedFromAbove" op traits are now defined as named
records in base ODS file. Use those instead of NativeOpTrait referring to the
C++ class name in the ODS definition of LLVMFuncOp. NFC.
PiperOrigin-RevId: 284260891
Since these operations lower to [insert|extract][element|value] at LLVM
dialect level, neither element nor value would correctly reflect the meaning.
PiperOrigin-RevId: 284240727
Accept the address space of the global as a builder argument when constructing
an LLVM::GlobalOp instance. This decreases the reliance of LLVM::GlobalOp users
on the internal name of the attribute used for this purpose. Update several
uses of the address space in GPU to NVVM conversion.
PiperOrigin-RevId: 284233254
Move the definition of the GPU function opreation from hand-rolled C++ code to
ODS framework. This only does the moves, a follow-up is necessary to clean up
users of custom functions that could be auto-generated by ODS.
PiperOrigin-RevId: 284233245
For ops with infer type op interface defined, generate version that calls the inferal method on build. This is intermediate step to removing special casing of SameOperandsAndResultType & FirstAttrDereivedResultType. After that would be generating the inference code, with the initial focus on shaped container types. In between I plan to refactor these a bit to reuse generated paths. The intention would not be to add the type inference trait in multiple places, but rather to take advantage of the current modelling in ODS where possible to emit it instead.
Switch the `inferReturnTypes` method to be static.
Skipping ops with regions here as I don't like the Region vs unique_ptr<Region> difference at the moment, and I want the infer return type trait to be useful for verification too. So instead, just skip it for now to avoid churn.
PiperOrigin-RevId: 284217913
GPU functions use memory attributions, a combination of Op attributes and
region arguments, to specify function-wide buffers placed in workgroup or
private memory spaces. Introduce a lowering pattern for GPU functions to be
converted to LLVM functions taking into account memory attributions. Workgroup
attributions get transformed into module-level globals with unique names
derived from function names. Private attributions get converted into
llvm.allocas inside the function body. In both cases, we inject at the
beginning of the function the IR that obtains the raw pointer to the data and
populates a MemRef descriptor based on the MemRef type of buffer, making
attributions compose with the rest of the MemRef lowering and transparent for
use with std.load and std.store. While using raw pointers instead of
descriptors might have been more efficient, it is better implemented as a
canonicalization or a separate transformation so that non-attribution memrefs
could also benefit from it.
PiperOrigin-RevId: 284208396
It would be nice if we could detect if stats were enabled or not and use 'Requires', but this isn't possible to do at configure time.
Fixestensorflow/mlir#296
PiperOrigin-RevId: 284200271
Updates vector ContractionOp to use proper vector masks (produced by CreateMaskOp/ConstantMaskOp).
Leverages the following canonicalizations in unrolling unit test: CreateMaskOp -> ConstantMaskOp, StridedSliceOp(ConstantMaskOp) -> ConstantMaskOp
Removes IndexTupleOp (no longer needed now that we have vector mask ops).
Updates all unit tests.
PiperOrigin-RevId: 284182168
The iterator should be erased before adding a new entry
into blockMergeInfo to avoid iterator invalidation.
Closestensorflow/mlir#299
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/299 from denis0x0D:sandbox/reoder_erase 983be565809aa0aadfc7e92962e4d4b282f63c66
PiperOrigin-RevId: 284173235
The AddressOf operation in the LLVM dialect return a pointer to a global
variable. The latter may be in a non-default address space as indicated by the
"addr_space" attribute. Check that the address space of the pointer returned by
AddressOfOp matches that of the referenced GlobalOp. Update the AddressOfOp
builder to respect this constraint.
PiperOrigin-RevId: 284138860
This patch closes issue tensorflow/mlir#271.
It adds an optional permutation map to declarative tiling transformations.
The map is expressed as a list of integers.
Closestensorflow/mlir#288
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/288 from tetuante:issue271 2df2938d6a1f01b3bc404ded08dea2dd1e10b588
PiperOrigin-RevId: 284064151
This allows for more interesting behavior from users, e.g. enabling the ability to dump the IR to a separate file for each pass invocation.
PiperOrigin-RevId: 284059447
A CompositeInsertOp operation make a copy of a composite object,
while modifying one part of it.
Closestensorflow/mlir#292
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/292 from denis0x0D:sandbox/composite_insert 2200962b9057bda53cd2f2866b461e2797196380
PiperOrigin-RevId: 284036551
Statistics are a way to keep track of what the compiler is doing and how effective various optimizations are. It is useful to see what optimizations are contributing to making a particular program run faster. Pass-instance specific statistics take this even further as you can see the effect of placing a particular pass at specific places within the pass pipeline, e.g. they could help answer questions like "what happens if I run CSE again here".
Statistics can be added to a pass by simply adding members of type 'Pass::Statistics'. This class takes as a constructor arguments: the parent pass pointer, a name, and a description. Statistics can be dumped by the pass manager in a similar manner to how pass timing information is dumped, i.e. via PassManager::enableStatistics programmatically; or -pass-statistics and -pass-statistics-display via the command line pass manager options.
Below is an example:
struct MyPass : public OperationPass<MyPass> {
Statistic testStat{this, "testStat", "A test statistic"};
void runOnOperation() {
...
++testStat;
...
}
};
$ mlir-opt -pass-pipeline='func(my-pass,my-pass)' foo.mlir -pass-statistics
Pipeline Display:
===-------------------------------------------------------------------------===
... Pass statistics report ...
===-------------------------------------------------------------------------===
'func' Pipeline
MyPass
(S) 15 testStat - A test statistic
MyPass
(S) 6 testStat - A test statistic
List Display:
===-------------------------------------------------------------------------===
... Pass statistics report ...
===-------------------------------------------------------------------------===
MyPass
(S) 21 testStat - A test statistic
PiperOrigin-RevId: 284022014
SPIR-V/Vulkan spec requires the workgroups size to be specified with
the spv.ExecutionMode operation. This was hard-wired to be set to a
particular value. It is now changed to be configurable by clients of
the pass or of the patterns that implement the lowering from GPU to
SPIRV.
PiperOrigin-RevId: 284017482
It is often desirable to know where within the program that a diagnostic was emitted, without reverting to assert/unreachable which crash the program. This change adds a flag `mlir-print-stacktrace-on-diagnostic` that attaches the current stack trace as a note to every diagnostic that gets emitted.
PiperOrigin-RevId: 283996373
For serialization, when we have nested ops, the inner loop will create multiple
SPIR-V blocks. If the outer loop has block arguments (which corresponds to
OpPhi instructions), we defer the handling of OpPhi's parent block handling
until we serialized all blocks and then fix it up with the result <id>. These two
cases happening together was generating invalid SPIR-V blob because we
previously assume the parent block to be the block containing the terminator.
That is not true anymore when the block contains structured control flow ops.
If that happens, it should be fixed to use the structured control flow op's
merge block.
For deserialization, we record a map from header blocks to their corresponding
merge and continue blocks during the initial deserialization and then use the
info to construct spv.selection/spv.loop. The existing implementation will also
fall apart when we have nested loops. If so, we clone all blocks for the outer
loop, including the ones for the inner loop, to the spv.loop's region. So the map
for header blocks' merge info need to be updated; otherwise we are operating
on already deleted blocks.
PiperOrigin-RevId: 283949230
This change adds support for non-congruent indices in the operation ordering within a basic block. This effect of this is that insertions are less likely to cause an invalidation of the ordering within a block. This has a big effect on modules that have very large basic blocks.
PiperOrigin-RevId: 283858136
In some situations a diagnostic may optionally be emitted by the presence of a location, e.g. attribute and type verification. These situations currently require extra 'if(loc) emitError(...); return failure()' wrappers that make verification clunky. These new overloads take an optional location and a list of arguments to the diagnostic, and return a LogicalResult. We take the arguments directly and return LogicalResult instead of returning InFlightDiagnostic because we cannot create a valid diagnostic with a null location. This creates an awkward situation where a user may try to treat the, potentially null, diagnostic as a valid one and encounter crashes when attaching notes/etc. Below is an example of how these methods simplify some existing usages:
Before:
if (loc)
emitError(*loc, "this is my diagnostic with argument: ") << 5;
return failure();
After:
return emitOptionalError(loc, "this is my diagnostic with argument: ", 5);
PiperOrigin-RevId: 283853599
In the future, a more configurable malloc and free interface should be used and exposed via
extra parameters to the `createLowerToLLVMPass`. Until requirements are gathered, a simple CL flag allows generating code that runs successfully on hardware that cannot use the stdlib.
PiperOrigin-RevId: 283833424
Adds a ConstantMaskOp to the vector ops dialect.
Adds the following canonicalization patterns:
CreateMaskOp -> ConstantMaskOp
StridedSliceOp(ConstantMaskOp) -> ConstantMaskOp
PiperOrigin-RevId: 283816752
Now that we have unrolling as a declarative pattern, we can drop a full pass that has gone stale. In the future we may want to add specific unrolling patterns for VectorTransferReadOp.
PiperOrigin-RevId: 283806880
I found that when running crash reproducers, the elided elementsattr's
would prevent parsing the IR repro. I found myself manually going and
replacing the "..." with some valid IR.
With this change, we now print elided attrs as `opaque<"", "0xDEADBEEF">`
to clearly delineate them as being elided while still being parseable.
PiperOrigin-RevId: 283781806
- the name was misleading; this is really checking if a Value being used
to index was loop IV invariant. Update comment.
- the method is only used locally; what can be exposed in the future is
isAccessInvariant(LoadOrStoreOp op, Value *iv)
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#285
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/285 from bondhugula:quickfix fe5837abe987980c4ab469a9aa7de8e4f0007d9f
PiperOrigin-RevId: 283771923
In the replaceAllUsesExcept utility function called from loop coalescing the
iteration over the use-chain is incorrect. The use list nodes (IROperands) have
next/prev links, and bluntly resetting the use would make the loop to continue
on uses of the value that was replaced instead of the original one. As a
result, it could miss the existing uses and update the wrong ones. Make sure we
increment the iterator before updating the use in the loop body.
Reported-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#291.
PiperOrigin-RevId: 283754195
This CL refactors some of the MLIR vector dependencies to allow decoupling VectorOps, vector analysis, vector transformations and vector conversions from each other.
This makes the system more modular and allows extracting VectorToVector into VectorTransforms that do not depend on vector conversions.
This refactoring exhibited a bunch of cyclic library dependencies that have been cleaned up.
PiperOrigin-RevId: 283660308
This CL also did the following cleanup:
- Moved the test for spv.SubgroupBallotKHR to its own file
- Wrapped generated canonicalization patterns in anonymous namespace
- Updated header comments in SPVOps.td
PiperOrigin-RevId: 283650091
Not all StandardOps can be lowered to SPIR-V. For example, subview op
implementation requires use of pointer bitcasts which is not valid
according to SPIR-V spec (or at least is ambiguous about it). Such ops
need to be removed/transformed before lowering to SPIR-V. The
SPIRVLegalizationPass is added a place where such legalizations can be
added. Current implementation folds the subview ops with load/stores
so that the lowering itself does not have to convert a subview op.
PiperOrigin-RevId: 283642981
The hook has the following form:
* `bool isInvalidated(const AnalysisManager::PreservedAnalyses &)`
Given a preserved analysis set, the analysis returns true if it should truly be
invalidated. This allows for more fine-tuned invalidation in cases where an
analysis wasn't explicitly marked preserved, but may be preserved(or
invalidated) based upon other properties; such as analyses sets.
PiperOrigin-RevId: 283582889
The SPIR-V lowering used nested !spv.arrays to represented
multi-dimensional arrays, with the hope that in-conjunction with the
layout annotations, the shape and layout of memref can be represented
directly. It is unclear though how portable this representation will
end up being. It will rely on driver compilers implementing complex
index computations faithfully. A more portable approach is to use
linearized arrays to represent memrefs and explicitly instantiate all
the index computation in SPIR-V. This gives added benefit that we can
further optimize the generated code in MLIR before generating the
SPIR-V binary.
PiperOrigin-RevId: 283571167
As described in the documentation, ViewOp is expected to take an optional
dynamic offset followed by a list of dynamic sizes. However, the ViewOp parser
did not include a check for the offset being a single value and accepeted a
list of values instead.
Furthermore, several tests have been exercising the wrong syntax of a ViewOp,
passing multiple values to the dyanmic stride list, which was not caught by the
parser. The trailing values could have been erronously interpreted as dynamic
sizes. This is likely due to resyntaxing of the ViewOp, with the previous
syntax taking the list of sizes before the offset. Update the tests to use the
syntax with the offset preceding the sizes.
Worse, the conversion of ViewOp to the LLVM dialect assumed the wrong order of
operands with offset in the trailing position, and erronously relied on the
permissive parsing that interpreted trailing dynamic offset values as leading
dynamic sizes. Fix the lowering to use the correct order of operands.
PiperOrigin-RevId: 283532506
tensorflow/mlir#162 introduced a bug that
incorrectly allowed fusion of producer loops with multiple outgoing
edges. This commit fixes that problem. It also introduces a new flag to
disable sibling loop fusion so that we can test producer-consumer fusion
in isolation.
Closestensorflow/mlir#259
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/259 from dcaballe:dcaballe/fix_multi_out_edge_producer_fusion 578d5661705fd5c56c555832d5e0528df88c5282
PiperOrigin-RevId: 283531105
are constant (i.e., there are no size and stride operands).
We recently added canonicalization that rewrites constant size and stride operands to
SubViewOp into static information in the type, so these patterns now occur during code
generation.
PiperOrigin-RevId: 283524688
A recent commit introduced the Linkage attribute to the LLVM dialect and used
it in the Global Op. Also use it in LLVMFuncOp. As per LLVM Language Reference,
if the linkage attribute is omitted, the function is assumed to have external
linkage.
PiperOrigin-RevId: 283493299
Existing builders generated by ODS require attributes to be passed
in as mlir::Attribute or its subclasses. This is okay foraggregate-
parameter builders, which is primarily to be used by programmatic
C++ code generation; it is inconvenient for separate-parameter
builders meant to be called in manually written C++ code because
it requires developers to wrap raw values into mlir::Attribute by
themselves.
This CL extends to generate additional builder methods that
take raw values for attributes and handles the wrapping in the
builder implementation. Additionally, if an attribute appears
late in the arguments list and has a default value, the default
value is supplied in the declaration if possible.
PiperOrigin-RevId: 283355919
Right now op argument matching in DRR is position-based, meaning we need to
specify N arguments for an op with N ODS-declared argument. This can be annoying
when we don't want to capture all the arguments. `$_` is to remedy the situation.
PiperOrigin-RevId: 283339992
LLVM IR supports linkage on global objects such as global variables and
functions. Introduce the Linkage attribute into the LLVM dialect, backed by an
integer storage. Use this attribute on LLVM::GlobalOp and make it mandatory.
Implement parsing/printing of the attribute and conversion to LLVM IR.
See tensorflow/mlir#277.
PiperOrigin-RevId: 283309328