Tensor inputs, if not used in the body of TiledLoopOp, can be removed.
memref::CastOp can be folded into TiledLoopOp as well.
Differential Revision: https://reviews.llvm.org/D101445
Both, `shape.broadcast` and `shape.cstr_broadcastable` accept dynamic and static
extent tensors. If their operands are casted, we can use the original value
instead.
Differential Revision: https://reviews.llvm.org/D101376
Add a section attribute to LLVM_GlobalOp, during module translation attribute value is propagated to llvm
Reviewed By: sgrechanik, ftynse, mehdi_amini
Differential Revision: https://reviews.llvm.org/D100947
This adds `mlirOperationSetOperand` to the IR C API, similar to the
function to get an operand.
In the Python API, this adds `operands[index] = value` syntax, similar
to the syntax to get an operand with `operands[index]`.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D101398
Add the `getCapsule()` and `createFromCapsule()` methods to the
PyValue class, as well as the necessary interoperability.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D101090
MatMul and FullyConnected have transposed dimensions for the weights.
Also, removed uneeded tensor reshape for bias.
Differential Revision: https://reviews.llvm.org/D101220
Quantized negation can be performed using higher bits operations.
Minimal bits are picked to perform the operation.
Differential Revision: https://reviews.llvm.org/D101225
Explicitly check for uninitialized to prevent crashes in edge cases where the derived analysis creates a lattice element for a value that hasn't been visited yet.
Like `print-ir-after-all` and `-before-all`, this allows to inspect IR for
debug purposes. While the former allow to inspect only between passes, this
change allows to follow the rewrites that happen within passes.
Differential Revision: https://reviews.llvm.org/D100940
Empty extent tensor operands were only removed when they were defined as a
constant. Additionally, we can remove them if they are known to be empty by
their type `tensor<0xindex>`.
Differential Revision: https://reviews.llvm.org/D101351
Splat constant folding was limited to `std.constant` operations. Instead, use
the constant matcher and apply splat constant folding to any constant-like
operation that holds a splat attribute.
Differential Revision: https://reviews.llvm.org/D101301
This revision takes the forward value propagation engine in SCCP and refactors it into a more generalized forward dataflow analysis framework. This framework allows for propagating information about values across the various control flow constructs in MLIR, and removes the need for users to reinvent the traversal (often not as completely). There are a few aspects of the traversal, that were conservative for SCCP, that should be relaxed to support the needs of different value analyses. To keep this revision simple, these conservative behaviors will be left in (Note that this won't produce an incorrect result, but may produce more conservative results than necessary in certain edge cases. e.g. region entry arguments for non-region branch interface operations). The framework also only focuses on computing lattices for values, given the SCCP origins, but this is something to relax as needed in the future.
Given that this logic is already in SCCP, a majority of this commit is NFC. The more interesting parts are the interface glue that clients interact with.
Differential Revision: https://reviews.llvm.org/D100915
The new "encoding" field in tensor types so far had no meaning. This revision introduces:
1. an encoding attribute interface in IR: for verification between tensors and encodings in general
2. an attribute in Tensor dialect; #tensor.sparse<dict> + concrete sparse tensors API
Active discussion:
https://llvm.discourse.group/t/rfc-introduce-a-sparse-tensor-type-to-core-mlir/2944/
Reviewed By: silvas, penpornk, bixia
Differential Revision: https://reviews.llvm.org/D101008
Add two canoncalizations for scf.if.
1) A canonicalization that allows users of a condition within an if to assume the condition
is true if in the true region, etc.
2) A canonicalization that removes yielded statements that are equivalent to the condition
or its negation
Differential Revision: https://reviews.llvm.org/D101012
In LLVM_ENABLE_STATS=0 builds, `llvm::Statistic` maps to `llvm::NoopStatistic`
but has 3 mostly unused pointers. GlobalOpt considers that the pointers can
potentially retain allocated objects, so GlobalOpt cannot optimize out the
`NoopStatistic` variables (see D69428 for more context), wasting 23KiB for stage
2 clang.
This patch makes `NoopStatistic` empty and thus reclaims the wasted space. The
clang size is even smaller than applying D69428 (slightly smaller in both .bss and
.text).
```
# This means the D69428 optimization on clang is mostly nullified by this patch.
HEAD+D69428: size(.bss) = 0x0725a8
HEAD+D101211: size(.bss) = 0x072238
# bloaty - HEAD+D69428 vs HEAD+D101211
# With D101211, we also save a lot of string table space (.rodata).
FILE SIZE VM SIZE
-------------- --------------
-0.0% -32 -0.0% -24 .eh_frame
-0.0% -336 [ = ] 0 .symtab
-0.0% -360 [ = ] 0 .strtab
[ = ] 0 -0.2% -880 .bss
-0.0% -2.11Ki -0.0% -2.11Ki .rodata
-0.0% -2.89Ki -0.0% -2.89Ki .text
-0.0% -5.71Ki -0.0% -5.88Ki TOTAL
```
Note: LoopFuse is a disabled pass. For now this patch adds
`#if LLVM_ENABLE_STATS` so `OptimizationRemarkMissed` is skipped in
LLVM_ENABLE_STATS==0 builds. If these `OptimizationRemarkMissed` are useful in
LLVM_ENABLE_STATS==0 builds, we can replace `llvm::Statistic` with
`llvm::TrackingStatistic`, or use a different abstraction to keep track of the strings.
Similarly, skip the code in `mlir/lib/Pass/PassStatistics.cpp` which
calls `getName`/`getDesc`/`getValue`.
Reviewed By: lattner
Differential Revision: https://reviews.llvm.org/D101211
This tidies up the code a bit:
* Eliminate the ctx member, which doesn't need to be stored.
* Rename verify(Operation) to make it more clear that it is
doing more than verifyOperation and that the dominance check
isn't being done multiple times.
* Rename mayNotHaveTerminator which was confusing about whether
it wasn't known whether it had a terminator, when it is really
about whether it is legal to have a terminator.
* Some minor optimizations: don't check for RegionKindInterface
if there are no regions. Don't do two passes over the
operations in a block in OperationVerifier::verifyDominance when
one will do.
The optimizations are actually a measurable (but minor) win in some
CIRCT cases.
Differential Revision: https://reviews.llvm.org/D101267
Ensure to preserve the correct type during when folding and canonicalization.
`shape.broadcast` of of a single operand can only be folded away if the argument
type is correct.
Differential Revision: https://reviews.llvm.org/D101158
Includes tests and implementation for both integer and floating point values.
Both nearest neighbor and bilinear interpolation is included.
Differential Revision: https://reviews.llvm.org/D101009
Mask vectors are handled similar to data vectors in N-D TransferWriteOp. They are copied into a temporary memory buffer, which can be indexed into with non-constant values.
Differential Revision: https://reviews.llvm.org/D101136
This commit adds support for broadcast dimensions in permutation maps of vector transfer ops.
Also fixes a bug in VectorToSCF that generated incorrect in-bounds checks for broadcast dimensions.
Differential Revision: https://reviews.llvm.org/D101019
This commit adds support for dimension permutations in permutation maps of vector transfer ops.
Differential Revision: https://reviews.llvm.org/D101007
Fix style/clang-tidy warning, trim stale includes and forward
declarations, and cleanup/fix stale comments.
Differential Revision: https://reviews.llvm.org/D101021
Strided 1D vector transfer ops are 1D transfers operating on a memref dimension different from the last one. Such transfer ops do not accesses contiguous memory blocks (vectors), but access memory in a strided fashion. In the absence of a mask, strided 1D vector transfer ops can also be lowered using matrix.column.major.* LLVM instructions (in a later commit).
Subsequent commits will extend the pass to handle the remaining missing permutation maps (broadcasts, transposes, etc.).
Differential Revision: https://reviews.llvm.org/D100946
Eliminate empty shapes from the operands, partially fold all constant shape
operands, and fix normal folding.
Differential Revision: https://reviews.llvm.org/D100634
The interchange option attached to the linalg to loop lowering affects only the loops and does not update the memory accesses generated in to body of the operation. Instead of performing the interchange during the loop lowering use the interchange pattern.
Differential Revision: https://reviews.llvm.org/D100758