- this change is already consistent with the current code
- having no constraints made the integer set spec look odd - as nothing appears
between ':' and the closing parenthesis
- there is no loss in representational power - an unconstrained set can always
be represented by a trivially true constraint
PiperOrigin-RevId: 229307353
DenseElementAttr currently does not support value bitwidths of > 64. This can result in asan failures and crashes when trying to invoke DenseElementsAttr::writeBits/DenseElementsAttr::readBits.
PiperOrigin-RevId: 229241125
*) LoopFusion: Adds fusion cost function which compares the cost of the fused loop nest, with the cost of the two unfused loop nests to determine if it is profitable to fuse the candidate loop nests. The fusion cost function is run for various combinations for src/dst loop depths attempting find the minimum cost setting for src/dst loop depths which does not increase the computational cost when the loop nests are fused. Combinations of src/dst loop depth are evaluated attempting to maximize loop depth (i.e. take a bigger computation slice from the source loop nest, and insert it deeper in the destination loop nest for better locality).
*) LoopFusion: Adds utility to compute op instance count for loop nests, sliced loop nests, and to compute the cost of a loop nest fused with another sliced loop nest.
*) LoopFusion: canonicalizes slice bound AffineMaps (and updates related tests).
*) Analysis::Utils: Splits getBackwardComputationSlice into two functions: one which calculates and returns the slice loop bounds for analysis by LoopFusion, and the other for insertion of the computation slice (ones fusion has calculated the min-cost src/dst loop depths).
*) Test: Adds multiple unit tests to test the new functionality.
PiperOrigin-RevId: 229219757
This CL adds a short term remedy to an issue that was found during execution
tests.
Lowering of vector transfer ops uses the permutation map to determine which
ForInst have been super-vectorized. During materialization to HW vector sizes
however, some of those dimensions may be fully unrolled and do not appear in
the permutation map.
Such dimensions were then not clipped and may have accessed out of bounds.
This CL conservatively clips all dimensions to ensure no out of bounds access.
The longer term solution is still up for debate but will probably require
either passing more information between Materialization and lowering, or just
merging the 2 passes.
PiperOrigin-RevId: 228980787
Arguably the dependence of EDSCs on Analysis is not great but on the other
hand this is a strict improvement in the emitted IR and since EDSCs are an
alternative to builders it makes sense that they have as much access to
Analysis as Transforms.
PiperOrigin-RevId: 228967624
This CL is the 6th and last on the path to simplifying AffineMap composition.
This removes `AffineValueMap::forwardSubstitutions` and replaces it by simple
calls to `fullyComposeAffineMapAndOperands`.
PiperOrigin-RevId: 228962580
- should be testing on the output of -dma-generate and not '-dma-generate
-canonicalize'; save trouble for those updating -canonicalize in the future!
PiperOrigin-RevId: 228915192
The const folding logic is structurally similar, so use a template
to abstract the common part.
Moved mul(x, 0) to a legalization pattern to be consistent with
mul(x, 1).
Also promoted getZeroAttr() to be a method on Builder since it is
expected to be frequently used.
PiperOrigin-RevId: 228891989
Expand type matcher template generator to consider a set of predicates that are known to
hold. This avoids inserting redundant checking for trivially true predicates
(for example predicate that hold according to the op definition). This only targets predicates that trivially holds and does not attempt any logic equivalence proof.
PiperOrigin-RevId: 228880468
Multiple binaries have the needs to open input files. Use this function
to de-duplicate the code.
Also changed openOutputFile() to return errors using std::string since
it is a library call and accessing I/O in library call is not friendly.
PiperOrigin-RevId: 228878221
This CL is the 5th on the path to simplifying AffineMap composition.
This removes the distinction between normalized single-result AffineMap and
more general composed multi-result map.
One nice byproduct of making the implementation driven by single-result is
that the multi-result extension is a trivial change: the implementation is
still single-result and we just use:
```
unsigned idx = getIndexOf(...);
map.getResult(idx);
```
This CL also fixes an AffineNormalizer implementation issue related to symbols.
Namely it stops performing substitutions on symbols in AffineNormalizer and
instead concatenates them all to be consistent with the call to
`AffineMap::compose(AffineMap)`. This latter call to `compose` cannot perform
simplifications of symbols coming from different maps based on positions only:
i.e. dims are applied and renumbered but symbols must be concatenated.
The only way to determine whether symbols from different AffineApply are the
same is to look at the concrete values. The canonicalizeMapAndOperands is thus
extended with behavior to support replacing operands that appear multiple
times.
Lastly, this CL demonstrates that the implementation is correct by rewriting
ComposeAffineMaps using only `makeComposedAffineApply`. The implementation
uses a matcher because AffineApplyOp are introduced as composed operations on
the fly instead of iteratively forwardSubstituting. For this purpose, a walker
would revisit freshly introduced AffineApplyOp. Regardless, ComposeAffineMaps
is scheduled to disappear, this CL replaces the implementation based on
iterative `forwardSubstitute` by a composed-by-construction
`makeComposedAffineApply`.
Remaining calls to `forwardSubstitute` will be removed in the next CL.
PiperOrigin-RevId: 228830443
Previously these were all defined as operating on Tensors which is not true in general. These don't serve much now so just inline it and we can extract it out again.
PiperOrigin-RevId: 228827011
* Check was returning success instead of failure when reporting illegal type;
* TFL ops only support tensor types so update tests with corrected logic.
- Removed some checks in broadcasting that don't work with tensor input requirement;
PiperOrigin-RevId: 228770184
- FM has a worst case exponential complexity. For our purposes, this worst case
is rarely expected, but could still appear due to improperly constructed
constraints (a logical/memory error in other methods for eg.) or artificially
created arbitrarily complex integer sets (adversarial / fuzz tests).
Add a check to detect such an explosion in the number of constraints and
conservatively return false from isEmpty() (instead of running out of memory
or running for too long).
- Add an artifical virus test case.
PiperOrigin-RevId: 228753496
This implements the lowering of `floordiv`, `ceildiv` and `mod` operators from
affine expressions to the arithmetic primitive operations. Integer division
rules in affine expressions explicitly require rounding towards either negative
or positive infinity unlike machine implementations that round towards zero.
In the general case, implementing `floordiv` and `ceildiv` using machine signed
division requires computing both the quotient and the remainder. When the
divisor is positive, this can be simplified by adjusting the dividend and the
quotient by one and switching signs.
In the current use cases, we are unlikely to encounter affine expressions with
negative divisors (affine divisions appear in loop transformations such as
tiling that guarantee that divisors are positive by construction). Therefore,
it is reasonable to use branch-free single-division implementation. In case of
affine maps, divisors can only be literals so we can check the sign and
implement the case for negative divisors when the need arises.
The affine lowering pass can still fail when applied to semi-affine maps
(division or modulo by a symbol).
PiperOrigin-RevId: 228668181
* Get a specific successor operand.
* Iterator support for non successor operands.
* Fix bug when removing the last operand from the operand list of an Instruction.
* Get the argument number for a BlockArgument.
PiperOrigin-RevId: 228660898
- the double buffer should be indexed (iv floordiv step) % 2 and NOT (iv % 2);
step wasn't being accounted for.
- fix test cases, enable failing test cases
PiperOrigin-RevId: 228635726
This CL added a tblgen::Attribute class to wrap around raw TableGen
Record getValue*() calls on Attr defs, which will provide a nicer
API for handling TableGen Record.
PiperOrigin-RevId: 228581107
Originally, terminators were special kinds of operation and could not be
extended by dialects. Only builtin terminators were supported and they had
custom parsers and printers. Currently, "terminator" is a property of an
operation, making it possible for dialects to define custom terminators.
However, verbose forms of operation syntax were not designed to support
terminators that may have a list of successors (each successor contains a block
name and an optional operand list). Calling printDefaultOp on a terminator
drops all successor information. Dialects are thus required to provide custom
parsers and printers for their terminators.
Introduce the syntax for the list of successors in the verbose from of the
operation. Add support for printing and parsing verbose operations with
successors.
Note that this does not yet add support for unregistered terminators since
"terminator" is a property stored in AsbtractOperation and therefore is only
available for registered operations that have an instance of AbstractOperation.
Add tests for verbose parsing. It is currently impossible to test round-trip
for verbose terminators because none of the known dialects use verbose syntax
for printing terminators by default, however the printer was exercised on the
LLVM IR dialect prototype.
PiperOrigin-RevId: 228566453
- fix visitDivExpr: constraints constructed for localVarCst used the original
divisor instead of the simplified divisor; fix this. Add a simple test case
in memref-bound-check that reproduces this bug - although this was encountered in the
context of slicing for fusion.
- improve mod expr flattening: when flattening mod expressions,
cancel out the GCD of the numerator and denominator so that we can get a
simpler flattened form along with a simpler floordiv local var for it
PiperOrigin-RevId: 228539928
Supervectorization does not plan on handling multi-result AffineMaps and
non-canonical chains of > 1 AffineApplyOp.
This CL uses the simpler single-result unbounded AffineApplyOp in the
MaterializeVectors pass.
PiperOrigin-RevId: 228469085
This CL added a tblgen::Type class to wrap around raw TableGen
Record getValue*() calls on Type defs, which will provide a
nicer API for handling TableGen Record.
The PredCNF class is also updated to work together with
tblgen::Type.
PiperOrigin-RevId: 228429090
clients. Let's re-add it in the future if there is ever a reason to. NFC.
Unrelatedly, add a use of a variable to unbreak the non-assert build.
PiperOrigin-RevId: 228284026
This CL is the 4th on the path to simplifying AffineMap composition.
This CL extract canonicalizeMapAndOperands so it can be reused by other
functions; in particular, this will be used in
`makeNormalizedAffineApply`.
PiperOrigin-RevId: 228277890
This CL is the 3rd on the path to simplifying AffineMap composition.
This CL just moves `makeNormalizedAffineApply` from VectorAnalysis to
AffineAnalysis where it more naturally belongs.
PiperOrigin-RevId: 228277182
This CL is the 2nd on the path to simplifying AffineMap composition.
This CL uses the now accepted `AffineExpr::compose(AffineMap)` to
implement `AffineMap::compose(AffineMap)`.
Implications of keeping the simplification function in
Analysis are documented where relevant.
PiperOrigin-RevId: 228276646
Alias identifiers can be used in the place of the types that they alias, and are defined as:
type-alias-def ::= '!' alias-name '=' 'type' type
type-alias ::= '!' alias-name
Example:
!avx.m128 = type vector<4 x f32>
...
"foo"(%x) : vector<4 x f32> -> ()
// becomes:
"foo"(%x) : !avx.m128 -> ()
PiperOrigin-RevId: 228271372