Summary: The native alignment may generally not be used when lowering a vector.transfer to the underlying load/store operation. This revision fixes the unmasked load/store alignment to match that of the masked path.
Differential Revision: https://reviews.llvm.org/D83684
Per the Vulkan's SPIR-V environment spec, "for the OpSRem and OpSMod
instructions, if either operand is negative the result is undefined."
So we cannot directly use spv.SRem/spv.SMod if either operand can be
negative. Emulate it via spv.UMod.
Because the emulation uses spv.SNegate, this commit also defines
spv.SNegate.
Differential Revision: https://reviews.llvm.org/D83679
Summary:
These are semantically equivalent, but fmuladd allows decaying the op
into fmul+fadd if there is no fma instruction available. llvm.fma lowers
to scalar calls to libm fmaf, which is a lot slower.
Reviewers: nicolasvasilache, aartbik, ftynse
Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, Kayjukh, jurahul, msifontes
Tags: #mlir
Differential Revision: https://reviews.llvm.org/D83666
This patch introduces type conversion for SPIR-V structs. Since
handling offset case requires thorough testing, it was left out
for now. Hence, only structs with no offset are currently
supported. Also, structs containing member decorations cannot
be translated.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D83403
This patch adds type conversion for 4 SPIR-V types: array, runtime array, pointer
and struct. This conversion is integrated using a separate function
`populateSPIRVToLLVMTypeConversion()` that adds new type conversions. At the moment,
this is a basic skeleton that allows to perfom conversion from SPIR-V array,
runtime array and pointer types to LLVM typesystem. There is no support of array
strides or storage classes. These will be supported on the case by case basis.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D83399
This patch adds conversion patterns for `spv.BitFieldSExtract` and `spv.BitFieldUExtract`.
As in the patch for `spv.BitFieldInsert`, `offset` and `count` have to be broadcasted in
vector case and casted to match the type of the base.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D82640
This patch introduces 3 new direct conversions for SPIR-V ops:
- `spv.Select`
- `spv.Undef`
- `spv.FMul` that was skipped in the patch with arithmetic ops
Differential Revision: https://reviews.llvm.org/D83291
scf.if currently lacks folding on true / false conditionals.
Such foldings are a bit more involved than can be addressed immediately.
This revision introduces an eager folding for lowering vector.transfer operations in the presence of unrolling.
Differential revision: https://reviews.llvm.org/D83146
This patch introduces conversion pattern for `spv.constant` with scalar
and vector types. There is a special case when the constant value is a
signed/unsigned integer (vector of integers). Since LLVM dialect does not
have signedness semantics, the types had to be converted to signless ints.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D82936
Added conversion pattern for SPIR-V `FunctionCallOp`. Based on
specification, it returns no results or a single result, so
can be mapped directly to LLVM dialect's `llvm.call`.
Reviewed By: antiagainst, ftynse
Differential Revision: https://reviews.llvm.org/D83030
This patch introduces conversion pattern for `spv.BitFiledInsert` op,
as well as some utility functions to facilitate code reading.
Since `spv.BitFiledInsert` may take both vector and integer operands,
this case was specifically handled by broadcasting values (`count`
and `offset` here) to vectors. Moreover, the types had to be converted
to same bitwidth in order to conform with LLVM dialect rules.
This was done with `zext` when extending (Note that `count` and
`offset` are treated as unsigned) and `trunc` in the opposite case.
For the latter one, truncation is safe since the op is defined only when
`count`/`offset`/their sum is less than the bitwidth of the result.
This introduces a natural bound of the value of 64, which can be
expressed as `i8`.
Reviewed By: antiagainst, ftynse
Differential Revision: https://reviews.llvm.org/D82639
This allow lowering to support scf.for and scf.if with results. As right now
spv region operations don't have return value the results are demoted to
Function memory. We create one allocation per result right before the region
and store the yield values in it. Then we can load back the value from
allocation to be able to use the results.
Differential Revision: https://reviews.llvm.org/D82246
Added conversion pattern and tests for `spv.Bitcast` op. This one has
a direct mapping in LLVM dialect so `DirectConversionPattern` was used.
Differential Revision: https://reviews.llvm.org/D82748
This patch introduces new conversion patterns for bit and logical
negation op: `spv.Not` and `spv.LogicalNot`. They are implemented
by applying xor on the operand and mask with all bits set.
Differential Revision: https://reviews.llvm.org/D82637
Summary:
The patch makes the index type lowering of the GPU to NVVM/ROCDL conversion configurable. It introduces a pass option that controls the bitwidth used when lowering index computations and uses the LowerToLLVMOptions structure to control the Standard to LLVM lowering.
This commit fixes a use-after-free bug introduced by the reverted commit d10b1a3. It implements the following changes:
- Added a getDefaultOptions method to the LowerToLLVMOptions struct that returns a reference to statically allocated default options.
- Use the getDefaultOptions method to provide default LowerToLLVMOptions (instead of an initializer list).
- Added comments to clarify the required lifetime of the LowerToLLVMOptions
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D82475
`llvm.mlir.constant` was originally introduced as an LLVM dialect counterpart
to `std.constant`. As such, it was supporting "function pointer" constants
derived from the symbol name. This is different from `std.constant` that allows
for creation of a "function" constant since MLIR, unlike LLVM IR, supports
this. Later, `llvm.mlir.addressof` was introduced as an Op that obtains a
constant pointer to a global in the LLVM dialect. It naturally extends to
functions (in LLVM IR, functions are globals) and should be used for defining
"function pointer" values instead.
Fixes PR46344.
Differential Revision: https://reviews.llvm.org/D82667
When the origin of a shape is an extent tensor the operation `get_extent` can be
lowered directly to `extract_element`.
This choice circumvents the necessity to materialize the shape in memory.
Differential Revision: https://reviews.llvm.org/D82645
When the shape is derived from a tensor argument the shape extent can be derived
directly from that tensor with `std.dim`.
This lowering pattern circumvents the necessity to materialize the shape in
memory.
Differential Revision: https://reviews.llvm.org/D82644
Rationale:
In general, passing "fastmath" from MLIR to LLVM backend is not supported, and even just providing such a feature for experimentation is under debate. However, passing fine-grained fastmath related attributes on individual operations is generally accepted. This CL introduces an option to instruct the vector-to-llvm lowering phase to annotate floating-point reductions with the "reassociate" fastmath attribute, which allows the LLVM backend to use SIMD implementations for such constructs. Oher lowering passes can start using this mechanism right away in cases where reassociation is allowed.
Benefit:
For some microbenchmarks on x86-avx2, speedups over 20 were observed for longer vector (due to cleaner, spill-free and SIMD exploiting code).
Usage:
mlir-opt --convert-vector-to-llvm="reassociate-fp-reductions"
Reviewed By: ftynse, mehdi_amini
Differential Revision: https://reviews.llvm.org/D82624
Implemented conversion for `spv.BitReverse` and `spv.BitCount`. Since ODS
generates builders in a different way for LLVM dialect intrinsics, I
added attributes to build method in `DirectConversionPattern` class. The
tests for these ops are in `bitwise-ops-to-llvm.mlir`.
Differential Revision: https://reviews.llvm.org/D82286
Initially, unranked memref descriptors in the LLVM dialect were designed only
to be passed into functions. An assertion was guarding against returning
unranked memrefs from functions in the standard-to-LLVM conversion. This is
insufficient for functions that wish to return an unranked memref such that the
caller does not know the rank in advance, and hence cannot allocate the
descriptor and pass it in as an argument.
Introduce a calling convention for returning unranked memref descriptors as
follows. An unranked memref descriptor always points to a ranked memref
descriptor stored on stack of the current function. When an unranked memref
descriptor is returned from a function, the ranked memref descriptor it points
to is copied to dynamically allocated memory, the ownership of which is
transferred to the caller. The caller is responsible for deallocating the
dynamically allocated memory and for copying the pointed-to ranked memref
descriptor onto its stack.
Provide default lowerings for std.return, std.call and std.indirect_call that
maintain the conversion defined above.
This convention is additionally exercised by a runtime test to guard against
memory errors.
Differential Revision: https://reviews.llvm.org/D82647
Lower `shape.rank` to standard dialect.
A shape's size is the same as the extent of the first and only dimension of the
`tensor<?xindex>` it is represented by.
Differential Revision: https://reviews.llvm.org/D82080
This patch introduces conversion patterns for `spv.module` and `spv._module_end`.
SPIR-V module is converted into `ModuleOp`. This will play a role of enclosing
scope to LLVM ops. At the moment, SPIR-V module attributes (such as memory model,
etc) are ignored.
Differential Revision: https://reviews.llvm.org/D82468
This patch provides an implementation for `spv.func` conversion. The pattern
is populated in a separate method added to the pass. At the moment, the type
signature conversion only includes the supported types. The conversion pattern
also matches SPIR-V function control attributes to LLVM function attributes.
Those are modelled as `passthrough` attributes in LLVM dialect. The following
mapping are used:
- None: no attributes passed
- Inline: `alwaysinline` seems to be the right equivalent (`inlinehint` is
semantically weaker in my opinion)
- DontInline: `noinline`
- Pure and Const: I think those can be modelled as `readonly` and `readnone`
attributes respectively.
Also, 2 patterns added for return ops conversion (`spv.Return` for void return
and `spv.ReturnValue` for a single value return).
Differential Revision: https://reviews.llvm.org/D81931
This patch extends the AccessChainOp index type handling to be able to deal with
all Integer type indices (i.e., all bit-widths and signedness symantics).
There were two ways of achieving this:
1- Backward compatible: The new way of handling the indices will assume that
an index type is i32 by default if not specified in the assembly format,
this way all the old tests would pass correctly.
2- Enforce the format: This unifies the spv.AccessChain Op format and all the old
tests had to be updated to reflect this change or else they fail.
I picked option-2 to unify the Op format and avoid having optional index-type fields
that can lead to somewhat confusing tests format and multiple representations for
the same Op with undocumented assumption that an index is i32 unless stated.
Nonetheless, reverting to option-1 should be straightforward if preferred or needed.
Differential Revision: https://reviews.llvm.org/D81763
The patch makes the index type lowering of the GPU to NVVM/ROCDL
conversion configurable. It introduces a pass option that controls the
bitwidth used when lowering index computations.
Differential Revision: https://reviews.llvm.org/D80285
Subview operations are not natively supported downstream in the spirv path.
This change allows removing subview when used by vector transfer the same way
we already do it when they are used by LoadOp/StoreOp
Differential Revision: https://reviews.llvm.org/D82106
Use direct vector constants for the 1-D case. This approach
scales much better than generating elaborate insertion operations
that are eventually folded into a constant. We could of course
generalize the 1-D case to higher ranks, but this simplification
already helps in scaling some microbenchmarks that would formerly
crash on the intermediate IR length.
Reviewed By: reidtatge
Differential Revision: https://reviews.llvm.org/D82144
Lower `shape.shape_of` to standard dialect.
This lowering supports statically and dynamically shaped tensors.
Support for unranked tensors will be added as part of the lowering to `scf`.
Differential Revision: https://reviews.llvm.org/D82098
Summary:
The "i1" (viz. bool) type does not have a proper equivalent on the "C"
size. So, to avoid any ABIs issues, we simply use print_i32 on an i32
value of one or zero for true and false. This has the added advantage
that one less function needs to be implemented when porting the runtime
support library.
Reviewers: ftynse, bkramer, nicolasvasilache
Reviewed By: ftynse
Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, msifontes
Tags: #mlir
Differential Revision: https://reviews.llvm.org/D82048
Added support of simple logical ops: `LogicalAnd`, `LogicalOr`,
`LogicalEqual` and `LogicalNotEqual`. Added a missing conversion
for `UMod` op.
Also, implemented SPIR-V cast ops conversion. There are 4 simple
case where there is a clear equivalent in LLVM (e.g. `ConvertFToS`
is `fptosi`). For `FConvert`, `SConvert` and `UConvert` we
distinguish between truncation and extension based on the bit
width of the operand.
Differential Revision: https://reviews.llvm.org/D81812
Implement the missing lowering from `std.dim` to the LLVM dialect in case of a
dynamic dimension.
Differential Revision: https://reviews.llvm.org/D81834
This patch has shift ops conversion implementation. In SPIR-V dialect,
`Shift` and `Base` may have different bit width. On the contrary,
in LLVM dialect both `Base` and `Shift` have to be of the same bit width.
This leads to the following cases:
- if `Base` has the same bit width as `Shift`, the conversion is
straightforward.
- if `Base` has a greater bit width than `Shift`, shift is sign/zero
extended first. Then the extended value is passed to the shift.
- otherwise the conversion is considered to be illegal.
Differential Revision: https://reviews.llvm.org/D81546
This option avoids to accidentally reuse variable across -LABEL match,
it can be explicitly opted-in by prefixing the variable name with $
Differential Revision: https://reviews.llvm.org/D81531
Implemented `FComparePattern` and `IComparePattern` classes
that provide conversion of SPIR-V comparison ops (such as
`spv.FOrdGreaterThanEqual` and others) to LLVM dialect.
Also added tests in `comparison-ops-to-llvm.mlir`.
Differential Revision: https://reviews.llvm.org/D81487
Following the previous revision `D81100`, this commit implements a templated class
that would provide conversion patterns for “straightforward” SPIR-V ops into
LLVM dialect. Templating allows to abstract away from concrete implementation
for each specific op. Those are mainly binary operations. Currently supported
and tested ops are:
- Arithmetic ops: `IAdd`, `ISub`, `IMul`, `FAdd`, `FSub`, `FMul`, `FDiv`, `FNegate`,
`SDiv`, `SRem` and `UDiv`
- Bitwise ops: `BitwiseAnd`, `BitwiseOr`, `BitwiseXor`
The implementation relies on `SPIRVToLLVMConversion` class that makes use of
`OpConversionPattern`.
Differential Revision: https://reviews.llvm.org/D81305
Allow for dynamic indices in the `dim` operation.
Rather than an attribute, the index is now an operand of type `index`.
This allows to apply the operation to dynamically ranked tensors.
The correct lowering of dynamic indices remains to be implemented.
Differential Revision: https://reviews.llvm.org/D81551
Having the input dumped on failure seems like a better
default: I debugged FileCheck tests for a while without knowing
about this option, which really helps to understand failures.
Remove `-dump-input-on-failure` and the environment variable
FILECHECK_DUMP_INPUT_ON_FAILURE which are now obsolete.
Differential Revision: https://reviews.llvm.org/D81422