OSS build was broken (missing CMakeLists.txt changes and compilation failures on Ubuntu)
Automated rollback of changelist 247564213.
PiperOrigin-RevId: 247713812
The idea is to lower `gpu.launch` operations into `gpu.launch_func` operations by outlining the kernel body into a function, which is closer to the NVVM model.
--
PiperOrigin-RevId: 246806890
This syntax removes boilerplate and verbose list of region arguments in the
header of the entry block. It groups operands into segments related to GPU
blocks, GPU threads as well as the operands that are forwarded to the kernel.
The two former segments are also used to give names to the region arguments
that are used for GPU blocks and threads inside the kernel body region.
--
PiperOrigin-RevId: 246792329
Define a new dialect related to GPU kernels. Currently, it only contains a
single operation for launching a kernel on a three-dimensional grid of thread
blocks, following a model similar to that of CUDA. In particular, the body of
the kernel contains operations executed by each thread and uses region
arguments to accept thread and block identifiers (similar to how the loop body
region accepts the induction value).
--
PiperOrigin-RevId: 245713728
making the IR dumps much nicer.
This is part 2/3 of the path to making dialect types more nice. Part 3/3 will
slightly generalize the set of characters allowed in pretty types and make it
more principled.
--
PiperOrigin-RevId: 242249955
Historically, the LLVM IR dialect has been using the generic form of MLIR
operation syntax. It is verbose and often redundant. Introduce the custom
printing and parsing for all existing operations in the LLVM IR dialect.
Update the relevant documentation and tests.
--
PiperOrigin-RevId: 241617393
Due to legacy reasons (ML/CFG function separation), regions in affine control
flow operations require contained blocks not to have terminators. This is
inconsistent with the notion of the block and may complicate code motion
between regions of affine control operations and other regions.
Introduce `affine.terminator`, a special terminator operation that must be used
to terminate blocks inside affine operations and transfers the control back to
he region enclosing the affine operation. For brevity and readability reasons,
allow `affine.for` and `affine.if` to omit the `affine.terminator` in their
regions when using custom printing and parsing format. The custom parser
injects the `affine.terminator` if it is missing so as to always have it
present in constructed operations.
Update transformations to account for the presence of terminator. In
particular, most code motion transformation between loops should leave the
terminator in place, and code motion between loops and non-affine blocks should
drop the terminator.
PiperOrigin-RevId: 240536998
These cleanups reflects some recent changes to the LLVM IR Dialect and the
infrastructure that affects it. In particular, add documentation on direct and
indirect function calls as well as remove the `call` and `call0` separation.
Change the prefix of custom types from `!llvm.type` to `!llvm` so that it
matches the IR. Remove the verifier check disallowing conditional branches to
the same block with arguments: identical arguments are now supported, and
different arguments will be caught later.
PiperOrigin-RevId: 237203452
Since the goal of the LLVM IR dialect is to reflect LLVM IR in MLIR, the
dialect and the conversion procedure must account for the differences betweeen
block arguments and LLVM IR PHI nodes. In particular, LLVM IR disallows PHI
nodes with different values coming from the same source. Therefore, the LLVM IR
dialect now disallows `cond_br` operations that have identical successors
accepting arguments, which would lead to invalid PHI nodes. The conversion
process resolves the potential PHI source ambiguity by injecting dummy blocks
if the same block is used more than once as a successor in an instruction.
These dummy blocks branch unconditionally to the original successors, pass them
the original operands (available in the dummy block because it is dominated by
the original block) and are used instead of them in the original terminator
operation.
PiperOrigin-RevId: 235682798
Addressing post-submit comments. The `getelementptr` operation now supports
non-constant indexes, similarly to LLVM, and this functionality is exercised by
the lowering to the dialect. Update the documentation accordingly.
List the values of integer comparison predicates, which currently correspond to
those of CmpIOp in MLIR. Ideally, we would use strings instead, but it
requires additional support for argument conversion in both the dialect
lowering pass and the LLVM translator.
PiperOrigin-RevId: 235678877
The LLVM IR pass was bootstrapped without user documentation, following LLVM's
language reference and existing conversions between MLIR standard operations
and LLVM IR instructions. Provide concise documentation of the LLVM IR dialect
operations. This documentation does not describe the semantics of the
operations, which should match that of LLVM IR, but highlights the structural
differences in operation definitions, in particular using attributes instead of
constant-only values. It also describes pseudo-operations that exist only to
make the LLVM IR dialect self-contained within MLIR.
While it could have been possible to generate operation description from
TableGen, this opts for a more concise format where groups of related
operations are described together.
PiperOrigin-RevId: 235149136
This operation is produced and used by the super-vectorization passes and has
been emitted as an abstract unregistered operation until now. For end-to-end
testing purposes, it has to be eventually lowered to LLVM IR. Matching
abstract operation by name goes into the opposite direction of the generic
lowering approach that is expected to be used for LLVM IR lowering in the
future. Register vector_type_cast operation as a part of the SuperVector
dialect.
Arguably, this operation is a special case of the `view` operation from the
Standard dialect. The semantics of `view` is not fully specified at this point
so it is safer to rely on a custom operation. Additionally, using a custom
operation may help to achieve clear dialect separation.
PiperOrigin-RevId: 225887305
From the beginning, vector_transfer_read and vector_transfer_write opreations
were intended as a mid-level vectorization abstraction. In particular, they
are lowered to the StandardOps dialect before further processing. As such, it
does not make sense to keep them at the same level as StandardOps. Introduce
the new SuperVectorOps dialect and move vector_transfer_* operations there.
This will be used as a testbed for the generic lowering/legalization pass.
PiperOrigin-RevId: 225554492