Update a few APIs to return non-const references instead of pointers,
and remove associated `const_cast`s and non-null assertions.
Differential Revision: https://reviews.llvm.org/D90067
Based on discourse discussion, fix the doc string and remove examples with
wrong semantic. Also fix insert_map semantic by adding missing operand for
vector we are inserting into.
Differential Revision: https://reviews.llvm.org/D89563
We do not need to use the implicit cast here. We can instead can rely on
a comparison between two TypeSize objects instead. This algorithm will
work fine with scalable vectors.
Reviewed By: DavidTruby
Differential Revision: https://reviews.llvm.org/D90146
The warning would fire when calling canReplaceGEPIdxWithZero on a GEP
whose source element type is a scalable vector. The size of scalable
vector types is not known, so this optimization cannot be performed.
This patch fixes the issue by:
- bailing out early in this routine if the GEP instruction's source
element type is a scalable vector.
- making use of getFixedSize -- this removes the dependency on the
deprecated interface.
Reviewed By: fpetrogalli
Differential Revision: https://reviews.llvm.org/D89968
The warning would fire when calling getGEPCost for analyzing the cost of
a GEP instruction. This would result in the use of the now deprecated
implicit cast of TypeSize to uint64_t through the overloaded operator.
This patch fixes the issue by using getKnownMinSize instead of the
implicit cast. This is possible because the code is already
scalable-vector aware. The semantic behaviour of the code is unchanged
by this patch.
Reviewed By: sdesmalen, fpetrogalli
Differential Revision: https://reviews.llvm.org/D89872
The warning would fire when calling isDereferenceableAndAlignedInLoop
with a scalable load. Calling isDereferenceableAndAlignedInLoop with a
scalable load would result in the use of the now deprecated implicit
cast of TypeSize to uint64_t through the overloaded operator.
This patch fixes this issue by:
- no longer considering vector loads as candidates in
canVectorizeWithIfConvert. This doesn't make sense in the context of
identifying scalar loads to vectorize.
- making use of getFixedSize inside isDereferenceableAndAlignedInLoop --
this removes the dependency on the deprecated interface, and will
trigger an assertion error if the function is ever called with a
scalable type.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D89798
Recently there has been renewed interest in improving debug-info for variables
that (partially or otherwise) live on the stack in optimised code.
At the moment instcombine speculates that stack slots are probably going to be
promoted to registers, and prepares the debug-info accordingly. It runs a
function called LowerDbgDeclare which converts dbg.declares to a set of
dbg.values after loads, and before stores and calls. Sometimes the stack
location remains (e.g. for escaped locals). If any dbg.values become undef
where the stack location is still valid we end up unnecessarily reducing
variable location coverage due to our inability to track multiple locations
simultaneously. There is a flag to disable this feature
(-instcombine-lower-dbg-declare=0), which prevents this conversion at the cost
of sometimes providing incorrect location info in the face of DSE, DCE, GVN,
CSE etc.
This has been discussed fairly extensively on PR34136.
The idea of these tests is to provide examples of situations that we should
consider when designing a new system, to aid discussions and eventually help
evaluate the implementation.
Dexter isn't ideal for observing specific optimisation behaviour. Writing an
exaustive test suite would be difficult, and the resultant suite would be
fragile. However, I think having some concrete executable examples is useful
at least as a reference.
Differential Revision: https://reviews.llvm.org/D89543
This revision allows the fusion of the producer of input tensors in the consumer under a tiling transformation (which produces subtensors).
Many pieces are still missing (e.g. support init_tensors, better refactor LinalgStructuredOp interface support, try to merge implementations and reuse code) but this still allows getting started.
The greedy pass itself is just for testing purposes and will be extracted in a separate test pass.
Differential revision: https://reviews.llvm.org/D89491
The local variable CmpResult added in that change shadowed the
type CmpResult, which confused an older gcc. Rename the variable
CmpResult to APFloatCmpResult.
The modified code in visitSTORE was missing a scalable vector check, and still
using the now deprecated implicit cast of TypeSize to uint64_t through the
overloaded operator. This patch fixes these issues.
This brings the logic in line with the comment on the context line immediately
above the added precondition.
Add a test in sve-redundant-store.ll that the warning is not triggered.
Differential Revision: https://reviews.llvm.org/D89701
The modified code in visitSTORE was missing a scalable vector check, and still
using the now deprecated implicit cast of TypeSize to uint64_t through the
overloaded operator. This patch fixes these issues.
This brings the logic in line with the comment on the context line immediately
above the added precondition.
Add a test in Redundantstores.ll that the warning is not triggered.
Summary: Method of obtaining MemRegion from LocAsInteger/MemRegionVal already exists in SVal::getAsRegion function. Replace repetitive conditions in SVal::getAsLocSymbol with SVal::getAsRegion function.
Differential Revision: https://reviews.llvm.org/D89982
Both `SymbolKind` and `indexSymbolKindToSymbolKind` support constructors and
separate them into a different category from regular methods.
Reviewed By: kadircet
Differential Revision: https://reviews.llvm.org/D89935
Previous attempt (15f6bad6d7) introduced
add_dependencies but unfortunately it does not actually add a dependency
between RemoteIndexProto and RemoteIndexServiceProto. This is likely due
to some requirements of it that clang_add_library violates.
As a workaround, we will link RemoteIndexProto library to
RemoteIndexServiceProto which is logical because the library can not be
without linking to RemoteIndexProto anyway.
This patch introduces a SPIR-V runner. The aim is to run a gpu
kernel on a CPU via GPU -> SPIRV -> LLVM conversions. This is a first
prototype, so more features will be added in due time.
- Overview
The runner follows similar flow as the other runners in-tree. However,
having converted the kernel to SPIR-V, we encode the bind attributes of
global variables that represent kernel arguments. Then SPIR-V module is
converted to LLVM. On the host side, we emulate passing the data to device
by creating in main module globals with the same symbolic name as in kernel
module. These global variables are later linked with ones from the nested
module. We copy data from kernel arguments to globals, call the kernel
function from nested module and then copy the data back.
- Current state
At the moment, the runner is capable of running 2 modules, nested one in
another. The kernel module must contain exactly one kernel function. Also,
the runner supports rank 1 integer memref types as arguments (to be scaled).
- Enhancement of JitRunner and ExecutionEngine
To translate nested modules to LLVM IR, JitRunner and ExecutionEngine were
altered to take an optional (default to `nullptr`) function reference that
is a custom LLVM IR module builder. This allows to customize LLVM IR module
creation from MLIR modules.
Reviewed By: ftynse, mravishankar
Differential Revision: https://reviews.llvm.org/D86108
The code to detect the requirement for 64-bit offsets in the archive
symbol table was not correctly accounting for the archive file signature
and the size of all the contents of the symbol table itself, e.g. the
symbol table's header and string table. Also was not considering the
variation in symbol table formats. This could result in the creation of
large archives with a corrupt symbol table.
Change the testing environment variable SYM64_THRESHOLD to be an
absolute value rather than a power of 2 in order to enable precise
testing of this detection code.
Differential Revision: https://reviews.llvm.org/D89891
This patch introduces a pass for running
`mlir-spirv-cpu-runner` - LowerHostCodeToLLVMPass.
This pass emulates `gpu.launch_func` call in LLVM dialect and lowers
the host module code to LLVM. It removes the `gpu.module`, creates a
sequence of global variables that are later linked to the varables
in the kernel module, as well as a series of copies to/from
them to emulate the memory transfer to/from the host or to/from the
device sides. It also converts the remaining Standard dialect into
LLVM dialect, emitting C wrappers.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D86112