Loops with irreducible cycles may loop infinitely. Those cannot be
removed, unless the loop/function is marked as mustprogress.
Also discussed in D103382.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D104238
The const qualifier was a hangover from an earlier iteration that allowed
wrapper functions to return pointers to const memory. This feature has
been removed, so there's no reason for this to be const any more, and
removing it eliminates const-cast warnings.
Replace the existing WrapperFunctionResult type in
llvm/include/ExecutionEngine/Orc/Shared/TargetProcessControlTypes.h with a
version adapted from the ORC runtime's implementation.
Also introduce the SimplePackedSerialization scheme (also adapted from the ORC
runtime's implementation) for wrapper functions to avoid manual serialization
and deserialization for calls to runtime functions involving common types.
This commit mostly just replaces bad uses of `NDEBUG` with uses of
`LLVM_ENABLE_ABI_BREAKING_CHANGES` - the safe way to include ABI
breaking changes (normally extra struct elements in headers).
Differential Revision: https://reviews.llvm.org/D104216
libstdc++ since version 11 has a conditional compilation based on
[[no_unique_address]] availability whether one element is either
inherited or put there as a field with [[no_unique_address]].
The code comment is by teemperor.
Reviewed By: teemperor
Differential Revision: https://reviews.llvm.org/D104283
Much like `mulx`'s `WriteIMulH`, there are two outputs of
AVX2 GATHER instructions. This was changed back in rL160110,
but the sched model change wasn't present.
So right now, for sched models that are marked as complete
(`znver3` only now), codegen'ning `GATHER` results in a crash:
```
DefIdx 1 exceeds machine model writes for early-clobber renamable $ymm3, dead early-clobber renamable $ymm2 = VPGATHERDDYrm killed renamable $ymm3(tied-def 0), undef renamable $rax, 4, renamable $ymm0, 0, $noreg, killed renamable $ymm2(tied-def 1) :: (load 32, align 1)
```
https://godbolt.org/z/Ks7zW7WGh
I'm guessing we need to deal with this like we deal with `WriteIMulH`.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D104205
This patch fixes the logic that checks for variadic register definitions,
Before llvm-svn 348114 (commit 4cf35b4ab0), it was not possible to explicitly
mark variadic operands as definitions. By default, variadic operands of an
MCInst were always assumed to be uses. A number of had-hoc checks were
introduced in the InstrBuilder to fix the processing of variadic register
operands of ARM ldm/stm variants.
This patch simply replaces those old (and buggy) checks with a much simpler (and
correct) check for MCID::Flag::VariadicOpsAreDefs.
Apparently __builtin_abort() is not supported when targetting Windows.
This should fix the following builder errors:
clang_rt.builtins-x86_64.lib(int_util.c.obj) : error LNK2019: unresolved
external symbol __builtin_abort referenced in function __compilerrt_abort_impl
One interesting problem was discovered here. When we do interrupt
Tracker's track flow, we want to interrupt only it and not all the
other flows recursively.
Differential Revision: https://reviews.llvm.org/D103914
21c18d5a04
improved the detection of multiplication in function call argument lists,
but unintentionally regressed the handling of function type casts (there
were no tests covering those).
This patch improves the detection of function type casts and adds a few tests.
Reviewed By: HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/D104209
This has been unnecessary since r352353 removed GraphTraits
specializations for Type, except that a couple of other headers were
accidentally relying on this declaration.
Differential Revision: https://reviews.llvm.org/D104119
When compiled with -ffreestanding, we should not assume that headers
declaring functions such as abort() are available. While the compiler may
still emit calls to those functions [1], we should not require the headers
to build compiler-rt since that can result in a cyclic dependency graph:
The compiler-rt functions might be required to build libc.so, but the libc
headers such as stdlib.h might only be available once libc has been built.
[1] From https://gcc.gnu.org/onlinedocs/gcc/Standards.html:
GCC requires the freestanding environment provide memcpy, memmove,
memset and memcmp. Finally, if __builtin_trap is used, and the target
does not implement the trap pattern, then GCC emits a call to abort.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D103876
Resubmission of D100646 now making sure that we handle cases were `__builtin_memcpy_inline` is not available.
Original commit message:
Each of these elementary operations can be assembled to support higher order constructs (Overlapping access, Loop, Aligned Loop).
The patch does not compile yet as it depends on other ones (D100571, D100631) but it allows to get the conversation started.
A self-contained version of this code is available at https://godbolt.org/z/e1x6xdaxM
This doesn't add any canonicalizations, but executes the same
simplification on bufferSemantic linalg.generic ops by using
linalg::ReshapeOp instead of linalg::TensorReshapeOp.
Differential Revision: https://reviews.llvm.org/D103513
This patch is to address https://bugs.llvm.org/show_bug.cgi?id=50459.
YAML:455:28: error: GUID strings are 38 characters long
The valid format for a GUID is {XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX}
where X is a hex digit (0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F).
The length of the individual components must be: 8, 4, 4, 4, 12.
For some cases, the converted string generated by obj2yaml, does not
comply with those lengths. yaml2obj checks that the GUID string must
be 38 characters including the dashes and braces.
Reviewed By: amccarth
Differential Revision: https://reviews.llvm.org/D103089
This patch includes some changes which deletes the code accessing
g_atl_machine global. Some accesses related to memory_pools are
still remaining.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D103813
Changing vector element type doesn't work for v6i32->v6i16 now
that v6i32 is an MVT and v6i16 is not.
I would like to fix this in changeVectorElementType, but you
need a LLVMContext to call getVectorVT which we can't get from
an MVT.
Fixes PR50709.
A lambda in a function template may be recursively instantiated. The recursive
lambda will cause a lambda function instantiated multiple times, one inside another.
The inner LocalInstantiationScope should not be marked as MergeWithParentScope
since it already has references to locals properly substituted, otherwise it causes
assertion due to the check for duplicate locals in merged LocalInstantiationScope.
Reviewed by: Richard Smith
Differential Revision: https://reviews.llvm.org/D98068
The runtimes build has assertions enabled, which is necessary to catch
some of the modules-related issues we've been seeing recently. This
patch enables testing with modules in the runtimes build so as to cover
those cases.
In the future, a better solution would be to systematically use versions
of Clang that have assertions enabled. However, the Clangs we release
currently don't have assertions enabled by default, which causes a
challenge for the CI (we could try to build our own Clang from ToT with
assertions in the CI, but that poses some problems).
Differential Revision: https://reviews.llvm.org/D104252
Export `lq`, `stq`, `lqarx` and `stqcx.` in preparation for implementing 16-byte lock free atomic operations on AIX.
Add a new register class `g8prc` for these instructions, since these instructions require even-odd register pair.
Reviewed By: nemanjai, jsji, #powerpc
Differential Revision: https://reviews.llvm.org/D103010
This is useful for "build tuple" type ops. In my case, in npcomp, I have
an op:
```
// Result type is `!torch.tuple<!torch.tensor, !torch.tensor>`.
torch.prim.TupleConstruct %0, %1 : !torch.tensor, !torch.tensor
```
and the context is required for the `Torch::TupleType::get` call (for
the case of an empty tuple).
The handling of these FmtContext's in the code is pretty ad-hoc -- I didn't
attempt to rationalize it and just made a targeted fix. As someone
unfamiliar with the code I had a hard time seeing how to more broadly fix
the situation.
Differential Revision: https://reviews.llvm.org/D104274
If the flag is not set, the script saved_model_aot_compile.py in tensorflow will
default it to the correct value. However, in TF 2.5, the way the value is set in
TensorFlowCompile.cmake file triggers a build error.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D103972
One nice feature of the os_signpost API is that format string
substitutions happen in the consumer, not the logging
application. LLVM's current Signpost class doesn't take advantage of
this though and instead always uses a static "Begin/End %s" format
string.
This patch uses variadic macros to allow the API to be used as
intended. Unfortunately, the primary use-case I had in mind (the
LLDB_SCOPED_TIMER() macro) does not get much better from this, because
__PRETTY_FUNCTION__ is *not* a macro, but a static string, so
signposts created by LLDB_SCOPED_TIMER() still use a static "%s"
format string. At least LLDB_SCOPED_TIMERF() works as intended.
This reapplies the previously reverted patch with additional include
order fixes for non-modular builds of LLDB.
Differential Revision: https://reviews.llvm.org/D103575
Currently, Loop strengh reduce is not handling loops with scalable stride very well.
Take loop vectorized with scalable vector type <vscale x 8 x i16> for instance,
(refer to test/CodeGen/AArch64/sve-lsr-scaled-index-addressing-mode.ll added).
Memory accesses are incremented by "16*vscale", while induction variable is incremented
by "8*vscale". The scaling factor "2" needs to be extracted to build candidate formula
i.e., "reg(%in) + 2*reg({0,+,(8 * %vscale)}". So that addrec register reg({0,+,(8*vscale)})
can be reused among Address and ICmpZero LSRUses to enable optimal solution selection.
This patch allow LSR getExactSDiv to recognize special cases like "C1*X*Y /s C2*X*Y",
and pull out "C1 /s C2" as scaling factor whenever possible. Without this change, LSR
is missing candidate formula with proper scaled factor to leverage target scaled-index
addressing mode.
Note: This patch doesn't fully fix AArch64 isLegalAddressingMode for scalable
vector. But allow simple valid scale to pass through.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D103939