These were all changed in 32b1b06b70 (as
discussed in D133967) but some intrinsics introduced since have
re-introduced `undef` as the masked-off value.
Reviewed By: reames, eopXD
Differential Revision: https://reviews.llvm.org/D135244
Polymorphic and unlimited polymorphic entities should be handled by runtime. This patch
update the condition in `genDeallocate` to force polymorphic and unlimited polymorphic entities
to be deallocated through a runtime call and not inlined.
Depends on D135143
Reviewed By: jeanPerier, PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D135144
Setting up a lazy-save mechanism around calls is done during SelectionDAG
because calls to intrinsics may be expanded into an actual function call
(e.g. calls to @llvm.cos()), and maintaining an allowed-list in the SMEABI
pass is not feasible.
The approach for conditionally restoring the lazy-save based on the runtime
value of TPIDR2_EL0 is similar to how we handle conditional smstart/smstop.
We create a pseudo-node which gets expanded into a conditional branch and
expands to a call to __arm_tpidr2_restore(%tpidr2_object_ptr).
The lazy-save buffer and TPIDR2 block are only allocated once at the start
of the function. For each call, the TPIDR2 block is initialised, and at
the end of the call, a pseudo node (RestoreZA) is planted.
Patch by Sander de Smalen.
Differential Revision: https://reviews.llvm.org/D133900
If a call base use will not capture a pointer we can approximate the
effects. This is important especially for readnone/only uses. Even
may-write uses are not too bad with reachability in place. Capturing
is the problem as we loose track of update sides.
LookupSpecialMember might fail, so changes the cast to cast_or_null.
Inside Sema, skip a particular base, similar to other cases, rather than
asserting on dtor showing up.
Other option would be to mark classes with invalid destructors as invalid, but
that seems like a lot more invasive and we do lose lots of diagnostics that
currently work on classes with broken members.
Differential Revision: https://reviews.llvm.org/D135254
If we have a constant aggregate, e.g., as an initializer, we usually
failed to extract the proper value/type from it. This patch provides the
size and offset information necessary to extract the right part of the
constant.
This was already handled correctly below, but not checked for the
original store pointer operand. Encountered when converting tests
to opaque pointers, where the intermediate bitcast goes away.
Currently the following case fails:
```
template<typename Ty>
Ty foo(Ty *addr, Ty val) {
Ty v;
#pragma omp atomic compare capture
{
v = *addr;
if (*addr > val)
*addr = val;
}
return v;
}
```
The compiler complains `addr` is not a lvalue. That's because when an expression
is instantiation dependent, we cannot tell if it is lvalue or not.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D135224
In the case of non-opaque pointers, when combining consecutive loads,
need to bitcast the pointer source to the combined type size, otherwise
asserts are triggered.
Differential Revision: https://reviews.llvm.org/D135249
This supports the lowering of private and firstprivate clauses in single
construct. The alloca ops are emitted in the entry block according to
https://llvm.org/docs/Frontend/PerformanceTips.html#use-of-allocas, and
the load/store ops are emitted in the single region. The data race
problem is handled in OMPIRBuilder. That is, the barrier is emitted in
OMPIRBuilder.
Co-authored-by: Nimish Mishra <neelam.nimish@gmail.com>
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D128596
The infinite loop seen on buildbots should be fixed by
11897708c0 (assuming there are not
multiple infinite combine loops...)
-----
foldOpIntoPhi() currently only folds operations into the phi if all
but one operands constant-fold. The two exceptions to this are freeze
and select, where we allow more general simplification.
This patch makes foldOpIntoPhi() generally simplification based and
removes all the instruction-specific logic. We just try to simplify
the instruction for each operand, and for the (potentially) one
non-simplified operand, we move it into the new block with adjusted
operands.
This fixes https://github.com/llvm/llvm-project/issues/57448, which
was my original motivation for the change.
Differential Revision: https://reviews.llvm.org/D134954
The old approach (dedicated ExecXXX for each instruction) is not flexible and results in duplicated code when RVC kicks in.
According to the spec, every compressed instruction can be decoded to a non-compressed one. So we can lower compressed instructions to instructions we already had, which requires a decoupling between the decoder and executor.
This patch:
- use llvm::Optional and its combinators AMAP.
- use template constraints on common instruction.
- make instructions strongly-typed (no uint32_t everywhere bc it is error-prone and burdens the developer when lowering the RVC) with the help of algebraic datatype (std::variant).
Note:
(NFC) because this is more of a refactoring in preparation for RVC.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D135015
Rather than inserting a ptrtoint + inttoptr pair, directly replace
the inttoptr with the new phi node. This ensures that no other
transform can undo it before the pair gets folded away.
This avoids the infinite loop when combined with D134954.
This is NFCI in the sense that it shouldn't make a difference, but
could due to different worklist order.
The GPU transform dialect currently has restrictions and several situations where we can't use transform dialect.
This update includes a method to test a failing cases in GPU transform dialect.
Differential Revision: https://reviews.llvm.org/D135063
The new pass implements the following:
* Inserts code at the start of an arm_new_za function to
commit a lazy-save when the lazy-save mechanism is active.
* Adds a smstart intrinsic at the start of the function.
* Adds a smstop intrinsic at the end of the function.
Patch co-authored by kmclaughlin.
Differential Revision: https://reviews.llvm.org/D133896
SimpleLoopUnswitch may remove blocks from loops. Clear block and loop
dispositions in that case, to clean up invalid entries in the cache.
Fixes#58158.
Fixes#58159.
This simplifies the test case added in e399dd601 to only require indvars
and simple-loop-unswitch. This allows adding the test case for #58158 to
the same file, keeping related tests together.
This patch introduces a new AArch64 ISD node (OBSCURE_COPY) that can
be used when we want to prevent SVE object address calculations
from being rematerialised between a smstop/smstart and a call.
At the moment we use COPY to copy the frame index to a register,
which leads to problems because the "simple register coalescing"
pass understands the COPY instruction and attempts to rematerialise
an address calculation with 'addvl' between an smstop and a call.
When in streaming mode the 'addvl' instruction may have different
behaviour because the streaming SVE vector length is not guaranteed
to equal the normal SVE vector length.
The new ISD opcode OBSCURE_COPY gets lowered to a new pseudo
instruction also called OBSCURE_COPY. This ensures it cannot be
rematerialised and we expand this into a simple move very late in
the machine instruction pipeline.
A new test is added here:
CodeGen/AArch64/sme-streaming-interface.ll
Differential Revision: https://reviews.llvm.org/D134940
This patch update the fir::isUnlimitedPolymorphicType function
to reflect the chosen design. It adds also a fir::isPolymorphicType
function.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D135143
This makes sure that the instructions of the prologue matches the
SEH opcodes.
Also remove a couple redundant cases of setting HasWinCFI; it was
already set unconditionally after the conditional cases.
Differential Revision: https://reviews.llvm.org/D135101
When trying to debug some `compiler-rt` unittests, I initially had a hard
time because
- even in a `Debug` build one needs to set `COMPILER_RT_DEBUG` to get
debugging info for some of the code and
- even so the unittests used a hardcoded `-O2` which often makes debugging
impossible.
This patch addresses this by instead using `-O0` if `COMPILER_RT_DEBUG`.
Two tests in `sanitizer_type_traits_test.cpp` need to be disabled since
they have undefined references to `__sanitizer::integral_constant<bool,
true>::value`.
Tested on `sparcv9-sun-solaris2.11`, `amd64-pc-solaris2.11`, and
`x86_64-pc-linux-gnu`.
Differential Revision: https://reviews.llvm.org/D91620
We often have constraints for array attributes that they are sorted
non-decreasing or strictly increasing. This change adds AttrConstraint classes
that support DenseArrayAttr for integer types.
Differential Revision: https://reviews.llvm.org/D134944