This intrinsic blocks floating point transformations by the optimizer.
Author: Pengfei
Reviewed By: LuoYuanke, Andy Kaylor, Craig Topper, kpn
Differential Revision: https://reviews.llvm.org/D99675
MemRefDataFlow performs mem2reg style operations for affine load/stores. Unfortunately, it is not presently correct in the presence of external operations such as memref.cast, or function calls. This diff extends the functionality of the pass to remain correct in the presence of such ops.
Differential Revision: https://reviews.llvm.org/D104053
Added the option `-altivec-src-compat=[mixed,gcc,xl]`. The default at this time is `mixed`.
The default behavior for clang is for all vector compares to return a scalar unless the vectors being
compared are vector bool or vector pixel. In that case the compare returns a
vector. With the gcc case all vector compares return vectors and in the xl case
all vector compares return scalars.
This patch does not change the default behavior of clang.
This option will be used in future patches to implement behaviour compatibility for the vector bool/pixel types.
Reviewed By: bmahjour
Differential Revision: https://reviews.llvm.org/D103615
A canonicalization accidentally will remove a memref allocation if it is only stored into. However, this is incorrect if the allocation is the value being stored, not the allocation being stored into.
Differential Revision: https://reviews.llvm.org/D104947
This reverts commit 6f3b775c3e.
Test fails flakily, see comments on https://reviews.llvm.org/D103967
Also revert follow-up "[Analyzer] Attempt to fix windows bots test
failure b/c of new-line"
This reverts commit fe0e861a4d.
Previously xscale was known to everything apart
from the ELF streamer so we would crash as soon
as you tried to output an object file.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D104776
Given a select that returns the logical negation of the condition, replace it with a not of the condition.
Differential Revision: https://reviews.llvm.org/D104966
On AIX the alignment implementation has the storage aligned to the
preferred alignment instead of the alignment of a type. Macro guard
these tests for AIX and have them pass when the "reference alignment" is
less than or equal to the alignment observed. In other words, the
alignment applied is at least as strict as the required alignment.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D104786
This patch relands https://reviews.llvm.org/D104454, but fixes some failing
builds on Mac OS which apparently has a different definition for size_t,
that caused 'ambiguous operator overload' for the implicit conversion
of TypeSize to a scalar value.
This reverts commit b732e6c9a8.
Peephole optimizer should not be introducing sub-reg definitions
as they are illegal in machine SSA phase. This patch modifies
the optimizer to not emit sub-register definitions.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D103408
In typeinfo there is a reinterpret_cast between a uintptr_t and size_t. These are two integer types and therefore a reinterpret_cast is not right for this situation. It looks like it may have been copied and pasted from above in the file. An implicit cast works in it's place.
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D104814
Compiler crashes at an assertion while casting operands to PtrToIntInst at some cases when
ptrtoint is present as an explicit operand to inttoptr. Explicit instruction operator as
operand can not be casted to an Instruction.
This patch replaces cast from PtrToInst to Operator which are later checked for constant
expressions.
Differential Revision: https://reviews.llvm.org/D105002
Adds legalizer, register bank select, and instruction
select support for G_SBFX and G_UBFX. These opcodes generate
scalar or vector ALU bitfield extract instructions for
AMDGPU. The instructions allow both constant or register
values for the offset and width operands.
The 32-bit scalar version is expanded to a sequence that
combines the offset and width into a single register.
There are no 64-bit vgpr bitfield extract instructions, so the
operations are expanded to a sequence of instructions that
implement the operation. If the width is a constant,
then the 32-bit bitfield extract instructions are used.
Moved the AArch64 specific code for creating G_SBFX to
CombinerHelper.cpp so that it can be used by other targets.
Only bitfield extracts with constant offset and width values
are handled currently.
Differential Revision: https://reviews.llvm.org/D100149
Increase the number of attributor iterations on a GPU target. I forgot to
change this in D104416.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D104920
This adds support for Armv9-A's Realm Management Extension, including
three new system registers - MFAR_EL3, GPCCR_EL3 and GPTBR_EL3 - and
four new TLBI instructions.
The reference for the Realm Management Extension can be found at: https://developer.arm.com/documentation/ddi0615/aa.
Based on patches by Victor Campos.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D104773
Prior to the changes from D52010, clobbering Thumb's high registers in
inline asm would cause incorrect code to be generated - or an assertion
failure for debug builds. Now that the issue is no longer reproducible,
this patch adds a MIR test to cover that scenario.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D96335
The logic is almost similar to that of system.cpp with one change that
instead of adding all the memory pools to a device struct it only
keeps a single pool. The existing approach also always allocated memory on
the first HSA pool found for a GPU.
This depends on D104691. The goal of this series of patches is to remove
_atl_machine global. The next patch will drop g_atl_machine entirely.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D104695
Since RangeSet::Factory actually contains BasicValueFactory, we can
remove value factory from many function signatures inside the solver.
Differential Revision: https://reviews.llvm.org/D105005
I can't be sure of the cause but I believe these fail
due to to fast unwinding not working on Thumb.
Whatever the case, they have been failing on our bots
for a long time:
https://lab.llvm.org/buildbot/#/builders/170/builds/46
Require fast-unwinder-works for both.
This change modifies the existing check-debuginfo target to only run the
debuginfo tests within the cross-project-tests, and adds a new target
(check-cross-project) which runs all the tests. The former has also been
modified to not be included in check-all (since the check-cross-project
target covers them).
Differential Revision: https://reviews.llvm.org/D96513
Reviewed by: aprantl
Also mark debuginfo_tests as UNSUPPORTED if clang can't be found and
remove it from the list of test dependencies if not in
LLVM_ENABLE_PROJECTS.
Differential Revision: https://reviews.llvm.org/D96511
Reviewed by: aprantl
Currently we will allow loops with a fixed width VF of 1 to vectorize
if the -enable-strict-reductions flag is set. However, the loop vectorizer
will not use ordered reductions if `VF.isScalar()` and the resulting
vectorized loop will be out of order.
This patch removes `VF.isVector()` when checking if ordered reductions
should be used. Also, instead of converting the FAdds to reductions if the
VF = 1, operands of the FAdds are changed such that the order is preserved.
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D104533
Sinking scalar operands into predicated-triangle regions may allow
merging regions. This patch adds a VPlan-to-VPlan transform that tries
to merge predicate-triangle regions after sinking.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D100260
This test has always failed on 32 bit armv8 bots:
https://lab.llvm.org/buildbot/#/builders/178/builds/42
Due to the output order of some symbols changing.
I don't think this is an Arm specific issue so disabling
on 32 bit while it's investigated.
While the original check's purpose is to identify potentially dangerous
functions based on the parameter types (as identifier names do not mean
anything when it comes to the language rules), unfortunately, such a plain
interface check rule can be incredibly noisy. While the previous
"filtering heuristic" is able to find many similar usages, there is an entire
class of parameters that should not be warned about very easily mixed by that
check: parameters that have a name and their name follows a pattern,
e.g. `text1, text2, text3, ...`.`
This patch implements a simple, but powerful rule, that allows us to detect
such cases and ensure that no warnings are emitted for parameter sequences that
follow a pattern, even if their types allow for them to be potentially mixed at a call site.
Given a threshold `k`, warnings about two parameters are filtered from the
result set if the names of the parameters are either prefixes or suffixes of
each other, with at most k letters difference on the non-common end.
(Assuming that the names themselves are at least `k` long.)
- The above `text1, text2` is an example of this. (Live finding from Xerces.)
- `LHS` and `RHS` are also fitting the bill here. (Live finding from... virtually any project.)
- So does `Qmat, Tmat, Rmat`. (Live finding from I think OpenCV.)
Reviewed By: aaron.ballman
Differential Revision: http://reviews.llvm.org/D97297
There are several types of functions and various reasons why some
"swappable parameters" cannot be fixed with changing the parameters' types, etc.
The most common example might be int `min(int a, int b)`... no matter what you
do, the two parameters must remain the same type.
The **filtering heuristic** implemented in this patch deals with trying to find
such functions during the modelling and building of the swappable parameter
range.
If the parameter currently scrutinised matches either of the predicates below,
it will be regarded as **not swappable** even if the type of the parameter
matches.
Reviewed By: aaron.ballman
Differential Revision: http://reviews.llvm.org/D78652