This moves WidenIV from IndVarSimplify to Utils/SimplifyIndVar so that we have
createWideIV available as a generic helper utility. I.e., this is not only
useful in IndVarSimplify, but could be useful for loop transformations. For
example, motivation for this refactoring is the loop flatten transformation: if
induction variables in a loop nest can be widened, we can avoid having to
perform certain overflow checks, enabling this transformation.
Differential Revision: https://reviews.llvm.org/D90421
To accommodate frame layouts that have both fixed and scalable objects
on the stack, describing a stack location or offset using a pointer + uint64_t
is not sufficient. For this reason, we've introduced the StackOffset class,
which models both the fixed- and scalable sized offsets.
The TargetFrameLowering::getFrameIndexReference is made to return a StackOffset,
so that this can be used in other interfaces, such as to eliminate frame indices
in PEI or to emit Debug locations for variables on the stack.
This patch is purely mechanical and doesn't change the behaviour of how
the result of this function is used for fixed-sized offsets. The patch adds
various checks to assert that the offset has no scalable component, as frame
offsets with a scalable component are not yet supported in various places.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D90018
Some targets may add required passes via
TargetMachine::registerPassBuilderCallbacks(). We need to run those even
under -O0. As an example, BPFTargetMachine adds
BPFAbstractMemberAccessPass, a required pass.
This also allows us to clean up BackendUtil.cpp (and out-of-tree Rust
usage of the NPM) by allowing us to share added passes like coroutines
and sanitizers between -O0 and other optimization levels.
Tests are a continuation of those added in
https://reviews.llvm.org/D89083.
In order to prevent TargetMachines from adding unnecessary optimization
passes at -O0, TargetMachine::registerPassBuilderCallbacks() will be
changed to take an OptimizationLevel, but that will be done separately.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D89158
This patch adds the llvm.loop.mustprogress loop metadata. This is to be
added to loops where the frontend language requires that the loop makes
observable interactions with the environment. This is the loop-level
equivalent to the function attribute `mustprogress` defined in D86233.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D88464
Allow single-quoted strings and double-quoted character values, as well as doubled-quote escaping.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D89731
This patch adds the `async` lowering of coroutines.
This will be used by the Swift frontend to lower async functions. In
contrast to the `retcon` lowering the frontend needs to be in control
over control-flow at suspend points as execution might be suspended at
these points.
This is very much work in progress and the implementation will change as
it evolves with the frontend. As such the documentation is lacking
detail as some of it might change.
rdar://70097093
Reapply with fix for memory sanitizer failure and sphinx failure.
Differential Revision: https://reviews.llvm.org/D90612
This allows those instrumentation to log when they decide to skip a
pass. This provides extra helpful info for optnone functions and also
will help with opt-bisect.
Have OptNoneInstrumentation print when it skips due to seeing optnone.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D90545
This patch adds the `async` lowering of coroutines.
This will be used by the Swift frontend to lower async functions. In
contrast to the `retcon` lowering the frontend needs to be in control
over control-flow at suspend points as execution might be suspended at
these points.
This is very much work in progress and the implementation will change as
it evolves with the frontend. As such the documentation is lacking
detail as some of it might change.
rdar://70097093
Differential Revision: https://reviews.llvm.org/D90612
Allows the MACRO directive to define macro procedures with parameters and macro-local symbols.
Supports required and optional parameters (including default values), and matches ml64.exe for its macro-local symbol handling (up to 65536 macro-local symbols in any translation unit).
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D89729
This patch replaces the AArch64StackOffset class by the generic one
defined in TypeSize.h.
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D88983
- Basically iterate each pair of memory operands from both instructions
and return true if any of them may alias.
- The exception are memory instructions without any memory operand. They
may touch everything and could alias to any memory instruction.
Differential Revision: https://reviews.llvm.org/D89447
As discussed on D90527, we should be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking.
As discussed on D90527, we should be be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking.
The refactor to use the KnownBits fixed/min/max constant helpers allows us to hit a couple of cases that we were missing before.
We still need the getValidMinimumShiftAmountConstant case as KnownBits doesn't handle per-element vector cases.
This differentiates the Ryzen 4000/4300/4500/4700 series APUs that were
previously included in gfx909.
Differential Revision: https://reviews.llvm.org/D90419
Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d
This patch adds some helper in the DirectiveLanguage wrapper to initialize it from
the RecordKeeper and validate the records. This simplify arguments in lots of function
since only the DirectiveLanguge is passed.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D90358
As noted in D90554, there's an opcode typo in using an easily
misused cost model API: getCmpSelInstrCost(). Beyond that, the
assumed sequence of ops is questionable, but that would be
another patch.
My guess is that the x86 test diffs show that we are probably
wrong both before and after this change, so there will be no
practical difference.
As an example, I tried this test which shows a cost of '7'
either way:
define <4 x i32> @sadd(<4 x i32> %va, <4 x i32> %vb) {
%V4I32 = call {<4 x i32>, <4 x i1>} @llvm.sadd.with.overflow.v4i32(<4 x i32> %va, <4 x i32> %vb)
%ov = extractvalue {<4 x i32>, <4 x i1>} %V4I32, 1
%r = extractvalue {<4 x i32>, <4 x i1>} %V4I32, 0
%z = select <4 x i1> %ov, <4 x i32> <i32 42, i32 42, i32 42, i32 42>, <4 x i32> %r
ret <4 x i32> %z
}
$ llc -o - sadd.ll -mattr=avx
vpaddd %xmm1, %xmm0, %xmm2
vpcmpgtd %xmm2, %xmm0, %xmm0
vpxor %xmm0, %xmm1, %xmm0
vblendvps %xmm0, LCPI0_0(%rip), %xmm2, %xmm0a
Differential Revision: https://reviews.llvm.org/D90681
This lets external consumers customize the output, similar to how
AssemblyAnnotationWriter lets the caller define callbacks when printing
IR. The array of handlers already existed, this just cleans up the code
so that it can be exposed publically.
Replaces https://reviews.llvm.org/D74158
Differential Revision: https://reviews.llvm.org/D89613
Adds a method called pop_back_n to SmallVector.
This is more readable and less error prone than the alternatives of using
```lang=c++
Vector.resize(Vector.size() - N);
Vector.erase(Vector.end() - N, Vector.end());
for (unsigned I = 0;I<N;++I) Vector.pop_back();
```
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D90576
As discussed on D90527, we should be be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking.
The refactor to use the KnownBits fixed/min/max constant helpers allows us to hit a couple of cases that we were missing before.
We still need the getValidMinimumShiftAmountConstant case as KnownBits doesn't handle per-element vector cases.
This patch changes the intrinsics cost model to assume that by default
target intrinsics are cheap. This didn't seem to be the case for all
intrinsics, and is potentially an MVE problem due to our scalarization
overheads. Cheap seems to be a good default in general though.
Differential Revision: https://reviews.llvm.org/D90597
This patch adds a linear polynomial base class, called LinearPolyBase, which
serves as a base class for StackOffset. It tries to represent a linear
polynomial like:
c0 * scale0 + c1 * scale1 + ... + cK * scaleK
where the scale is implicit, meaning that only the coefficients are
encoded.
This patch also adds a univariate linear polynomial, which serves as
a base class for ElementCount and TypeSize. This tries to represent a
linear polynomial where only one dimension can be set at any one time,
i.e. a TypeSize is either fixed-sized, or scalable-sized, but cannot be
a combination of the two.
class LinearPolyBase
^
|
+---- class StackOffset (dimensions = 2 (fixed/scalable), type = int64_t)
class UnivariateLinearPolyBase
|
|
+---- class LinearPolySize (dimensions = 2 (fixed/scalable))
^
|
+-------- class ElementCount (type = unsigned)
|
|
+-------- class TypeSize (type = uint64_t)
Reviewed By: ctetreau, david-arm
Differential Revision: https://reviews.llvm.org/D88982
Currently it is impossible to create an instance of ELFObjectFile when the
SHT_SYMTAB_SHNDX can't be read. We error out when fail to parse the
SHT_SYMTAB_SHNDX section in the factory method.
This change delays reading of the SHT_SYMTAB_SHNDX section entries,
with it llvm-readobj is now able to work with such inputs.
Differential revision: https://reviews.llvm.org/D89379
parallelTransformReduce is modelled on the C++17 pstl API of
std::transform_reduce, except our wrappers do not use execution policy
parameters.
parallelForEachError allows loops that contain potentially failing
operations to propagate errors out of the loop. This was one of the
major challenges I encountered while parallelizing PDB type merging in
LLD. Parallelizing a loop with parallelForEachError is not behavior
preserving: the loop will no longer stop on the first error, it will
continue working and report all errors it encounters in a list.
I plan to use this to propagate errors out of LLD's
coff::TpiSource::remapTpiWithGHashes, which currently stores errors an
error in the TpiSource object.
Differential Revision: https://reviews.llvm.org/D90639
MC currently produces monolithic .gcc_except_table section. GCC can split up .gcc_except_table:
* if comdat: `.section .gcc_except_table._Z6comdatv,"aG",@progbits,_Z6comdatv,comdat`
* otherwise, if -ffunction-sections: `.section .gcc_except_table._Z3fooi,"a",@progbits`
This ensures that (a) non-prevailing copies are discarded and (b)
.gcc_except_table associated to discarded text sections can be discarded by a
.gcc_except_table-aware linker (GNU ld, but not gold or LLD)
This patches matches the GCC behavior. If -fno-unique-section-names is
specified, we don't append the suffix. If -ffunction-sections is additionally specified,
use `.section ...,unique`.
Note, if clang driver communicates that the linker is LLD and we know it
is new (11.0.0 or later) we can use SHF_LINK_ORDER to avoid string table
costs, at least in the -fno-unique-section-names case. We cannot use it on GNU
ld because as of binutils 2.35 it does not support mixed SHF_LINK_ORDER &
non-SHF_LINK_ORDER components in an output section
https://sourceware.org/bugzilla/show_bug.cgi?id=26256
For RISC-V -mrelax, this patch additionally fixes an assembler-linker
interaction problem: because a section is shrinkable, the length of a call-site
code range is not a constant. Relocations referencing the associated text
section (STT_SECTION) are needed. However, a STB_LOCAL relocation referencing a
discarded section group member from outside the group is disallowed by the ELF
specification (PR46675):
```
// a.cc
inline int comdat() { try { throw 1; } catch (int) { return 1; } return 0; }
int main() { return comdat(); }
// b.cc
inline int comdat() { try { throw 1; } catch (int) { return 1; } return 0; }
int foo() { return comdat(); }
clang++ -target riscv64-linux -c a.cc b.cc -fPIC -mno-relax
ld.lld -shared a.o b.o => ld.lld: error: relocation refers to a symbol in a discarded section:
```
-fbasic-block-sections= is similar to RISC-V -mrelax: there are outstanding relocations.
Reviewed By: jrtc27, rahmanl
Differential Revision: https://reviews.llvm.org/D83655
A SMLoc allows MCStreamer to report location-aware diagnostics, which
were previously done by adding SMLoc to various methods (e.g. emit*) in an ad-hoc way.
Since the file:line is most important, the column is less important and
the start token location suffices in many cases, this patch reverts
b7e7131af2
```
// old
symbol-binding-changed.s:6:8: error: local changed binding to STB_GLOBAL
.globl local
^
// new
symbol-binding-changed.s:6:1: error: local changed binding to STB_GLOBAL
.globl local
^
```
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D90511
Running `-fsyntax-only` on UniqueID.h is 2x faster with this patch
(which avoids calling `std::tie` for `operator<`). Since the transitive
includers of this file will go up as `FileEntryRef` gets used in more
places, avoid that compile-time hit. This is a follow-up to
23ed570af1 (suggested by Reid Kleckner).
Also drop the `<tuple>` include from FileSystem.h (which was vestigal
from before UniqueID.h was split out).
Differential Revision: https://reviews.llvm.org/D90471
This reverts the revert commit 408c4408fa.
This version of the patch includes a fix for a crash caused by
treating ICmp/FCmp constant expressions as instructions.
Original message:
On some targets, like AArch64, vector selects can be efficiently lowered
if the vector condition is a compare with a supported predicate.
This patch adds a new argument to getCmpSelInstrCost, to indicate the
predicate of the feeding select condition. Note that it is not
sufficient to use the context instruction when querying the cost of a
vector select starting from a scalar one, because the condition of the
vector select could be composed of compares with different predicates.
This change greatly improves modeling the costs of certain
compare/select patterns on AArch64.
I am also planning on putting up patches to make use of the new argument in
SLPVectorizer & LV.
For PS4 development we support dllimport/export annotations in
source code. This patch enables the dllimport/export attributes
on PS4 by adding a new function to query the triple for whether
dllimport/export are used and using that function to decide
whether these attributes are supported. This replaces the current
method of checking if the target is Windows.
This means we can drop the use of "TargetArch" in the .td file
(which is an improvement as dllimport/export support isn't really
a function of the architecture).
I have included a simple codgen test to show that the attributes
are accepted and have an effect on codegen for PS4. I have also
enabled the DLLExportStaticLocal and DLLImportStaticLocal
attributes, which we support downstream. However, I am unable to
write a test for these attributes until other patches for PS4
dllimport/export handling land upstream. Whilst writing this
patch I noticed that, as these attributes are internal, they do
not need to be target specific (when these attributes are added
internally in Clang the target specific checks have already been
run); however, I think leaving them target specific is fine
because it isn't harmful and they "really are" target specific
even if that has no functional impact.
Differential Revision: https://reviews.llvm.org/D90442
This reverts the revert commit a1b53db324.
This patch includes a fix for a reported issue, caused by
matchSelectPattern returning UMIN for selects of pointers in
some cases by looking to some connected casts.
For now, ensure integer instrinsics are only returned for selects of
ints or int vectors.
This patch mainly made the following changes:
1. Support AVX-VNNI instructions;
2. Introduce ExplicitVEXPrefix flag so that vpdpbusd/vpdpbusds/vpdpbusds/vpdpbusds instructions only use vex-encoding when user explicity add {vex} prefix.
Differential Revision: https://reviews.llvm.org/D89105
This reverts commit 1922570489.
This appears to cause a crash in the following example
a, b, c;
l() {
int e = a, f = l, g, h, i, j;
float *d = c, *k = b;
for (;;)
for (; g < f; g++) {
k[h] = d[i];
k[h - 1] = d[j];
h += e << 1;
i += e;
}
}
clang -cc1 -triple i386-unknown-linux-gnu -emit-obj -target-cpu pentium-m -O1 -vectorize-loops -vectorize-slp reduced.c
llvm::Type *llvm::Type::getWithNewBitWidth(unsigned int) const: Assertion `isIntOrIntVectorTy() && "Original type expected to be a vector of integers or a scalar integer."' failed.