The DXC driver tests don't really belong mixed in with the toolchain
tests. This pulls them out to their own file and moves the
SimpleDiagnosticConsumer into a header so it can be used by both DXC and
toolchain tests.
fast-forwarded.
This avoids deprecation warning:
```
warning: definition of implicit copy assignment operator for 'AddrInfo'
is deprecated because it has a user-declared copy constructor
[-Wdeprecated-copy]
```
This fixes https://github.com/llvm/llvm-project/issues/57229
When we ship LLVM 16, <ranges> won't be considered experimental anymore.
We might as well do this sooner rather than later.
Differential Revision: https://reviews.llvm.org/D132151
A debug message in `LoopAccessAnalysis` did not have a newline in it, causing printed debug messages to be formatted incorrectly.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D132172
When checking if we are cross-compiling, use `CMAKE_HOST_SYSTEM_NAME`
rather than `CMAKE_HOST_SYSTEM` which seems to have the full version
number attached.
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D132130
On known hardware, reductions, gather, and scatter operations have execution latencies which correlated with the vector length (VL) of the operation. Most other operations (e.g. simply arithmetic) don't correlated in this way, and instead essentially fixed cost as VL varies.
When I'd implemented initial scalable cost model support for reductions, gather, and scatter operations, I had used an upper bound on the statically unknown VL. The argument at the time was that this prevented falsely low costs, and biased the vectorizer away from generating bad (on some hardware) code. Unfortunately, practical experience shows we were a bit too effective at that goal, and the high costs defacto prevents vectorization using these constructs at all.
This patch reverses course, and ties the returned cost not to the maximum possible VL, but the VL which would correspond to VScaleForTuning. This parameter is the same one the vectorizer uses when normalizing loop costs, so the term effectively cancels out. The result is that the vectorizer now sees these constructs as comparable in cost to their fixed length variants.
This does introduce the possibility of the cost for these operations being a significant under estimate on platforms where actual VLEN is far from that implied by VScaleForTuning. On such platforms, we might make poor heuristic choices. Probably not in LV itself (due to the cancellation mentioned above), but possibly during e.g. lowering. I'm not currently aware of any concrete examples of this, but this patch does open a concern which did not previously exist.
Previously, we had the problem of overestimating costs causing the same problem on machines much closer to default values for vscale for tuning. With this patch, we still have that problem potentially if vscale for tuning is set high (manually), and then the code is run on a narrow VLEN machine.
Differential Revision: https://reviews.llvm.org/D131519
This patch adds OMPIRBuilder support for the safelen clause for the
simd directive.
Reviewed By: shraiysh, Meinersbur
Differential Revision: https://reviews.llvm.org/D131526
We held off on this before as `LLVM_LIBDIR_SUFFIX` conflicted with it.
Now we return this.
`LLVM_LIBDIR_SUFFIX` is kept as a deprecated way to set
`CMAKE_INSTALL_LIBDIR`. The other `*_LIBDIR_SUFFIX` are just removed
entirely.
I imagine this is too potentially-breaking to make LLVM 15. That's fine.
I have a more minimal version of this in the disto (NixOS) patches for
LLVM 15 (like previous versions). This more expansive version I will
test harder after the release is cut.
Reviewed By: sebastian-ne, ldionne, #libc, #libc_abi
Differential Revision: https://reviews.llvm.org/D130586
The methods to perform such operations have been implemented for the DynamicMemRefType in a way that is similar to the implementation for StridedMemRefType. Up until here one could pass an unranked memref to the library, and thus obtain a “dynamic” memref descriptor, but then there would have been no possibility to operate on its content.
to match the emitted warning, which has changed. I push this without
review as it is a one-line NFC. I have discussed it in D130386, which
altered the warning in question.
Add the clang option -finline-max-stacksize=<N> to suppress inlining
of functions whose stack size exceeds the given value.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D131986
This patch "modernizes" the implementation of these operations by
switching them to assembly formats and type inference traits.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D131971
Function returning CHARACTER with adjustable length or dynamic arrays
can have negative length or extents passed to them. This patch makes sure
any negative inputs is rounded to 0.
Reviewed By: vdonaldson
Differential Revision: https://reviews.llvm.org/D132139
The TargetRewrite pass can change the number of argument of a function.
An extra llvm.nest attribute is added and was not set at the correct position
if an extra argument was inserted before.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D132113
If the incoming previous value of a fixed-order recurrence is a phi in
the header, go through incoming values from the latch until we find a
non-phi value. Use this as the new Previous, all uses in the header
will be dominated by the original phi, but need to be moved after
the non-phi previous value.
At the moment, fixed-order recurrences are modeled as a chain of
first-order recurrences.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D119661
Make tests expect the (correctly) emitted warnings using the WARNING
directive. This directive is non-functional now, but will be recognised
by test_errors.py when D125804 is landed. This patch is a preparation
for D125804.
For most tests, we add missing WARNING directives for emitted warnings,
but there are exceptions:
- for int-literals.f90 and resolve31.f90 we pass -pedantic to the
frontend driver, so that the expected warnings are actually emitted.
- for block-data01.f90 and resolve42.f90 we change the tests so that
warnings, which appear unintentional, are not emitted. While testing
the warning in question (padding added for alignment in common block)
would be desired, that is beyond the scope of this patch. This
warning is target-dependent.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D131987
Extends findMoreOptimalIndexType to allow ISD::BUILD_VECTOR based
indices to be truncated when such truncation is lossless. This can
enable the use of 32bit gather/scatter indices thus making it less
likely to have to split a gather/scatter in two.
Depends on D125194
Differential Revision: https://reviews.llvm.org/D130533
This patch moves `LLVM::ShuffleVectorOp` to assembly format and in the
process drops the extra type that can be inferred (both operand types
are required to be the same) and switches to a dense integer array.
The syntax change:
```
// Before
%0 = llvm.shufflevector %0, %1 [0 : i32, 0 : i32, 0 : i32, 0 : i32] : vector<4xf32>, vector<4xf32>
// After
%0 = llvm.shufflevector %0, %1 [0, 0, 0, 0] : vector<4xf32>
```
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D132038
The m_objects store in this class is only used to check whether
this ClusterManager already owns this pointer. With a SmallVector
the is_contained call was non-linear in number of children, and for
instance printing all the elements of a 16M element std::vector
didn't complete in the time I was willing to wait for it (hours).
Since we are only doing insert & contains, some kind of set is a
better data structure. In this patch I used SmallPtrSet. With
that, the same array prints out in 30 seconds. I also tried a
std::unordered_set but that was slightly slower and used a fair bit
more memory.
Other than performance, this is NFC.
Differential Revision: https://reviews.llvm.org/D131996
The code would use first non-phi instruction as an insertion point, however
this could lead to freeze getting inserted between phi and landingpad
causing a verifier assert.
Differential Revision: https://reviews.llvm.org/D132105
This change reorganizes the code and comments to make the expected semantics of these routines more clear. However, this is *not* an NFC change. The functional change is having isScalarWithPredication return false if the instruction does not need predicated. Specifically, for the case of a uniform memory operation we were previously considering it *not* to be a predicated instruction, but *were* considering it to be scalable with predication.
As can be seen with the test changes, this causes uniform memory ops which should have been lowered as uniform-per-parts values to instead be lowering via naive scalarization or if scalarization is infeasible (i.e. scalable vectors) aborted entirely. I also don't trust the code to bail out correctly 100% of the time, so it's possible we had a crash or miscompile from trying to scalarize something which isn't scalaralizable. I haven't found a concrete example here, but I am suspicious.
Differential Revision: https://reviews.llvm.org/D131093