While a collection of allocas are technically vectorizeable - by forming a wider alloca - this was not a transform SLP actually knows how to do. Instead, we were forming a bundle with missing dependencies, and then relying on the scheduling code to preserve program order if multiple instructions were scheduleable at once. I haven't been able to write a test case, but I'm 99% sure this was wrong in some edge case.
The unknown op case was flowing down the shufflevector path. This did result in some splat handling being lost with this change, but the same lack of splat handling is visible in a whole bunch of simple examples for the gather path. I didn't consider this interesting to fix given how narrow the splat of allocas case is.
Summary:
This patch correctly handles the `--sysroot=` option when passed to the
linker wrapper. This allows users to correctly find libraries that may
contain offloading code if using this option.
LLVM defines several default datalayouts for integer and floating point types that are not being considered when importing into MLIR. This patch remedies this.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D120832
The verify call was taking 50% of the compile time in our internal LLVM
fork when trying to unroll many loops.
Differential Revision: https://reviews.llvm.org/D113028
Instead of emitting 0 > Hi, emit Hi < 0. If Hi needs to be expanded again
this will allow the special case for sign bit tests in ExpandIntOp_SETCC
to trigger.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D120761
This miscompile was introduced in D119527.
This was a special pattern for rotate+bswap on RV32. It doesn't
work for RV64 since the rotate needs to be half the bitwidth. The
equivalent pattern for RV64 is ROTR ((GREV x, 56), 32) so match
that instead.
This could be generalized further as noted in the new FIXME.
Reviewed By: Chenbing.Zheng
Differential Revision: https://reviews.llvm.org/D120686
https://reviews.llvm.org/D120423 replaced the use of stacksave/restore with memref.alloca_scope, but kept the save/restore at the same location. This PR places the allocation scope within the wsloop, thus keeping the same allocation scope as the original scf.parallel (e.g. no longer over stack allocating).
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D120772
In Streaming SVE mode full NEON is not available, even though this is
implied from armv8-a. LLVM previously inferred that NEON needed to be
disabled when setting +streaming-sve, but there is no need to infer
this from +streaming-sve, because we can explicitly disable NEON using
LLVM's attribute mechanism. This is specifically relevant because
+streaming-sve is not a user-facing feature, but rather an LLVM internal
feature.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D120809
This patch enables the lowering of basic modules and functions/subroutines
in modules.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D120819
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
This patch adds the lowering of the `inquire` statement.
This patch is part of the upstreaming effort from fir-dev branch.
Depends on D120822
Reviewed By: schweitz
Differential Revision: https://reviews.llvm.org/D120823
Co-authored-by: Jean Perier <jperier@nvidia.com>
This patches adds lowering for couple of basic io statements such as `flush`,
`endfile`, `backspace` and `rewind`
This patch is part of the upstreaming effort from fir-dev branch.
Depends on D120821
Reviewed By: schweitz
Differential Revision: https://reviews.llvm.org/D120822
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
This patch adds the lowering of open and close statements
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: schweitz
Differential Revision: https://reviews.llvm.org/D120821
Co-authored-by: Jean Perier <jperier@nvidia.com>
Also, changes how the CSR loop is indexed, which should avoid bugs like the one fixed by rG4a57bb5a3b74bdad9b0518009a7d7ac7ca2ac650
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D120668
Instead of relying on underlying instructions, this patch updates
VPScalarIVStepsRecipe to only store the required type information.
This removes access to unrelated information, as well as avoiding issues
with the same underlying instruction being shared by multiple recipes.
This change should only change the debug output and not cause any
codegen changes, hence NFCI.
SIInstrInfo::FoldImmediate tried to delete move-immediate instructions
after folding them into their only use. This did not work because it was
checking hasOneNonDBGUse after doing the fold, at which point there
should be no uses. This seems to have no effect on codegen, it just
means less stuff for DCE to clean up later.
Differential Revision: https://reviews.llvm.org/D120815
This is an alternative to D120330, which disables MachineSink for
functions with irreducible cycles entirely. This avoids both the
correctness problem, and ensures we don't perform non-profitable
sinks into cycles. At the same time, it may also disable
profitable sinks in the same function. This can be made more
precise by using MachineCycleInfo in the future.
Fixes https://github.com/llvm/llvm-project/issues/53990.
Differential Revision: https://reviews.llvm.org/D120800
This is a recommit without changes. I originally reverted this
due to a significant code-size regression on tramp3d-v4, however
further investigation showed that in the tramp3d-v4 case this
change enables additional optimizations (in particular more
jump threading), which happens to reduce the size of a function
just enough to be eligible for inlining at hot callsites, which
results in the code size increase. As such, this was just bad
luck.
-----
This one-use limitation is artificial, we do not increase
instruction count if we perform the fold with multiple uses. The
motivating case is shown in @sub_eq_zero_select, where the one-use
limitation causes us to miss a subsequent select fold.
I believe the backend is pretty good about reusing flag-producing
subs for cmps with same operands, so I think doing this is fine.
Differential Revision: https://reviews.llvm.org/D120337
Currently we only check for splat shuffles, this extends it to see if the source operand is a splat across the demanded elts based upon the shuffle mask
The patch adds very basic cost model for masked memory op on scalable vector.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D117884
This patch moves all functionality from IntegerPolyhedron to IntegerRelation.
IntegerPolyhedron is now implemented as a relation with no domain. All existing
functionality is extended to work on relations.
This patch does not affect external users like FlatAffineConstraints as they
can still continue to use IntegerPolyhedron abstraction.
This patch is part of a series of patches to support relations in Presburger
library.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D120652
A using directive in a header pollutes the namespace of all files which
include that header. It seems this snuck in in D115764 by moving some
code from a cpp file.
This patch changes the return value of Platform::GetName() to a
StringRef, and uses the opportunity (compile errors) to change some
callsites to use GetPluginName() instead. The two methods still remain
hardwired to return the same thing, but this will change once the ideas
in
<https://discourse.llvm.org/t/multiple-platforms-with-the-same-name/59594>
are implemented.
Differential Revision: https://reviews.llvm.org/D119146
Add support for translating data layout specifications for integer and float
types between MLIR and LLVM IR. This is a first step towards removing the
string-based LLVM dialect data layout attribute on modules. The latter is still
available and will remain so until the first-class MLIR modeling can fully
replace it.
Depends On D120739
Reviewed By: wsmoses
Differential Revision: https://reviews.llvm.org/D120740
Add support for integer and float types into the data layout subsystem with
default logic similar to LLVM IR. Given the flexibility of the sybsystem, the
logic can be easily overwritten by operations if necessary. This provides the
connection necessary, e.g., for the GPU target where alignment requirements for
integers and floats differ from those provided by default (although still
compatible with the LLVM IR model). Previously, it was impossible to use
non-default alignment requirements for integer and float types, which could
lead to incorrect address and size calculations when targeting GPUs.
Depends On D120737
Reviewed By: wsmoses
Differential Revision: https://reviews.llvm.org/D120739