D128285 only changed the stable (v1) layout, so the matching change in
D128694 broke the formatting of the unstable strings. This fixes that,
and ensures compatibility with all older layouts as well.
Migrate all binops to use FoldXYZ rather than CreateXYZ APIs,
which are compatible with InstSimplifyFolder and fallible constant
folding.
Rather than continuing to add one method for every single operator,
add a generic FoldBinOp (plus variants for nowrap, exact and fmf
operators), which we would need anyway for CreateBinaryOp.
This change is not NFC because IRBuilder with InstSimplifyFolder
may perform more folding. However, this patch changes SCEVExpander
to not use the folder in InsertBinOp to minimize practical impact
and keep this change as close to NFC as possible.
PDB/func-symbols.test was orignally written for 32bit x86, keeping in
mind cdecl and stdcall calling conventions which does name mangling for
example like adding "_" underscore before function name.
This is only x86 specific but purpose of pointers.test is NOT to test
calling convention.
I have made a minor change to make this test pass on Windows/Arm.
TestCommandScript.py fails on Arm/Windows due following issues:
https://llvm.org/pr56288https://llvm.org/pr56292
LLDB fails to skip prologue and also step over library function or
nodebug functions fails due to PDB/DWARF mismatch.
This patch replace function breakpoint with line breakpoint so that we
can expect LLDB to stop on desired line. Also replace dwarf with PDB
debug info for this test only.
The original assertion is not necessarily correct since the shape
argument may involve a slice of an array (an expression) and not a whole
vector with constant length. In the presence of a slice operation, the
size must be computed (left as a TODO for now).
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D128894
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
This means we no longer need to have the same API between IRBuilder
and IRBuilderFolder.
The constant case is substantially simpler, so implementing it
separately isn't an undue burden.
Fixing issue "incorrect -Winfinite-recursion warning on potentially-
unevaluated operand".
We add a dedicated visit function (VisitCXXTypeidExpr) for typeid,
instead of using the default (VisitStmt). In this new function we skip
over building the CFG for unevaluated operands of typeid.
Fixes#21668
Differential Revision: https://reviews.llvm.org/D128747
Nowdays we have a generic constant folding API to load a type from
an offset. It should be able to do anything that VNCoercion can do.
This avoids the weird templating between IRBuilder and ConstantFolder
in one function, which is will stop working as the IRBuilderFolder
moves from CreateXYZ to FoldXYZ APIs.
Unfortunately, this doesn't eliminate this pattern from VNCoercion
entirely yet.
It took me multiple hours of debugging plus asking an expert for help to
figure out why this function didn't do what it promised to do. It turns
out there is a flag that needs to be set. Document this, in an attempt
to save the next person the surprise.
Reviewed By: ymandel
Differential Revision: https://reviews.llvm.org/D128774
At the moment LoopVersioning is only created for inner-loop
vectorization. This patch moves it to LVP::execute, which means it will
also be added for epilogue vectorization. As a consequence, the proper
noalias metadata is now also added to epilogue vector loops.
LVer will be moved to VPTransformState as follow-up.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D127966
This revision merges the 2 split_reduction transforms and adds extra control by using attributes.
SplitReduction is known to require a concrete additional buffer to store tempoaray information.
Add an option to introduce a `bufferization.alloc_tensor` instead of `linalg.init_tensor`.
This behaves better with subset-based tiling and bufferization.
Differential Revision: https://reviews.llvm.org/D128722
The assert was added with 0399473de8 and is correct for that
pattern, but it is off-by-1 with the enhancement in d4f39d8333.
The transforms are still correct with the new pre-condition:
https://alive2.llvm.org/ce/z/6_6ghmhttps://alive2.llvm.org/ce/z/_GTBUt
And as shown in the new test, the transform is expected with
'ult' - in that case, the icmp reduces to test if the shift
amount is 0.
This allows all constant folding to happen through a single
function, without requiring special handling for loads at each
call-site.
This may not be NFC because some callers currently don't do that
special handling.
This is a follow up to my previous commit where TestSTL.py got broken
due to 9c6e043592.
Now that we force dwarf symbols by default on windows we dont need to
specifically put -gdwarf O0 in debug flags for this test.
These are not mentioned in the OpenCL C Specification nor in the
OpenCL Extension Specification.
Differential Revision: https://reviews.llvm.org/D128434
The ArgumentPromotion pass uses Mem2Reg promotion at the end to cutting
down generated alloca instructions as well as meaningless stores and
this behavior can leave unused (dead) arguments.
The test shows that the arguments are not removed in the current
optimization pipeline.
Use a common ConstantFoldInstOperands-based constant folding
implementation, instead of specifying the folding function for
each function individually. Going through the generic handling
doesn't appear to have any significant compile-time impact.
As the test change shows, this is not NFC, because we now use
DataLayout-aware constant folding, which can do slightly better
in some cases (e.g. those involving GEPs).
For instructions that don't need any special handling, use
ConstantFoldInstOperands(), rather than re-implementing individual
cases.
This is probably not NFC because it can handle cases the previous
code missed (e.g. vector operations).
Support compares in ConstantFoldInstOperands(), instead of
forcing the use of ConstantFoldCompareInstOperands(). Also handle
insertvalue (extractvalue was already handled).
This removes a footgun, where many uses of ConstantFoldInstOperands()
need a separate check for compares beforehand. It's particularly
insidious if called on a constant expression, because it doesn't
fail in that case, but will just not do DL-dependent folding.
This test checks one of problematic cases outlined in D128006, leading
to the patch's reversal. I thought it best to add a test just in case
this sort of optimization is attempted again in the future in some
fashion.
Even though the array is declared with '*' upper bounds, it has an
initial value that has a statically known shape. Use the shape from
the type of the initializer when the declared size is '*'.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D128889
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D128888
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
In some cases, there may be widened users of inductions even though the
plan includes the scalar VF. In those cases, make sure we still replace
the VPWidenIntOrFpInductionRecipe with scalar steps, as otherwise we may
try to execute a VPWidenIntOrFpInductionRecipe with a scalar VF.
Alternatively the patch could also split the range if needed.
This fixes a crash exposed by D123720.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D128755
The names of CHARACTER strings were being truncated leading to invalid
collisions and other failures. This change makes sure to use the entire
string as the seed for the unique name.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D128884
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>