When an aggregate contains an opaque type its size cannot be
determined. This triggers an "Invalid GetElementPtrInst indices for type" assert
in function checkGEPType. The fix suppresses the conversion in this case.
http://reviews.llvm.org/D20319
llvm-svn: 270479
This patch fixes https://llvm.org/bugs/show_bug.cgi?id=27703.
If there is a sequence of one or more load instructions, each loaded value is used as address of later load instruction, bitcast is necessary to change the value type, don't optimize it.
llvm-svn: 270135
We neglected to transfer operand bundles for some transforms. These
were found via inspection, I'll try to come up with some test cases.
llvm-svn: 268010
As noted in the code comment, I don't think we can do the same transform that we do for
*scalar* integers comparisons to *vector* integers comparisons because it might pessimize
the general case.
Exhibit A for an incomplete integer comparison ISA remains x86 SSE/AVX: it only has EQ and GT
for integer vectors.
But we should now recognize all the variants of this construct and produce the optimal code
for the cases shown in:
https://llvm.org/bugs/show_bug.cgi?id=26701
llvm-svn: 262424
This change was discussed in D15392. It allows us to remove the fold that was added
in:
http://reviews.llvm.org/r255261
...and it will allow us to generalize this fold:
http://reviews.llvm.org/rL112232
while preserving the order of bitcast + extract that it produces and testing shows
is better handled by the backend.
Note that the existing check for "isVectorTy()" wasn't strong enough in general
and specifically because: x86_mmx. It's not a vector, but it's not vectorizable
either. So here we check VectorType::isValidElementType() directly before
proceeding with the transform.
llvm-svn: 255433
This is a redo of r255137 (reverted at r255227) which was a redo of
r255124 (reverted at r255126) with a fixed check for a scalar source
type and an added test for the failure that caused the revert.
Original commit message:
Example:
bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float
--->
extractelement <2 x float> %X, i32 1
This is part of fixing PR25543:
https://llvm.org/bugs/show_bug.cgi?id=25543
The next step will be to generalize this fold:
trunc ( lshr ( bitcast X) ) -> extractelement (X)
Ie, I'm hoping to replace the existing transform of:
bitcast ( trunc ( lshr ( bitcast X)))
added by:
http://reviews.llvm.org/rL112232
with 2 less specific transforms to catch the case in the bug report.
Differential Revision: http://reviews.llvm.org/D14879
llvm-svn: 255261
This is a redo of r255124 (reverted at r255126) with an added check for a
scalar destination type and an added test for the failure seen in Clang's
test/CodeGen/vector.c. The extra test shows a different missing optimization.
Original commit message:
Example:
bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float
--->
extractelement <2 x float> %X, i32 1
This is part of fixing PR25543:
https://llvm.org/bugs/show_bug.cgi?id=25543
The next step will be to generalize this fold:
trunc ( lshr ( bitcast X) ) -> extractelement (X)
Ie, I'm hoping to replace the existing transform of:
bitcast ( trunc ( lshr ( bitcast X)))
added by:
http://reviews.llvm.org/rL112232
with 2 less specific transforms to catch the case in the bug report.
Differential Revision: http://reviews.llvm.org/D14879
llvm-svn: 255137
Example:
bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float
--->
extractelement <2 x float> %X, i32 1
This is part of fixing PR25543:
https://llvm.org/bugs/show_bug.cgi?id=25543
The next step will be to generalize this fold:
trunc ( lshr ( bitcast X) ) -> extractelement (X)
Ie, I'm hoping to replace the existing transform of:
bitcast ( trunc ( lshr ( bitcast X)))
added by:
http://reviews.llvm.org/rL112232
with 2 less specific transforms to catch the case in the bug report.
Differential Revision: http://reviews.llvm.org/D14879
llvm-svn: 255124
The logic for handling the pattern without a shift is identical
to the logic for handling the pattern with a shift if you set
the shift amount to zero for the former.
This should make it easier to see that we probably don't even need
optimizeIntToFloatBitCast().
If we call something like foldVecTruncToExtElt() from visitTrunc(),
we'll solve PR25543:
https://llvm.org/bugs/show_bug.cgi?id=25543
llvm-svn: 253403
removes cast by performing the lshr on smaller types. However, currently there
is no trunc(lshr (sext A), Cst) variant.
This patch add such optimization by transforming trunc(lshr (sext A), Cst)
to ashr A, Cst.
Differential Revision: http://reviews.llvm.org/D12520
llvm-svn: 247271
removes cast by performing the lshr on smaller types. However, currently there
is no trunc(lshr (sext A), Cst) variant.
This patch add such optimization by transforming trunc(lshr (sext A), Cst)
to ashr A, Cst.
Differential Revision: http://reviews.llvm.org/D12520
llvm-svn: 246997
The select pattern recognition in ValueTracking (as used by InstCombine
and SelectionDAGBuilder) only knew about integer patterns. This teaches
it about minimum and maximum operations.
matchSelectPattern() has been extended to return a struct containing the
existing Flavor and a new enum defining the pattern's behavior when
given one NaN operand.
C minnum() is defined to return the non-NaN operand in this case, but
the idiomatic C "a < b ? a : b" would return the NaN operand.
ARM and AArch64 at least have different instructions for these different cases.
llvm-svn: 244580
Make sure if we're truncating a constant that would then be sign extended
that the sign extension of the truncated constant is the same as the
original constant.
> Canonicalize min/max expressions correctly.
>
> This patch introduces a canonical form for min/max idioms where one operand
> is extended or truncated. This often happens when the other operand is a
> constant. For example:
>
> %1 = icmp slt i32 %a, i32 0
> %2 = sext i32 %a to i64
> %3 = select i1 %1, i64 %2, i64 0
>
> Would now be canonicalized into:
>
> %1 = icmp slt i32 %a, i32 0
> %2 = select i1 %1, i32 %a, i32 0
> %3 = sext i32 %2 to i64
>
> This builds upon a patch posted by David Majenemer
> (https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass
> passively stopped instcombine from ruining canonical patterns. This
> patch additionally actively makes instcombine canonicalize too.
>
> Canonicalization of expressions involving a change in type from int->fp
> or fp->int are not yet implemented.
llvm-svn: 237821
SimplifyDemandedBits was "simplifying" a constant by removing just sign bits.
This caused a canonicalization race between different parts of instcombine.
Fix and regression test added - third time lucky?
llvm-svn: 237539
The AArch64 LNT bot is unhappy - I've found that the problem is in
SimpliftDemandedBits, but that's going to require another code review
so reverting in the meantime.
llvm-svn: 237528
The test timeouts were due to instcombine fighting itself. Regression test added.
Original log message:
Canonicalize min/max expressions correctly.
This patch introduces a canonical form for min/max idioms where one operand
is extended or truncated. This often happens when the other operand is a
constant. For example:
%1 = icmp slt i32 %a, i32 0
%2 = sext i32 %a to i64
%3 = select i1 %1, i64 %2, i64 0
Would now be canonicalized into:
%1 = icmp slt i32 %a, i32 0
%2 = select i1 %1, i32 %a, i32 0
%3 = sext i32 %2 to i64
This builds upon a patch posted by David Majenemer
(https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass
passively stopped instcombine from ruining canonical patterns. This
patch additionally actively makes instcombine canonicalize too.
Canonicalization of expressions involving a change in type from int->fp
or fp->int are not yet implemented.
llvm-svn: 237520
We already had a method to iterate over all the incoming values of a PHI. This just changes all eligible code to use it.
Ineligible code included anything which cared about the index, or was also trying to get the i'th incoming BB.
llvm-svn: 237169
This just didn't need to be here at all, but the assertion I tried to
add wasn't appropriate either - the circumstance isn't impossible, it's
just not important to deal with it here - the gep-rooted version of this
instcombine will handle this case, we don't need to duplicate it for the
case where the gep happens to be used in a bitcast.
llvm-svn: 233404
The changes to InstCombine (& SCEV) do seem a bit silly - it doesn't make
anything obviously better to have the caller access the pointers element
type (the thing I'm trying to remove) than the GEP itself, but it's a
helpful migration step. This will allow me to more obviously lock down
GEP (& Load, etc) API usage, then fix all the code that accesses pointer
element types except the places that need to be removed (most of the
InstCombines) anyway - at which point I'll need to just remove all that
code because it won't be meaningful anymore (there will be no pointer
types, so no bitcasts to combine)
SCEV looks like it'll need some restructuring - we'll have to do a bit
more work for GEP canonicalization, since it'll depend on how it's used
if we can even manage to canonicalize it to a non-ugly GEP. I guess we
can do some fun stuff like voting (do 2 out of 3 load from the GEP with
a certain type that gives a pretty GEP? Does every typed use of the GEP
use either a specific type or a generic type (i8*, etc)?)
llvm-svn: 233131
Assert that this doesn't fire - I'll remove all of this later, but just
leaving it in for a while in case this is firing & we just don't have
test coverage.
llvm-svn: 233116
Summary:
Now that the DataLayout is a mandatory part of the module, let's start
cleaning the codebase. This patch is a first attempt at doing that.
This patch is not exactly NFC as for instance some places were passing
a nullptr instead of the DataLayout, possibly just because there was a
default value on the DataLayout argument to many functions in the API.
Even though it is not purely NFC, there is no change in the
validation.
I turned as many pointer to DataLayout to references, this helped
figuring out all the places where a nullptr could come up.
I had initially a local version of this patch broken into over 30
independant, commits but some later commit were cleaning the API and
touching part of the code modified in the previous commits, so it
seemed cleaner without the intermediate state.
Test Plan:
Reviewers: echristo
Subscribers: llvm-commits
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 231740
If we know that the sign bit of a value being sign extended is zero, we can use a zero extension instead. This is motivated by the fact that zero extensions are generally cheaper on x86 (and most other architectures?). We already apply a similar transform in DAGCombine, this just extends that to the IR level.
This comes up when we eagerly canonicalize gep indices to the width of a machine register (i64 on x86_64). To do so, we insert sign extensions (sext) to promote smaller types.
Differential Revision: http://reviews.llvm.org/D7255
llvm-svn: 229189
creating a non-internal header file for the InstCombine pass.
I thought about calling this InstCombiner.h or in some way more clearly
associating it with the InstCombiner clas that it is primarily defining,
but there are several other utility interfaces defined within this for
InstCombine. If, in the course of refactoring, those end up moving
elsewhere or going away, it might make more sense to make this the
combiner's header alone.
Naturally, this is a bikeshed to a certain degree, so feel free to lobby
for a different shade of paint if this name just doesn't suit you.
llvm-svn: 226783
While the term "Target" is in the name, it doesn't really have to do
with the LLVM Target library -- this isn't an abstraction which LLVM
targets generally need to implement or extend. It has much more to do
with modeling the various runtime libraries on different OSes and with
different runtime environments. The "target" in this sense is the more
general sense of a target of cross compilation.
This is in preparation for porting this analysis to the new pass
manager.
No functionality changed, and updates inbound for Clang and Polly.
llvm-svn: 226078
Summary:
InstCombine infinite-loops for the testcase added
It is because InstCombine is generating instructions that can be
optimized by itself. Fix by not optimizing frem if the optimized
type is the same as original type.
rdar://problem/19150820
Reviewers: majnemer
Differential Revision: http://reviews.llvm.org/D6634
llvm-svn: 224097