For function-scope variables with large initialisation list, FE usually
generates a global variable to hold the initializer, then generates
memcpy intrinsic to initialize the alloca. InstCombiner::visitAllocaInst
identifies such allocas which are accessed only by reading and replaces
them with the global variable. This is done by casting the global variable
to the type of the alloca and replacing all references.
However, when the global variable is in a different address space which
is disjoint with addr space 0 (e.g. for IR generated from OpenCL,
global variable cannot be in private addr space i.e. addr space 0), casting
the global variable to addr space 0 results in invalid IR for certain
targets (e.g. amdgpu).
To fix this issue, when the global variable is not in addr space 0,
instead of casting it to addr space 0, this patch chases down the uses
of alloca until reaching the load instructions, then replaces load from
alloca with load from the global variable. If during the chasing
bitcast and GEP are encountered, new bitcast and GEP based on the global
variable are generated and used in the load instructions.
Differential Revision: https://reviews.llvm.org/D27283
llvm-svn: 294786
The code comments didn't match the code logic, and we didn't actually distinguish the fake unary (not/neg/fneg)
operators from arguments. Adding another level to the weighting scheme provides more structure and can help
simplify the pattern matching in InstCombine and other places.
I fixed regressions that would have shown up from this change in:
rL290067
rL290127
But that doesn't mean there are no pattern-matching logic holes left; some combines may just be missing regression tests.
Should fix:
https://llvm.org/bugs/show_bug.cgi?id=28296
Differential Revision: https://reviews.llvm.org/D27933
llvm-svn: 294049
Some of the callers are artificially limiting this transform to integer types;
this should make it easier to incrementally remove that restriction.
llvm-svn: 291620
A number of new patterns for simplifying and/xor of icmp:
(icmp ne %x, 0) ^ (icmp ne %y, 0) => icmp ne %x, %y if the following is true:
1- (%x = and %a, %mask) and (%y = and %b, %mask)
2- %mask is a power of 2.
(icmp eq %x, 0) & (icmp ne %y, 0) => icmp ult %x, %y if the following is true:
1- (%x = and %a, %mask1) and (%y = and %b, %mask2)
2- Let %t be the smallest power of 2 where %mask1 & %t != 0. Then for any
%s that is a power of 2 and %s & %mask2 != 0, we must have %s <= %t.
For example if %mask1 = 24 and %mask2 = 16, setting %s = 16 and %t = 8
violates condition (2) above. So this optimization cannot be applied.
llvm-svn: 289813
After r289755, the AssumptionCache is no longer needed. Variables affected by
assumptions are now found by using the new operand-bundle-based scheme. This
new scheme is more computationally efficient, and also we need much less
code...
llvm-svn: 289756
If all the operands to a phi node are of the same operation, instcombine
will try to pull them through the phi node, combining them into a single
operation. When it does this, the debug location of the operation should
be the merged debug locations of the phi node arguments.
Patch 2 of 8 for D26256. Folding of a binary operation.
Differential Revision: https://reviews.llvm.org/D26256
llvm-svn: 289679
The original patch of the A->B->A BitCast optimization was reverted by r274094 because it may cause infinite loop inside compiler https://llvm.org/bugs/show_bug.cgi?id=27996.
The problem is with following code
xB = load (type B);
xA = load (type A);
+yA = (A)xB; B -> A
+zAn = PHI[yA, xA]; PHI
+zBn = (B)zAn; // A -> B
store zAn;
store zBn;
optimizeBitCastFromPhi generates
+zBn = (B)zAn; // A -> B
and expects it will be combined with the following store instruction to another
store zAn
Unfortunately before combineStoreToValueType is called on the store instruction, optimizeBitCastFromPhi is called on the new BitCast again, and this pattern repeats indefinitely.
optimizeBitCastFromPhi only generates BitCast for load/store instructions, only the BitCast before store can cause the reexecution of optimizeBitCastFromPhi, and BitCast before store can easily be handled by InstCombineLoadStoreAlloca.cpp. So the solution to the problem is if all users of a CI are store instructions, we should not do optimizeBitCastFromPhi on it. Then optimizeBitCastFromPhi will not be called on the new BitCast instructions.
Differential Revision: https://reviews.llvm.org/D23896
llvm-svn: 285116
Also, make foldSelectExtConst() a member of InstCombiner, remove
unnecessary parameters from its interface, and group visitSelectInst
helpers together in the header file.
llvm-svn: 282796
These 2 helper functions were already using APInt internally, so just
change the API and caller to allow folds for splats. The scalar
regression tests look quite thorough, so I just added a couple of
tests to prove that vectors are handled too.
These folds should be grouped with the other cmp+shift folds though.
That can be an NFC follow-up.
llvm-svn: 281663
This is a big glob of transforms that probably should work for vectors,
but currently they are disallowed because of ConstantInt guards.
llvm-svn: 281614
Everything under foldICmpInstWithConstant() should now be working for
splat vectors via m_APInt matchers. Ie, I've removed all of the FIXMEs
that I added while cleaning that section up. Note that not all of the
associated FIXMEs in the regression tests are gone though, because some
of the tests require earlier folds that are still scalar-only.
llvm-svn: 281139
This is prep work before changing the callers to also use APInt which will
allow folds for splat vectors. Currently, the callers have ConstantInt
guards in place, so no functional change intended with this commit.
llvm-svn: 280282
Like other recent changes near here, the goal is to allow vector types for
all of these folds. Splitting things up makes it easier to incrementally
enhance the code and easier to read.
llvm-svn: 279851
There was no logic in foldICmpDivConstant, so no need for a separate function.
The code is directly copy/pasted, so further cleanups to follow.
llvm-svn: 279685
There will only be 3 lines of code in foldICmpShrConst() when the cleanup is done,
so it doesn't make much sense to have a separate function for a single fold.
llvm-svn: 279575
Besides breaking up a 700 line function to improve readability,
this sinks the 'FIXME: ConstantInt' check into each helper. So
now we can independently break that restriction within any of the
helper functions.
As much as possible, the code was only {cut/paste/clang-format}'ed
to minimize risk (no functional changes intended), so several more
readability improvements are still possible.
llvm-svn: 278828
There's some formatting and pointer deref ugliness here that I intend to fix in
subsequent patches. The overall goal is to refactor the obnoxiously long switch
and incrementally remove the restriction to scalar types (allow folds for vector
splats). This patch introduces the use of m_APInt which means the RHSV reference
is now a pointer (and may have matched a vector splat), but the check of 'RHS'
remains, so vector folds are disallowed and no functional change is intended.
llvm-svn: 278816
This is part of an effort to constify ValueTracking.cpp. This change is
to methods which need const Value* instead of Value* to go with the upcoming
changes to ValueTracking.
llvm-svn: 278528
Add a generalized IRBuilderCallbackInserter, which is just given a
callback to execute after insertion. This can be used to get rid of
the custom inserter in InstCombine, which will in turn allow me to add
target specific InstCombineCalls API for intrinsics without horrible
layering violations.
llvm-svn: 277784
Making smaller pieces out of some of these ~1000 line functions should make
it easier to incrementally upgrade them to handle vector types.
llvm-svn: 276304
Summary:
This patch cleans up parts of InstCombine to raise its compliance with the LLVM coding standards and to increase its readability. The changes and according rationale are summarized in the following:
- Rename `ShouldOptimizeCast()` to `shouldOptimizeCast()` since functions should start with a lower case letter.
- Move `shouldOptimizeCast()` from InstCombineCasts.cpp to InstCombineAndOrXor.cpp since it's only used there.
- Simplify interface of `shouldOptimizeCast()`.
- Minor code style adaptions in `shouldOptimizeCast()`.
- Remove the documentation on the function definition of `shouldOptimizeCast()` since it just repeats the documentation on its declaration. Also enhance the documentation on its declaration with more information describing its intended use and make it doxygen-compliant.
- Change a comment in `foldCastedBitwiseLogic()` from `fold (logic (cast A), (cast B)) -> (cast (logic A, B))` to `fold logic(cast(A), cast(B)) -> cast(logic(A, B))` since the surrounding comments use this format.
- Remove comment `Only do this if the casts both really cause code to be generated.` in `foldCastedBitwiseLogic()` since it just repeats parts of the documentation of `shouldOptimizeCast()` and does not help to improve readability.
- Simplify the interface of `isEliminableCastPair()`.
- Removed the documentation on the function definition of `isEliminableCastPair()` which only contained obvious statements about its implementation. Instead added more general doxygen-compliant documentation to its declaration.
- Renamed parameter `DoXform` of `transformZExtIcmp()` to `DoTransform` to make its intention clearer.
- Moved documentation of `transformZExtIcmp()` from its definition to its declaration and made it doxygen-compliant.
Reviewers: vtjnash, grosser
Subscribers: majnemer, llvm-commits
Differential Revision: https://reviews.llvm.org/D22449
Contributed-by: Matthias Reisinger
llvm-svn: 275964
Revert "[InstCombine] Combine A->B->A BitCast"
as this appears to cause PR27996 and as discussed in http://reviews.llvm.org/D20847
This reverts commits r270135 and r263734.
llvm-svn: 274094
Also, rename recognizeBitReverseOrBSwapIdiom to recognizeBSwapOrBitReverseIdiom,
so the ordering of the MatchBSwaps and MatchBitReversals arguments are
consistent with the function name.
llvm-svn: 270715
When a va_start or va_copy is immediately followed by a va_end (ignoring
debug information or other start/end in between), then it is safe to
remove the pair. As this code shares some commonalities with the lifetime
markers, this has been factored to helper functions.
This InstCombine pattern kicks-in 3 times when running the LLVM test
suite.
llvm-svn: 269033