llvm-project/llvm/test/Transforms/IndVarSimplify/pr24783.ll

; RUN: opt -S -indvars < %s | FileCheck %s

target datalayout = "E-m:e-i64:64-n32:64"
target triple = "powerpc64-unknown-linux-gnu"

define void @f(i32* %end.s, i8** %loc, i32 %p) {
; CHECK-LABEL: @f(
entry:
  %end = getelementptr inbounds i32, i32* %end.s, i32 %p
  %init = bitcast i32* %end.s to i8*
  br label %while.body.i

while.body.i:
  %ptr = phi i8* [ %ptr.inc, %while.body.i ], [ %init, %entry ]
  %ptr.inc = getelementptr inbounds i8, i8* %ptr, i8 1
  %ptr.inc.cast = bitcast i8* %ptr.inc to i32*
  %cmp.i = icmp eq i32* %ptr.inc.cast, %end
  br i1 %cmp.i, label %loop.exit, label %while.body.i

loop.exit:
; CHECK: loop.exit:
; CHECK: [[END_BCASTED:%[a-z0-9]+]] = bitcast i32* %end to i8*
; CHECK: store i8* [[END_BCASTED]], i8** %loc
  %ptr.inc.lcssa = phi i8* [ %ptr.inc, %while.body.i ]
  store i8* %ptr.inc.lcssa, i8** %loc
  ret void
}
[IndVars] Fix PR24783. In `IndVarSimplify::ExpandSCEVIfNeeded`, `SCEVExpander::findExistingExpansion` may return an `llvm::Value` that differs in type from the SCEV it was asked to find an expansion for (but computes the same value). In such cases, we fall back on `expandCodeFor`; and rely on LLVM to CSE the two equivalent expressions (different only by a no-op cast) into a single computation. I tried a few other approaches to fixing PR24783, all of which turned out to be more complex than this current version: 1. Move the `ExpandSCEVIfNeeded` logic into `expandCodeFor`. This got problematic because currently we do not pass in the `Loop *` into `expandCodeFor`. Changing the interface to do this is a more invasive change, and really does not make much semantic sense unless the SCEV being passed in is an add recurrence. There is also the problem of `expandCodeFor` being used in places other than `indvars` -- there may be performance / correctness issues elsewhere if `expandCodeFor` is moved from always generating IR from scratch to cache-like model. 2. Have `findExistingExpansion` only return expression with the correct type. This would make `isHighCostExpansionHelper` and thus `isHighCostExpansion` more conservative than necessary. 3. Insert casts on the value returned by `findExistingExpansion` if needed using `InsertNoopCastOfTo`. This is complicated because `InsertNoopCastOfTo` depends on internal state of its `SCEVExpander` (specifically `Builder.GetInserPoint()`), and this may not be set up when `ExpandSCEVIfNeeded` is called. 4. Manually insert casts on the value returned by `findExistingExpansion` if needed using `InsertNoopCastOfTo` via `CastInst::Create`. This is probably workable, but figuring out the location where the cast instruction needs to be inserted has enough edge cases (arguments, constants, invokes, LCSSA must be preserved) makes me feel what I have right now is simplest solution. llvm-svn: 247749 2015-09-16 07:45:39 +08:00			`; RUN: opt -S -indvars < %s \| FileCheck %s`

			`target datalayout = "E-m:e-i64:64-n32:64"`
			`target triple = "powerpc64-unknown-linux-gnu"`

			`define void @f(i32* %end.s, i8** %loc, i32 %p) {`
			`; CHECK-LABEL: @f(`
			`entry:`
			`%end = getelementptr inbounds i32, i32* %end.s, i32 %p`
			`%init = bitcast i32* %end.s to i8*`
			`br label %while.body.i`

			`while.body.i:`
			`%ptr = phi i8* [ %ptr.inc, %while.body.i ], [ %init, %entry ]`
			`%ptr.inc = getelementptr inbounds i8, i8* %ptr, i8 1`
			`%ptr.inc.cast = bitcast i8* %ptr.inc to i32*`
			`%cmp.i = icmp eq i32* %ptr.inc.cast, %end`
			`br i1 %cmp.i, label %loop.exit, label %while.body.i`

			`loop.exit:`
			`; CHECK: loop.exit:`
[SCEV] Try to reuse existing value during SCEV expansion Current SCEV expansion will expand SCEV as a sequence of operations and doesn't utilize the value already existed. This will introduce redundent computation which may not be cleaned up throughly by following optimizations. This patch introduces an ExprValueMap which is a map from SCEV to the set of equal values with the same SCEV. When a SCEV is expanded, the set of values is checked and reused whenever possible before generating a sequence of operations. The original commit triggered regressions in Polly tests. The regressions exposed two problems which have been fixed in current version. 1. Polly will generate a new function based on the old one. To generate an instruction for the new function, it builds SCEV for the old instruction, applies some tranformation on the SCEV generated, then expands the transformed SCEV and insert the expanded value into new function. Because SCEV expansion may reuse value cached in ExprValueMap, the value in old function may be inserted into new function, which is wrong. In SCEVExpander::expand, there is a logic to check the cached value to be used should dominate the insertion point. However, for the above case, the check always passes. That is because the insertion point is in a new function, which is unreachable from the old function. However for unreachable node, DominatorTreeBase::dominates thinks it will be dominated by any other node. The fix is to simply add a check that the cached value to be used in expansion should be in the same function as the insertion point instruction. 2. When the SCEV is of scConstant type, expanding it directly is cheaper than reusing a normal value cached. Although in the cached value set in ExprValueMap, there is a Constant type value, but it is not easy to find it out -- the cached Value set is not sorted according to the potential cost. Existing reuse logic in SCEVExpander::expand simply chooses the first legal element from the cached value set. The fix is that when the SCEV is of scConstant type, don't try the reuse logic. simply expand it. Differential Revision: http://reviews.llvm.org/D12090 llvm-svn: 259736 2016-02-04 09:27:38 +08:00			`; CHECK: [[END_BCASTED:%[a-z0-9]+]] = bitcast i32* %end to i8*`
[IndVars] Fix PR24783. In `IndVarSimplify::ExpandSCEVIfNeeded`, `SCEVExpander::findExistingExpansion` may return an `llvm::Value` that differs in type from the SCEV it was asked to find an expansion for (but computes the same value). In such cases, we fall back on `expandCodeFor`; and rely on LLVM to CSE the two equivalent expressions (different only by a no-op cast) into a single computation. I tried a few other approaches to fixing PR24783, all of which turned out to be more complex than this current version: 1. Move the `ExpandSCEVIfNeeded` logic into `expandCodeFor`. This got problematic because currently we do not pass in the `Loop *` into `expandCodeFor`. Changing the interface to do this is a more invasive change, and really does not make much semantic sense unless the SCEV being passed in is an add recurrence. There is also the problem of `expandCodeFor` being used in places other than `indvars` -- there may be performance / correctness issues elsewhere if `expandCodeFor` is moved from always generating IR from scratch to cache-like model. 2. Have `findExistingExpansion` only return expression with the correct type. This would make `isHighCostExpansionHelper` and thus `isHighCostExpansion` more conservative than necessary. 3. Insert casts on the value returned by `findExistingExpansion` if needed using `InsertNoopCastOfTo`. This is complicated because `InsertNoopCastOfTo` depends on internal state of its `SCEVExpander` (specifically `Builder.GetInserPoint()`), and this may not be set up when `ExpandSCEVIfNeeded` is called. 4. Manually insert casts on the value returned by `findExistingExpansion` if needed using `InsertNoopCastOfTo` via `CastInst::Create`. This is probably workable, but figuring out the location where the cast instruction needs to be inserted has enough edge cases (arguments, constants, invokes, LCSSA must be preserved) makes me feel what I have right now is simplest solution. llvm-svn: 247749 2015-09-16 07:45:39 +08:00			`; CHECK: store i8* [[END_BCASTED]], i8** %loc`
			`%ptr.inc.lcssa = phi i8* [ %ptr.inc, %while.body.i ]`
			`store i8* %ptr.inc.lcssa, i8** %loc`
			`ret void`
			`}`