Now that Reassociate's LinearizeExprTree can look through arbitrary expression
topologies, it is quite possible for a leaf node to have huge multiplicity, for
example: x0 = x*x, x1 = x0*x0, x2 = x1*x1, ... rapidly gives a value which is x
raised to a vast power (the multiplicity, or weight, of x). This patch fixes
the computation of weights by correctly computing them no matter how big they
are, rather than just overflowing and getting a wrong value. It turns out that
the weight for a value never needs more bits to represent than the value itself,
so it is enough to represent weights as APInts of the same bitwidth and do the
right overflow-avoiding dance steps when computing weights. As a side-effect it
reduces the number of multiplies needed in some cases of large powers. While
there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree
static, pushing the rank computation out into users. This is progress towards
fixing PR13021.
llvm-svn: 158358
2012-06-12 22:33:56 +08:00
|
|
|
; RUN: opt < %s -reassociate -S | FileCheck %s
|
|
|
|
|
|
|
|
; Tests involving repeated operations on the same value.
|
|
|
|
|
|
|
|
define i8 @nilpotent(i8 %x) {
|
2013-07-14 09:42:54 +08:00
|
|
|
; CHECK-LABEL: @nilpotent(
|
Now that Reassociate's LinearizeExprTree can look through arbitrary expression
topologies, it is quite possible for a leaf node to have huge multiplicity, for
example: x0 = x*x, x1 = x0*x0, x2 = x1*x1, ... rapidly gives a value which is x
raised to a vast power (the multiplicity, or weight, of x). This patch fixes
the computation of weights by correctly computing them no matter how big they
are, rather than just overflowing and getting a wrong value. It turns out that
the weight for a value never needs more bits to represent than the value itself,
so it is enough to represent weights as APInts of the same bitwidth and do the
right overflow-avoiding dance steps when computing weights. As a side-effect it
reduces the number of multiplies needed in some cases of large powers. While
there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree
static, pushing the rank computation out into users. This is progress towards
fixing PR13021.
llvm-svn: 158358
2012-06-12 22:33:56 +08:00
|
|
|
%tmp = xor i8 %x, %x
|
|
|
|
ret i8 %tmp
|
|
|
|
; CHECK: ret i8 0
|
|
|
|
}
|
|
|
|
|
|
|
|
define i2 @idempotent(i2 %x) {
|
2013-07-14 09:42:54 +08:00
|
|
|
; CHECK-LABEL: @idempotent(
|
Now that Reassociate's LinearizeExprTree can look through arbitrary expression
topologies, it is quite possible for a leaf node to have huge multiplicity, for
example: x0 = x*x, x1 = x0*x0, x2 = x1*x1, ... rapidly gives a value which is x
raised to a vast power (the multiplicity, or weight, of x). This patch fixes
the computation of weights by correctly computing them no matter how big they
are, rather than just overflowing and getting a wrong value. It turns out that
the weight for a value never needs more bits to represent than the value itself,
so it is enough to represent weights as APInts of the same bitwidth and do the
right overflow-avoiding dance steps when computing weights. As a side-effect it
reduces the number of multiplies needed in some cases of large powers. While
there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree
static, pushing the rank computation out into users. This is progress towards
fixing PR13021.
llvm-svn: 158358
2012-06-12 22:33:56 +08:00
|
|
|
%tmp1 = and i2 %x, %x
|
|
|
|
%tmp2 = and i2 %tmp1, %x
|
|
|
|
%tmp3 = and i2 %tmp2, %x
|
|
|
|
ret i2 %tmp3
|
|
|
|
; CHECK: ret i2 %x
|
|
|
|
}
|
|
|
|
|
|
|
|
define i2 @add(i2 %x) {
|
2013-07-14 09:42:54 +08:00
|
|
|
; CHECK-LABEL: @add(
|
Now that Reassociate's LinearizeExprTree can look through arbitrary expression
topologies, it is quite possible for a leaf node to have huge multiplicity, for
example: x0 = x*x, x1 = x0*x0, x2 = x1*x1, ... rapidly gives a value which is x
raised to a vast power (the multiplicity, or weight, of x). This patch fixes
the computation of weights by correctly computing them no matter how big they
are, rather than just overflowing and getting a wrong value. It turns out that
the weight for a value never needs more bits to represent than the value itself,
so it is enough to represent weights as APInts of the same bitwidth and do the
right overflow-avoiding dance steps when computing weights. As a side-effect it
reduces the number of multiplies needed in some cases of large powers. While
there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree
static, pushing the rank computation out into users. This is progress towards
fixing PR13021.
llvm-svn: 158358
2012-06-12 22:33:56 +08:00
|
|
|
%tmp1 = add i2 %x, %x
|
|
|
|
%tmp2 = add i2 %tmp1, %x
|
|
|
|
%tmp3 = add i2 %tmp2, %x
|
|
|
|
ret i2 %tmp3
|
|
|
|
; CHECK: ret i2 0
|
|
|
|
}
|
|
|
|
|
|
|
|
define i2 @cst_add() {
|
2013-07-14 09:42:54 +08:00
|
|
|
; CHECK-LABEL: @cst_add(
|
Now that Reassociate's LinearizeExprTree can look through arbitrary expression
topologies, it is quite possible for a leaf node to have huge multiplicity, for
example: x0 = x*x, x1 = x0*x0, x2 = x1*x1, ... rapidly gives a value which is x
raised to a vast power (the multiplicity, or weight, of x). This patch fixes
the computation of weights by correctly computing them no matter how big they
are, rather than just overflowing and getting a wrong value. It turns out that
the weight for a value never needs more bits to represent than the value itself,
so it is enough to represent weights as APInts of the same bitwidth and do the
right overflow-avoiding dance steps when computing weights. As a side-effect it
reduces the number of multiplies needed in some cases of large powers. While
there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree
static, pushing the rank computation out into users. This is progress towards
fixing PR13021.
llvm-svn: 158358
2012-06-12 22:33:56 +08:00
|
|
|
%tmp1 = add i2 1, 1
|
|
|
|
%tmp2 = add i2 %tmp1, 1
|
|
|
|
ret i2 %tmp2
|
|
|
|
; CHECK: ret i2 -1
|
|
|
|
}
|
|
|
|
|
|
|
|
define i8 @cst_mul() {
|
2013-07-14 09:42:54 +08:00
|
|
|
; CHECK-LABEL: @cst_mul(
|
Now that Reassociate's LinearizeExprTree can look through arbitrary expression
topologies, it is quite possible for a leaf node to have huge multiplicity, for
example: x0 = x*x, x1 = x0*x0, x2 = x1*x1, ... rapidly gives a value which is x
raised to a vast power (the multiplicity, or weight, of x). This patch fixes
the computation of weights by correctly computing them no matter how big they
are, rather than just overflowing and getting a wrong value. It turns out that
the weight for a value never needs more bits to represent than the value itself,
so it is enough to represent weights as APInts of the same bitwidth and do the
right overflow-avoiding dance steps when computing weights. As a side-effect it
reduces the number of multiplies needed in some cases of large powers. While
there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree
static, pushing the rank computation out into users. This is progress towards
fixing PR13021.
llvm-svn: 158358
2012-06-12 22:33:56 +08:00
|
|
|
%tmp1 = mul i8 3, 3
|
|
|
|
%tmp2 = mul i8 %tmp1, 3
|
|
|
|
%tmp3 = mul i8 %tmp2, 3
|
|
|
|
%tmp4 = mul i8 %tmp3, 3
|
|
|
|
ret i8 %tmp4
|
|
|
|
; CHECK: ret i8 -13
|
|
|
|
}
|
|
|
|
|
|
|
|
define i3 @foo3x5(i3 %x) {
|
|
|
|
; Can be done with two multiplies.
|
2013-07-14 09:42:54 +08:00
|
|
|
; CHECK-LABEL: @foo3x5(
|
Now that Reassociate's LinearizeExprTree can look through arbitrary expression
topologies, it is quite possible for a leaf node to have huge multiplicity, for
example: x0 = x*x, x1 = x0*x0, x2 = x1*x1, ... rapidly gives a value which is x
raised to a vast power (the multiplicity, or weight, of x). This patch fixes
the computation of weights by correctly computing them no matter how big they
are, rather than just overflowing and getting a wrong value. It turns out that
the weight for a value never needs more bits to represent than the value itself,
so it is enough to represent weights as APInts of the same bitwidth and do the
right overflow-avoiding dance steps when computing weights. As a side-effect it
reduces the number of multiplies needed in some cases of large powers. While
there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree
static, pushing the rank computation out into users. This is progress towards
fixing PR13021.
llvm-svn: 158358
2012-06-12 22:33:56 +08:00
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: ret
|
|
|
|
%tmp1 = mul i3 %x, %x
|
|
|
|
%tmp2 = mul i3 %tmp1, %x
|
|
|
|
%tmp3 = mul i3 %tmp2, %x
|
|
|
|
%tmp4 = mul i3 %tmp3, %x
|
|
|
|
ret i3 %tmp4
|
|
|
|
}
|
|
|
|
|
|
|
|
define i3 @foo3x6(i3 %x) {
|
|
|
|
; Can be done with two multiplies.
|
2013-07-14 09:42:54 +08:00
|
|
|
; CHECK-LABEL: @foo3x6(
|
Now that Reassociate's LinearizeExprTree can look through arbitrary expression
topologies, it is quite possible for a leaf node to have huge multiplicity, for
example: x0 = x*x, x1 = x0*x0, x2 = x1*x1, ... rapidly gives a value which is x
raised to a vast power (the multiplicity, or weight, of x). This patch fixes
the computation of weights by correctly computing them no matter how big they
are, rather than just overflowing and getting a wrong value. It turns out that
the weight for a value never needs more bits to represent than the value itself,
so it is enough to represent weights as APInts of the same bitwidth and do the
right overflow-avoiding dance steps when computing weights. As a side-effect it
reduces the number of multiplies needed in some cases of large powers. While
there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree
static, pushing the rank computation out into users. This is progress towards
fixing PR13021.
llvm-svn: 158358
2012-06-12 22:33:56 +08:00
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: ret
|
|
|
|
%tmp1 = mul i3 %x, %x
|
|
|
|
%tmp2 = mul i3 %tmp1, %x
|
|
|
|
%tmp3 = mul i3 %tmp2, %x
|
|
|
|
%tmp4 = mul i3 %tmp3, %x
|
|
|
|
%tmp5 = mul i3 %tmp4, %x
|
|
|
|
ret i3 %tmp5
|
|
|
|
}
|
|
|
|
|
|
|
|
define i3 @foo3x7(i3 %x) {
|
|
|
|
; Can be done with two multiplies.
|
2013-07-14 09:42:54 +08:00
|
|
|
; CHECK-LABEL: @foo3x7(
|
Now that Reassociate's LinearizeExprTree can look through arbitrary expression
topologies, it is quite possible for a leaf node to have huge multiplicity, for
example: x0 = x*x, x1 = x0*x0, x2 = x1*x1, ... rapidly gives a value which is x
raised to a vast power (the multiplicity, or weight, of x). This patch fixes
the computation of weights by correctly computing them no matter how big they
are, rather than just overflowing and getting a wrong value. It turns out that
the weight for a value never needs more bits to represent than the value itself,
so it is enough to represent weights as APInts of the same bitwidth and do the
right overflow-avoiding dance steps when computing weights. As a side-effect it
reduces the number of multiplies needed in some cases of large powers. While
there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree
static, pushing the rank computation out into users. This is progress towards
fixing PR13021.
llvm-svn: 158358
2012-06-12 22:33:56 +08:00
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: ret
|
|
|
|
%tmp1 = mul i3 %x, %x
|
|
|
|
%tmp2 = mul i3 %tmp1, %x
|
|
|
|
%tmp3 = mul i3 %tmp2, %x
|
|
|
|
%tmp4 = mul i3 %tmp3, %x
|
|
|
|
%tmp5 = mul i3 %tmp4, %x
|
|
|
|
%tmp6 = mul i3 %tmp5, %x
|
|
|
|
ret i3 %tmp6
|
|
|
|
}
|
|
|
|
|
|
|
|
define i4 @foo4x8(i4 %x) {
|
|
|
|
; Can be done with two multiplies.
|
2013-07-14 09:42:54 +08:00
|
|
|
; CHECK-LABEL: @foo4x8(
|
Now that Reassociate's LinearizeExprTree can look through arbitrary expression
topologies, it is quite possible for a leaf node to have huge multiplicity, for
example: x0 = x*x, x1 = x0*x0, x2 = x1*x1, ... rapidly gives a value which is x
raised to a vast power (the multiplicity, or weight, of x). This patch fixes
the computation of weights by correctly computing them no matter how big they
are, rather than just overflowing and getting a wrong value. It turns out that
the weight for a value never needs more bits to represent than the value itself,
so it is enough to represent weights as APInts of the same bitwidth and do the
right overflow-avoiding dance steps when computing weights. As a side-effect it
reduces the number of multiplies needed in some cases of large powers. While
there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree
static, pushing the rank computation out into users. This is progress towards
fixing PR13021.
llvm-svn: 158358
2012-06-12 22:33:56 +08:00
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: ret
|
|
|
|
%tmp1 = mul i4 %x, %x
|
|
|
|
%tmp2 = mul i4 %tmp1, %x
|
|
|
|
%tmp3 = mul i4 %tmp2, %x
|
|
|
|
%tmp4 = mul i4 %tmp3, %x
|
|
|
|
%tmp5 = mul i4 %tmp4, %x
|
|
|
|
%tmp6 = mul i4 %tmp5, %x
|
|
|
|
%tmp7 = mul i4 %tmp6, %x
|
|
|
|
ret i4 %tmp7
|
|
|
|
}
|
|
|
|
|
|
|
|
define i4 @foo4x9(i4 %x) {
|
|
|
|
; Can be done with three multiplies.
|
2013-07-14 09:42:54 +08:00
|
|
|
; CHECK-LABEL: @foo4x9(
|
Now that Reassociate's LinearizeExprTree can look through arbitrary expression
topologies, it is quite possible for a leaf node to have huge multiplicity, for
example: x0 = x*x, x1 = x0*x0, x2 = x1*x1, ... rapidly gives a value which is x
raised to a vast power (the multiplicity, or weight, of x). This patch fixes
the computation of weights by correctly computing them no matter how big they
are, rather than just overflowing and getting a wrong value. It turns out that
the weight for a value never needs more bits to represent than the value itself,
so it is enough to represent weights as APInts of the same bitwidth and do the
right overflow-avoiding dance steps when computing weights. As a side-effect it
reduces the number of multiplies needed in some cases of large powers. While
there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree
static, pushing the rank computation out into users. This is progress towards
fixing PR13021.
llvm-svn: 158358
2012-06-12 22:33:56 +08:00
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: ret
|
|
|
|
%tmp1 = mul i4 %x, %x
|
|
|
|
%tmp2 = mul i4 %tmp1, %x
|
|
|
|
%tmp3 = mul i4 %tmp2, %x
|
|
|
|
%tmp4 = mul i4 %tmp3, %x
|
|
|
|
%tmp5 = mul i4 %tmp4, %x
|
|
|
|
%tmp6 = mul i4 %tmp5, %x
|
|
|
|
%tmp7 = mul i4 %tmp6, %x
|
|
|
|
%tmp8 = mul i4 %tmp7, %x
|
|
|
|
ret i4 %tmp8
|
|
|
|
}
|
|
|
|
|
|
|
|
define i4 @foo4x10(i4 %x) {
|
|
|
|
; Can be done with three multiplies.
|
2013-07-14 09:42:54 +08:00
|
|
|
; CHECK-LABEL: @foo4x10(
|
Now that Reassociate's LinearizeExprTree can look through arbitrary expression
topologies, it is quite possible for a leaf node to have huge multiplicity, for
example: x0 = x*x, x1 = x0*x0, x2 = x1*x1, ... rapidly gives a value which is x
raised to a vast power (the multiplicity, or weight, of x). This patch fixes
the computation of weights by correctly computing them no matter how big they
are, rather than just overflowing and getting a wrong value. It turns out that
the weight for a value never needs more bits to represent than the value itself,
so it is enough to represent weights as APInts of the same bitwidth and do the
right overflow-avoiding dance steps when computing weights. As a side-effect it
reduces the number of multiplies needed in some cases of large powers. While
there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree
static, pushing the rank computation out into users. This is progress towards
fixing PR13021.
llvm-svn: 158358
2012-06-12 22:33:56 +08:00
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: ret
|
|
|
|
%tmp1 = mul i4 %x, %x
|
|
|
|
%tmp2 = mul i4 %tmp1, %x
|
|
|
|
%tmp3 = mul i4 %tmp2, %x
|
|
|
|
%tmp4 = mul i4 %tmp3, %x
|
|
|
|
%tmp5 = mul i4 %tmp4, %x
|
|
|
|
%tmp6 = mul i4 %tmp5, %x
|
|
|
|
%tmp7 = mul i4 %tmp6, %x
|
|
|
|
%tmp8 = mul i4 %tmp7, %x
|
|
|
|
%tmp9 = mul i4 %tmp8, %x
|
|
|
|
ret i4 %tmp9
|
|
|
|
}
|
|
|
|
|
|
|
|
define i4 @foo4x11(i4 %x) {
|
|
|
|
; Can be done with four multiplies.
|
2013-07-14 09:42:54 +08:00
|
|
|
; CHECK-LABEL: @foo4x11(
|
Now that Reassociate's LinearizeExprTree can look through arbitrary expression
topologies, it is quite possible for a leaf node to have huge multiplicity, for
example: x0 = x*x, x1 = x0*x0, x2 = x1*x1, ... rapidly gives a value which is x
raised to a vast power (the multiplicity, or weight, of x). This patch fixes
the computation of weights by correctly computing them no matter how big they
are, rather than just overflowing and getting a wrong value. It turns out that
the weight for a value never needs more bits to represent than the value itself,
so it is enough to represent weights as APInts of the same bitwidth and do the
right overflow-avoiding dance steps when computing weights. As a side-effect it
reduces the number of multiplies needed in some cases of large powers. While
there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree
static, pushing the rank computation out into users. This is progress towards
fixing PR13021.
llvm-svn: 158358
2012-06-12 22:33:56 +08:00
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: ret
|
|
|
|
%tmp1 = mul i4 %x, %x
|
|
|
|
%tmp2 = mul i4 %tmp1, %x
|
|
|
|
%tmp3 = mul i4 %tmp2, %x
|
|
|
|
%tmp4 = mul i4 %tmp3, %x
|
|
|
|
%tmp5 = mul i4 %tmp4, %x
|
|
|
|
%tmp6 = mul i4 %tmp5, %x
|
|
|
|
%tmp7 = mul i4 %tmp6, %x
|
|
|
|
%tmp8 = mul i4 %tmp7, %x
|
|
|
|
%tmp9 = mul i4 %tmp8, %x
|
|
|
|
%tmp10 = mul i4 %tmp9, %x
|
|
|
|
ret i4 %tmp10
|
|
|
|
}
|
|
|
|
|
|
|
|
define i4 @foo4x12(i4 %x) {
|
|
|
|
; Can be done with two multiplies.
|
2013-07-14 09:42:54 +08:00
|
|
|
; CHECK-LABEL: @foo4x12(
|
Now that Reassociate's LinearizeExprTree can look through arbitrary expression
topologies, it is quite possible for a leaf node to have huge multiplicity, for
example: x0 = x*x, x1 = x0*x0, x2 = x1*x1, ... rapidly gives a value which is x
raised to a vast power (the multiplicity, or weight, of x). This patch fixes
the computation of weights by correctly computing them no matter how big they
are, rather than just overflowing and getting a wrong value. It turns out that
the weight for a value never needs more bits to represent than the value itself,
so it is enough to represent weights as APInts of the same bitwidth and do the
right overflow-avoiding dance steps when computing weights. As a side-effect it
reduces the number of multiplies needed in some cases of large powers. While
there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree
static, pushing the rank computation out into users. This is progress towards
fixing PR13021.
llvm-svn: 158358
2012-06-12 22:33:56 +08:00
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: ret
|
|
|
|
%tmp1 = mul i4 %x, %x
|
|
|
|
%tmp2 = mul i4 %tmp1, %x
|
|
|
|
%tmp3 = mul i4 %tmp2, %x
|
|
|
|
%tmp4 = mul i4 %tmp3, %x
|
|
|
|
%tmp5 = mul i4 %tmp4, %x
|
|
|
|
%tmp6 = mul i4 %tmp5, %x
|
|
|
|
%tmp7 = mul i4 %tmp6, %x
|
|
|
|
%tmp8 = mul i4 %tmp7, %x
|
|
|
|
%tmp9 = mul i4 %tmp8, %x
|
|
|
|
%tmp10 = mul i4 %tmp9, %x
|
|
|
|
%tmp11 = mul i4 %tmp10, %x
|
|
|
|
ret i4 %tmp11
|
|
|
|
}
|
|
|
|
|
|
|
|
define i4 @foo4x13(i4 %x) {
|
|
|
|
; Can be done with three multiplies.
|
2013-07-14 09:42:54 +08:00
|
|
|
; CHECK-LABEL: @foo4x13(
|
Now that Reassociate's LinearizeExprTree can look through arbitrary expression
topologies, it is quite possible for a leaf node to have huge multiplicity, for
example: x0 = x*x, x1 = x0*x0, x2 = x1*x1, ... rapidly gives a value which is x
raised to a vast power (the multiplicity, or weight, of x). This patch fixes
the computation of weights by correctly computing them no matter how big they
are, rather than just overflowing and getting a wrong value. It turns out that
the weight for a value never needs more bits to represent than the value itself,
so it is enough to represent weights as APInts of the same bitwidth and do the
right overflow-avoiding dance steps when computing weights. As a side-effect it
reduces the number of multiplies needed in some cases of large powers. While
there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree
static, pushing the rank computation out into users. This is progress towards
fixing PR13021.
llvm-svn: 158358
2012-06-12 22:33:56 +08:00
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: ret
|
|
|
|
%tmp1 = mul i4 %x, %x
|
|
|
|
%tmp2 = mul i4 %tmp1, %x
|
|
|
|
%tmp3 = mul i4 %tmp2, %x
|
|
|
|
%tmp4 = mul i4 %tmp3, %x
|
|
|
|
%tmp5 = mul i4 %tmp4, %x
|
|
|
|
%tmp6 = mul i4 %tmp5, %x
|
|
|
|
%tmp7 = mul i4 %tmp6, %x
|
|
|
|
%tmp8 = mul i4 %tmp7, %x
|
|
|
|
%tmp9 = mul i4 %tmp8, %x
|
|
|
|
%tmp10 = mul i4 %tmp9, %x
|
|
|
|
%tmp11 = mul i4 %tmp10, %x
|
|
|
|
%tmp12 = mul i4 %tmp11, %x
|
|
|
|
ret i4 %tmp12
|
|
|
|
}
|
|
|
|
|
|
|
|
define i4 @foo4x14(i4 %x) {
|
|
|
|
; Can be done with three multiplies.
|
2013-07-14 09:42:54 +08:00
|
|
|
; CHECK-LABEL: @foo4x14(
|
Now that Reassociate's LinearizeExprTree can look through arbitrary expression
topologies, it is quite possible for a leaf node to have huge multiplicity, for
example: x0 = x*x, x1 = x0*x0, x2 = x1*x1, ... rapidly gives a value which is x
raised to a vast power (the multiplicity, or weight, of x). This patch fixes
the computation of weights by correctly computing them no matter how big they
are, rather than just overflowing and getting a wrong value. It turns out that
the weight for a value never needs more bits to represent than the value itself,
so it is enough to represent weights as APInts of the same bitwidth and do the
right overflow-avoiding dance steps when computing weights. As a side-effect it
reduces the number of multiplies needed in some cases of large powers. While
there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree
static, pushing the rank computation out into users. This is progress towards
fixing PR13021.
llvm-svn: 158358
2012-06-12 22:33:56 +08:00
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: ret
|
|
|
|
%tmp1 = mul i4 %x, %x
|
|
|
|
%tmp2 = mul i4 %tmp1, %x
|
|
|
|
%tmp3 = mul i4 %tmp2, %x
|
|
|
|
%tmp4 = mul i4 %tmp3, %x
|
|
|
|
%tmp5 = mul i4 %tmp4, %x
|
|
|
|
%tmp6 = mul i4 %tmp5, %x
|
|
|
|
%tmp7 = mul i4 %tmp6, %x
|
|
|
|
%tmp8 = mul i4 %tmp7, %x
|
|
|
|
%tmp9 = mul i4 %tmp8, %x
|
|
|
|
%tmp10 = mul i4 %tmp9, %x
|
|
|
|
%tmp11 = mul i4 %tmp10, %x
|
|
|
|
%tmp12 = mul i4 %tmp11, %x
|
|
|
|
%tmp13 = mul i4 %tmp12, %x
|
|
|
|
ret i4 %tmp13
|
|
|
|
}
|
|
|
|
|
|
|
|
define i4 @foo4x15(i4 %x) {
|
|
|
|
; Can be done with four multiplies.
|
2013-07-14 09:42:54 +08:00
|
|
|
; CHECK-LABEL: @foo4x15(
|
Now that Reassociate's LinearizeExprTree can look through arbitrary expression
topologies, it is quite possible for a leaf node to have huge multiplicity, for
example: x0 = x*x, x1 = x0*x0, x2 = x1*x1, ... rapidly gives a value which is x
raised to a vast power (the multiplicity, or weight, of x). This patch fixes
the computation of weights by correctly computing them no matter how big they
are, rather than just overflowing and getting a wrong value. It turns out that
the weight for a value never needs more bits to represent than the value itself,
so it is enough to represent weights as APInts of the same bitwidth and do the
right overflow-avoiding dance steps when computing weights. As a side-effect it
reduces the number of multiplies needed in some cases of large powers. While
there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree
static, pushing the rank computation out into users. This is progress towards
fixing PR13021.
llvm-svn: 158358
2012-06-12 22:33:56 +08:00
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: mul
|
|
|
|
; CHECK-NEXT: ret
|
|
|
|
%tmp1 = mul i4 %x, %x
|
|
|
|
%tmp2 = mul i4 %tmp1, %x
|
|
|
|
%tmp3 = mul i4 %tmp2, %x
|
|
|
|
%tmp4 = mul i4 %tmp3, %x
|
|
|
|
%tmp5 = mul i4 %tmp4, %x
|
|
|
|
%tmp6 = mul i4 %tmp5, %x
|
|
|
|
%tmp7 = mul i4 %tmp6, %x
|
|
|
|
%tmp8 = mul i4 %tmp7, %x
|
|
|
|
%tmp9 = mul i4 %tmp8, %x
|
|
|
|
%tmp10 = mul i4 %tmp9, %x
|
|
|
|
%tmp11 = mul i4 %tmp10, %x
|
|
|
|
%tmp12 = mul i4 %tmp11, %x
|
|
|
|
%tmp13 = mul i4 %tmp12, %x
|
|
|
|
%tmp14 = mul i4 %tmp13, %x
|
|
|
|
ret i4 %tmp14
|
|
|
|
}
|