[GVN] Initial check-in of a new global value numbering algorithm.
The code have been developed by Daniel Berlin over the years, and
the new implementation goal is that of addressing shortcomings of
the current GVN infrastructure, i.e. long compile time for large
testcases, lack of phi predication, no load/store value numbering
etc...
The current code just implements the "core" GVN algorithm, although
other pieces (load coercion, phi handling, predicate system) are
already implemented in a branch out of tree. Once the core is stable,
we'll start adding pieces on top of the base framework.
The test currently living in test/Transform/NewGVN are a copy
of the ones in GVN, with proper `XFAIL` (missing features in NewGVN).
A flag will be added in a future commit to enable NewGVN, so that
interested parties can exercise this code easily.
Differential Revision: https://reviews.llvm.org/D26224
llvm-svn: 290346
2016-12-23 00:03:48 +08:00
|
|
|
; XFAIL: *
|
|
|
|
; RUN: opt -S -basicaa -newgvn < %s | FileCheck %s
|
|
|
|
|
2017-01-20 08:21:33 +08:00
|
|
|
@a = external constant i32
|
[GVN] Initial check-in of a new global value numbering algorithm.
The code have been developed by Daniel Berlin over the years, and
the new implementation goal is that of addressing shortcomings of
the current GVN infrastructure, i.e. long compile time for large
testcases, lack of phi predication, no load/store value numbering
etc...
The current code just implements the "core" GVN algorithm, although
other pieces (load coercion, phi handling, predicate system) are
already implemented in a branch out of tree. Once the core is stable,
we'll start adding pieces on top of the base framework.
The test currently living in test/Transform/NewGVN are a copy
of the ones in GVN, with proper `XFAIL` (missing features in NewGVN).
A flag will be added in a future commit to enable NewGVN, so that
interested parties can exercise this code easily.
Differential Revision: https://reviews.llvm.org/D26224
llvm-svn: 290346
2016-12-23 00:03:48 +08:00
|
|
|
; We can value forward across the fence since we can (semantically)
|
|
|
|
; reorder the following load before the fence.
|
|
|
|
define i32 @test(i32* %addr.i) {
|
|
|
|
; CHECK-LABEL: @test
|
|
|
|
; CHECK: store
|
|
|
|
; CHECK: fence
|
|
|
|
; CHECK-NOT: load
|
|
|
|
; CHECK: ret
|
|
|
|
store i32 5, i32* %addr.i, align 4
|
|
|
|
fence release
|
|
|
|
%a = load i32, i32* %addr.i, align 4
|
|
|
|
ret i32 %a
|
|
|
|
}
|
|
|
|
|
|
|
|
; Same as above
|
|
|
|
define i32 @test2(i32* %addr.i) {
|
|
|
|
; CHECK-LABEL: @test2
|
|
|
|
; CHECK-NEXT: fence
|
|
|
|
; CHECK-NOT: load
|
|
|
|
; CHECK: ret
|
|
|
|
%a = load i32, i32* %addr.i, align 4
|
|
|
|
fence release
|
|
|
|
%a2 = load i32, i32* %addr.i, align 4
|
|
|
|
%res = sub i32 %a, %a2
|
|
|
|
ret i32 %res
|
|
|
|
}
|
|
|
|
|
|
|
|
; We can not value forward across an acquire barrier since we might
|
|
|
|
; be syncronizing with another thread storing to the same variable
|
|
|
|
; followed by a release fence. This is not so much enforcing an
|
|
|
|
; ordering property (though it is that too), but a liveness
|
|
|
|
; property. We expect to eventually see the value of store by
|
|
|
|
; another thread when spinning on that location.
|
|
|
|
define i32 @test3(i32* noalias %addr.i, i32* noalias %otheraddr) {
|
|
|
|
; CHECK-LABEL: @test3
|
|
|
|
; CHECK: load
|
|
|
|
; CHECK: fence
|
|
|
|
; CHECK: load
|
|
|
|
; CHECK: ret i32 %res
|
|
|
|
; the following code is intented to model the unrolling of
|
|
|
|
; two iterations in a spin loop of the form:
|
|
|
|
; do { fence acquire: tmp = *%addr.i; ) while (!tmp);
|
|
|
|
; It's hopefully clear that allowing PRE to turn this into:
|
|
|
|
; if (!*%addr.i) while(true) {} would be unfortunate
|
|
|
|
fence acquire
|
|
|
|
%a = load i32, i32* %addr.i, align 4
|
|
|
|
fence acquire
|
|
|
|
%a2 = load i32, i32* %addr.i, align 4
|
|
|
|
%res = sub i32 %a, %a2
|
|
|
|
ret i32 %res
|
|
|
|
}
|
|
|
|
|
2017-01-20 08:21:33 +08:00
|
|
|
; We can forward the value forward the load
|
|
|
|
; across both the fences, because the load is from
|
|
|
|
; a constant memory location.
|
|
|
|
define i32 @test4(i32* %addr) {
|
|
|
|
; CHECK-LABEL: @test4
|
|
|
|
; CHECK-NOT: load
|
|
|
|
; CHECK: fence release
|
|
|
|
; CHECK: store
|
|
|
|
; CHECK: fence seq_cst
|
|
|
|
; CHECK: ret i32 0
|
|
|
|
%var = load i32, i32* @a
|
|
|
|
fence release
|
|
|
|
store i32 42, i32* %addr, align 8
|
|
|
|
fence seq_cst
|
|
|
|
%var2 = load i32, i32* @a
|
|
|
|
%var3 = sub i32 %var, %var2
|
|
|
|
ret i32 %var3
|
|
|
|
}
|
|
|
|
|
[GVN] Initial check-in of a new global value numbering algorithm.
The code have been developed by Daniel Berlin over the years, and
the new implementation goal is that of addressing shortcomings of
the current GVN infrastructure, i.e. long compile time for large
testcases, lack of phi predication, no load/store value numbering
etc...
The current code just implements the "core" GVN algorithm, although
other pieces (load coercion, phi handling, predicate system) are
already implemented in a branch out of tree. Once the core is stable,
we'll start adding pieces on top of the base framework.
The test currently living in test/Transform/NewGVN are a copy
of the ones in GVN, with proper `XFAIL` (missing features in NewGVN).
A flag will be added in a future commit to enable NewGVN, so that
interested parties can exercise this code easily.
Differential Revision: https://reviews.llvm.org/D26224
llvm-svn: 290346
2016-12-23 00:03:48 +08:00
|
|
|
; Another example of why forwarding across an acquire fence is problematic
|
|
|
|
; can be seen in a normal locking operation. Say we had:
|
|
|
|
; *p = 5; unlock(l); lock(l); use(p);
|
|
|
|
; forwarding the store to p would be invalid. A reasonable implementation
|
|
|
|
; of unlock and lock might be:
|
|
|
|
; unlock() { atomicrmw sub %l, 1 unordered; fence release }
|
|
|
|
; lock() {
|
|
|
|
; do {
|
|
|
|
; %res = cmpxchg %p, 0, 1, monotonic monotonic
|
|
|
|
; } while(!%res.success)
|
|
|
|
; fence acquire;
|
|
|
|
; }
|
|
|
|
; Given we chose to forward across the release fence, we clearly can't forward
|
|
|
|
; across the acquire fence as well.
|
|
|
|
|