[AutoFDO] Top-down Inlining for specialization with context-sensitive profile
Summary:
AutoFDO's sample profile loader processes function in arbitrary source code order, so if I change the order of two functions in source code, the inline decision can change. This also prevented the use of context-sensitive profile to do specialization while inlining. This commit enforces SCC top-down order for sample profile loader. With this change, we can now do specialization, as illustrated by the added test case:
Say if we have A->B->C and D->B->C call path, we want to inline C into B when root inliner is B, but not when root inliner is A or D, this is not possible without enforcing top-down order. E.g. Once C is inlined into B, A and D can only choose to inline (B->C) as a whole or nothing, but what we want is only inline B into A and D, not its recursive callee C. If we process functions in top-down order, this is no longer a problem, which is what this commit is doing.
This change is guarded with a new switch "-sample-profile-top-down-load" for tuning, and it depends on D70653. Eventually, top-down can be the default order for sample profile loader.
Reviewers: wmi, davidxl
Subscribers: hiraditya, llvm-commits, tejohnson
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70655
2019-11-25 15:54:07 +08:00
; Note that this needs new pass manager for now. Passing `-sample-profile-top-down-load` to legacy pass manager is a no-op.
; Test we aren't doing specialization for inlining with default source order
2020-07-01 05:32:46 +08:00
; RUN: opt < %s -passes=sample-profile -sample-profile-file=%S/Inputs/inline-topdown.prof -sample-profile-top-down-load=false -S | FileCheck -check-prefix=DEFAULT %s
[AutoFDO] Top-down Inlining for specialization with context-sensitive profile
Summary:
AutoFDO's sample profile loader processes function in arbitrary source code order, so if I change the order of two functions in source code, the inline decision can change. This also prevented the use of context-sensitive profile to do specialization while inlining. This commit enforces SCC top-down order for sample profile loader. With this change, we can now do specialization, as illustrated by the added test case:
Say if we have A->B->C and D->B->C call path, we want to inline C into B when root inliner is B, but not when root inliner is A or D, this is not possible without enforcing top-down order. E.g. Once C is inlined into B, A and D can only choose to inline (B->C) as a whole or nothing, but what we want is only inline B into A and D, not its recursive callee C. If we process functions in top-down order, this is no longer a problem, which is what this commit is doing.
This change is guarded with a new switch "-sample-profile-top-down-load" for tuning, and it depends on D70653. Eventually, top-down can be the default order for sample profile loader.
Reviewers: wmi, davidxl
Subscribers: hiraditya, llvm-commits, tejohnson
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70655
2019-11-25 15:54:07 +08:00
; Test we specialize based on call path with context-sensitive profile while inlining with '-sample-profile-top-down-load'
2020-07-01 05:32:46 +08:00
; RUN: opt < %s -passes=sample-profile -sample-profile-file=%S/Inputs/inline-topdown.prof -sample-profile-merge-inlinee -sample-profile-top-down-load=true -S | FileCheck -check-prefix=TOPDOWN %s
[AutoFDO] Top-down Inlining for specialization with context-sensitive profile
Summary:
AutoFDO's sample profile loader processes function in arbitrary source code order, so if I change the order of two functions in source code, the inline decision can change. This also prevented the use of context-sensitive profile to do specialization while inlining. This commit enforces SCC top-down order for sample profile loader. With this change, we can now do specialization, as illustrated by the added test case:
Say if we have A->B->C and D->B->C call path, we want to inline C into B when root inliner is B, but not when root inliner is A or D, this is not possible without enforcing top-down order. E.g. Once C is inlined into B, A and D can only choose to inline (B->C) as a whole or nothing, but what we want is only inline B into A and D, not its recursive callee C. If we process functions in top-down order, this is no longer a problem, which is what this commit is doing.
This change is guarded with a new switch "-sample-profile-top-down-load" for tuning, and it depends on D70653. Eventually, top-down can be the default order for sample profile loader.
Reviewers: wmi, davidxl
Subscribers: hiraditya, llvm-commits, tejohnson
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70655
2019-11-25 15:54:07 +08:00
@.str = private unnamed_addr constant [ 11 x i8 ] c "sum is %d\0A\00" , align 1
2020-05-15 03:05:49 +08:00
define i32 @_Z3sumii ( i32 %x , i32 %y ) #0 !dbg !6 {
[AutoFDO] Top-down Inlining for specialization with context-sensitive profile
Summary:
AutoFDO's sample profile loader processes function in arbitrary source code order, so if I change the order of two functions in source code, the inline decision can change. This also prevented the use of context-sensitive profile to do specialization while inlining. This commit enforces SCC top-down order for sample profile loader. With this change, we can now do specialization, as illustrated by the added test case:
Say if we have A->B->C and D->B->C call path, we want to inline C into B when root inliner is B, but not when root inliner is A or D, this is not possible without enforcing top-down order. E.g. Once C is inlined into B, A and D can only choose to inline (B->C) as a whole or nothing, but what we want is only inline B into A and D, not its recursive callee C. If we process functions in top-down order, this is no longer a problem, which is what this commit is doing.
This change is guarded with a new switch "-sample-profile-top-down-load" for tuning, and it depends on D70653. Eventually, top-down can be the default order for sample profile loader.
Reviewers: wmi, davidxl
Subscribers: hiraditya, llvm-commits, tejohnson
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70655
2019-11-25 15:54:07 +08:00
entry:
%x.addr = alloca i32 , align 4
%y.addr = alloca i32 , align 4
store i32 %x , i32 * %x.addr , align 4
store i32 %y , i32 * %y.addr , align 4
%tmp = load i32 , i32 * %x.addr , align 4 , !dbg !8
%tmp1 = load i32 , i32 * %y.addr , align 4 , !dbg !8
%add = add nsw i32 %tmp , %tmp1 , !dbg !8
%tmp2 = load i32 , i32 * %x.addr , align 4 , !dbg !8
%tmp3 = load i32 , i32 * %y.addr , align 4 , !dbg !8
%call = call i32 @_Z3subii ( i32 %tmp2 , i32 %tmp3 ) , !dbg !8
ret i32 %add , !dbg !8
}
2020-05-15 03:05:49 +08:00
define i32 @_Z3subii ( i32 %x , i32 %y ) #0 !dbg !9 {
[AutoFDO] Top-down Inlining for specialization with context-sensitive profile
Summary:
AutoFDO's sample profile loader processes function in arbitrary source code order, so if I change the order of two functions in source code, the inline decision can change. This also prevented the use of context-sensitive profile to do specialization while inlining. This commit enforces SCC top-down order for sample profile loader. With this change, we can now do specialization, as illustrated by the added test case:
Say if we have A->B->C and D->B->C call path, we want to inline C into B when root inliner is B, but not when root inliner is A or D, this is not possible without enforcing top-down order. E.g. Once C is inlined into B, A and D can only choose to inline (B->C) as a whole or nothing, but what we want is only inline B into A and D, not its recursive callee C. If we process functions in top-down order, this is no longer a problem, which is what this commit is doing.
This change is guarded with a new switch "-sample-profile-top-down-load" for tuning, and it depends on D70653. Eventually, top-down can be the default order for sample profile loader.
Reviewers: wmi, davidxl
Subscribers: hiraditya, llvm-commits, tejohnson
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70655
2019-11-25 15:54:07 +08:00
entry:
%x.addr = alloca i32 , align 4
%y.addr = alloca i32 , align 4
store i32 %x , i32 * %x.addr , align 4
store i32 %y , i32 * %y.addr , align 4
%tmp = load i32 , i32 * %x.addr , align 4 , !dbg !10
%tmp1 = load i32 , i32 * %y.addr , align 4 , !dbg !10
%add = sub nsw i32 %tmp , %tmp1 , !dbg !10
ret i32 %add , !dbg !11
}
2020-05-15 03:05:49 +08:00
define i32 @main ( ) #0 !dbg !12 {
[AutoFDO] Top-down Inlining for specialization with context-sensitive profile
Summary:
AutoFDO's sample profile loader processes function in arbitrary source code order, so if I change the order of two functions in source code, the inline decision can change. This also prevented the use of context-sensitive profile to do specialization while inlining. This commit enforces SCC top-down order for sample profile loader. With this change, we can now do specialization, as illustrated by the added test case:
Say if we have A->B->C and D->B->C call path, we want to inline C into B when root inliner is B, but not when root inliner is A or D, this is not possible without enforcing top-down order. E.g. Once C is inlined into B, A and D can only choose to inline (B->C) as a whole or nothing, but what we want is only inline B into A and D, not its recursive callee C. If we process functions in top-down order, this is no longer a problem, which is what this commit is doing.
This change is guarded with a new switch "-sample-profile-top-down-load" for tuning, and it depends on D70653. Eventually, top-down can be the default order for sample profile loader.
Reviewers: wmi, davidxl
Subscribers: hiraditya, llvm-commits, tejohnson
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70655
2019-11-25 15:54:07 +08:00
entry:
%retval = alloca i32 , align 4
%s = alloca i32 , align 4
%i = alloca i32 , align 4
store i32 0 , i32 * %retval
store i32 0 , i32 * %i , align 4 , !dbg !13
br label %while.cond , !dbg !14
while.cond: ; preds = %if.end, %entry
%tmp = load i32 , i32 * %i , align 4 , !dbg !15
%inc = add nsw i32 %tmp , 1 , !dbg !15
store i32 %inc , i32 * %i , align 4 , !dbg !15
%cmp = icmp slt i32 %tmp , 400000000 , !dbg !15
br i1 %cmp , label %while.body , label %while.end , !dbg !15
while.body: ; preds = %while.cond
%tmp1 = load i32 , i32 * %i , align 4 , !dbg !17
%cmp1 = icmp ne i32 %tmp1 , 100 , !dbg !17
br i1 %cmp1 , label %if.then , label %if.else , !dbg !17
if.then: ; preds = %while.body
%tmp2 = load i32 , i32 * %i , align 4 , !dbg !19
%tmp3 = load i32 , i32 * %s , align 4 , !dbg !19
%call = call i32 @_Z3sumii ( i32 %tmp2 , i32 %tmp3 ) , !dbg !19
store i32 %call , i32 * %s , align 4 , !dbg !19
br label %if.end , !dbg !19
if.else: ; preds = %while.body
store i32 30 , i32 * %s , align 4 , !dbg !21
br label %if.end
if.end: ; preds = %if.else, %if.then
br label %while.cond , !dbg !23
while.end: ; preds = %while.cond
%tmp4 = load i32 , i32 * %s , align 4 , !dbg !25
%call2 = call i32 ( i8 * , . . . ) @printf ( i8 * getelementptr inbounds ( [ 11 x i8 ] , [ 11 x i8 ] * @.str , i32 0 , i32 0 ) , i32 %tmp4 ) , !dbg !25
ret i32 0 , !dbg !26
}
declare i32 @printf ( i8 * , . . . )
2020-05-15 03:05:49 +08:00
attributes #0 = { "use-sample-profile" }
[AutoFDO] Top-down Inlining for specialization with context-sensitive profile
Summary:
AutoFDO's sample profile loader processes function in arbitrary source code order, so if I change the order of two functions in source code, the inline decision can change. This also prevented the use of context-sensitive profile to do specialization while inlining. This commit enforces SCC top-down order for sample profile loader. With this change, we can now do specialization, as illustrated by the added test case:
Say if we have A->B->C and D->B->C call path, we want to inline C into B when root inliner is B, but not when root inliner is A or D, this is not possible without enforcing top-down order. E.g. Once C is inlined into B, A and D can only choose to inline (B->C) as a whole or nothing, but what we want is only inline B into A and D, not its recursive callee C. If we process functions in top-down order, this is no longer a problem, which is what this commit is doing.
This change is guarded with a new switch "-sample-profile-top-down-load" for tuning, and it depends on D70653. Eventually, top-down can be the default order for sample profile loader.
Reviewers: wmi, davidxl
Subscribers: hiraditya, llvm-commits, tejohnson
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70655
2019-11-25 15:54:07 +08:00
!llvm.dbg.cu = ! { !0 }
!llvm.module.flags = ! { !3 , !4 }
!llvm.ident = ! { !5 }
!0 = distinct !DICompileUnit ( language: D W _ L A N G _ C _ p l u s _ p l u s , file: !1 , producer: "clang version 3.5 " , isOptimized: false , runtimeVersion: 0 , emissionKind: N o D e b u g , enums: !2 , retainedTypes: !2 , globals: !2 , imports: !2 )
!1 = !DIFile ( filename: "calls.cc" , directory: "." )
!2 = ! { }
!3 = ! { i32 2 , !"Dwarf Version" , i32 4 }
!4 = ! { i32 1 , !"Debug Info Version" , i32 3 }
!5 = ! { !"clang version 3.5 " }
!6 = distinct !DISubprogram ( name: "sum" , scope: !1 , file: !1 , line: 3 , type: !7 , scopeLine: 3 , virtualIndex: 6 , flags: D I F l a g P r o t o t y p e d , spFlags: D I S P F l a g D e f i n i t i o n , unit: !0 , retainedNodes: !2 )
!7 = !DISubroutineType ( types: !2 )
!8 = !DILocation ( line: 4 , scope: !6 )
!9 = distinct !DISubprogram ( name: "sub" , scope: !1 , file: !1 , line: 20 , type: !7 , scopeLine: 20 , virtualIndex: 6 , flags: D I F l a g P r o t o t y p e d , spFlags: D I S P F l a g D e f i n i t i o n , unit: !0 , retainedNodes: !2 )
!10 = !DILocation ( line: 20 , scope: !9 )
!11 = !DILocation ( line: 21 , scope: !9 )
!12 = distinct !DISubprogram ( name: "main" , scope: !1 , file: !1 , line: 7 , type: !7 , scopeLine: 7 , virtualIndex: 6 , flags: D I F l a g P r o t o t y p e d , spFlags: D I S P F l a g D e f i n i t i o n , unit: !0 , retainedNodes: !2 )
!13 = !DILocation ( line: 8 , scope: !12 )
!14 = !DILocation ( line: 9 , scope: !12 )
!15 = !DILocation ( line: 9 , scope: !16 )
!16 = !DILexicalBlockFile ( scope: !12 , file: !1 , discriminator: 2 )
!17 = !DILocation ( line: 10 , scope: !18 )
!18 = distinct !DILexicalBlock ( scope: !12 , file: !1 , line: 10 )
!19 = !DILocation ( line: 10 , scope: !20 )
!20 = !DILexicalBlockFile ( scope: !18 , file: !1 , discriminator: 2 )
!21 = !DILocation ( line: 10 , scope: !22 )
!22 = !DILexicalBlockFile ( scope: !18 , file: !1 , discriminator: 4 )
!23 = !DILocation ( line: 10 , scope: !24 )
!24 = !DILexicalBlockFile ( scope: !18 , file: !1 , discriminator: 6 )
!25 = !DILocation ( line: 11 , scope: !12 )
!26 = !DILocation ( line: 12 , scope: !12 )
; DEFAULT: @_Z3sumii
; DEFAULT-NOT: call i32 @_Z3subii
; DEFAULT: @main()
; DEFAULT-NOT: call i32 @_Z3subii
; TOPDOWN: @_Z3sumii
; TOPDOWN-NOT: call i32 @_Z3subii
; TOPDOWN: @main()
2020-05-15 03:05:49 +08:00
; TOPDOWN: call i32 @_Z3subii