llvm-project/llvm/test/Bitcode/thinlto-function-summary-ca...

; Test to check the callgraph in summary when there is PGO
; RUN: opt -module-summary %s -o %t.o
; RUN: llvm-bcanalyzer -dump %t.o | FileCheck %s

; RUN: opt -module-summary %p/Inputs/thinlto-function-summary-callgraph.ll -o %t2.o
; RUN: llvm-lto -thinlto -o %t3 %t.o %t2.o
; RUN: llvm-bcanalyzer -dump %t3.thinlto.bc | FileCheck %s --check-prefix=COMBINED

; Check parsing for old summary versions generated from this file.
; RUN: llvm-lto -thinlto-index-stats %p/Inputs/thinlto-function-summary-callgraph-pgo.1.bc  | FileCheck %s --check-prefix=OLD
; RUN: llvm-lto -thinlto-index-stats %p/Inputs/thinlto-function-summary-callgraph-pgo-combined.1.bc  | FileCheck %s --check-prefix=OLD-COMBINED

; CHECK: <SOURCE_FILENAME
; CHECK-NEXT: <FUNCTION
; "func"
; CHECK-NEXT: <FUNCTION op0=4 op1=4
; CHECK:       <GLOBALVAL_SUMMARY_BLOCK
; CHECK-NEXT:    <VERSION
; CHECK-NEXT:    <FLAGS
; See if the call to func is registered, using the expected hotness type.
; CHECK-NEXT:    <PERMODULE_PROFILE {{.*}} op7=1 op8=2/>
; CHECK-NEXT:  </GLOBALVAL_SUMMARY_BLOCK>
; CHECK: <STRTAB_BLOCK
; CHECK-NEXT: blob data = 'mainfunc{{.*}}'

; COMBINED:       <GLOBALVAL_SUMMARY_BLOCK
; COMBINED-NEXT:    <VERSION
; COMBINED-NEXT:    <FLAGS
; COMBINED-NEXT:    <VALUE_GUID op0=[[FUNCID:[0-9]+]] op1=7289175272376759421/>
; COMBINED-NEXT:    <VALUE_GUID
; COMBINED-NEXT:    <COMBINED
; See if the call to func is registered, using the expected hotness type.
; op6=2 which is hotnessType::None.
; COMBINED-NEXT:    <COMBINED_PROFILE {{.*}} op9=[[FUNCID]] op10=2/>
; COMBINED-NEXT:  </GLOBALVAL_SUMMARY_BLOCK>

; ModuleID = 'thinlto-function-summary-callgraph.ll'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: nounwind uwtable
define i32 @main() #0 !prof !2 {
entry:
    call void (...) @func()
    ret i32 0
}

declare void @func(...) #1

!2 = !{!"function_entry_count", i64 1}

; OLD: Index {{.*}} contains 1 nodes (1 functions, 0 alias, 0 globals) and 1 edges (0 refs and 1 calls)
; OLD-COMBINED: Index {{.*}} contains 2 nodes (2 functions, 0 alias, 0 globals) and 1 edges (0 refs and 1 calls)
Revert "[thinlto] Deleted unused test file" This reverts commit a7ad00460027c4a92640c2a5706a7d1869b60989. llvm-svn: 280886 2016-09-08 07:46:50 +08:00			`; Test to check the callgraph in summary when there is PGO`
			`; RUN: opt -module-summary %s -o %t.o`
			`; RUN: llvm-bcanalyzer -dump %t.o \| FileCheck %s`
[thinlto] Basic thinlto fdo heuristic Summary: This patch improves thinlto importer by importing 3x larger functions that are called from hot block. I compared performance with the trunk on spec, and there were about 2% on povray and 3.33% on milc. These results seems to be consistant and match the results Teresa got with her simple heuristic. Some benchmarks got slower but I think they are just noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with more iterations to confirm. Geomean of all benchmarks including the noisy ones were about +0.02%. I see much better improvement on google branch with Easwaran patch for pgo callsite inlining (the inliner actually inline those big functions) Over all I see +0.5% improvement, and I get +8.65% on povray. So I guess we will see much bigger change when Easwaran patch will land (it depends on new pass manager), but it is still worth putting this to trunk before it. Implementation details changes: - Removed CallsiteCount. - ProfileCount got replaced by Hotness - hot-import-multiplier is set to 3.0 for now, didn't have time to tune it up, but I see that we get most of the interesting functions with 3, so there is no much performance difference with higher, and binary size doesn't grow as much as with 10.0. Reviewers: eraman, mehdi_amini, tejohnson Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24638 llvm-svn: 282437 2016-09-27 04:37:32 +08:00
Revert "[thinlto] Deleted unused test file" This reverts commit a7ad00460027c4a92640c2a5706a7d1869b60989. llvm-svn: 280886 2016-09-08 07:46:50 +08:00			`; RUN: opt -module-summary %p/Inputs/thinlto-function-summary-callgraph.ll -o %t2.o`
			`; RUN: llvm-lto -thinlto -o %t3 %t.o %t2.o`
			`; RUN: llvm-bcanalyzer -dump %t3.thinlto.bc \| FileCheck %s --check-prefix=COMBINED`

[thinlto] Basic thinlto fdo heuristic Summary: This patch improves thinlto importer by importing 3x larger functions that are called from hot block. I compared performance with the trunk on spec, and there were about 2% on povray and 3.33% on milc. These results seems to be consistant and match the results Teresa got with her simple heuristic. Some benchmarks got slower but I think they are just noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with more iterations to confirm. Geomean of all benchmarks including the noisy ones were about +0.02%. I see much better improvement on google branch with Easwaran patch for pgo callsite inlining (the inliner actually inline those big functions) Over all I see +0.5% improvement, and I get +8.65% on povray. So I guess we will see much bigger change when Easwaran patch will land (it depends on new pass manager), but it is still worth putting this to trunk before it. Implementation details changes: - Removed CallsiteCount. - ProfileCount got replaced by Hotness - hot-import-multiplier is set to 3.0 for now, didn't have time to tune it up, but I see that we get most of the interesting functions with 3, so there is no much performance difference with higher, and binary size doesn't grow as much as with 10.0. Reviewers: eraman, mehdi_amini, tejohnson Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24638 llvm-svn: 282437 2016-09-27 04:37:32 +08:00			`; Check parsing for old summary versions generated from this file.`
			`; RUN: llvm-lto -thinlto-index-stats %p/Inputs/thinlto-function-summary-callgraph-pgo.1.bc \| FileCheck %s --check-prefix=OLD`
			`; RUN: llvm-lto -thinlto-index-stats %p/Inputs/thinlto-function-summary-callgraph-pgo-combined.1.bc \| FileCheck %s --check-prefix=OLD-COMBINED`

Bitcode: Add a string table to the bitcode format. Add a top-level STRTAB block containing a string table blob, and start storing strings for module codes FUNCTION, GLOBALVAR, ALIAS, IFUNC and COMDAT in the string table. This change allows us to share names between globals and comdats as well as between modules, and improves the efficiency of loading bitcode files by no longer using a bit encoding for symbol names. Once we start writing the irsymtab to the bitcode file we will also be able to share strings between it and the module. On my machine, link time for Chromium for Linux with ThinLTO decreases by about 7% for no-op incremental builds or about 1% for full builds. Total bitcode file size decreases by about 3%. As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2017-April/111732.html Differential Revision: https://reviews.llvm.org/D31838 llvm-svn: 300464 2017-04-18 01:51:36 +08:00			`; CHECK: <SOURCE_FILENAME`
			`; CHECK-NEXT: <FUNCTION`
			`; "func"`
			`; CHECK-NEXT: <FUNCTION op0=4 op1=4`
Revert "[thinlto] Deleted unused test file" This reverts commit a7ad00460027c4a92640c2a5706a7d1869b60989. llvm-svn: 280886 2016-09-08 07:46:50 +08:00			`; CHECK: <GLOBALVAL_SUMMARY_BLOCK`
			`; CHECK-NEXT: <VERSION`
[LTO] Record whether LTOUnit splitting is enabled in index Summary: Records in the module summary index whether the bitcode was compiled with the option necessary to enable splitting the LTO unit (e.g. -fsanitize=cfi, -fwhole-program-vtables, or -fsplit-lto-unit). The information is passed down to the ModuleSummaryIndex builder via a new module flag "EnableSplitLTOUnit", which is propagated onto a flag on the summary index. This is then used during the LTO link to check whether all linked summaries were built with the same value of this flag. If not, an error is issued when we detect a situation requiring whole program visibility of the class hierarchy. This is the case when both of the following conditions are met: 1) We are performing LowerTypeTests or Whole Program Devirtualization. 2) There are type tests or type checked loads in the code. Note I have also changed the ThinLTOBitcodeWriter to also gate the module splitting on the value of this flag. Reviewers: pcc Subscribers: ormris, mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, dang, llvm-commits Differential Revision: https://reviews.llvm.org/D53890 llvm-svn: 350948 2019-01-12 02:31:57 +08:00			`; CHECK-NEXT: <FLAGS`
Bitcode: Add a string table to the bitcode format. Add a top-level STRTAB block containing a string table blob, and start storing strings for module codes FUNCTION, GLOBALVAR, ALIAS, IFUNC and COMDAT in the string table. This change allows us to share names between globals and comdats as well as between modules, and improves the efficiency of loading bitcode files by no longer using a bit encoding for symbol names. Once we start writing the irsymtab to the bitcode file we will also be able to share strings between it and the module. On my machine, link time for Chromium for Linux with ThinLTO decreases by about 7% for no-op incremental builds or about 1% for full builds. Total bitcode file size decreases by about 3%. As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2017-April/111732.html Differential Revision: https://reviews.llvm.org/D31838 llvm-svn: 300464 2017-04-18 01:51:36 +08:00			`; See if the call to func is registered, using the expected hotness type.`
[ThinLTO] Attempt to recommit r365188 after alignment fix llvm-svn: 365215 2019-07-05 23:25:05 +08:00			`; CHECK-NEXT: <PERMODULE_PROFILE {{.*}} op7=1 op8=2/>`
Revert "[thinlto] Deleted unused test file" This reverts commit a7ad00460027c4a92640c2a5706a7d1869b60989. llvm-svn: 280886 2016-09-08 07:46:50 +08:00			`; CHECK-NEXT: </GLOBALVAL_SUMMARY_BLOCK>`
Bitcode: Add a string table to the bitcode format. Add a top-level STRTAB block containing a string table blob, and start storing strings for module codes FUNCTION, GLOBALVAR, ALIAS, IFUNC and COMDAT in the string table. This change allows us to share names between globals and comdats as well as between modules, and improves the efficiency of loading bitcode files by no longer using a bit encoding for symbol names. Once we start writing the irsymtab to the bitcode file we will also be able to share strings between it and the module. On my machine, link time for Chromium for Linux with ThinLTO decreases by about 7% for no-op incremental builds or about 1% for full builds. Total bitcode file size decreases by about 3%. As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2017-April/111732.html Differential Revision: https://reviews.llvm.org/D31838 llvm-svn: 300464 2017-04-18 01:51:36 +08:00			`; CHECK: <STRTAB_BLOCK`
Bitcode: Write the irsymtab to disk. Differential Revision: https://reviews.llvm.org/D33973 llvm-svn: 306487 2017-06-28 07:50:11 +08:00			`; CHECK-NEXT: blob data = 'mainfunc{{.*}}'`
Revert "[thinlto] Deleted unused test file" This reverts commit a7ad00460027c4a92640c2a5706a7d1869b60989. llvm-svn: 280886 2016-09-08 07:46:50 +08:00
			`; COMBINED: <GLOBALVAL_SUMMARY_BLOCK`
			`; COMBINED-NEXT: <VERSION`
[ThinLTO] Serialize WithGlobalValueDeadStripping index flag for distributed backends Summary: A recent fix to drop dead symbols (r323633) did not work for ThinLTO distributed backends because we lose the WithGlobalValueDeadStripping set on the index during the thin link. This patch adds a new flags record to the bitcode format for the index, and serializes this flag for the combined index (it would always be 0 for the per-module index generated by the compile step, so no need to serialize the new flags record there until/unless we add another flag that applies to the per-module indexes). Generally this flag should always be set for the distributed backends, which are necessarily performed after the thin link. However, if we were to simply set this flag on the index applied to the distributed backends (invoked via clang), we would lose the ability to disable dead stripping via -compute-dead=false for debugging purposes. Reviewers: grimar, pcc Subscribers: mehdi_amini, inglorion, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D42799 llvm-svn: 324444 2018-02-07 12:05:59 +08:00			`; COMBINED-NEXT: <FLAGS`
Bitcode: Add a string table to the bitcode format. Add a top-level STRTAB block containing a string table blob, and start storing strings for module codes FUNCTION, GLOBALVAR, ALIAS, IFUNC and COMDAT in the string table. This change allows us to share names between globals and comdats as well as between modules, and improves the efficiency of loading bitcode files by no longer using a bit encoding for symbol names. Once we start writing the irsymtab to the bitcode file we will also be able to share strings between it and the module. On my machine, link time for Chromium for Linux with ThinLTO decreases by about 7% for no-op incremental builds or about 1% for full builds. Total bitcode file size decreases by about 3%. As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2017-April/111732.html Differential Revision: https://reviews.llvm.org/D31838 llvm-svn: 300464 2017-04-18 01:51:36 +08:00			`; COMBINED-NEXT: <VALUE_GUID op0=[[FUNCID:[0-9]+]] op1=7289175272376759421/>`
			`; COMBINED-NEXT: <VALUE_GUID`
Revert "[thinlto] Deleted unused test file" This reverts commit a7ad00460027c4a92640c2a5706a7d1869b60989. llvm-svn: 280886 2016-09-08 07:46:50 +08:00			`; COMBINED-NEXT: <COMBINED`
Bitcode: Add a string table to the bitcode format. Add a top-level STRTAB block containing a string table blob, and start storing strings for module codes FUNCTION, GLOBALVAR, ALIAS, IFUNC and COMDAT in the string table. This change allows us to share names between globals and comdats as well as between modules, and improves the efficiency of loading bitcode files by no longer using a bit encoding for symbol names. Once we start writing the irsymtab to the bitcode file we will also be able to share strings between it and the module. On my machine, link time for Chromium for Linux with ThinLTO decreases by about 7% for no-op incremental builds or about 1% for full builds. Total bitcode file size decreases by about 3%. As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2017-April/111732.html Differential Revision: https://reviews.llvm.org/D31838 llvm-svn: 300464 2017-04-18 01:51:36 +08:00			`; See if the call to func is registered, using the expected hotness type.`
[thinlto] Basic thinlto fdo heuristic Summary: This patch improves thinlto importer by importing 3x larger functions that are called from hot block. I compared performance with the trunk on spec, and there were about 2% on povray and 3.33% on milc. These results seems to be consistant and match the results Teresa got with her simple heuristic. Some benchmarks got slower but I think they are just noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with more iterations to confirm. Geomean of all benchmarks including the noisy ones were about +0.02%. I see much better improvement on google branch with Easwaran patch for pgo callsite inlining (the inliner actually inline those big functions) Over all I see +0.5% improvement, and I get +8.65% on povray. So I guess we will see much bigger change when Easwaran patch will land (it depends on new pass manager), but it is still worth putting this to trunk before it. Implementation details changes: - Removed CallsiteCount. - ProfileCount got replaced by Hotness - hot-import-multiplier is set to 3.0 for now, didn't have time to tune it up, but I see that we get most of the interesting functions with 3, so there is no much performance difference with higher, and binary size doesn't grow as much as with 10.0. Reviewers: eraman, mehdi_amini, tejohnson Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24638 llvm-svn: 282437 2016-09-27 04:37:32 +08:00			`; op6=2 which is hotnessType::None.`
[ThinLTO] Attempt to recommit r365188 after alignment fix llvm-svn: 365215 2019-07-05 23:25:05 +08:00			`; COMBINED-NEXT: <COMBINED_PROFILE {{.*}} op9=[[FUNCID]] op10=2/>`
Revert "[thinlto] Deleted unused test file" This reverts commit a7ad00460027c4a92640c2a5706a7d1869b60989. llvm-svn: 280886 2016-09-08 07:46:50 +08:00			`; COMBINED-NEXT: </GLOBALVAL_SUMMARY_BLOCK>`

			`; ModuleID = 'thinlto-function-summary-callgraph.ll'`
			`target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"`
			`target triple = "x86_64-unknown-linux-gnu"`

			`; Function Attrs: nounwind uwtable`
			`define i32 @main() #0 !prof !2 {`
			`entry:`
			`call void (...) @func()`
			`ret i32 0`
			`}`

			`declare void @func(...) #1`

			`!2 = !{!"function_entry_count", i64 1}`
[thinlto] Basic thinlto fdo heuristic Summary: This patch improves thinlto importer by importing 3x larger functions that are called from hot block. I compared performance with the trunk on spec, and there were about 2% on povray and 3.33% on milc. These results seems to be consistant and match the results Teresa got with her simple heuristic. Some benchmarks got slower but I think they are just noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with more iterations to confirm. Geomean of all benchmarks including the noisy ones were about +0.02%. I see much better improvement on google branch with Easwaran patch for pgo callsite inlining (the inliner actually inline those big functions) Over all I see +0.5% improvement, and I get +8.65% on povray. So I guess we will see much bigger change when Easwaran patch will land (it depends on new pass manager), but it is still worth putting this to trunk before it. Implementation details changes: - Removed CallsiteCount. - ProfileCount got replaced by Hotness - hot-import-multiplier is set to 3.0 for now, didn't have time to tune it up, but I see that we get most of the interesting functions with 3, so there is no much performance difference with higher, and binary size doesn't grow as much as with 10.0. Reviewers: eraman, mehdi_amini, tejohnson Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24638 llvm-svn: 282437 2016-09-27 04:37:32 +08:00
			`; OLD: Index {{.*}} contains 1 nodes (1 functions, 0 alias, 0 globals) and 1 edges (0 refs and 1 calls)`
			`; OLD-COMBINED: Index {{.*}} contains 2 nodes (2 functions, 0 alias, 0 globals) and 1 edges (0 refs and 1 calls)`