llvm-project/llvm/tools
Hongtao Yu b9db70369b [CSSPGO] Split context string to deduplicate function name used in the context.
Currently context strings contain a lot of duplicated function names and that significantly increase the profile size. This change split the context into a series of {name, offset, discriminator} tuples so function names used in the context can be replaced by the index into the name table and that significantly reduce the size consumed by context.

A follow-up improvement made in the compiler and profiling tools is to avoid reconstructing full context strings which is  time- and memory- consuming. Instead a context vector of `StringRef` is adopted to represent the full context in all scenarios. As a result, the previous prevalent profile map which was implemented as a `StringRef` is now engineered as an unordered map keyed by `SampleContext`. `SampleContext` is reshaped to using an `ArrayRef` to represent a full context for CS profile. For non-CS profile, it falls back to use `StringRef` to represent a contextless function name. Both the `ArrayRef` and `StringRef` objects are underpinned by real array and string objects that are stored in producer buffers. For compiler, they are maintained by the sample reader. For llvm-profgen, they are maintained in `ProfiledBinary` and `ProfileGenerator`. Full context strings can be generated only in those cases of debugging and printing.

When it comes to profile format, nothing has changed to the text format, though internally CS context is implemented as a vector. Extbinary format is only changed for CS profile, with an additional `SecCSNameTable` section which stores all full contexts logically in the form of `vector<int>`, which each element as an offset points to `SecNameTable`. All occurrences of contexts elsewhere are redirected to using the offset of `SecCSNameTable`.

Testing
This is no-diff change in terms of code quality and profile content (for text profile).

For our internal large service (aka ads), the profile generation is cut to half, with a 20x smaller string-based extbinary format generated.

The compile time of ads is dropped by 25%.

Differential Revision: https://reviews.llvm.org/D107299
2021-08-30 20:09:29 -07:00
..
bugpoint [NFC] Cleanup more AttributeList::addAttribute() 2021-08-17 21:05:41 -07:00
bugpoint-passes Add missed rename of getFnAttributes() -> getFnAttrs() 2021-08-13 11:29:20 -07:00
dsymutil [OptTable] Rename PrintHelp to printHelp 2021-06-24 14:47:03 -07:00
gold [LTO] Add SelectionKind to IRSymtab and use it in ld.lld/LLVMgold 2021-07-20 13:22:00 -07:00
llc [llc] Initialize context for parsing options 2021-08-28 22:37:26 +02:00
lli [llvm] Replace LLVM_ATTRIBUTE_NORETURN with C++11 [[noreturn]] 2021-07-28 09:31:14 -07:00
llvm-ar [llvm] Replace LLVM_ATTRIBUTE_NORETURN with C++11 [[noreturn]] 2021-07-28 09:31:14 -07:00
llvm-as [OpaquePtr] Introduce option to force all pointers to be opaque pointers 2021-06-24 13:32:31 -07:00
llvm-as-fuzzer
llvm-bcanalyzer Use ManagedStatic and lazy initialization of cl::opt in libSupport to make it free of global initializer 2021-07-16 07:38:16 +00:00
llvm-c-test [ADT] Move DenseMapInfo for ArrayRef/StringRef into respective headers (NFC) 2021-06-03 18:34:36 +02:00
llvm-cat [tools] Use llvm::append_range (NFC) 2021-01-05 21:15:56 -08:00
llvm-cfi-verify [llvm][tools] Hide unrelated llvm-cfi-verify options 2021-07-16 10:43:52 +02:00
llvm-config
llvm-cov [Coverage][llvm-cov] Correctly export branch coverage in LCOV format 2021-08-20 13:44:25 -05:00
llvm-cvtres [OptTable] Refine how `printHelp` treats empty help texts 2021-08-19 09:30:15 +00:00
llvm-cxxdump [llvm] Replace LLVM_ATTRIBUTE_NORETURN with C++11 [[noreturn]] 2021-07-28 09:31:14 -07:00
llvm-cxxfilt [llvm-cxxfilt] Switch command line parsing from llvm::cl to OptTable 2021-07-09 10:10:45 -07:00
llvm-cxxmap [llvm][tools] Hide more unrelated tool options 2021-07-20 13:27:33 +02:00
llvm-diff [llvm-diff] correct variable typo 2021-08-12 11:29:48 -07:00
llvm-dis [llvm][tools] Hide more unrelated tool options 2021-07-20 13:27:33 +02:00
llvm-dwarfdump [DWARF] Don't process .debug_info relocations for DWO Context 2021-08-02 10:41:47 -07:00
llvm-dwp [DWP] Refactoring llvm-dwp in to a library part 2 2021-07-22 14:23:29 -07:00
llvm-exegesis Revert "[asan] Implemented intrinsic for the custom calling convention similar used by HWASan for X86." 2021-08-24 13:21:20 -07:00
llvm-extract llvmbuildectomy - replace llvm-build by plain cmake 2020-11-13 10:35:24 +01:00
llvm-go
llvm-gsymutil [llvm][clang][NFC] updates inline licence info 2021-08-11 02:48:53 +00:00
llvm-ifs [ifs] Add option to hide undefined symbols 2021-08-27 11:15:56 -07:00
llvm-isel-fuzzer
llvm-itanium-demangle-fuzzer
llvm-jitlink [ORC][ORC-RT] Reapply "Introduce ELF/*nix Platform and runtime..." with fixes. 2021-08-27 14:41:58 +10:00
llvm-jitlistener [MCJIT] Profile the code generated by MCJIT engine using Intel VTune profiler 2020-11-16 19:28:14 +11:00
llvm-libtool-darwin Use ManagedStatic and lazy initialization of cl::opt in libSupport to make it free of global initializer 2021-07-16 07:38:16 +00:00
llvm-link [llvm][tools] Hide more unrelated LLVM tool options 2021-07-21 09:14:04 +02:00
llvm-lipo [llvm] Replace LLVM_ATTRIBUTE_NORETURN with C++11 [[noreturn]] 2021-07-28 09:31:14 -07:00
llvm-lto [LTO][Legacy] Add new API to check presence of ctor/dtor functions. 2021-07-28 12:41:56 +00:00
llvm-lto2 [LTO] Add SelectionKind to IRSymtab and use it in ld.lld/LLVMgold 2021-07-20 13:22:00 -07:00
llvm-mc [llvm][tools] Hide more unrelated LLVM tool options 2021-07-21 09:14:04 +02:00
llvm-mc-assemble-fuzzer [llvm-mc-assemble-fuzzer] Initialize MCTargetOptions. 2021-07-22 14:36:37 +08:00
llvm-mc-disassemble-fuzzer
llvm-mca [MCA][NFC] Removed unused method, and fixed a coverity issue. 2021-08-27 12:49:49 +01:00
llvm-microsoft-demangle-fuzzer
llvm-ml [ms] [llvm-ml] Support built-in text macros 2021-07-21 11:44:09 -04:00
llvm-modextract [llvm][tools] Hide more unrelated LLVM tool options 2021-07-21 09:14:04 +02:00
llvm-mt Make WindowsManifestMerger::merge() take a MemoryBufferRef 2021-08-24 16:39:20 -04:00
llvm-nm [Object] Move llvm-nm's symbol version utility to ELFObjectFile::readDynsymVersions 2021-08-17 09:06:39 -07:00
llvm-objcopy [llvm-objcopy] [COFF] Consider section flags when adding section 2021-08-25 23:11:41 +03:00
llvm-objdump [llvm-objdump] -T: print symbol versions 2021-08-17 09:10:50 -07:00
llvm-opt-fuzzer [NewPM] Hide pass manager debug logging behind -debug-pass-manager-verbose 2021-05-07 21:51:47 -07:00
llvm-opt-report [SystemZ][z/OS][Windows] Add new OF_TextWithCRLF flag and use this flag instead of OF_Text 2021-04-06 07:23:31 -04:00
llvm-pdbutil [llvm][tools] Hide more unrelated LLVM tool options 2021-07-21 09:14:04 +02:00
llvm-profdata [CSSPGO] Split context string to deduplicate function name used in the context. 2021-08-30 20:09:29 -07:00
llvm-profgen [CSSPGO] Split context string to deduplicate function name used in the context. 2021-08-30 20:09:29 -07:00
llvm-rc [llvm-rc] Allow specifying language with a leading 0x prefix 2021-08-05 10:19:55 +03:00
llvm-readobj [llvm-readobj][XCOFF] Add support for `--needed-libs` option. 2021-08-26 07:17:06 +00:00
llvm-reduce [llvm-reduce] Check if module data strings are empty before attempting to reduce 2021-08-24 10:23:00 -07:00
llvm-rtdyld [DWARF] Don't process .debug_info relocations for DWO Context 2021-08-02 10:41:47 -07:00
llvm-rust-demangle-fuzzer Fix implicit dependency on <string> header. NFCI. 2021-06-11 10:24:14 +01:00
llvm-shlib [CMake] Don't use -Bsymbolic-functions for MinGW targets 2021-06-30 22:54:26 +03:00
llvm-sim [IRSim] Adding basic implementation of llvm-sim. 2021-06-23 14:38:58 -05:00
llvm-size [llvm-size] Switch command line parsing from llvm::cl to OptTable 2021-07-09 10:26:53 -07:00
llvm-special-case-list-fuzzer
llvm-split [llvm][tools] Hide remaining unrelated llvm- tool options 2021-07-22 09:47:55 +02:00
llvm-stress [llvm][tools] Hide remaining unrelated llvm- tool options 2021-07-22 09:47:55 +02:00
llvm-strings [llvm] Replace LLVM_ATTRIBUTE_NORETURN with C++11 [[noreturn]] 2021-07-28 09:31:14 -07:00
llvm-symbolizer [llvm-symbolizer] Remove one-dash long options 2021-07-23 08:35:45 -07:00
llvm-tapi-diff [llvm-tapi-diff] Wrap empty string around StringLiteral NFC 2021-06-23 11:41:03 -07:00
llvm-undname [llvm][tools] Hide remaining unrelated llvm- tool options 2021-07-22 09:47:55 +02:00
llvm-xray llvm-xray {convert,extract}: Add --demangle 2021-08-24 13:35:19 -07:00
llvm-yaml-numeric-parser-fuzzer [llvm] NFC: Cleanup llvm-yaml-numeric-parser-fuzzer 2021-02-15 14:52:53 +01:00
llvm-yaml-parser-fuzzer [llvm] Use llvm::erase_value and llvm::erase_if (NFC) 2021-01-02 09:24:15 -08:00
lto [LTO][Legacy] Add new API to check presence of ctor/dtor functions. 2021-07-28 12:41:56 +00:00
msbuild
obj2yaml [yaml2obj][MachO] Rename PayloadString to Content 2021-07-26 09:04:51 -07:00
opt [NewPM] Use parameterized syntax for a couple of more passes 2021-08-20 14:59:21 +02:00
opt-viewer
remarks-shlib
sancov [MC] Refactor MCObjectFileInfo initialization and allow targets to create MCObjectFileInfo 2021-05-23 14:15:23 -07:00
sanstats [NFC] Reordering parameters in getFile and getFileOrSTDIN 2021-03-25 09:47:49 -04:00
split-file [split-file] Default to --no-leading-lines 2021-08-16 19:23:11 -07:00
verify-uselistorder [SystemZ][z/OS][Windows] Add new OF_TextWithCRLF flag and use this flag instead of OF_Text 2021-04-06 07:23:31 -04:00
vfabi-demangle-fuzzer
xcode-toolchain
yaml2obj [llvm] Make obj2yaml and yaml2obj LLVM utilities instead of tools 2020-10-19 10:21:21 -07:00
CMakeLists.txt