llvm-project/llvm/lib/DebugInfo/CodeView
Reid Kleckner 783db78835 [PDB] Print the most redundant type record indices with /summary
Summary:
I used this information to motivate splitting up the Intrinsic::ID enum
(5d986953c8) and adding a key method to
clang::Sema (586f65d31f) which saved a
fair amount of object file size.

Example output for clang.pdb:

  Top 10 types responsible for the most TPI input bytes:
         index     total bytes   count     size
        0x3890:      8,671,220 = 1,805 *  4,804
       0xE13BE:      5,634,720 =   252 * 22,360
       0x6874C:      5,181,600 =   408 * 12,700
        0x2A1F:      4,520,528 = 1,574 *  2,872
       0x64BFF:      4,024,020 =   469 *  8,580
        0x1123:      4,012,020 = 2,157 *  1,860
        0x6952:      3,753,792 =   912 *  4,116
        0xC16F:      3,630,888 =   633 *  5,736
        0x69DD:      3,601,160 =   985 *  3,656
        0x678D:      3,577,904 =   319 * 11,216

In this case, we can see that record 0x3890 is responsible for ~8MB of
total object file size for objects in clang.

The user can then use llvm-pdbutil to find out what the record is:

  $ llvm-pdbutil dump -types -type-index 0x3890
                       Types (TPI Stream)
  ============================================================
    Showing 1 records.
       0x3890 | LF_FIELDLIST [size = 4804]
                - LF_STMEMBER [name = `WORDTYPE_MAX`, type = 0x1001, attrs = public]
                - LF_MEMBER [name = `U`, Type = 0x37F0, offset = 0, attrs = private]
                - LF_MEMBER [name = `BitWidth`, Type = 0x0075 (unsigned), offset = 8, attrs = private]
                - LF_METHOD [name = `APInt`, # overloads = 8, overload list = 0x3805]
  ...

In this case, we can see that these are members of the APInt class,
which is emitted in 1805 object files.

The next largest type is ASTContext:

  $ llvm-pdbutil dump -types -type-index 0xE13BE bin/clang.pdb
      0xE13BE | LF_FIELDLIST [size = 22360]
                - LF_BCLASS
                  type = 0x653EA, offset = 0, attrs = public
                - LF_MEMBER [name = `Types`, Type = 0x653EB, offset = 8, attrs = private]
                - LF_MEMBER [name = `ExtQualNodes`, Type = 0x653EC, offset = 24, attrs = private]
                - LF_MEMBER [name = `ComplexTypes`, Type = 0x653ED, offset = 48, attrs = private]
                - LF_MEMBER [name = `PointerTypes`, Type = 0x653EE, offset = 72, attrs = private]
  ...

ASTContext only appears 252 times, but the list of members is long, and
must be repeated everywhere it is used.

This was the output before I split Intrinsic::ID:

  Top 10 types responsible for the most TPI input:
        0x686C:     69,823,920 = 1,070 * 65,256
        0x686D:     69,819,640 = 1,070 * 65,252
        0x686E:     69,819,640 = 1,070 * 65,252
        0x686B:     16,371,000 = 1,070 * 15,300
        ...

These records were all lists of intrinsic enums.

Reviewers: MaskRay, ruiu

Subscribers: mgrang, zturner, thakis, hans, akhuang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71437
2020-01-02 16:10:36 -08:00
..
AppendingTypeTableBuilder.cpp [codeview] Remove Type member from CVRecord 2019-04-04 00:28:48 +00:00
CMakeLists.txt [cmake] Explicitly mark libraries defined in lib/ as "Component Libraries" 2019-11-21 10:48:08 -08:00
CVSymbolVisitor.cpp [codeview] Remove Type member from CVRecord 2019-04-04 00:28:48 +00:00
CVTypeVisitor.cpp Removing block comments from CodeView records in assembly files & related code cleanup 2019-08-25 01:09:11 +00:00
CodeViewError.cpp Move some classes into anonymous namespaces. NFC. 2019-02-11 15:16:21 +00:00
CodeViewRecordIO.cpp Improving CodeView debug info type record's inline comments 2019-08-21 15:19:58 +00:00
ContinuationRecordBuilder.cpp [codeview] Remove Type member from CVRecord 2019-04-04 00:28:48 +00:00
DebugChecksumsSubsection.cpp Update the file headers across all of the LLVM projects in the monorepo 2019-01-19 08:50:56 +00:00
DebugCrossExSubsection.cpp Update the file headers across all of the LLVM projects in the monorepo 2019-01-19 08:50:56 +00:00
DebugCrossImpSubsection.cpp Update the file headers across all of the LLVM projects in the monorepo 2019-01-19 08:50:56 +00:00
DebugFrameDataSubsection.cpp Reverted r361134 because of a failing test left unattended for a long time. 2019-05-22 20:42:56 +00:00
DebugInlineeLinesSubsection.cpp Update the file headers across all of the LLVM projects in the monorepo 2019-01-19 08:50:56 +00:00
DebugLinesSubsection.cpp Update the file headers across all of the LLVM projects in the monorepo 2019-01-19 08:50:56 +00:00
DebugStringTableSubsection.cpp Update the file headers across all of the LLVM projects in the monorepo 2019-01-19 08:50:56 +00:00
DebugSubsection.cpp Update the file headers across all of the LLVM projects in the monorepo 2019-01-19 08:50:56 +00:00
DebugSubsectionRecord.cpp Update the file headers across all of the LLVM projects in the monorepo 2019-01-19 08:50:56 +00:00
DebugSubsectionVisitor.cpp Update the file headers across all of the LLVM projects in the monorepo 2019-01-19 08:50:56 +00:00
DebugSymbolRVASubsection.cpp Update the file headers across all of the LLVM projects in the monorepo 2019-01-19 08:50:56 +00:00
DebugSymbolsSubsection.cpp Fix a few 'no newline at end of file' warnings that Xcode emits 2019-07-11 15:26:45 +00:00
EnumTables.cpp Improving CodeView debug info type record's inline comments 2019-08-21 15:19:58 +00:00
Formatters.cpp Update the file headers across all of the LLVM projects in the monorepo 2019-01-19 08:50:56 +00:00
GlobalTypeTableBuilder.cpp [codeview] Remove Type member from CVRecord 2019-04-04 00:28:48 +00:00
LLVMBuild.txt Update the file headers across all of the LLVM projects in the monorepo 2019-01-19 08:50:56 +00:00
LazyRandomTypeCollection.cpp Update the file headers across all of the LLVM projects in the monorepo 2019-01-19 08:50:56 +00:00
Line.cpp Update the file headers across all of the LLVM projects in the monorepo 2019-01-19 08:50:56 +00:00
MergingTypeTableBuilder.cpp [codeview] Remove Type member from CVRecord 2019-04-04 00:28:48 +00:00
RecordName.cpp Update the file headers across all of the LLVM projects in the monorepo 2019-01-19 08:50:56 +00:00
RecordSerialization.cpp Update the file headers across all of the LLVM projects in the monorepo 2019-01-19 08:50:56 +00:00
SimpleTypeSerializer.cpp [codeview] Remove Type member from CVRecord 2019-04-04 00:28:48 +00:00
StringsAndChecksums.cpp Update the file headers across all of the LLVM projects in the monorepo 2019-01-19 08:50:56 +00:00
SymbolDumper.cpp Changing representation of .cv_def_range directives in Codeview debug info assembly format for better readability 2019-08-05 14:16:58 +00:00
SymbolRecordHelpers.cpp Hide implementation details in anonymous namespaces. NFC. 2019-10-24 10:48:43 +02:00
SymbolRecordMapping.cpp Changing representation of .cv_def_range directives in Codeview debug info assembly format for better readability 2019-08-05 14:16:58 +00:00
SymbolSerializer.cpp Update the file headers across all of the LLVM projects in the monorepo 2019-01-19 08:50:56 +00:00
TypeDumpVisitor.cpp [codeview] Remove Type member from CVRecord 2019-04-04 00:28:48 +00:00
TypeHashing.cpp [CodeView] Fix cycles in debug info when merging Types with global hashes 2019-02-07 15:24:18 +00:00
TypeIndex.cpp Update the file headers across all of the LLVM projects in the monorepo 2019-01-19 08:50:56 +00:00
TypeIndexDiscovery.cpp [llvm-pdbutil] Add -type-ref-stats to help find unused type info 2019-03-21 18:02:34 +00:00
TypeRecordHelpers.cpp Update the file headers across all of the LLVM projects in the monorepo 2019-01-19 08:50:56 +00:00
TypeRecordMapping.cpp Improving CodeView debug info type record's inline comments 2019-08-21 15:19:58 +00:00
TypeStreamMerger.cpp [PDB] Print the most redundant type record indices with /summary 2020-01-02 16:10:36 -08:00
TypeTableCollection.cpp [codeview] Remove Type member from CVRecord 2019-04-04 00:28:48 +00:00